
Citation 
 Permanent Link:
 http://ufdc.ufl.edu/AA00034746/00001
Material Information
 Title:
 Overcomplete wavelet representations with applications in image processing
 Creator:
 Fan, Jian, 1954
 Publication Date:
 1997
 Language:
 English
 Physical Description:
 xii, 115 leaves : ill. ; 29 cm.
Subjects
 Subjects / Keywords:
 Approximation ( jstor )
Dyadics ( jstor ) Fourier transformations ( jstor ) Frequency response ( jstor ) Image filters ( jstor ) Image processing ( jstor ) Mathematical vectors ( jstor ) Signals ( jstor ) Symmetry ( jstor ) Wavelet analysis ( jstor ) Computer and Information Science Engineering thesis, Ph. D ( lcsh ) Dissertations, Academic  Computer and Information Science and Engineering  UF ( lcsh )
 Genre:
 bibliography ( marcgt )
theses ( marcgt ) nonfiction ( marcgt )
Notes
 Thesis:
 Thesis (Ph. D.)University of Florida, 1997.
 Bibliography:
 Includes bibliographical references (leaves 111114).
 Additional Physical Form:
 Also available online.
 General Note:
 Typescript.
 General Note:
 Vita.
 Statement of Responsibility:
 by Jian Fan.
Record Information
 Source Institution:
 University of Florida
 Holding Location:
 University of Florida
 Rights Management:
 The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. Â§107) for nonprofit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
 Resource Identifier:
 028019760 ( ALEPH )
37851420 ( OCLC )

Downloads 
This item has the following downloads:

Full Text 
OVERCOMPLETE WAVELET REPRESENTATIONS WITH APPLICATIONS IN IMAGE PROCESSING
By
JIAN FAN
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1997
Copyright 1997
by
Jian Fan
ACKNOWLEDGEMENTS
First, I would like to express my deep gratitude to my advisor, Andrew Laine, for all his support and encouragement. I thank him for introducing me to the field of wavelets, and I appreciate the opportunity to work with him.
I would like to thank Dr. Jerry C. L. Chen and Mrs. Shirley H. W. Chen of Palo Alto, California, for their help and support. Without their support, I may not have the opportunity to pursue my graduate studies in the United States.
I would like to thank Drs. Gerhard Ritter, Sartaj Sahni, Dave Wilson and John G. Harris for being on my thesis committee, and Dr. Baba C. Vemuri for attending my defense. Their time and thoughtful recommendations were greatly appreciated.
I would also like to thank Dr. David Burchfield of the Department of Pediatrics, University of Florida, for providing me a graduate research assistantship from January 1991 through December 1991.
During my early years at the University of Florida, Ming Jiang and Min Shi gave me much help. I also benefited greatly from discussions with my follow graduate students, Shuwu Song, Iztok Koren, Sergio Schuler, Wuhai Yang, Chunming Chang and Hongchi Shi.
In addition, I would like to thank my manager Catherine Hunt of Hewlett Packard Company, San Diego, for her support. I have also benefited from HewlettPackard Company's educational assistance program. I appreciate the efforts of John Shumate who helped polish the writing of this dissertation.
Finally, I would like to thank my mother Yuzhen Chen and my late father Laikun Fan for their inspiration and encouragement, as well as my wife \eeihua Liu and my son Xiying Fan for their understanding, love, and patience.
This research was supported in part by the Whitaker Foundation, and the U. S. Army Medical Research and Development Command, Grand number DAMD1793J3003.
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ............................ iii
LIST OF TABLES ............................. .... viii
LIST OF FIGURES ................................ ix
A B ST R A C T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
CHAPTERS
1 INTRODUCTION ............................. 1
1.1 Motivations of the Research ..................... 1
1.2 Review of Related W ork ....................... 2
1.3 Overview of the Thesis ....................... 4
2 SIGNAL REPRESENTATIONS AND CONVOLUTION OPERATORS 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Fourier Transforms .......................... 6
2.2.1 Fourier Transform (FT) ......... .......... .. 7
2.2.2 Discretetime Fourier Transform (DTFT) ......... 7
2.2.3 Shorttime Fourier Transform (STFT) .............. 8
2.2.4 Discretetime Shorttime Fourier Transform (DTSTFT) 8
2.2.5 Connection Between Shorttime Fourier Transform and
Filter Banks ............................ 9
2.2.6 Generalized Shorttime Fourier Transform ............ 10
2.3 Wavelet Transforms .............. .......... 10
2.3.1 Continuous Wavelet Transform (CWT) .......... 11
2.3.2 Discrete Wavelet Transform (DWT) ................ 12
2.3.3 Discretetime Wavelet Transform (DTWT) ........... 13
2.4 Generalized Discretetime Filter Bank Transforms ........... 15
2.4.1 Convolution Operator on Complete Representations . . . 17 2.4.2 Convolution Operator on Overcomplete Representations 19
2.4.3 Approximation of Convolution Operators on Overcomplete
Representations ............................ 19
2.4.4 Aliasing Enhancement on Complete Representations . . . 20
2.5 Summary and Discussion ...... ...................... 22
3 OVERCOMPLETE WAVELET REPRESENTATIONS .......... 23
3.1 Introduction . . . . . . . . . . . . . . . .. . . . . . 23
3.2 Overcomplete \Wavelet Transforms and Filters ............... 23
3.2.1 Overcomplete Wavelet Packet Representations ........ 23 3.2.2 Dvadic Wavelet Representations .................. 25
3.2.3 Filters of Autocorrelation Functions of Wavelets ....... 28
3.3 Connection to Complete Wavelet Representations ............ 29
3.4 Timefrequency Interpretation and the Uncertainty Principle 29 3.5 Twodimensional Extensions ...... .................... 34
4 OPERATIONS ON OVERCOMPLETE WAVELET
REPRESENTATIONS ............................... 37
4.1 Introduction ....... ............................. 37
4.2 Gain Operators ............................... 37
4.2.1 Mininmn Mean Square Error (MMSE) Approximation. 38 4.2.2 Timefrequency Tradeoff and a Greedy Algorithm . ... 40
4.3 Shrinking Operators ...... ......................... 42
4.4 Envelope Detectors ............................. 44
4.4.1 Envelope Detection by Hilbert Transform ........... 47
4.4.2 Envelope Detection by Zero Crossings .............. 49
4.4.3 Comparison Between the Two Detectors ............. 50
4.4.4 Comparison with Other Energy Operators ............ 50
4.4.5 Two Dimensional Envelope Detection .............. 50
5 APPLICATION I: TEXTURE SEGMENTATION .............. 53
5.1 Introduction ....... ............................. 53
5.2 Texture Feature Extraction 4.....................34
5.3 Considerations on Filter Selection .................... 55
5.4 The Basic Isodata Clustering Algorithm .................. 57
5.5 Postprocessing ... ............................ 58
5.6 Experimental Results ..... ........................ 58
5.6.1 Onedimensional Signals ....................... 58
5.6.2 Natural Textures ........................... 59
5.6.3 Synthetic Textures ............................ 62
5.7 Summary and Discussion ...... ...................... 62
6 APPLICATION II: IMAGE DEBLURRING .... .............. 66
6.1 Introduction ....... ............................. 66
6.2 Review of Some Deblurring Techniques .... .............. 67
6.2.1 Modified Inverse Filters ....................... 67
6.2.2 Wiener Filters ...... ........................ 68
6.2.3 WaveletVaguelette Inversion ..................... 69
6.2.4 Discussion ....... .......................... 70
6.3 Discretetime Overcomplete Wavelet Packet Inversion ........ 72
6.3.1 Onedimensional Formulation ..................... 72
6.3.2 Twodimensional Extension ..................... 74
6.4 Experimental Results ..... ........................ 75
6.4.1 Onedimensional Signals ....................... 78
6.4.2 Twodimensional Images ..... .................. 81
6.5 Summary and Discussion ...... ...................... 82
7 CONCLUSION .............................. 86
APPENDIXES
A PROOFS RELATED TO DISCRETETIME WAVELETS ......... 88 B NUMERICAL COMPUTATION OF BATTLELEMARIt WVAVELET 95
B.1 Functions in Finite Terms ..................... 95
B.2 Numerical Computation Using FFT ................ 99
C NUMERICAL COMPUTATION OF THE UNCERTAINTY FACTOR 103 D BOUNDARY TREATMENTS OF FINITELENGTH SEQUENCES . 107
D.1 Periodic Extensions .......................... 107
D.2 Closure of Symmetry Under Convolution ............. 107
REFERENCES . ....... . .... .. ...... .... .... .. . .. . 111
BIOGRAPHICAL SKETCH ............................ 115
LIST OF TABLES
3.1 Uncertainty Factors of Analyzing Filters ..................... 34
5.1 Boundary Accuracy of the Segmentation Results ................ 64
6.1 Performance of ID Deblurring Examples ..................... 81
6.2 Performance of 2D Deblurring Examples ..................... 82
B.1 Coefficients of g24(x) (a) and Truncated Impulse Response of hi(n) (b). 100
LIST OF FIGURES
2.1 A Filter Bank Implementation of STFT ..................... 10
2.2 Fast Discretetime Wavelet TransformFirst Two Levels ........... 13
2.3 Generalized Discretetime Filter Bank Transform ................ 15
2.4 Example of General Binarytreestructured Filter Bank of Five Channels. 16 2.5 The Overcomplete Wavelet Transform for the Example of Figure 2.4. 17 2.6 Example of the Aliasingenhancemnent Anomaly ................. 21
3.1 Binary Tree for Overcomplete Wavelet Packet Decomposition ..... . 24 3.2 Binary Tree for Dyadic Wavelet Decomposition ................ 26
3.3 Examples of the Fumction &)a,b(w) ...... ..................... .....27
3.4 Two Examples of the Overcomplete Wavelet Representation ...... . 30
3.5 Examples of Timefrequency Localization by Overcomplete Wavelet
Packet Representations ............................... 32
3.6 Uncertainty factors (drawn on top of each band) of Channel Filters. 33 4.1 Examples of Gain Vector Approximation ..................... 41
4.2 Examples of the Minimum Tree Approximation with the Maximum
Depth of Seven Levels ...... ......................... 43
4.3 An Example of Overcomplete Wavelet Denoising ................ 45
4.4 Comparison of Overcomplete Wavelet Shrinkage Denoising with Linear
Smoothing Using the Signal of (row 1,col 1) of Figure 4.3 ...... .46
4.5 Two Examples of Power Density Distribution of Overcomplete Wavelet
Packet Representations ............................... 48
4.6 A FIR Hilbert Transformer .............................. 49
4.7 Comparison of the Two Envelope Detectors ................... 51
4.8 Frequency Response of Equivalent Complex Quadrature Filters (level 1). 52
5.1 Segmentation of a ID Signal Consists of Triangular \Vaveforrn and
Sinusoid .......... ................................. 59
5.2 Segmentation of a Noisy ID Signal Consists of Two Pure Tone Segments 60 5.3 Segmentation Results of a inage Consist of Natural Textures..... 61 5.4 Segmentation of a Synthetic Gaussian Lowpassed Texture inage . . 63 5.5 Segmentation of a Synthetic FIN Texture Image ............... 63
5.6 Segmentation of a Synthetic Texture inage with Line Patterns. . . 64 6.1 Log Frequency Responses of Filters for a Gaussian Blur ........... 76
6.2 Log Frequency Responses of Filters for an uniform Blur ........... 77
6.3 Onedimensional Deblurrings of a Gaussian Blur ............... 79
6.4 Onedimensional Deblurrings of an Uniform Blur ................ 80
6.5 Deblurring a Gaussianblurred Lena Image .................... 83
6.6 Deblurring an Uniformblurred Lena inage ................... 84
A.1 Equivalent Structures for (a) Convolution and Decimation and
(b) Expansion and Convolution ......................... 91
A.2 Illustration of Adding One More Channel by Splitting a Channel into
Two ........ ................................... 92
B.1 Transition Band of Filters H,(w) with p = 1, 3,9 ............... 101
C.1 Channel Bandwidths of Overcomplete \Vavelet Representations. . . . 104
Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
OVERCOMPLETE WAVELET REPRESENTATIONS
WITH APPLICATIONS IN IMAGE PROCESSING By
Jian Fan
August 1997
Chairman: Dr. Andrew F. Laine
Major Department: Computer and Information Science and Engineering
Orthogonal wavelet transforms have been applied to the field of signal and image processing with promising results in compression and denoising. Coefficients of such a transform constitute a complete representation of a signal without redundancy. However, there are applications where complete represent at ions are disadvantageous. In this thesis, we examine classes of Fourier transforms and wavelet transforms in terms of their efficacy of representing convolution operators. Ve have identified two shortcomings associated with complete representations of the discretetime domain:
(1) time lack of translation invariance and the (2) a possible anomaly of aliasingenhancement.
On the other hand, our analysis showed that overcomplete wavelet representations do not bear those shortcomings of their nonredundant counterparts. Our framework of overcomplete wavelet representat ions include construction algorithms
and prototype filters, spatialfrequency interpretation and three operations. Capabilities of spatialfrequency localization were quantitatively evaluated using uncertainty factors. Associated with gain, shrinking and envelo)e operators, algorithms for convolution, denoising and analysis of power density distribution were presented and analyzed.
The framework of overcomplete wavelet representations was then applied to segmentation of textured images and image del)lurring. NVe deinonstrated that envelopes as feature vectors performed well in segmenting both natural and synthetic textures. We showed that gain and shrinking operators may be used for image deblurring and discuss limitations of the mothodology.
CHAPTER 1
INTRODUCTION
1.1 Motivations of the Research
In the past decade, wavelets have stimulated great interests in many fields of applied mathematics and engineering, generating a significant amount of research activities. Within these fields, traditional theories and techniques have been reevaluated to explore opportunities for wavelet applications.
Although continuous time wavelets possess mathematical elegance, discretetime wavelet transforms are of special importance for practical applications. Among them, discretetime orthogonal wavelet transforms have gained popularity. Coefficients of such a transform constitute a complete representation of a signal without redundancy. Complete representations have produced promising results in image compression and denoising. However, there are applications where complete representations are disadvantageous. Understanding the strength and weakness of various representations and the relationships among them is essential for satisfying intellectual inquires and developing new methodologies. In this thesis, we considered both classes of Fourierbased representations and waveletbased representations with emphasis in the discretetime domain. We found that the overcomplete wavelet representations served to bridge the two classes.
The merit of a particular representation should be judged by its capability to facilitate solving problems. Two important but difficult problems, segmentation of textured images and image deblurring, were selected as benchmarks. Accordingly, we value a representation by its capability to extract certain features from signals, efficacy of representing an operation and invariance under certain operations.
With the understanding of various representations and the problems in hand,. we developed a framework of discretetime overcomplete wavelet representation. The framework included construction algorithms and prototype filters, timefrequency interpretation and three operations. Algorithms associated with gain operators. shrinking operators and envelope detectors were developed. The framework of overcomplete wavelet representations was then applied to segmentation of textured images and image deblurring. This demonstrated that envelopes as feature vectors performed well in segmenting both natural and synthetic textures. For the deblurring application, our formulation extended the waveletvaguelette inversion into discretetime domain and was applicable to more blurring sources.
1.2 Review of Related Work
On the general topic of wavelet theory, a brief history and comprehensive review were included in Jawerth and Sweldens' survey [32]. The papers by Daubechies [11] and Mallat [42. 41] were considered influential for the recent developments. The paper by another wavelet pioneer Strang [59] was also explanatory. The orthonormal wave packet bases is a generalization of orthogonal wavelets. was introduced by Coifman and Meyer [10]. The book by Daubechies [13] included indepth mathematical discussions and real design examples. The book by Vaidyanathan [63], which was written for signal processing audience, provided much insight into the relationship between wavelets and multirate filter banks.
The lack of translation invariance of discretetime orthogonal wavelets was recognized as a major shortcoming for pattern recognition by many researchers [41. 59]. Various proposals were made to address the issue. Multiscale edge representations were investigated by Mallat and Zhong [44], Mallat and Hwang [43], and Berman and Baras [5]. Steerable filters with shiftabilities were studied by Simoncelli et al. [58]. A orthogonal representation with invariance of the subband structure was proposed by Pesquet et al. [52].
Using energy distribution in subbands of wavelet packet decomposition as signatures for texture classification was presented by Laine and Fan [36]. A similar approach was also reported by Chang and Kuo [8]. Earlier results using envelopes of overcoinplete wavelet packet representation for texture segmentation was reported by Laine and Fan [35]. A simplified version of Chapter 5 of this thesis was later published [37]. The overcomplete wavelet representation was also used by User [62] with an ad hoc feature extraction algorithm. A methodology combining Markov random field and orthogonal wavelet packet transform was described by Bello [4].
The nonlinear wavelet shrinkage for denoising was due to Donoho and Johnstone [17]. The approach was applied to power spectrum estimation [48]. and to medical image processing [1]. Nonlinear denoising based on wavelet maxima representation by Mallat and Hwang shared some similar ideas [43], although tracing singularities through the scale space turned out to be much more difficult, especially in a twodimensional space.
Manipulating wavelet representations to enhance certain features is obviously a good idea. Jawerth et al. [31] reported image enhancement using weight factors to modify coefficients of orthogonal wavelet transforms. Laine et al. [38] demonstrated various approaches for enhancement of digital mammographic images. Fan and Laine [20] pointed out the connection between linear gain enhancement using overcoiplete dyadic wavelet representations and the unsharp masking, and introduced a nonlinear function for contrast enhancement. However, due to the lack of an objective definition of image quality, those enhancement algorithms were mostly ad hoc and hard to evaluate. On the other hand, image deblurring is considered to be a well defined problem. The approach adapted by Banhamn et al. [3] is to decompose the linear minimum mean square error (LMMSE) filters into the orthogonal transform domain with advantage of reduced alliance on the assumption of global stationarity. However, their proposal was basically the linear filtering and no optimal subband selection was
considered. The mathematical framework of the waveletvaguelette inversion was due to Donoho [14. 16]. The limitations were that it was derived on the continuous time domain and inapplicable to some common sources such as Gaussian and uniform blurs.
1.3 Overview of the Thesis
This thesis is organized as follows:
Chapter 2 reviews signal representations derived from the Fourier transform and the wavelet transform in the context of being operated upon by a convolution operator. The scrutiny reveals that complete representations suffer from not being translation invariant and the existence of the aliasingenhancement anomaly. and are thus inappropriate for certain applications.
Chapter 3 presents the framework of the overcomplete wavelet representation with two particular building algorithm and filters. The timefrequency localization property of overcomnplete wavelet representations is examined under the uncertaintv principle.
Chapter 4 introduces gain, shrinking and envelope operators applicable upon anr overcomplete wavelet representation. Algorithms for determining specific parameters are detailed with examples and comparisons shown. Both Chapter 5 and 6 are applications of the theory of overcomplete wavelet representations developed in previous chapters.
Chapter 5 deals with the topic of image texture segmentation. The segmentation problem is formulated in the paradigm of spatialfrequency analysis, and a texture feature based on the concept of power density distribution is proposed.
Chapter 6 addresses the problem of image deblurring using the gain operator for deconvolution and the shrinking operator for nonlinear denoising.
Chapter 7 concludes the research and proposes future directions.
CHAPTER 2
SIGNAL REPRESENTATIONS AND CONVOLUTION OPERATORS
2.1 Introduction
For a given signal x, it is often beneficial to transform (map) x into another form (domain), denoted x44 X. X is called the representation of x in the transform domain. Instead of directly dealing with the original signal x. we may deal with its representation X. By doing so we usually transform a particular problem into a different form and make it easier to solve. Clearly, different applications demand different representations.
The applications considered in this thesis are texture image segmentation and image deblurring. Accordingly, the timeinvariability and the efficacy of representations are the most important factors. Both properties may be revealed by their behavior under a convolution operator. With these in mind, we shall review Fourier transforms and wavelet transforms of one dimensional signals. To facilitate our discussions, we used the following basic definitions:
Definition 2.1.1 (Translation Operator) For a continuous function x(t), a translation operator is defined as T, [x(t)] = x(t  s); For a discrete function x(n), a translation operator is defined as Td [x(n)] = x(n  d).
Definition 2.1.2 (Convolution Operator) For a continuous function f(t), a convolution operator is defined as Tf [x(t)] = x(t) * f(t) = ff .X(T)f(t  'r)dT; For a discrete function .r(n), a convolution operator is defined as Tf [x(n)] = x(n) * f(n) = Sx(m)f(n m). The functions f(t) and f(n) are called the kernel. Notice that a translation operator is a special convolution operator with kernel f(t) = 6(t  7)
(f(n) = 6(n,  d)). Therefore, we will use T for both translation and convolution operators.
Definition 2.1.3 (Eigenfunction and Eigenvalue) For a linear operator T. if there exists a function e, such that T [e] = Ae, where A is a constant, the function (' is called the eigenfunction of T, and the value A is called the eigenvalue.
Definition 2.1.4 (Translation invariant) For a given signal x. a translation operator T and a mapping operator A4, if the two operators are commutable such that
A4 [T(,r)] = T [M(Ax)]I
the mapping is called translation invariant, and the representation X is said to be a translation invariant representation.
2.2 Fourier Transforms
Frequencydomain representation by Fourier transform played a major role in signal analysis and linear system theory [31. 49, 63]. The efficacy of Fourier representation is due to the fact that its basis functions of complex exponentials are eigenfunctions of a convolution operator [49, page 39]. For a linear convolution operator T, the eigenequation is:
Tf [ei't1 = F(w)ejwt,
where eigenvalue F(,,) is the Fourier transform of f(t). An almost identical form can be derived for the discretetime Fourier transform. Fourier representations are particularly powerful for signal deblurring since it transforms a deconvoltuion problem into the simple form of divisions.
However, Fourier representations may not be good at capturing frequency evolution over time. As time variable is added to Fourier transforms., basis functions of shorttime Fourier transforms are no longer eigenfunctions of a convolution operator.
2.2.1 Fourier Transform (FT)
The Fourier transform pair may be written as: X(wz) =J x(t)eJw"dt (FT). x(t) = J j X(a)YtdL, (Inverse FT). A linear convolution operator Tf can be characterized by
y(t) = Tf [x(t) =J X(Lz)Tf [ejwtl d L X( )F(L)eJtd,
or.
I~w X y )Fy)
This shows that Fourier representation of the convolution operator T is the Fourier transform of its kernel. The beauty of this representation is that on the Fourier domain, a convolution (deconvolution) is simply multiplications (divisions) of the both representations.
For a translation operator with kernel f(t) = 6(ts), F(w) = ej'. and therefore JY(w')I = JX(L,)J. which means that magnitude of Fourier transform is translation invariant.
2.2.2 Discretetime Fourier Transform (DTFT)
The discretetime Fourier transform may be written as [49]:
X(c) L 3 x(n)ej"' (DTFT),
X(n) = J X(eJ')eJndaw (Inverse DTFT).
Similar to its continuous counterpart,
q(n) Tf [.x(n)] 7r X ((j') F (eij'eifdL,, 2rr
0O'.
Y(eFlat X(ewi F(ew)= For a translation operator with f (n) 6(,n, d), F(eJa;)
I e= I X (ejw) I.
C j d and thus
2.2.3 Shorttime Fourier Transform (STFT)
In order to capture frequency evolution of a nonstationary signal, a window w(t) is introduced into the Fourier transform. The shorttime Fourier transforn may be written as:
X( t) J x(T)UI(t T)eCjwrdT (STFT).,
M 0 X(, t)eJWtdw (Inverse STFT). where we assumed w(0) # 0.
For a convolution operator TJ,
Tf [W(t T)ejt F(w, t T
where F(w, t) is the STFT representation of the kernel f(t). It means that in general the windowed complex exponentials are no longer eigenfunctions of a convolution operator. Consequently, STFT representation of y(t) = Tf [x(t)] is another convolution of either
or.
)Y(w, ) J X(, t  T).f(T) CJdT.
00
For a translation operator with f(t) = 6(t  s), Y(&, t) X(w t  s)ejL', and thus 't)l = X(,, t s)1. Therefore., magnitude of STFT transform is a timeinvariant representation.
2.2.4 Discretetime Shorttime Fourier Transform (DTSTFT)
The discretetime shorttime Fourier transform may be written as [49. 53]
OC
X(e, i) 3 x(k)w(n  k)ejwk, (DTSTFT) (2.1) k= c
,() 27;w(0) J (,, n)eJ, dw. (Inverse DTSTFT) (2.2)
For a discrete convolution operator Tf and y(n) =Tf [x(n)]7 we have
0
Y(eJ', n) = X(eL',n  k)f(k)eJk k=c
0o'.
0
Y(eJ3,n) = F(cY , k)x(k)e j k.
k=oc
For a translation operator Tf with f(n) = 6(nk), '(ejw, n)I =X(jn Thus, the magnitude of DTSTFT is also a translationinvariant representation.
2.2.5 Connection Between Shorttime Fourier Transform and Filter Banks
The discretetime shorttime Fourier transform X(ejw, it) is a twodimensional function of a continuous variable w and a integer variale tt. Is it possible to have only finite samples X(c', n) in the frequency domain, and yet still able to recover the original signal x(n)? It turns out to be possible by carefully designing channel filters [63. 53].
For a given frequency a'k, STFT coefficients of (2.1) can be rewritten as
X(ej* ) = 7 i"A Z r(in)'(  ,)(ejWk(nm') = J kfl [r(") * ,'k(1)],
where Vk(n) = w(n)ejL" '. This means that a frequency point of STFT can be computed using a filter with frequency response 1"(w) = IV(w W:k) and a complex multiplier e( '"'. Note that IX(eJwk, n) X * Vk(n).
For a filter bank consists of M  1 such channels satisfying MI t A 1
E 1,(w) = E R'(  W 1, (2.3) k=O k=O
the original function x(n) can be recovered by replacing the integral of (2.2) with the following summation:
Al ' Al 1l 1 (U , n)e 'kf ~>: [X(n) * Vk(n)] = (n) '(I)  x(n).
k=O k=O Ls
Figure 2.1 showed a filter bank structure for the analys issynthesis process of
STFT.
v~ (n) A X(ednx I)
e n"'M T 'N  I n
Figure 2.1: A Filter Bank Imtplementation of STFT.
2.2.6 Generalized Shorttime Fourier Transform
We already pointed out that STFT magnitude can be computed by a filter bank without modulators and denodulators (see Figure 2.1). We can further relax the constraint that channel filters are frequencyshifted of a single lowpass filter v(n), and define a generalized shorttime Fourier transform pair as
DC3
Vk (fl) Z X(M)Vk(n i n). 0 <~ A < Al  1.
Al I
=~) Z Uk(11)
k=O
2.3 Wavelet Transforms
Vavelet transforms have become a powerful tool for analyzing nonstationary signals in recent years. It has several major advantages over the shorttime Fourier transform.
First, wavelet transforms choose to fix the ratio of bandwidt h/centerfrequency (called constantQ) of channel frequency response instead of the frequency resolution (determined by the bandwidth of the window) as in STFT. In other words, wavelet transforms use a narrow bandwidth window for low frequency signals and a wide
bandwidth window for high frequency signals. The idea behind this is that low frequency components represent slow changes in the time domain for which time resolution is not critical while higher resolution in the frequency domain is more desirable. In the opposite case, a wide bandwidth window for high frequency components means better resolution in the time domain to capture abrupt changes.
Second., recent development of wavelet packet transforms and other signaladaptive wavelettype transforms are much more flexible and powerful in exploiting timefrequency concept for wideranging applications. 2.3.1 Continuous Wavelet Transform (CWT)
For a continuous time signal x(t), a continuous wavelet transform pair may be written as [41, 12]:
X(a, ) J X(u)y!(u  T)du (CVT), (2.4) oo d2
X (t) = CIX(,T'),=t0T(7 Ivcs 1 ) 25
where x* denotes complex conjugate of x, V'a(t) = , (t) is called the basic wavelet. Parameter T is the translation factor and a is the dilation factor. Informally speaking. when a increases, function Ya(t T) expands and takes longtime behavior into account. In the opposite, when a decreases, function '( (t  T) contracts and only focuses on the shorttime behavior. In order to recover the original signal from its CWT, the wavelet 'y(t) must satisfy the admissibility condition: = p P)I'du < +oc.
However, a wavelet function Vp0(t) is generally not an eigenfumction of a convolution operator T, as:
Tf aV'(t)] f f( )V/;a(t)d
does not equal to A.C,(t).
Since inverse CWVT of (2.5) can be rewritten as:
1 f(= da M [X(a, t) * (1,(I] a where ï¿½ denotes convolution on variable t, we have:
Tf [x(t)]  [((t) * X(a. t)) 2, (t)]d2 Ga2
If we denote W42 as the operator of continuous wavelet transform, the above result showed:
/VTf [x(t)] f(t) * X(a, t)  J f( )X(a, T  = TfW [x(t)
Therefore. CWT is a translation invariant representation.
2.3.2 Discrete Wavelet Transform (DWT)
Discrete wavelet transform pair on real orthogonal wavelet bases may be written as [63]:
pj~) J70 x(t)2 ji'(22t  n)dt (DIT).
+oc +occ
x(t) = E E Ij(n)2 J/2y/(2r  n) (In vers( D11T).
j=0 n=0
where functions {2 Ji/2',(2it n)} consist an orthonornal basis.
Notice that DWT maps a continuoustime function x(t) into a discretetime sequence pj(n). Clearly, it is not a translation invariant representation.
As in the case of CWT. basis functions 2 J/%2V (2Jt  n) generally is not a eigenfunction of convolution operators. However, T [2 ji/2(2t n)] may be decomposed in DWT space:
+co +oc
T [2 / (2 i t  P)] v=, i(n, m)2 /2 (2 't ,,)
1=0 rrt=oc
where the transform coefficients may be calculated by:
v2,1(n, i) Tf [2 j2 (2(t )] 2 1/2 (2  iii)dt
Jf f(&)2 ij/2y (2 2(t  n) ) d&] 2 1 /2 '(2't  nd
2 (jl)/2 J f()J Q (2 i(t  ) ' y(2 1t nr) dt <.
Figure 2.2: Fast Discretetime Vavelet TransformFirst Two Levels.
Note: Down arrow = decimation and up arrow = expansion.
We can derive DWT representation of Tf [x(t)] to be
+o Oo ++CC
T > >3(t) = >3 > 1,())T [2,j/1 2''(2jt 't
j=0 =O
= E Z pj(,,)v,,,.(,,,,) 2 1/2 021t  nt).
1= = L=0 n=oc
Therefore, {uj(n, m)} can be seen as representation of the operator Tf under the basis {2 Ji2? '(2Jt  n)}.
2.3.3 Discretetime Wavelet Transform (DTWT)
The discretetime wavelet transform (DTWT) cannot be obtained by sampling the continuoustime wavelet transforms nor by simply mimicking the continuous ones. Assuming a discrete "mother wavelet" v(n), a discretetime version of dilation v(2kn  m) does not produce a frequency response  I (e 2',). Moreover, a dilation I (e2kW) in frequency domain is generally not a bandpass function but a multiband function. The fundamental difference lies on both the 27rperiodic characteristic of frequency response and the indexing constraint of discretetime series.
A fast algorithm based on the concept of multiresolution was found by S. Mallat [42] and is illustrated in Figure 2.2.
If we consider only one level, we have
do(,n) = E'i= x(,,)g(2n  mi), (2.6) co(n) = Em_0 x(n)h(2n  m), (2.7)
x(n) = _ co(1)h(, 21) + E d(l)0(,i 21). (2.8)
Therefore, the decomposition filters h(n) and g(n) and the reconstruction filters h n) and 4(n) must satisfy:
.. g(21,(  21) = 6(,, ), 1)
Z, h(2n  in)h(m  21) = 6(n  1),
E, g(2n1  m)h.(n  21) = 0, (2.9)
E711 h(2n,   )(;n 21) = 0,
El h(21  r)i)h(n 2/) + Yt g(21  t)4(n  21) = a(mi  n)
It can be proved (see Appendix A) that in frequency domain those conditions are equivalent to:
G(c G(elw) + G(cj(w+r)),(J(&e+")) = 2 H(e J H(cJ) + H(ci('+'))H(e*('+')) 2 G(cl H(eJ) + G(eJ(+'))R(j('+')) 0 H (eJ G (el ) + H (ej(W+ ))G(ej(W+ )) 0 H(c")H(2') + G(e")G(e.') 2 H(eJ')H(eï¿½(+)) + G(eJw)G(ej(W+7)) = 0. One possible choice is to choose h(n), g(n), hi(n) and 4(n) by:
g(,i) =(1)"h(1  n) 4:: G(ej ) =ej 'H*(eJ( +7),
(n) = h*(n) ï¿½ H(&ei) H*(eJ(,),/ (2.11) 0(, = f,,)** (ej') G*(cj"),
In this case, the resulting bases is called orthogonal bases. Otherwise, the resulting bases is called biorthogonal bases.
It can be proved (see Appendix A) that the structure of Figure 2.2 is equivalent to a filter bank with the transform written as [63]:
dk(n) Z X(1i)ik(2k+in  in). 0 < k < M 2,
0= (D TTVT )
CM1(7) Z X(il)'AI1(2 A111 T),
M 2 c
x(n) = dk(,n) k(  2k+l in) + E C 1(iTn)uitI(n  2 M it) (Inverse DTTI'T).
Particularly. for orthonomnal basis, it can be proved (see Appendix A) that ILA(f)  r(( n),
Y>14n(11 nkl)nU*(m nan) = 6(k  c)6(l  n). (2.12)
M =c
x (n) x (n)
Figure 2.3: Generalized Discretetime Filter Bank Transform.
Since DTWT can be included into a general framework to be presented next,. we leave more in depth discussions to the coming section.
2.4 Generalized Discretetime Filter Bank Transforms
Previous sections showed that both STFT and DTWT can be implemented by a filter bank structure. If we ignore the modulator (ejwA f) and the demodulator (e ' IL) of STFT, the discretetime version of the two can be included in a generalized form:
Wk(n) 7 1i= x(m)vk(nkn  in), 0 < k < Al  1, (GDTFBT) (2.13) x(n) ,A 1 0c wk(nm)uk(n nnm) (Inverse GDTFBT). (2.14)
It was called the generalized filter bank transform [63]. Figure 2.3 illustrated the structure of GDTFBT.
Depending on nk, Vk(n) and uk(n). GDTFBT includes the following special cases.
1) Orthonomal wavelet representations. 1k = 2k for 0 < k <  2 and n,1=
11AI2. In this case, basis functions possess orthonomnality of (2.12).
Figure 2.4: Example of General Binarytreestructured Filter Bank of Five Channels.
2) Orthonomal wavelet packet representations. This is a generalization on the orthonomal wavelet by applying the recursive decompositionreconstruction structure to all branches. Therefore, Ilk = 2L for 0 < k < 2L  1.
3) Biorthogonal wavelet (packet) representations. The same as the orthonornal
counterpart except that the constraints (2.11) are lifted.
4) Generalized binary tree structured filter bank representations. This is a further
generalization upon the (bi)orthogonal wavelet (packets). Every channel may be split into two with filters satisfy (2.9). Therefore, every nk is power of 2.
Moreover, filters used in one channel (node) may be different with another.
Figure 2.4 showed an example. It was proved in Appendix A that the basis
functions constructed this way satisfy the biorthonomality expressed by:
Sa,,(im  nCl)vk(nkn  In) = 6(c  k)6(l  n). (2.15)
5) Overcomplete wavelet representations. In this case. no downsampling and upsampling are used, which means nk =1 for all channels in the Figure 2.3. As an example, the overcomnplete wavelet transform corresponding to Figure 2.4 is
shown in Figure 2.5.
Figure 2.5: The Overcomplete \Vavelet Transform for the Example of Figure 2.4.
Representations 14 above are called "complete". The reason of calling the representations of the category 5 "overcomplete" or "redundant' is that they contain more analyzing functions ({uk(n)}) than a basis. Since these analyzing and synthesizing functions must satisfy
At c A 1
Y, 1: '1(0  1)Vk(n I) = 6(1 7), or E ["k(cj)1k(Cj') 1.
k=0 in= k=0
they must be linear dependent. In the other word, some functions must be expressible by others.
Complete and overcomplete representations behave rather differently under a convolution operator.
2.4.1 Convolution Operator on Complete Representations
In general. a basis function uk(nnkm) is not an eigenfunction of a convolution operator. Using the same idea as in the case of DWT, a convolution operator Tf can be represented by basis functions {Uk(n  nkm)}: Al I oc
if fUk(P kn~ 1100 = E 'Ik(mi, b) ui,(n  nab)
c=0 b=oc
Nrxk(nI, b) = f .f(i)7k(l  I k 'IkM)V.(n,b 1). (2.16) l=oc i=Oc
It can be proved that uk(n  nkm) is an eigenfunction of the operator Tf if and only if 1k,c (m, b) = A6(k  c)6(in  b).
Proof: If %lA((m. b) A6(k  c)6(rm  b), then
T [Uk(  1101)] 'o E=z A6(k  ()6(I  b)u,(n inb) Allk( llkIII).
In the other direction, if Tf [uk(u, 7,km)] = AUk(n 10km). then
11k,,(in,,b) = E' oT [(, nk1)] c(n~b  it)
= E cc Au11k(n  nkm)v (nb  n) = A6(k  c)6(m  b) where we utilized the property (2.15). m
The set of coefficients {T/k,,(tn, b)} is the representation of the operator Tf in the basis and is independent of input signal. Since a convolution operation on a sequence X(n) can be derived as:
Tf [x(11)] > 3 wk(nt)Tf [Uk("  ,,k in)]
k=O m=  00
A1] oc Al 1 SE E Ul'k (nl) Y 1: %x" C(i , b ) 'ac(n  nc b) k=0 m = c=O b=oc
E= 1L k (n 1)1,k,,(in , b) .( r  :) ,
c=0 b=c l k=O r=Oc
the representation of T [x(n)] is thus A/I oc
at(b)= 1: E Wk(m)7k,c(rnb). (2.17)
k=O M=00
For Aperiodic sequences x(n), Uk(U) and Vk(n), l/k.(m, b) is periodic in
Ilk
in index and Xperiodic in b index, and a,(b) is , periodic in b index.
ncc
A representation in matrix form was derived by Banham et al [3].
For a translation operator of f(i) = 6(i  d),
,lk..(nt, b) = E'_, U(  d  nkm) ,(nb  1),
and generally is not equal to A(k  c)6(,m  b). Therefore, complete representations are not translation invariant.
2.4.2 Convolution Operator on Overcomplete Representations
The overcomplete representation of a convolution operator T" can be obtained from (2.16) with n, 1 and Ik = 1:
llk,c(in, b) = Z f(i)uk(l i rn)v,(b 1) = (f * Uk * c) (b  in). (2.18)
Accordingly, the overcomplete representation of T [x(n)] can be obtained from (2.17):
I1 xc
a, (10 = Z Y wk()rk,,(11 rM) = f(n) * u?,,(n), (2.19)
k=O ?'n=oc
which showed that discrete overcomplete wavelet transform operator and convolution operator are commutable, and thus overcomplete wavelet representations are translationinvariant. To verify this, assume an overcomplete wavelet transform operator of W/V and a translation operator of Tf with f(n) = 6(n  d). overcomplete wavelet representation of Tf [x(n)] is:
WTf [x(n)] = TfW [x(n)] = TJ [{wk(n)10
which means that the representation of a timeshifted signal is simply the translated version of the original representation with the same amount of timeshift.
2.4.3 Approximation of Convolution Operators on Overcomplete Representations
In frequency domain, (2.18) can be written as:
7lk.Cde') = F(edw)uk(eJ )1 .(e'),
where ik,j(e') is the discretetiie Fourier transform of llkx((f).
If the passbands of the filters Uk((eJ) and I .(eiJ") have little overlap, the following approximation is justified:
, (d ) ;.( ') 6(k  ct:d' ;d'
Moreover, if the function F(cj ') is real and the l)assbands of channel c's are narrow enough., we may further approximate k,c(eC) as: Ik,,cC ) A,6a(k  cU (e.1 I (el )
or.
TI .c(fl)  Aj6(k  c)u,(n) * j,,(n)
where A, is a real number. In this case, (2.19) may be well approximated by: Al 1
(,(n)  A(k  C)I,,,k(,) * U(,) * V,(,) = A,(,,n). (2.20) k=O
This means that a convolution operator with real frequency response F(e>j) may be approximated by a set of real multipliers {k}. This is another advantage of overcomplete representation since we found that the approximation is generally not extendible to complete representations.
2.4.4 Aliasing Enhancement on Complete Representations
For complete representations, subsampling usually does not meet the Shannon sampling theorem, and thus produces aliasing distortions. Those aliasing components get canceled only by specially designed filters at the reconstruction stage. If we manipulate the representation in a way like (2.20). aliasing distortions may not be canceled. For a concrete example, we consider a wavelet transform of Figure 2.2 with only one level. If we multiply representations do(n) and co(n) by kd and k, respectively, we can rewrite Equations (2.62.8) in frequency domain:
D' [X(eJi2)G(eJ/2) + X(J(+2)/2)G(eJ("+27)i2)1 C1(e 2k [)cJ)H(ew/)H
X (')  [kdG(eJ')G(eJ') + kcH(ej)(eii] X(ej) +
2
2[kdG(ej(+ï¿½))G(ej) + kc!H(e j(Wï¿½T) )(ej1 I (A:7 The second term is the aliasing component. Unless kd = k , the second term is generally not equal to zero even though filters meet the perfect reconstruction
conditions (2.10). In this case, the system is not a linear time invariant (LTI) system, but rather a linear periodically time varying (LPTV) system [63]. It is easy to recognize that the system can not be characterized by the familiar form of Y(JW) = I(&jX(eJ ). Figure 2.6 shows an example of such aliasingenhancement effect.
80:1
70
40
20,
100 50 100 150 200 250 0
2 25 3
Ahasng component
50 100 150 200 250 0 05 1 1 5 2 21
(c) (()
Figure 2.6: Example of the Aliasingenhancement Anomaly. Where:
(a) The original sinusoid signal;
(b) Magnitude of DFT of the signal (a);
(c) Reconstructed signal with orthogonal Haar filte
(H(eJw) = 2 1/2(1 + eJ), G(ej) = 2 1/2(1 _ e )
one level decomposition and kd = 6, k. 1;
(d) Magnitude of DFT of the signal (c).
5 3
rs
2.5 Summary and Discussion
We have reviewed signal representations based ol Fourier and wavelet transformations and included them in a general framework called generalized filter bank transform. \We studied their behavior under a convolution operator. It revealed that complex sinusoid functions of Fourier transform are the only basis functions in this class which are eigenfunctions of the operator. In this case, a convolution operator is equivalent to multiplications in the representation (frequency) domain and the inverse operator can be simply expressed as divisions. For overcomplete representations, a convolution operator may be approximated by multiplications of coefficients by real factors. However, such an approximation can not be extended to complete representations due to aliasing distortions. Moreover, we realized that discrete complete representations also lack traislationinvariant. Based oil the characteristics of these representations, we may find the best applications for them.
1) Complete rel)resentat ions may be good for compression, progressive transinssion. No redundancy alone would be a major advantage for those applications.
Throwing away some coefficients may be seen as a zeroing operation, which does not enhance aliasing components. After all. we are prepared to accept
some distortions for those applications.
2) Overcomplete represent at ions possess many desirable properties for restoration.
enhancement, feature extraction and time delay estimation [52]. First, they are translation invariant which is critical for pattern recognition. Second. they do not have the aliasing distortions. Third, they allow much higher degree of
flexibility in filter design.
CHAPTER 3
OVERCOMPLETE WkAVELET REPRESENTATIONS
3.1 Introduction
The paradigm of overcornplete wavelet representations is a versatile and powerful tool shown to be advantageous for certain applications. In this chapter. we shall study overcomplete wavelet representations in detail, including issues of decomposition and reconstruction algorithm, filter selection and timefrequency interpretation.
3.2 Overcomplete Wavelet Transforms and Filters
In general, a discretetime overcomplete wavelet transform (DTOWT) pair can be obtained by setting nk to 1 in the Equations (2.132.14), as rewritten in the following:
Uk(f)  =_V uk(M)x(n  in), 0 < k < M 1, (DTOITT) (3.1) A(n) = '  O w(7)uk(n n) (Inverse DTOI VT). (3.2)
x~n) k=O M='1qk~)U(T
More importantly, we need an algorithm to generate analyzing and synthesizing filters Vk(n) and uk(n). In this thesis, we restrict ourselves to the binary tree algorithm as illustrated in the previous chapter, and consider two particular classes: overcomplete wavelet packets and dyadic wavelets.
3.2.1 Overcomplete Wavelet Packet Representations
The construction algorithm of overcomplete wavelet packet representations corresponds to a complete binary tree, as shown in Figure 3.1. Though each tree node represents an analyzing filter, only a subset of all tree nodes is needed to achieve a perfect reconstruction. Notice that the position and the width of each rectangular
11(w)
No,o I N ,
1() GNw ((w H(2,,)
h1(2w) (;(2w) /\1(2w ) :
G(2we)
N2.0 N2,1 IA2.2 IN2,3
... ... ... ... 0 X 27r
(a) (b) Figure 3.1: Binary Tree for Overcomplete Wavelet Packet Decomposition:
(a) the tree; (b) illustration of frequency scaling of the prototype filters. node also represents the channel's position and the band width in the frequency domain.
In general, prototype filters H(w) and G( ) in each level can be distinct, as proved in Appendix A, as long as they satisfy the condition H(w)H(,)+G(w)G(') The frequency resl)onse of each node can be calculated recursively:
Po(Lw) = H(wc). P(a) = G();
No,o(w) 1 N,+,,m(w)= P,,(2'W)NI\,L[,,,2j(L'), 1 > 0. (3.3) where the subscript c of the filter P,(,) follows the periodic pattern of 0. 1. 1. 0. and can be calculated by: c = (m + [m/2J) mod 2. The purpose of such ordering is to make the index m, of the node N,, proportional to its central frequency.
The reconstruction from two child nodes to their parent node can be expressed by the following formula:
"VlV+l,2k(W)Pkmod2(21W) + AI +1,2k+1(W')P(k+i)mod2(2'LW) ( where we used the property of: Pkmod2=(W)PkO(+ 2(W) + P(kï¿½1)rn.d2(W)P(k+1)d2(W) H H(w)HR(w) + G (L,))G(w) = 1.
25
There are many possible choices of filters vk(11) from the tree to achieve the perfect reconstruction. For example, for the tree shown in the Figure 3.1 with the height of two [55, page 449], five sets are available: {N,0}, {N1,o, Nl, I}, {N,o, 2, 2,3}, {N2,0, N2,. ,N1,1} and {N2,0. N2,A12N2,2, N2,3}. In general, let T(h) be the number of selections from a complete binary decomposition tree of height h. Since the tree may be split into two subtrees of height h 1 each in turn have T(h  1) selections, we have the following recursive formula:
T(h) = [T(h 1)]2 + 1, h > 1, and T(0)= 1.
The first five numbers were calculated as T(1)=2. T(2)=5. T(3)=26, T(4)=677 and T(5)=458330.
All quadrature mirror filters (QMF), good for orthogonal wavelet transform, are also good for overcomuplete wavelet packet transforms. However, for real QMF except Haar wavelet, I. Daubechies proved that linear phase and finite impulse respouse (FIR) properties are mutual exclusive [13]. For example, Daubechies wavelet filters [11] are FIR but not linear phase. In the other hand. BattleLemari6 wavelet filters are symmetric but not FIR, which may be written as [42]: Hp() =  4p+4(CO) (3.4) 24p+44p+4 (2w)
where p is a positive integer and
Ekp (wï¿½+2kir)"'
Note: An algorithm for numerical computation of BattleLemari6 wavelets is included in Appendix B.
3.2.2 Dyadic Wavelet Representations
The building structure of a dyadic wavelet transform is a particular subset of the overcomplete wavelet packet transform. It decomposes lowfrequency branch only (assume H(w) is lowpass), as shown in Figure 3.2.
Figure 3.2: Binary Tree for Dyadic Wavelet Decomposition.
Obviously, all filters good for overcomplete wavelet packet transforms are good for dyadic wavelet transforms. However, one may want to exploit the extra opportunitv created by the more relaxed constraint on filters. The following filter classes [44, 20] possess distinct property in the time domain:
H() CJp',/2 cos2n+P(/2),
H(w)  * ) 35
G( ) = (j)'Yejpw/2 snP(/2), (3.5) G (J)Pc ill (,;/2) 0(') (J)PJPW!2sinP(c'/2) _n=0 cos2 (L'i2) where n, > 0 is an integer and p E {0, 1}.
In fact. for: p = 0 filter g(1) = 0.25, g(0) 0.5 and g(1) = 0.23 is proportional to a discrete Laplacian operator and for: p  1, g(0) = 0.5 and g(1) = 0.5 is proportional to a discrete gradient operator.
Notice that filters within this class are linear phase and FIR, but not orthogonal. For p = 0, filters H, H, G and C and their frequency scalings (e.g. H(2,..u)) are all symmetric. For p = 1. different prototype filters may be used for frequency scaling to minimize spatial shifting caused by the phase factor. If we denote H(w).
Hj(w), G1(w) and GI(w) for filters used in level I > 1, we find the following filters are either symmetric or antisymmetric as well as FIR:
H,(w) = cos2n 1(2 2w), H11(L) = HI(w),
Gi(w) = j sin(2' 2L,), ,() = G(L,) E cos2'"(21'2)ï¿½ Mn=O
0,9, a=l, b=4 08 a=2 b=4 07, a=3 b=4 06
0,5
04/
03
0 2'
01
0
3  2  1 0 1 2 3 Figure 3.3: Examples of the Function ),ob(W).
In both cases, frequency response of tree node Nl,k (I > 0 and A E {0, 1}) can be derived as:
'O,( .) =(CjP'w2 JJCos 2 flï¿½(21n 'w) =J)~/ si(2 1.) w+
Sl 21 sin(/2) j "r/,l11=0 ( p jp,;i2 [2  Sl('/ )2p [ Sill (2' 2a;) I] 2n+2
i (j )P  1[2' "1 sin( /2)l j [ 1 sin( /2 i)
For convenience, we defined a function a,b(w ) as:
e[.(,) = Lsin/2j
For b > 4. the function O,b,) is approximately Gaussian as plotted in Figure 3.3.
Similarly, we can define a reconstruction tree with nodes: A,(w) H.=j Hrn(u), N,,1(w) N l ,o(w)Gj().
A dyadic wavelet representation consists of leaf nodes. For example., a threelevel representation consists of nodes {N,0, N3.1 N,,, Njj}. In terms of filter bank, they can be viewed as multiple channels. Channel frequency responses can be derived as:
1
Co(,) =A ,o()N,0()= I !,o(w) )H,(w)= ,4n+2p(L )
1 1 jl(w+)N/ ,0(W)[1i Hi(w)i(w)1
The channel frequency responses for a threelevel representation are {C3,o, Ca,i,C2,1 ,C.1}. Notice that the channel frequency responses C.1(LL) are an approximate difference of two Gaussians (DOG) [47, page 63]. Two examples of channel filters are shown in Figure 3.6.
3.2.3 Filters of Autocorrelation Functions of Wavelets
From a filtering point of view, the idea is very simple. Since there is no downsamplingupsampling taking place in overcomplete wavelet transforms, it seems that we can simply combine decomposition and reconstruction into a single filtering step. To achieve this, one only need to change prototype filters into: H1(w) H(w)H(w), (3.6)
The perfect reconstruction condition now takes the form of: H, (L,) + G,,,(Lo) = 1.
As H(w) is connected to a wavelet, Ha(w) is connected to an autocorrelation function of wavelet. For rigorous mathematical treatment, refer to Saito's Ph.D thesis [57].
In this case. a reconstruction step is simply a summation of all coefficients as (3.2) degenerates to:
which takes the same form as the generalized STFT.
The most important property is that filters H(w) and G,,()) are always symmetric or antisvmmnetric, and thus providing a simple way to construct symmetry and FIR filters for certain applications.
3.3 Connection to Complete Wavelet Representations
In Appendix A, Proposition A.0.3 built the bridge between an overcomiplete wavelet representation and a complete wavelet representation. If both representations used the exactly same prototype filters H(s), G(O), H(,) and ((), the only difference lies on the decimation. In this case, the complete wavelet representation is a subset of the corresponding overcomplete wavelet representation due to decimation. If the )rototy)e filters used by an overcont)lete wavelet representation does not satis y (2.10), there is no corresponding complete wavelet representation.
3.4 Timefrequency Interpretation and the Uncertainty Principle
The set of transform coefficients { Wk(n)10
Figure 3.4 shows two overcomplete wavelet packet representations using symmetric Lemari6 filter of p = 1. It is apparent that overcomplete wavelet representations include both frequency and temporal information of the signals. The frequency resolution or capability to separate different frequency component of an overcomplete wavelet representation is determined by I 'k(w)'s. The narrower the passband of time 1'(4w) is, the higher it is tihe frequency resolution. On the other hand. tme time resolution is determined by the distribution (or window size) of the impulse responses Vk(1)'s. The shorter the distribution is, the higher the time resolution it can achieve. Unfortunately, high resolution ill time and frequency domains are conflict goals, as demonstrated in tihe Figure 3.5. In this example, we saw that as passband narrowed, impulse responses spread and consequently the boundary between
Figure 3.4: Two Examples of the Overcomplete Wavelet Representation.
Legend:
(a) An ideal square waveform;
(b) A linear chirp signal s(n) = 5cos(O.027 + 0.00lrn,2).
In each image, top half is the signal waveform, and the bottom half is (0255) scaled overcomplete wavelet representation (coefficients) of 64 leaf nodes of the level six. For the bottom half,
horizontal axis is time. vertical axis is frequency.
two impulses blurred. This phenomenon reflected a fundamental principle called the uncertainty principle [21, 51].
Time and frequency localization of a discrete lowpass function f(n) can be mathematically described by two quantities, (2= 1 Y) 2
n EZ(n ) If(n).
r7r
a2 2irE I~w
where E 2 1 7f(n)12 f , IF(ei") Vdw and Hi = E Zn "If(n)(
The product of o, and a2, is defined as the unccrtainty factor:
U = 7o (3.7) Liu and Akansu [40] proved that U > [E  F(eJ )12] /(4E2), and thus for filters with F(ej') = 0, U > 0.25. That the uncertainty factor has a lower bound is significant and is called the uncertainty principle. It basically states that we cannot have filters with arbitrarily narrow bandwidth in the frequency domain and arbitrarily short duration in the time domain. As a result, we cannot achieve arbitrarily precise time and frequency localization simultaneously. (Numerical computation of uncertainty factors for filter banks is discussed in Appendix C.)
Figure 3.6 shows examples of analyzing filters of overcomplete wavelet transform and their uncertainty factors. Uncertainty factors of the analyzing filters of the dyadic wavelet showed that they are indeed very close to Gaussian functions. In general, the results showed that uncertainty factor varied from channel to channel, and level to level. As a statistical measure, we used maximum, minimum and mean to describe higher levels. Such statistics for higher level overcomplete wavelet packets are presented in the Table 3.1, where Regular stands for regular Lernari6, filters of (3.4) and Autocorrelation stands for autocorrelation functions of Lelnari6 filters.
\We may catch some trends from the data on Lemari6filterspanned analyzing filters of overcoInplete wavelet packet representations:
Row 1
2 3 4 5
6
8 9 10
12 13
Col I Col 2 Col 3 Figure 3.5: Examples of Timefrequency Localization by Overcomplete Wavelet Packet Representations. Where:
Row 1: A signal consists of two puretone impulses with frequency of 0'  0.3re
and & = 0.327r: the spectral (Col 1) and the waveforn (Col 3).
Row 25: level 2 representation.
Row 613: level 3 representation.
For row 213: columnil, frequency responses {NI,kI0 < k < 2'}; column2, impulse responses of columni (dot lines approximate envelopes). column3, overcomplete wavelet representation. The prototype filter used was Lemar filter of p = 1.
070 035 035 070
08 Row 1 06
04i
0U\
0 1 1s 2 20 3
070 035 044 043 043 044 035 0,70 1
084
2
04 02
00 2 20
09 1 025 0,34
00 07.
0.6 029
0 30 /
3 05 29
04,
02
01 . . .. . \
333
0.8
06
04 02
1 27 1.27 333
0 05 1 15 2 25 1296 117 117 1 000 10 117 117 296
00 06!
02r
0 05 081
~i
06' 030
030
030 04 02 011
15 2 25
042
0 05 1 15 2 2M 3 0 0 1 15 2 25 3
Col 1 Col 2
Figure 3.6: Uncertainty factors (drawn on top of each band) of Channel Filters. Where:
Row 1: Overcomplete wavelet packet of level 2 using Lemnari6 filter of p = 1 (Col
1) and p = 9 (Col 2);
Row 2: Overcomplete wavelet packet of level 3 using Lemari6 filter of p = 1 (Col
1) and p = 9 (Cot 2);
Row 3: Channel filters {C4.(). C4,1. C3., C2,, C1 } of dyadic wavelet using the filter class of (3.5) with n = 1 (Col 1) and p = 1, n = 1 (Col 2).
0
Table 3.1: Uncertainty Factors of Analyzing Filters Regular A utocorrelation
p level Umax Ii U UrEax Ulin, U
4 1.618 0.350 0.701 0.853 0.367 0.497 1 5 5.581 0.342 1.705 2.053 0.318 0.579
6 16.721 0.318 4.535 5.693 0.262 0.980 4 1.011 0.437 0.628 1.178 0.470 0.584 2 5 3.464 0.397 0.954 0.871 0.333 0.432
6 7.984 0.320 1.809 2.736 0.260 0.475 4 1.283 0.529 0.674 1.349 0.496 0.646 3 5 2.386 0.418 0.704 0.721 0.329 0.398
6 5.244 0.318 1.005 1.605 0.261 0.346
1) Joint timefrequency localization of those using low order filters tends to deteriorate quickly as level increase. This is due to the existence of small sidelobes.
2) Using filters of autocorrelation function is more effective than using higher order
regular filter, if the noreconstruction structure is acceptable by applications. However, one should be very careful in using uncertainty factors as an optimality measure due to two main reasons. First, how can we come lip a single sensible number for a filter bank? Minimum U,,, or minimm U or something else'? Second, more fundamentally, the uncertainty factor does not take the signal into account. For configurations with similar uncertainty factors, as in the case of Gaussianlike filters such as (3.5), the measure of uncertainty factor alone would not tell us which one is optimal.
3.5 Twodimensional Extensions
So far we have only discussed onedimensional representations. In order to apply overcomplete wavelet transforms to twodimensional images, we need to extend the transform to twodimension. The following two separable extensions can be implemented as separated ID filtering of rows and columns.
35
1) Tensor product extension [42]. This extension is originally applied to orthogonal
wavelet transform. Only ID filters H(L,) and G(&u) are used. It directly exploit
the perfect reconstruction property of quadrature mirror filters by extending: H(w)H(w) + G( )G(w) = 1
into:
[H(,;x)H(W ) + G(wJ,.)G(w,)] [H(wj)fH(u y)+ G(u)G(:y)] 1.
The above 2D extension can be rewritten as
HH(wx, Wy) HH(w1, wy) + HG(w'1. Ly) HG(wx&. wy)
+
GH (w,wy)G6H (w,;,u ) + GGY1w G(w,.wy) 1
where
HH(L, ,y) = H (Lx) H (Ly), iH(LL;, Loj) = RH(L,x)HR(w), HG(,,wy) = H(wx)G(wa), HG(A.;.cWy)= H(;.)( ("CV), GH(wwy ) = G(Lux)H(wy), GH(w, y) = G()H()R.) GG(w1.,w,) = G( ,x)G(wy). GG(a,, wy) = G(w,)G(w).
Filters {HH HG, GH, GG} are used for decomposition while filters {HH, HG, GH. GG} are used for reconstruction. The same construction method is applied to frequency scaling of higher levels with each node decomposed into
four child nodes.
2) Twoorientational extension. For the particular dyadic wavelet filters (3.5) with
emphasis on spatial operation (gradient and Laplacian), the component {GG} of the tensor product method may not be useful. Therefore, the following 2D
extension was proposed by Mallat [44]:
HH(wx,wy) = H(wx)H(wy), HH(WX, WY)= H( X)H(LY),
GX (w'xw) = G(wx), GXv(W,,w,,,) = (< )L
Gl'(,,) = G(L,), (W,. ,y) = L(,.) 0(w,).
This 2D extension has only three components:
* a horizontal component GX
* a vertical component GY
* a lower resolution component HH
However. it introduced a new filter L(w), which is to be determined by the condition of:
HH. HH + GX + GX+G) GY1.
It turned out that the formula for L(w) is
1 + H(w)H(w)
L i 2 For filters of (3.5), L(w) = 0.3 * [1 + cOsI2p /)2)]
which is a symmetric and FIR filter.
CHAPTER 4
OPERATIONS ON OVERCOMPLETE WAVELET REPRESENTATIONS
4.1 Introduction
In most cases. a suitable signal representation is not the end of a problemsolving process. On the contrary, how to manipulate it to achieve a desired effect is the more challenging part.
For overcomplete wavelet representations. we have identified three operations: gain, shrinkage and envelope operators for our applications in image deblurring and segmentation. We demonstrated that a linear convolution may be approximated by a gain operator, and devised a tree pruning algorithm based on a tolerant of approximation error. We extended the wavelet shrinkage proposed by Donoho and Johnstone [17] to overcomplete wavelet representations for denoising. We built the connection between the power density distribution of a overcomplete wavelet representation and its envelopes, and presented two envelope detection algorithms.
4.2 Gain Operators
In the Section 2.4.3. we determined that a convolution operator can be approximated by multiplication of real factors {Akj0
reconstructed signal x'(n) in the frequency domain is simply A! 1 1
k=O k=O
and thus the equivalent filter is
Q(G,) = [AoA, ...AA I] [Co(L)C,(L') C I (L,)]',
where [A O l... AA,,] is a gain vector, Ck(W) = V ,(w)Uk(W) is the channel filters, and 11t denotes the matrix transpose of M.
Two fundamental issues immediately arise:
1) What is the characteristics of the set of filters expressible by a gain operator? In
the other words, what kind of filters may be approximated by a gain operator?
A broad answer is that any gain operator can only represent a symmetric filter.
This is because channel filters {C k(w)I0< k
also be a real function.
2) For a given frequency response F(w). how can we determine a tree configuration
and a gain vector as the best approximation? Note that this question really
asks for a criteria of "best" and an algorithm to find it.
4.2.1 Minimum Mean Square Error (MMSE) Approximation
The closeness of the approximation in terms of frequency response by a wavelet representation with channel frequency responses {Ck(w')j0
1f _[F(7) M AkCk(w)] d,. (4.1)
It is well known that the optimal gain vector corres)onding to the minimum mean square error may be found by setting all partial derivatives  to zero:
17 ) Ak Q( w& )d& = 0, 0< q < 3l,
Oq 7F 7r k=0 which is a set of linear equations:
[ Ck(W)Cq()dw Ak J F(w)Cq,(w)dw, 0 < q <31 1. (4.2)
k=O IT
The minimum mean square error can be derived as
S1 F 2(w)dw  F () Ck(w)dw 27 7. k=0 where {A O
We may define a normalized minimum mean square error (NMMSE) as
c min
Ef'
where Ef  2i f F2(w)dw is the energy of the given frequency response. Since 6rin > 0, it must )e true that Ef > F>l'l' AO fF (w) C k () dw,
_ = k 27 !r ( ) (,) ,
and therefore 0 < K < 1 .
The criteria of mean square error provides a way to determine the optimal gain vector for a given frequency response F(w) and a given configuration. However. the criteria alone suggests that the bottommost nodes are the best representation., for which E may reach its minimum among all possible configurations. This can be formally proved by the following proposition. Proposition 4.2.1 Splitting a channel into two will not increase the error measure
Proof: By contradiction. Assume that for the current configuration {Ck(P) 0<.< 1 }, the optimal gain vector is {A" } and the NMMSE is _. We further assume that the channel q is split into two channels with indexes q, and q2, with the optimal gain vector {A0' } for the new configuration and the new s' >
Without loss of generality, we can write:
C4, (W) Cq(W)H(21w)H(2w). (4.3) Cq,(w;) Cq(w))G(2'Lw)GO(2Lw).
Since the prototype filters satisfy H(w)H(w) + G(w)() =1. we have Cqi (w) + Cq. (L) = Cq(w). For the new configuration. we can find a gain vector {A'k =A", k # q, k 4q2; A', =A 2 A'} with error c = E. This contradicts the assumption that E' > t, and therefore ' < E
Examples showing higher level corresponding to better approximation can be visualized in Figure 4.1. Overcomplete wavelet packet representations of leaf nodes at Level 4, 5, 6. 7 were selected, corresponding to 16, 32. 64. 128 channels, respectively. Lemari6 filter of p = 3 were used.
4.2.2 Timefrequency Tradeoff and a Greedy Algorithm
Proposition 4.2.1 clearly states that approximation error F of frequency response decreases as the channel bandwidth decreases, as demonstrated by the examples in Figure 4.1. Therefore, by the criteria alone, the Fourier transform would be the best representation since its approximation error is zero. This is certainly not the answer we are looking for.
What we lost in sight is the other part of the picture in the time domain. As dictated by the uncertainty principle, time resolution decreased as the frequency resolution increased. A overcomplete wavelet packet representation by the deepest leaves would be closed to a Fourier representation and thus lost its resolution in the time domain. Moreover, from computational point of view, such a downtothebottom decomposition is rather inefficient.
R'owlI
20
3
5,
Figure 4.1: Examples of Where:
Gain Vector Approximation (frequency range [0, 7r] shown)
Row 1: a given frequency response F(cJw); Row 2 to 5: approximated frequency responses )y overcomplete wavelet l)acket
representations of level 4. 5, 6, 7, overlaid with F(eJ) (dotted curve), with
E=39.4, 18.5%, 4.5%, 0.02%, respectively.
I
One possible tradeoff between time and frequency can be accomplished by setting up a tolerant oil the approximation error i. Under an error tolerant. the goal could be to find a tree configuration with minimum number of leaves. In the following, we present a greedy algorithm in the order of breadthfirst search [45]. Notice that a tree node here represents a channel filter Cl,.,(w) and the relationship between children nodes and their parent node is characterized by (4.3). Algorithm 4.2.2 (Minimum Tree) Given a frequency response F(,) and an error tolerant Fo, let current leaves ï¿½ = {Co,0(w) = 1} and compute the current error u using the set C.
Step 1. One by one, compute error Ek by temporarily substituting a node in the current leaves C with its two children. Choose the pair with minimum, Ek to replace their parent in the set ï¿½ and update the current error Ec = min{k}.
Step 2. If c < Eo, stop and use the leaves C as representation; else, go to step 1.
Examples of overcomlplete wavelet packet approximation using the minimum tree algorithm are shown in Figure 4.2. The improvement is appreciated by comparing it with Figure 4.1. In Figure 4.1, the representation using thirtytwo channels of the Level 5 has the approximation error E = 18.5%. However, for the same frequency response, the minimum tree algorithm was able to find a representation of only twentysix channels and yet reduced the approximation error to ) = 6%. In the other case, the minimum tree algorithm found a representation with merely seven channels and achieved approximation error of 0.6%.
4.3 Shrinking Operators
Wavelet shrinkage was introduced by Donoho and Johnstone [17] for signal denoising and linear inverse problems. For a thorough introduction, refer to [15]. The description of the methodology was given in [15, page 173], we quote, "Wavelet
Row 1
2
h)]1
Ci Col 2
Col 1
Figure 4.2: Examples of the Minimum Tree Approximation with the Maximum Depth of Seven Levels.
Where:
Col 1: A given frequency response F(ei') (rowi, same as in Figure 4.1), approximation result overlaid with F (row2) with E = 6% and selected 26
channels (row3);
Col 2: A given frequency response F2(cjw) (rowi), approximation result overlaid
with F2 (row2) with  = 0.6% and selected 7 channels (row3).
Overcolnplete wavelet packet with Lelnari6 filter H, (ejw) was used in both cases, and the frequency range [0, wr] is shown.
05
shrinkage refers to reconstructions obtained by wavelet transformation, followed by shrinking the empirical wavelet coefficients towards zero, followed by inverse transformation."' For a thorough introduction, refer to [15].
We extended the idea to overcomplete wavelet representations on the ground that they differs with orthogonal wavelet representations mainly on having extra correlated coefficients.
A shrinking operator KIt can be written as
"aw  t, ?1 > t,
Ict [TV] = 0, 1 U <_ t, (4.4) IV + t, It <t
where t > 0 is a threshold. Notice that the shrinking operator is a nonlinear and memoryless operator, and is supposed to work in spatial domain.
The goal of a shrinking operation is to cut the noise off while keeping useful signals. Obviously. this is possible only if the amplitude of desirable components is stronger than the noise components. Therefore, the threshold value t is dependent on the signaltonoise ratio. Examples of choosing appropriate thresholds are discussed in Chapter 6.
To demonstrate the shrinking operation and the advantage of wavelet denoising, an example is included in Figure 4.3. The major advantage of wavelet shrinkage denoising, as compared with linear smoothing, is that it diminishes noise but does not smear the sharp edges. as seen in the example. Comparison with a linear averaging filter with h = [1, 1. 1.1, 1. 1, 1]/7 is presented in Figure 4.4.
4.4 Envelope Detectors
Energy distribution is a very important concept in the physical world. It is a indispensable tool for signal analysis, too.
For a harmonic current i(t) = A sin (wot + a) passing through a oneohm pure resistant, the instantaneous power (energy per unit time) at time t is i2(t). The
Row 1I
2'
3
4
IY
Col I Col 2 Figure 4.3: An Example of Overcomplete Wavelet Denoising.
Where:
Col 1: An perfect impulse contaminated by white Gaussian
noise (rowl) and its level 2 overcomplete wavelet
packet representation (row25);
Col 2: Overcomplete wavelet packet representation after shrinking operations (row25) and reconstructed signal
(rowl).
Row 1
2
Figure 4.4: Comparison of Overcomplete Wavelet Shrinkage Denoising with Linear Smoothing Using the Signal of (row 1,col 1) of Figure 4.3. Where:
Row 1: Overcomplete wavelet shrinkage (left, the same as row 1, col 2
of Figure 4.3) and linear smoothing (right):
Row 2: Local zoomin with overlay of the original noisy signal.
average normalized power in a period T is thus
1 I = i2(t)dt = 12
Therefore, the average normalized power of a harmonic signal is proportional to its amplitude squared.
For an arbitrary waveform i(t), one can decompose it into harmonic components using Fourier analysis:
i *  IDwe wff
By the same token, energy density (per unit time and per unit frequency) at frequency w is proportional to the amplitude squared of the component I(w)ej2rIft, which is I(W)12. The function I(W)12 describes power distribution in the frequency domain and is called power density spectrum [49, page 66].
The above concepts can be extended to overcomplete wavelet representations, by extending the concept of amplitude to "envelope" [26, chapter 4], or "local amplitude." An overcomplete wavelet representation is consist of bandpass components.
Since any bandpass waveforn may be represented by A(t) sin(wt + a(t)) [26. page 229], where A(t) is the envelope and uvc is the associated carrier frequency, an overcomplete wavelet representation {wk(t)10
Thus, squared envelopes {A2( Wk, t)} shall describe the power density distribution of the overcomplete wavelet representation in the timefrequency plane (also called "phase space" in [12]). Figure 4.5 shows examples of such a distribution.
For timefrequency representations using complex Gabor filters [6], envelopes may be extracted by simply performing a modulus operation on two quadratic components. However., for representations with real analyzing filters., such as overcomplete wavelet representations, more sophisticated envelope detection algorithms are needed. We present the next two envelope detection algorithms and investigate their performance.
4.4.1 Envelope Detection by Hilbert Transform
The envelope of a narrowband bandpass signal can be coml)uted by a corresponding analytical signal [51]. For a signal r(t), the analytic signal is defined by:
.x(t) = x(t) + jx(t),
where . (t) is the Hilbert transform of x(t), x(t) =1 fï¿½x(r;) d
7T t dii
The envelope of the original signal x(t) is then simply the modulus of the analytic signal .(t):
The frequency characteristics of the Hilbert transform can be expressed by: j , W > = 0
H(w) = { otherwise.
Figure 4.5: Two Examples of Power Density Distribution of Overcomplete NVavelet Packet Representations.
Legend:
(a) A linear chirp signal, the same as in Figure 3.4(b);
(b) A multisegment pure tone signal with frequencies 'O = 0.087r, &; = 0.2531,
)2 = 0.10r and wL3 = 0.55r, in the order of occurrence. The overcomplete wavelet representation and visual arrangements are the same as in Figure 3.4.
, 1(n)
08 9 1.94636728e2 06 7 4.525349255e2 0.4 5 9.117551207e2 02 3 1.889891117e1 i 1 6.286127232e1
0.2 1 6.286127232e1 0.4 3 1.889891117e1 6 5 9.117551207e2 08 7 4.525349255e2
 9 1.946367728e2
0 1 2 3 4 5 6
(a) (b) Figure 4.6: A FIR, Hilbert Transformer. Legend:
(a) the imaginary part of the frequency response (the real part is zero).
(b) nonzero coefficients.
Therefore, the Fourier transform of the analytical signal x(t) is: {2X(L,) w >= 0
X(w)= { ,) otherwise.
For implementations in discretetime domain, approximate FIR Hilbert transformers may be designed by windowing the ideal frequency response [49]. Figure 4.6 shows an example. It is a typeIII FIR Hilbert transformer designed with parameters 118, and 3 = 2.629 [49, page 680]. This Hilbert transformer is antisvmmetric, of length 19 and has only 10 nonzero coefficients.
4.4.2 Envelope Detection by Zero Crossings
In this method, the maximum absolute value between two adjacent zerocrossings is first found, and then assigned to each point within the interval.
Algorithm 4.4.1 (Envelope by Zero Crossings) For a given array x(1 : N), start from, the index i = 1, find the next index k which is either a zero crossing point or a zero valued point x(k) = 0 or the other end point k= N, whoever is first encountered, assign the maximum absolute value A,,k = max {x(n)ki<
{x(n) [i<,,
4.4.3 Comparison Between the Two Detectors
We compared two envelope detectors by their working frequency range and robustness under noise perturbations. For puretone impulse signals, both (lid well in the middle range (w E [0.1r, 0.77r]). While Hilbert transform was unacceptable at the low end (Lo < 0.17), the zerocrossing approach did poor at the high end (L' > 0.77r). The test cases were shown in the rows 1 and 2 of Figure 4.7. We constructed a noisy impulse by introducing a random frequency perturbation: s(n) = sin(( o + 0.05 * RANDN(n))n)., where RANDN (a MATLAB function) is a random noise with normal distribution, and filtered it with a Lemari6 filter H, (L). This case was included in the row 3 of Figure 4.7. We found that the zerocrossing method exhibited edgepreserving smoothing characteristics and was more robust to wideband noise.
4.4.4 Comparison with Other Energy Operators
A useful and simple energy operator was analyzed on [46]. The operator %P for a discrete signal x(n) was given as
q [x(")]= ?(n) x(n + 1)x(,  1).
However, this operator differs with envelope detectors in the way that its value on a pure tone signal is not only proportional to the anplitude squared but also a function of frequency [46]:
TI [A sin(won + a)] = A2 sin2( o).
4.4.5 Two Dimensional Envelope Detection
Next we extended the ID envelope detection algorithms for the analysis of twodimensional image signals. In the frequency domain, a twodimensional analytic
2
3
Col 1 Col 2 Col 3 Figure 4.7: Comparison of the Two Envelope Detectors.
Where:
Col 1: Input signals (top row, wo00.1w, wl=0.5T; center row.
",0=0.057, w10.757; bottom row, noisy signal with ,0 0.1T).
Col 2: Envelopes detected via the 19tap Hilbert transformer.
Col 3: Envelopes detected via the zerocrossing method.
signal may be obtained by setting an appropriate half plane to zero. based on its orientation. That is, for a 2D signal f(x, y), the Fourier transform of an analytic signal f(x, y) is either:
F{w, 2F(LOXW) , 4; > 0
(w , w)  { 2,0 otherwiSC 0'.
F{) 2 F (w,,y) , > 0
Fy(,,y)  { 2Fw , otherwise. For the 2D filters constructed by the tensor product method with halfband filters, the equivalent complex quadrature filters exhibited the frequency response shown in Figure 4.8. Notice that lowpass and diagonal components may have an alternate arrangement by zeroing out the left or bottom half plane.
This separable property allowed one to compute the envelope of a 2D signal using the ID algorithms described earlier in a straightforward manner, described in the following.
ff ff
(a) (b)(c) (d)
Figure 4.8: Frequency Response of Equivalent Conplex Quadrature Filters (level 1). Legend:
(a) lowpass channel (b) vertical channel
(c) horizontal channel
(d) diagonal channel
The diagonal shadowed areas identify zeroed half planes.
Algorithm 4.4.2 (2D Envelope Detection) For a given 2D array u(m, n) and its orientation. If its orientation is horizontal, apply the ID envelope detector columnwise; If its orientation is vertical, apply the ID envelope detector rowwise; Otherwise, apply the 1D envelope detector columnwise.
CHAPTER 5
APPLICATION I: TEXTURE SEGMENTATION
5.1 Introduction
In this chapter, we shall apply overcoinplete wavelet representations to the problem of segmentation of textured images.
However. it seems that we were back to the beginning. It may sound odd. but it is true that there is not even a precise definition of "texture," let alone "texture segmentation [61]. In spite of the difficulty, there is a general consensus that texture is a property of area. We felt that the concept of texture is parallel to the concept of "local frequency" for an nonstationary signal. The concept of local frequency is easy to understand. It describes the rate of change occurs in a window around a particular time. Yet there is not a unique definition of local frequency. Therefore, our treatment of texture segmentation is to embrace it into the paradigm of timefrequency analysis.
There have been many research works on texture segmentation using the concept of spatialfrequency representation. The analyzing functions used by researchers include:
" Laws microtexture masks (filters) [24]
" Complex prolate spheroidal sequences [64]
* Complex Gabor functions [6, 19]
* Real Gabor functions [30]
" The Wigner distribution [54]
More recently, wavelet and wavelet packet representations have been added to the list [34, 35, 36. 62, 9, 37].
A segnmentation process usually consists of two distinct phases: feature extraction and clustering. Features for texture representation are of crucial importance
for accomplishing segmentation [27]. In this chapter. we demonstrated that overcomplete wavelet packet representation and envelope detection make a good feature extraction scheme on a variety of both natural and synthetic textured images.
5.2 Texture Feature Extraction
The texture features we chose are the envelopes of overcomplete wavelet representations. The interpretation of the feature is that their squares corresponds to spatialfrequency power density distribution as pointed out in the last chapter. The feature extraction thus consists of two stages: 1) overcomplete wavelet packet decomposition and 2) envelope detection.
For 2D envelope detection on overcomplete wavelet packet representations, we classified each node in the decomposition tree into four possible categories, taking into account orientation, as follows:
" The root node is omnidirectional.
" The node last filtered by Gj(w,)H(wy) corresponds to verticalorientation.
(Highpass filter G, is applied rowwise and lowpass filter H, columnwise.)
" The node last filtered by H1(w,)G1(wy) corresponds to horizontalorientation.
(Lowpass filter H, is applied rowwise and highpass filter G, columnwise.)
" The node last filtered by G1(w)x)Gj(w.) corresponds to diagonalorientation.
(Highpass filter G, is applied rowwise and highpass filter G, columnwise)
" The node last filtered by H(a),)H(wy) has the same orientation as its parent.
(Lowpass filter H, is applied rowwise and lowpass filter H, colunmnwise)
At the end of feature extraction, each sample of a signal was attached a feature vector. In our case, such a feature vector is consist of envelopes of overcomplete wavelet packet representation. It is possible to apply a monotonic function to the
55
feature space to transform it into another feature space. For example, a square function x' will transform the envelope space into a space of power density distribution and log function log(x) will transform the envelope space into a space of log spectral.
5.3 Considerations on Filter Selection
\Ve raised the issue of filter selection in the Chapter 3 on overcomnplete wavelet representations. For the application of segmentation, we argued that symnetry, frequency response. and boundary accuracy are important factors in the selection of filters for feature extraction. In the following, we discuss these constraints in terms of overall performance.
" Symmetry. For accomplishing texture segmentation, accuracy in texture
boundary detection is crucial for reliable performance. In this application, filters with symmetry or antisymmuetry are clearly favored. Such filters have a linear phase response. where the delay (shift) is predictable. Alternatively.
filters with nonlinear phase may introduce complex distortion. Moreover. symmetric or antisymmetric filters are also advantageous in alleviating boundary
effects through simple methods of nirror extension (see Appendix D).
* Optimal Frequency Response. In order to derive an ideal filter frequency
response for the chosen feature, we considered a twoband filter bank with input signals of infinite length consisting of two segments with distinct pure tones.
The input signals can be written as:
( 1 A1 cos(w, dn + a,) , n < 0,
A2 cos(wU2n1 + a2) ,I > 0,
where .41 >0 and .42 >0. Except for the boundary (n =0). we derived the
feature vectors (envelopes of channel outputs) as.,
T= (eICG) = 2eft (A, IH(L,)I) A, IG(Lm)j) , n < 0, Tight (A2 IH(Lw2) ,A2 IG(W2)1) ,, > 0.
The angle 0 between vectors Tleft and Tright is
cs ( IH(w)H(I2) + jG(wij)G(2) )
and is bound by 0 < 0 < . The distance between the two classes in the feature space is then
Vi 2 2 2
D = rTejf 2 + T,,nt  2 ftj Tgrig,1, cos(O)
For 0 = E. vectors Tieft and Tright are orthogonal. and the distance D reaches its maximum. Clearly, the maximum distance in feature space between any two classes is optimal for segmentation and classification al)plications, in the sense that the classes are better seperated and more robust to noise perturbations.
Notice that for H(Loj) : 0 and G(W2)  0, cos(0) = 0 if and only if G(WI) = 0 and H( 2) = 0. This means that optimal filter banks should have no overlap in the frequency domain, and thus filter H(,) with optimal frequency response must be a perfect halfband filter. Indeed, [50] derived a similar optimal filter
for image coding applications.
Of course such ideal filters cannot be realized in practice. Based on time preceding discussion., filter bands overlapping in the frequency domain tend to reduce class distance in feature space. Therefore, a filter H(w) with a large stopband
attenuation and flat passband frequency response is desirable.
e Spatial Localization. There are two types of boundaries for signals of finite
length: 1) boundaries of regions exhibiting distinct characteristics. and 2) the physical boundaries of a data segment. Feature vectors close to the boundaries will be affected. The size of the affected region depends on the length of the channel filters, and the distribution of filter coefficients. Therefore, filters of
short length and fast decay shall be better suited for boundary detection.
37
Unfortunately. the )receding three criteria cannot be satisfied simultaneously. As previously pointed out, quadrature mirror filters (QNIF) with compact support cannot be symmetric or antisymmetric. This means that a symmetric constraint on any QMF will be in conflict with spatial localization goals. Moreover, large attenuation in the stop band requires a longer filter, which in turn degrades the filter's spatial localization. Again, we faced the limitation of the uncertainty principle. and the previous discussions about the uncertainty factor of a filter bank is applicable here.
In this study, we chose the Lemari6 filters for its symmetry and good frequency characteristics.
5.4 The Basic Isodata Clustering Algorithm
Segmentation algorithms accept a set of features as input and output a consistent labeling for each pixel. Basically, this is a multidimensional data clustering problem with no general algorithm available [23]. Clustering algorithms that have been previously used for texture segmentation can be divided into two categories: 1) supervised segmentation and 2) unsupervised segmentation.
In practice. unsupervised segmentation is often desirable and easy to validate. It is particularly useful in those cases where testing samples are difficult to prepare (making supervised segmentation infeasible). Thus, we used a Baisc Isodata clustering algorithm [18, page 201].
Algorithm 5.4.1 (Basic Isodata) Given a 2D image array x of structure containing feature vectors and label fields, and the number of classes N ,
Step 1. Scan the representation matrix x in raster order. For every pixel encountered, randomly pick a label from set {0. ..., N.1} and assign it to the pixel.
Step 2. Compute the class center {Ck 10
58
Step 3. Rescan the whole representation matrix x, and assign pixel (ij) to the class k if the Euclidean distance between the feature vector of the pixel and the class center Ok is the closest.
Step 4. If no pixel changes its class in the Step 3, stop; else, go to the Step 2.
This algorithm differs from a Kmeans algorithm [29] in only one aspect: Basic Isodata updates class centers after a complete scan of an input feature set while Kmeans updates for every reassignment. In our experiments, Basic Isodata outperformed Kmeans for almost all cases.
5.5 Postprocessing
The Basic Isodata algorithm labels each pixel independently and does not take into account the high correlation between neighboring pixels. A postprocessing stage, such as the relaxation labeling [25], can be used to incorporate some neighborhood constraint into the segmentation process.
For simplicity, we used median filtering as our postprocessing to simulate the benefit of a local constraint. In particular, a 5 x 5 median filtering was repeatedly applied to an initial segmentation until no change of labeling occurred.
5.6 Experimental Results
Our segmentation algorithm was tested on both onedimensional signals and twodimensional textured images. Our test images included samples of two distinct families of textures. In all examples shown, straightforward envelopes of the overcomplete wavelet packet representation were used as texture feature.
5.6.1 Onedimensional Signals
Figure 5.1 shows the segmentation of a signal consist of a sinusoid segment and a triangular segment with the same period (sixteen samples) and amplitude.
Row 1
2
3
4
Figure 5. 1: Segmentation of a 1D Signal Consists of Triangular Waveformn and Sinusoid.
Where:
Row 1: the original signal and the segnmentation result:
Row 25: the overcornpiete wavelet packet representation anid envelopes.
With envelopes of four channels of the Level 2 representation. the perfect result was achieved. Figure 5.2 is a segmentation example of a noisy signal consist of two sinusoid segments P=0O.37, and w=0.57) contaminated by white Gaussian noise. Envelopes of eight channels of the Level 3 representation was able to achieve the perfect result.
5.6.2 Natural Textures
Here we used textures obtained from the Brodatz album [1] and public archive. Each testing sample was first histogramequalized so that a segmentation result based only on first order statistics was not possible. Experimental results are displayed in Figure 5.3. Experimentally, we observed that a lower order Lemari6&Battle filter (p =1) performed well in boundary detection (Figure 5.3 (b)), while the higher order Leinari6Battle filter (p =2) did a better job within nonboundary (internal) regions.
Row 1
I I~ II
Figure 5.2: Segmentation of a Noisy ID Signal Consists of Two Pure Tone Segments. Where:
Row 1: the original signal and the segmentation result,
Row 29: the overcomplete wavelet packet representation and envelopes.
8 , I I I 1 11, 1 1 , . " 1, 1 1 1 44A .
Row 1
2
3
Col I Col2 Col3
Figure 5.3: Segmentation Results of a Image Consist of Natural Textures. Where:
Row 1: Test image TI (256 x 256, 8bit) consists of D17, herringborn
weave and bark (true boundary overlaid for display only).
Row 2: Clustering results using features extracted from a Level 4
filter bank generated by (Col 1) Lemari6Battle filter of p = 1, (Col 2) autocorrelation function of Lemari6Battle filter of p= 1, (Col 3) Lemari6Battle filter of p =2. The zerocrossing
algorithm were used for envelope detection.
Row 3: Final segmentations after postprocessing.
5.6.3 Synthetic Textures
We tested the perfornance of our algorithm oil several texture images synthesized from random noise based on [27].
" Gaussian Lowpass and Bandpass Textures. The images were generated
by filtering a white Gaussian noise with mean of zero and standard deviation
of 30 with a Gaussian filter of frequency response (in polar coordinate): G(r, 0) = eï¿½ï¿½0i e F /(').
" Filtered Impulse Noise (FIN). The images were generated by filtering a
random impulse image I(m, n) generated by:
I (m, n)= 1.0. if RAN > T.
{ 0.0. if RAN < T.
where RAN E (0.0, 1.0) is a random number with uniform distribution, with
Gaussian impulse response:
g (t , '/'I) = Cm2,2/(2 .52) 2/(2.,
In both cases. images were linearlyscaled into the range of [0, 255].
Figure 5.4 shows a segmentation result on a Gaussian lowpass texture image, and Figure 5.5 shows a segmentation result oil a filtered impulse noise (FIN) texture image. For this difficult test image T3, the algorithm achieved outstanding performance.
W\e also tested our algorithm on a texture image containing regular and sparse elements. Figure 5.6 demonstrates the accurate segmentation result.
5.7 Summary and Discussion
A quantitative comparison, presenting the accuracy of our segmentation results of textured images is summarized in Table 5.1. This performance is consistent
(a) (b)
Figure 5.4: Segmentation of a Synthetic Gaussian Lowpassed Texture Image.
Legend:
(a) Test image T2 (256 x 256, 8bit): Gaussian LP, left: isotropic
F, =0. S= 60; right: nonisotropic F,= 0. S =60. 00=0,
B0 = 0.175.
(b) Initial segmentation (Lemari6Battle filter
zerocrossing envelope detection)
(c) Final segmentation after postprocessing.
of p = 1. Level 4 and
(a) (b)
Figure 5.5: Segmentation of a Synthetic FIN Texture Image. Legend:
(a) Test image T3 (256 x 256, 8bit): Filtered impulse noise, left:
nonisotropic T = 0.15, S, = 1.0, Sy = 1.5 right: nonisotropic
T = 0.15, S= 2.0, Sy = 1.0.
(b) Initial segmentation (LemariBattle filter of p = 1. Level 4 and
zerocrossing envelope detection)
(c) Final segmentation after postprocessing.
(a) (b)
Figure 5.6: Segmentation of a Synthetic Texture Image with Line Patterns.
Legend:
(a) Test image T4 (256 x 256, 8bit)
(b) Final segmentation (Lemari&Battle filter of p = 1. Level 3 and
zerocrossing based envelope detection).
(c) Detected boundary overlaid with the original image.
Table 5.1: Boundary Accuracy of the Segmentation Results.
Test image Maximum ABE Average ABE ï¿½u T1 11.0 3.2 2.6 T2 15.0 2.9 2.4 T3 11.0 2.9 2.6 ABE: Absolute Boundary Error (in pixels).
with the difficulty of segmentation perceived by human observers. We observed that boundary errors are dependent oi shape i.e., complex boundaries yielded higher variance.
Overcomplete wavelet packet representations and envelope features provide a versatile and flexible framework. The examples showed that satisfactory results (an be achieved. Based on our experiments, loworder Lemari6 filter (p = 1) achieved the best results. However, we are aware that there were several important parameters to be determined manually for the segmentation algorithm to work. These include the configuration of the representation and the number of classes. On the face of lacking a precise definition of the problem, we believed that no mathematical solution
is possible, and these parameters can only be designed casebvcase for particular applications.
Comparing with Gabor filters, overcomplete wavelet packet representat ions possess several potential advantages for further exploration:
" channel filters cover exactly the frequency domain (provide a mathematically
complete representation) without significant overlap, thus greatly reducing correlations between features extracted from distinct channels
" adaptive pruning of a decomposition tree makes possible the reduction of computational complexity and tle length of feature vectors
" by moving up the decomposition tree, feature vectors at distinct resolution
levels can be obtained to increase the accuracy of boundary detection
CHAPTER 6
APPLICATION II: IMAGE DEBLURRING
6.1 Introduction
Image deblurring is a special case of signal restoration. The goal is to recover an image which is blurred and usually degraded by noise. During the past two decades, large amount of research works have been done in this area [33. 2. 28].
Unlike the texture segmentation, image deblurring is a well defined problem. The mathematical model commonly used to describe the degradation process is a convolution and an additive noise, written as: !(k, k2)= d(ki, k2) * x(kl, k2) + n(k1. A2), where x(kl. A2) is the original image., y(kl, k2) is the observed image, d(kl, k2) is the impulse response of the blurring operator, n(kI, k2) is the noise and the symbol
* stands for convolution. The deblurring problem is to restore the original image X(k], k2) from the degraded image y(kj, k2). This is thus a linear inverse problem.
At the first glance. the problem seerns solvable. Assuming known impulse response of the blurring process and the absent of noise, one may formally derive the solution in the frequency domain as:
This is the simplest, most intuitive approach and is called inverse filtering [2, 28, 22]. However, the frequency response D(w1, W2) associated with a blurring process usually has zeros in the frequency plane and its reciprocal is thus unbounded. In the other point of view, blurring generally causes irreversible information loss even without the existence of noise. By taking into account of noise, the inverse filtering approach
would produce a restoration:
X'(1: I 02) = Y(1) I2)/D(w1,W2) = X(W1,W2) + N(,,2)/D(wi.w2).
In this case, the closeness of the restored image X' to the original image X is hinged on the strength of the noise N. Notice that the precise waveform of the noise is unknown and thus cannot be completely subtracted from the observed image Y.
Inverse I)roblens associated with a unbounded inverse operator is said to be illposed [33, 14]. The traditional mathematical tool for dealing with such problems is regularization mnethods [60].
In this chapter, we shall apply overcomnplete wavelet packet representations to the deblurring problem. Our approach explicitly suppresses noise using the nonlinear shrink operator and approximates an inverse filter using the gain operator.
6.2 Review of Some Deblurring Techniques
In this section, we shall review modified inverse filters, Wiener filters and the more recent theory of waveletvaguelette inversion.
6.2.1 Modified Inverse Filters
For blurs having zeros in the frequency plane, there are two modifications available to make a bounded inverse filter.
1. Gainlimited Inverse Filter. The idea is simply setting a upperbound for
the magnitude of frequency response to grow, as described below: 11R .0 < D(.) < R. F(K. ,&Y) 1/D(w,w ) (D(u', ) > R. (6.1)
1/R  < D(x,,)Y) < O, where R is a positive real value and D(wr, wy) is a real filter.
2. Pseudoinverse Filter[28, page 276]. This is the same as the gainlimited
inverse filter except the constant gain region set to zero:
F 0 , D(,, 4,',)j < R, (6.2) 1/D(L, , L, ) , ID( ,),w)I > R,
where R is a positive real value.
Both approaches have their pros and cons. The advantage of the gainlimited inverse filter is that it is continuous in the frequency domain. The disadvantage is that the passband of ID(wa,,w') < R may significantly enhance wideband noise. In opposite. the pseudoinverse filter will com)letely wipe out anything outside its region of inversion by sacrificing the continuity.
6.2.2 Wiener Filters
The idea of the Wiener filter is to choose a deblurring filter f to minimize the mean square restoration error defined as [18, 2, 221: C ï¿½ {[x'(k ,. k2) x~i 2]2}
where x' f * (d * x + ) and E denotes mathematical expectation [49]. Assume that noise n is a random field with zero mean and is independent of x, the expression of E can be evaluated as:
= E [(.f * d * x)2 2(f *d*x x)(f *n)(fï¿½ )2] ï¿½ (( *d6) * ,.)2] +ï¿½[(.f * 1 2
2 J [ D  112 (DX + IF2 D,, d .,wId2,
47r2 _" _"[r D
where F, D, Dxx and ,, are all functions of (w1l wL2). and (Dx, and ,D,, are the power density spectrum of the signal and noise [49, Section 2.10]. respectively. Furthermore. the integrand can be factored as:
IFD  112 4),X + IF12 4),"
= (F  1/D) (F*  l!D*) ID12 4xx + FF*J,.
= (FF*  F/D*  F*iD + 1/1ID12) 1D12 I)x. + FF*,,,
= FF* (ID 2 x + Nn.)  (F/D* + F*/D) D12 4,, + 4
= [FF* (F/D* + F*/D) IDI 12] (ID12 (D X + 4) + ),X.
2 ) 1 (~242 ï¿½2 i D) 2 (D D DF 1x 2(IDI 'i, + 4Dn)+  I ID ~~ I)Pn A Pxx+(D
Since the first term is large than or equal to zero, the optimal deblurring filter (Wiener filter) is thus
DF1 W , 12).xw , W2) 12 (6.3) r( '~w' = D( ,I, W2)1 'Dxx Pol, W2) + (IDnn (" l 2)' which can be rewritten in terms of the signaltonoise power ratio D* (.w, &2)
FYI I2) D(i,, 2)12 + (Dn( . I, 2) / (D,, '2)
For white Gaussian noise, (N. (a)], W2) = No. The power spectrum (xD,, (")12) characterizes correlation property of the signal X. As the correlation usually falls off as the sample distant increases, (, 2) usually is a lowpass function. If the signal has no correlation among its adjacent samples, (Dx,,(W]  2) = So, and the signaltonoise power ratio equals to a constant So/N  1/R. In this case. the Wiener filter is reduced to
D* I +, 1.12) (6.4) Comparing with the gainlimited inverse filter and the pseudoinverse filter., the Wiener filter of (6.4) possesses both merits of continuity and noise suppression.
6.2.3 WaveletVaguelette Inversion
The combination of waveletvaguelette decomposition (WVD) and nonlinear shrinkage of WVD coefficients was proposed by Donoho [14] for linear inversion problems involving dilationhonlogeneous operators. An operator F with such property is commutable with a dilation operator D0, such that FDa = aD0F ,
where a is a constant. and D, is the dilation operator defined as Da [f(t)] = f(at) .
The WVD involves three sets of functions: an orthogonal wavelet basis {'0,k(t) = 2i (2 t  k)} and two nearorthogonal sets {uj,k(t) = 2ji2u(23t  k)} and
{Vj,k(t) = 2Ji12v(2ft  k)}. The three sets are related by the quasisingular value retations
Ftj,k = ' ~~ (6.5) F*Ujk = Yj,k, (6.6)
where quasisingular values ij depend on resolution index j but not spatial index k. Moreover, the two sets {uj,k(t)} and {vj,k(t)} satisfy biothogonality Uj,k(t)v,',,(t)dt z= 56 ,Sk
and nearorthogonality
ff (z,;kaj,kj,kY() dt Ej,k aj,k
Linking everything together. the reproducing formula of the WVD is expressed by
r(t)  I [FX.Jik] 1j 'j,k(t) jk
where [x, y] = r(t)y(t)dt is the inner product of x and y.
Finally, the WVD inversion of a white Gaussian contaminated measurement y(t) = Fxr(t) + n(t) may be achieved via X'(t)  > Kt, { [Y, Uj,k] j1 } I Vj,k(t), (6.7) j,k
where ICt, is the nonlinear shrinking operator defined by (4.4). Notice that the threshold tj depends only on resolution j.
6.2.4 Discussion
The Wiener filter provided a powerful tool for linear inversion problems. Given the statistics of both image and noise processes, one can design a optimal filter. However, Wiener filter belongs to the linear filtering paradigm and works in the
frequency domain. The linear property determines that it has to tradeoff between smoothing out noise and sharpening up edges.
In the opposite, the waveletvaguelette approach deployed nonlinear shrinkage to combat noise in the time domain. It is characterized by three components: 1) dvadic analyzing and synthesizing functions determined by the blurring operator F 2) leveldependent gain factors X 1; and 3) nonlinear shrinkage with leveldependent thresholds td. The nonlinearity enables it to achieve better compromise between denoising and sharpening. However, it is limited to dilationhomogeneous operators only, excluding the most commonly encountered Gaussian and uniform blurs [14]. Moreover, the theory was developed in the continuoustime domain.
The limitation of the WVD inversion lies on its insistence of the quasisingular value relations (Eqs. (6.5) and (6.6)). For a convolution operator F with kernel h(t), in the frequency domain (6.5) is equal to H(w)'I(2j ) =  jli(2J ). (6.8)
Let L' = 2kL and plug it into (6.8), we have H(2=k)4T(2 ( k)) =1V(2 U k)) (6.9)
In the other hand, directly plugging level index jk into (6.8) produces H(w)T(2(Jk)w) = ' jkV(2(jk)L). (6.10)
By comparing (6.9) and (6.10), we conclude that it must be true that H (2 kW,) ,j
 independentt of w)
H(w')  k
which is the characteristic of dilationhomogeneous operators in the frequency domain.
6.3 Discretetime Overcomplete Wavelet Packet Inversion
6.3.1 Onedimensional Formulation
There were two major considerations in our searching for a deblurring algorithm. First. it should be available in the discretetime domain. Second, it should be applicable to inhomogeneous blurring operators. The result was the discretetime overcomplete wavelet packet representations equipped with a nonlinear shrinking operator and a gain operator. For an onedimensional observation y(k) = d(k) * x(k) + n(k), our deblurring algorithm can be expressed by: wj(k) = v (n)y(k i), (6.11)
X'(k) = ) 3 K1, [wj(,,)] Auj(k in), (6.12) where {AjI0
which means that the order of gain and shrinking operator can be exchanged. If the shrinking threshold is adaptive to magnitude of coefficients, such a change of order would not alter the result. We prefer the order of "denoising" and then inversing" as expressed in (6.12) over the order of "inversing" and then "denoising" of (6.7) since we believed the former is more logical.
The idea behind our method is the same as in the WVD inversion (6.7). That is to cut off noise by nonlinear means and then sharpen the survived coefficients. Although the formula (6.12) looks virtually the same as the WVD inversion, it has several major differences. First. it uses discretetime overcomplete wavelet packet representation, which guarantees it will not fall on the anomaly of aliasing enhancement. Second, its analyzing and synthesizing functions do not subject to the dyadic bandwidth constraint, and do not necessarily satisfy the quasisingular value relations (Eqs. (6.5) and (6.6)).
To complete the framework, we still need to resolve two issues. First. for a given blur with frequency response D(u;), how can we determine the channel configuration (analyzing and synthesizing filters) as well as the gain vector? Second., how should we choose thresholds tj for denoising?
We proposed the following algorithm for the channel structure and the gain vector:
1) Choose a gainlimited inverse filter or a Wiener filter to approximate.
2) Choose the prototype filters H(w;), H(w) Gc) and ((), and then the channel
filters of the Level 1 can be determined by C1,0(&:)= H(w )H(,:) and
C1,I(w) =
3) Call Algorithm 4.2.2 (Minimurn Tree) to determine both the tree configuration
and gain vector.
However, the issue of determining thresholds tj of denoising is much more difficult. Thresholding schemes for several particular applications were suggested by Donoho [15]. Generally speaking, knowledge about signals and noise is needed to devise a thresholding scheme. By treating signal and noise as two random processes, the correlation property may be used to characterize them, as in the case of Wiener filtering. For widesense stationary white Gaussian noise and colored Gaussian signals., their autocorrelation functions o and power spectrum densities 4 may be written as:
ae)xx(w) a (2e (6.13) Note that q)x(w) is generally different from the power density spectrum IX(,)I2 of a particular realization x of the signal process. Under these assumptions, a possible thresholding scheme is:
tj = 3B/Ix1( ), (6.14)
where Bj is the bandwidth of the channel j, and wvj is the center frequency of the channel.
6.3.2 Twodimensional Extension
The onedinensional algorithm of the overcomplete wavelet packet inversion need to be extended to twodimension in order to deal with images degraded by symmetric 2D blurs. Formally, the 2D version may be written as
Af 1M 1
x' (k1, '2) = EZ Z Z 1t,,C2 [wCI,C2 (ni , nl.2)] ) ,c2 ï¿½Uc'2(Aj'  Ik2  11 2)
CI= (2=0 rn1 M2
However, an efficient 2D overcomplete wavelet packet representation is much more difficult to seek than in the ID cases. In general, the 2D separable overcomplete wavelet packets may not be the efficient representation. This is due to the fact that both 2D modified inverse filters and Wiener filters are nonseparable even for separable symmetric 2D blurs. For example, for a separable Gaussian blur: D(wx, wY) = Ac i+ )iq2
the pseudoinverse filter and the Wiener filter are F  l /R) (6.15) ( = , otherwise,
and
F (L,;x, w) = 1/ (I + R/A e + )i ). Both filters are isotropic and can be efficiently represented by donutshaped channel filters.
In practice, a reasonable approximation may be achieved by applying the ID algorithm separately to rows and columns. For the separable Gaussian blur, a separable pseudoinverse filter may be constructed as:
where the ID filter F(w ) is
Fi,) { / e 2/,2 IWI < oaln(A/R)
0 , otherwise.
It is clear that the rectangular region (,< crV/ln(A/R). w11y  a/ln(A/R))of the above 2D filter includes the disk of VwU + wV < 1n(A/R) of the (6.15). By choosing appropriate parameter R, acceptable results may also be achieved by approxilating 2D Wiener filters with ID Wiener filters. For uniform blurs, gainlimited inverse filters are similar to Wiener filters.
The major problem of such separable approximations is they may boost diagonal frequency components significantly. Figures 6.1 and 6.2 show examples of both 2D nonseparable and separable filters. The emphasis of the diagonal frequency region is particularly clear in the case of Gaussian blur.
6.4 Experimental Results
To quantitatively compare deblurring results, we used an objective measure, the improvement in signaltonoise ratio (ISNR), defined as [3]: ISNR, = 20 log I X P = 1,2, where Ij,,jP is Lp norm defined as [56]:
Nv1 ) i/
and x. y and x' are the original, degraded and deblurred signals, respectively.
Moreover, a, performance upper bound is useful in providing a gauge for comparison. The performance of an oracle Wiener filter may be considered the upper bound for \Wiener filters. An oracle Wiener filter is a Wiener filter with Dn(w) and 4)x,(w) replaced by power density spectrums JN(w)12 and IX (W)12 of the particular realization of the noise and the original signal. The filter may be written as: DI(w)
I() + IN()2 / IX(w)IH
Row 1
2m
3
Figure 6.1: Log Frequency Responses of Filters for a Gaussian Blur.
Where:
Row 1: Blurring filter
Row 2: 2D nonseparable filters, gainlimited, pseudoinverse and
Wiener filter from left to right
Row 3: 2D separable filters, gainlimited, pseudoinverse and
Wiener filter from left to right
Row 1
2
040,
3
Figure 6.2: Log Frequency Responses of Filters for an Uniform Blur.
Where:
Row 1: Blurring filter
Row 2: 2D nonseparable filters, gainlimited, pseudoinverse and
\Wiener filter from left to right
Row 3: 2D separable filters, gainlimited, pseudoinverse and
Wiener filter from left to right
However, it is not necessarily the upper bound of the overcomplete wavelet packet inversion.
6.4.1 Onedimensional Signals
Onedimensional deblurring examples are shown in Figure 6.3 and 6.4. The ideal signal in both cases is a perfect multilevel blocky function, which is very challenging and revealing for any deblurring method. For overcomplete wavelet packet representations used in both examples, Lemari6 filter of p= 1 was used. In Figure 6.3, the impulse response of a Gaussian blur was:
f {k A exp [(h7/ ((N/2))2], 0< k< N12.
f~)= Aexp[((Nk)/((cN/2))] V/ 2< k
with a = 0.03 and A 1.0/ Ek f(k). The impulse response of the uniform blur was f = [1, 1, 1. 1, 1, 1. 11/7. In both cases, white noise with (72 = 0.01 was added such that 1log( 2c~o i2,
the ratio of peak signal power to average noise power was SNR= 10 Xog(,1(1a) = 21.6db. To make fair comparisons, the same Gaussian signal model of (6.13) with a = 1.0 an(l Tx = 0.007 was used for both Wiener filtering and the overcomplete wavelet packet inversion. The thresholding scheme of (6.14) with a = 0.25 and a = 0.18 was used for the Gaussian and uniform blur, respectively. The gainlimited inverse filter (6.1) with R = 0.04 and R = 0.05 was used for the Gaussian and uniform blur, respectively. For the minimum tree algorithm, the maximum depth was set to Level 6. The algorithm found a configuration of twentyfive channels with F = 5.7% in the case of uniform blur, and twelve channels with g = 0.004(/ in the case of Gaussian blur.
Table 6.1 compares deblurring performance in terms of ISNR. For the two particular blurring sources and the signal, it shows that overall performance of the overcomplete wavelet packet inversion is about the same as Wiener filtering by the measure of ISNR2. and it is better than Wiener filtering by ISNR.
Row 1
2 3
4
5 I
6
Figure 6.3: Onedimensional Deblurrings of a Gaussian Blur.
Where:
Row 1: The original signal
Row 2: The blur filter and the degraded signal
Row 3: The oracle Wiener filter and the deblurred signal
Row 4: The Wiener filter and the deblurred signal
Row 5: Tile gainlimited inverse filter and the deblurred signal
Row 6: The overall frequency response of the overcomplete wavelet
packet inversion and the deblurred signal
Row 7: Channels of the overcomplete wavelet packet representation
Note: The left column are frequency responses. The central column are signals of fulllength. The right column are zoomin of the feature region., where deblurring results were overlaid with the degraded signal
of doted line.
Row 1
2
3
4
6
Ij
Figure 6.4: Onedimensional Deblurrings of an Uniform Blur.
Where:
Row 1: The original signal
Row 2: The blur filter and the degraded signal
Row 3: The oracle Wiener filter and the deblurred signal
Row 4: The Wiener filter and the deblurred signal
Row 5: The gainlimited inverse filter and the deblurred signal
Row 6: The overall frequency response of the overcoinplete wavelet
packet inversion and the deblurred signal
Row 7: Channels of the overcomplete wavelet packet representation
Note: The left column are frequency responses. The central column are signals of fulllength. The right column are zoomin of the feature region, where deblurring results were overlaid with the degraded signal
of doted line.
Table 6.1: Performance of ID Deblurring Examples
Blur Type I ISNR2 (DB) ISNR1 (DB)
I OWF WF GLIF OWPI OWF WF I GLIF OWPI
Gaussian 2.37 1.88 9.04 1.90 0.75 1.45 15.87 1.18
Uniform 5.17 3.90 4.32 3.62 1.00 2.40 11.80 0.00
Note:
OWF=Oracle Wiener Filter
WF=Wiener Filter
GLIF=Gainlinited Inverse Filter
OV\PI=Overcomplete Wavelet Packet Inversion
6.4.2 Twodimensional Images
Examples of image deblurring using the standard Lena image were included in Figures 6.5 and 6.6. Separable blurring sources were used. The same ID blurring filters used in the ID examples were applied rowwise and columnwise on the original Lena image. A statistical signal model of (6.13) with a = 3.5 and ur. = 0.008 was used for both Wiener filtering and the overcomplete wavelet packet inversion. In both cases. white Gaussian noise with ar = 0.5 was added such that SNR 44.3DB. The thresholding scheme of (6.14) with a = 25.2 was used for both cases. The shrinking operator was applied on the 2D components. The gainlimited inverse filter (6.1) with R = 0.08 was used for both comparison and overcomplete wavelet packet inversion. A separable overcomplete wavelet packet inversion algorithm was used to approximate a 2D overcomuplete wavelet packet inversion. For the case of Gaussian blur, the approximation used 11 x 11 channels with = 0.005'c (1D). For the case of uniform blur, the approximation used 25 x 25 channels with E = 3.9% (1D). The deblurred images were all linear converted to the same intensity range of the original image for the calculation of ISNR indexes.
Table 6.2 compares deblurring performance in terms of ISNR. For the two particular blurring sources and the Lena image, it shows that overall performance of
Table 6.2: Performance of 2D Deblurring Examples
Blur Type ISNR2 (DB) ISNR1 (DB)
OWF WE[ GLIF OWPI OWE] IVF GLIF OWPI
Gaussian 3.20 13.17 14.04 2.80 2.96 3.25 16.82 2.69 Uniform 3.55 1.78 14.78 1.76 1.46 0.60 18.05 0.64
Note:
OWF=Oracle Wiener Filter
WF=Wiener Filter
GLIF=Gainlimited Inverse Filter
OWPI=Overcomplete Wavelet Packet Inversion
the overcomplete wavelet packet inversion is about tile same as Wiener filtering by both measures of ISNR2 and ISNR1.
6.5 Summary and Discussion
We have demonstrated that the deblurring problem can be handled using an overcomplete wavelet packet representation. We showed that inverse filtering in the frequency domain and denoising in the time domain can be done in a single step. In other words, for this application the gain and shrinking operators can be coml)ined as a single operation on the spatialfrequency representation. Moreover, our approach may be seen as an extension of the waveletvaguelette inversion into both the discretetime and broader problem domains.
Although the one step inversion and denoising nmay be conceptually appealing, its limitation is also apparent. Since nonlinear shrinkage works only within areas where the magnitude of signal components is stronger than that of noise components,. the best representation for denoising is to maximize local signaltonoise ratio. For example, if a sharp edge is decomposed into many channels, its components at some channels may be indistinguishable from noise. Therefore, the best representations for approximating the inverse filter and denoising are generally different. Having only one choice of representation, the approach has to limit its application to certain signals or blurring sources.
(a) (b)
(c) (d)
(e) (f)
Figure 6.5: Deblurring a Gaussianblurred Lena inage. Legend:
(a) The original 256 x 256 image
(b) Degraded image with SNR=44.3 db
(c) Deblurred by the oracle Wiener filter
(d) Deblurred by a Wiener filter
(e) Deblurred by a gainlimited inverse filter
(f) Deblurred by a separable overcomplete wavelet packet inversion
(a) (b)
(e) (d)
(e) (f)
Figure 6.6: Deblurring an Uniformblurred Lena Image. Legend:
(a) The original 256 x 256 image
(b) Degraded image with SNR=44.3 db
(c) Deblurred by the oracle Wiener filter
(d) Deblurred by a Wiener filter
(e) Deblurred by a gainlimited inverse filter
(f) Deblurred by a separable overcomplete wavelet packet inversion
85
This limitation may be tile reason that our preliminary experimental results showed roughly the same performance as the much simpler approach of the Wiener filtering. For the Gaussian and uniform blurring sources, some channels with very narrow bandwidth are needed to achieve satisfactory approximation using the gain vector. For those channels, shrinking operation is very close to multiplication.
CHAPTER 7
CONCLUSION
We examined both classes of Fourierbased and waveletbased representations. \Ve showed that orthogonal wavelet transforms lack translation invariance and the efficacy of representing convolution operators. In particular, we demonstrated that aliasing distortion due to critical subsampling limits its applications in image enhancement and restoration. On the other hand, overcomplete wavelet representations. although not as efficient as nonredundant representations, avoid those shortcomings. Moreover, overcomplete wavelet representations can be conveniently analyzed using linear time invariant system theory. In this thesis, we evaluated their capability of timefrequency localization using the uncertainty factor. Wï¿½e also identified gain, shrinking and envelope operators to approximate convolution operators. suppress noise and extract power density distributions. We investigated the criteria of minimum mean square error and designed an optimization algorithm. We extended the wavelet shrinkage to overcomplete representations. We also presented two envelope detection algorithms and compared their performances.
The framework of the overcomplete wavelet representation was applied to two difficult problems: segmentation of textured images and image deblurring. W\e demonstrated that channel envelopes as feature vector produced satisfactory results on both natural and synthetic textures. We showed the gain and shrinking operators may be used for image deblurring and pointed out its limitations.
The issues of timefrequency representations. pattern recognition and noise suppression are very fundamental ones. Our knowledge about them is going to be accumulative and evolving. This thesis increases our level of understanding and points
87
to many opportunities for future research. The issue of timefrequency localization was touched upon but not fully explored. Although the uncertainty factor was used for evaluation of individual channels, we lack a criteria of optimality on a representation as a whole. Once such a criteria is found, we may able to search for the optimal representation of textured images, which was not explored here. Denoising is certainly another front. The optimal representation and adaptive selection of thresholds are going to be intertwined and challenging.
APPENDIX A
PROOFS RELATED TO DISCRETETIME WAVELETS
Some basic proofs related to (bi)orthogonal discretetime wavelets were included here in order to make this thesis selfsustained.
Proposition A.O.1 The necessary condition for a pair of discrete sequences to be orthogonal (as defined by (A.1)) is (A.2):
a(2n,
rtn X.
 n)b(i  21) = A6(n
.4(cjW)B(c))) + A(d(L+ ))B(c( )) = 2A.
(A.1)
(A.2)
where A is a constant, and A(cj') a(n) and b(n), respectively.
Proof. By changing variable k
a0  a [2(n 1)  k] b(k). Then
and B(ejw) are discretetime Fourier transform of
= m21, the left side of (A.1) is changed to it is equivalent to prove that
Z a(21 k)b(k) = A6(1).
k=o
Notice that the left side of the above is a 2fold decimation of the series
p(l) = E a(/  k)b(k).
k=o
Since p(l) is the convolution of a(l) and b(l), its discretetime Fourier transform is known as
P(dW) =A(ejw)B(ej').

Full Text 
PAGE 1
OVERCOMPLETE WAVELET REPRESENTATIONS WITH APPLICATIONS IN IMAGE PROCESSING By JIAN FAN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1997
PAGE 2
Copyright 1997 by Jian Fan
PAGE 3
ACKNOWLEDGEMENTS First, I would like to express my deep gratitude to my advisor, Andrew Laine, for all his support and encouragement. I thank him for introducing me to the field of wavelets, and I appreciate the opportunity to work with him. I would like to thank Dr. Jerry C. L. Chen and Mrs. Shirley H. W. Chen of Palo Alto, California, for their help and support. Without their support, I may not have the opportunity to pursue my graduate studies in the United States. I would like to thank Drs. Gerhard Ritter, Sartaj Sahni, Dave Wilson and John G. Harris for being on my thesis committee, and Dr. Baba C. Vemuri for attending my defense. Their time and thoughtful recommendations were greatly appreciated. I would also like to thank Dr. David Burchfield of the Department of Pediatrics, University of Florida, for providing me a graduate research assistantship from January 1991 through December 1991. During my early years at the University of Florida, Ming Jiang and Min Shi gave me much help. I also benefited greatly from discussions with my follow graduate students, Shuwu Song, Iztok Koren, Sergio Schuler, Wuhai Yang, Chunming Chang and Hongchi Shi. In addition, I would like to thank my manager Catherine Hunt of Hewlett Packard Company, San Diego, for her support. I have also benefited from HewlettPackard Company's educational assistance program. I appreciate the efforts of John Shumate who helped polish the writing of this dissertation. iii
PAGE 4
Finally, I would like to thank my mother Yuzhen Chen and my late father Laikun Fan for their inspiration and encouragement, as well as my wife Weihua Liu and my son Xiying Fan for their understanding, love, and patience. This research was supported in part by the Whitaker Foundation, and the U. S. Army Medical Research and Development Command, Grand number DAMD1793J3003. iv
PAGE 5
TABLE OF CONTENTS ACKNOWLEDGEMENTS iii LIST OF TABLES viii LIST OF FIGURES ix ABSTRACT xi CHAPTERS 1 INTRODUCTION 1 1.1 Motivations of the Research 1 1.2 Review of Related Work 2 1.3 Overview of the Thesis 4 2 SIGNAL REPRESENTATIONS AND CONVOLUTION OPERATORS 5 2.1 Introduction 5 2.2 Fourier Transforms 6 2.2.1 Fourier Transform (FT) 7 2.2.2 Discretetime Fourier Transform (DTFT) 7 2.2.3 Shorttime Fourier Transform (STFT) 8 2.2.4 Discretetime Shorttime Fourier Transform (DTSTFT) . 8 2.2.5 Connection Between Shorttime Fourier Transform and Filter Banks 9 2.2.6 Generalized Shorttime Fourier Transform 10 2.3 Wavelet Transforms 10 2.3.1 Continuous Wavelet Transform (CWT) 11 2.3.2 Discrete Wavelet Transform (DWT) 12 2.3.3 Discretetime Wavelet Transform (DTWT) 13 2.4 Generalized Discretetime Filter Bank Transforms 15 2.4.1 Convolution Operator on Complete Representations ... 17 2.4.2 Convolution Operator on Overcomplete Representations 19 2.4.3 Approximation of Convolution Operators on Overcomplete Representations 19 2.4.4 Aliasing Enhancement on Complete Representations ... 20 2.5 Summary and Discussion 22 v
PAGE 6
3 OVERCOMPLETE WAVELET REPRESENTATIONS 23 3.1 Introduction 23 3.2 Overcomplete Wavelet Transforms and Filters 23 3.2.1 Overcomplete Wavelet Packet Representations 23 3.2.2 Dyadic Wavelet Representations 25 3.2.3 Filters of Autocorrelation Functions of Wavelets 28 3.3 Connection to Complete Wavelet Representations 29 3.4 Timefrequency Interpretation and the Uncertainty Principle . . 29 3.5 Twodimensional Extensions 34 4 OPERATIONS ON OVERCOMPLETE WAVELET REPRESENTATIONS 37 4.1 Introduction 37 4.2 Gain Operators 37 4.2.1 Minimum Mean Square Error (MMSE) Approximation . . 38 4.2.2 Timefrequency Tradeoff and a Greedy Algorithm .... 40 4.3 Shrinking Operators 42 4.4 Envelope Detectors 44 4.4.1 Envelope Detection by Hilbert Transform 47 4.4.2 Envelope Detection by Zero Crossings 49 4.4.3 Comparison Between the Two Detectors 50 4.4.4 Comparison with Other Energy Operators 50 4.4.5 Two Dimensional Envelope Detection 50 5 APPLICATION I: TEXTURE SEGMENTATION 53 5.1 Introduction 53 5.2 Texture Feature Extraction 54 5.3 Considerations on Filter Selection 55 5.4 The Basic Isodata Clustering Algorithm 57 5.5 Postprocessing 58 5.6 Experimental Results 58 5.6.1 Onedimensional Signals 58 5.6.2 Natural Textures 59 5.6.3 Synthetic Textures 62 5.7 Summary and Discussion 62 6 APPLICATION II: IMAGE DEBLURRING 66 6.1 Introduction 66 6.2 Review of Some Deblurring Techniques 67 6.2.1 Modified Inverse Filters 67 6.2.2 Wiener Filters 68 6.2.3 WaveletVaguelette Inversion 69 6.2.4 Discussion 70 6.3 Discretetime Overcomplete Wavelet Packet Inversion 72 6.3.1 Onedimensional Formulation 72 6.3.2 Twodimensional Extension 74 6.4 Experimental Results 75 6.4.1 Onedimensional Signals 78 6.4.2 Twodimensional Images 81 6.5 Summary and Discussion 82 vi
PAGE 7
7 CONCLUSION 86 APPENDIXES A PROOFS RELATED TO DISCRETETIME WAVELETS 88 B NUMERICAL COMPUTATION OF BATTLELEMARIE WAVELET 95 B.l Functions in Finite Terms 95 B.2 Numerical Computation Using FFT 99 C NUMERICAL COMPUTATION OF THE UNCERTAINTY FACTOR 103 D BOUNDARY TREATMENTS OF FINITELENGTH SEQUENCES . 107 D.l Periodic Extensions 107 D.2 Closure of Symmetry Under Convolution 107 REFERENCES Ill BIOGRAPHICAL SKETCH 115 vii
PAGE 8
LIST OF TABLES 3.1 Uncertainty Factors of Analyzing Filters 34 5.1 Boundary Accuracy of the Segmentation Results 64 6.1 Performance of ID Deblurring Examples 81 6.2 Performance of 2D Deblurring Examples 82 B.l Coefficients of g 2 4 (x) (a) and Truncated Impulse Response of hi(n) (b). 100 viii
PAGE 9
LIST OF FIGURES 2.1 A Filter Bank Implementation of STFT 10 2.2 Fast Discretetime Wavelet TransformFirst Two Levels 13 2.3 Generalized Discretetime Filter Bank Transform 15 2.4 Example of General Binarytreestructured Filter Bank of Five Channels. 16 2.5 The Overcomplete Wavelet Transform for the Example of Figure 2.4. 17 2.6 Example of the Aliasingenhancement Anomaly 21 3.1 Binary Tree for Overcomplete Wavelet Packet Decomposition 24 3.2 Binary Tree for Dyadic Wavelet Decomposition 26 3.3 Examples of the Function O a ,6(<^) 27 3.4 Two Examples of the Overcomplete Wavelet Representation 30 3.5 Examples of Timefrequency Localization by Overcomplete Wavelet Packet Representations 32 3.6 Uncertainty factors (drawn on top of each band) of Channel Filters. . 33 4.1 Examples of Gain Vector Approximation 41 4.2 Examples of the Minimum Tree Approximation with the Maximum Depth of Seven Levels 43 4.3 An Example of Overcomplete Wavelet Denoising 45 4.4 Comparison of Overcomplete Wavelet Shrinkage Denoising with Linear Smoothing Using the Signal of (row l,col 1) of Figure 4.3 46 4.5 Two Examples of Power Density Distribution of Overcomplete Wavelet Packet Representations 48 4.6 A FIR Hilbert Transformer 49 4.7 Comparison of the Two Envelope Detectors 51 4.8 Frequency Response of Equivalent Complex Quadrature Filters (level 1). 52 ix
PAGE 10
5.1 Segmentation of a ID Signal Consists of Triangular Waveform and Sinusoid 59 5.2 Segmentation of a Noisy ID Signal Consists of Two Pure Tone Segments 60 5.3 Segmentation Results of a Image Consist of Natural Textures 61 5.4 Segmentation of a Synthetic Gaussian Lowpassed Texture Image ... 63 5.5 Segmentation of a Synthetic FIN Texture Image 63 5.6 Segmentation of a Synthetic Texture Image with Line Patterns. ... 64 6.1 Log Frequency Responses of Filters for a Gaussian Blur 76 6.2 Log Frequency Responses of Filters for an uniform Blur 77 6.3 Onedimensional Deblurrings of a Gaussian Blur 79 6.4 Onedimensional Deblurrings of an Uniform Blur 80 6.5 Deblurring a Gaussianblurred Lena Image 83 6.6 Deblurring an Uniformblurred Lena Image 84 A.l Equivalent Structures for (a) Convolution and Decimation and (b) Expansion and Convolution 91 A. 2 Illustration of Adding One More Channel bv Splitting a Channel into Two 92 B. l Transition Band of Filters H p (uj) with p = 1, 3, 9 101 C. l Channel Bandwidths of Overcomplete Wavelet Representations. . . . 104
PAGE 11
Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy OVERCOMPLETE WAVELET REPRESENTATIONS WITH APPLICATIONS IN IMAGE PROCESSING By Jian Fan August 1997 Chairman: Dr. Andrew F. Laine Major Department: Computer and Information Science and Engineering Orthogonal wavelet transforms have been applied to the field of signal and image processing with promising results in compression and denoising. Coefficients of such a transform constitute a complete representation of a signal without redundancy. However, there are applications where complete representations are disadvantageous. In this thesis, we examine classes of Fourier transforms and wavelet transforms in terms of their efficacy of representing convolution operators. We have identified two shortcomings associated with complete representations of the discretetime domain: (1) the lack of translation invariance and the (2) a possible anomaly of aliasingenhancement. On the other hand, our analysis showed that overcomplete wavelet representations do not bear those shortcomings of their nonredundant counterparts. Our framework of overcomplete wavelet representations include construction algorithms xi
PAGE 12
and prototype filters, spatialfrequency interpretation and three operations. Capabilities of spatialfrequency localization were quantitatively evaluated using uncertainty factors. Associated with gain, shrinking and envelope operators, algorithms for convolution, denoising and analysis of power density distribution were presented and analyzed. The framework of overcomplete wavelet representations was then applied to segmentation of textured images and image deblurring. We demonstrated that envelopes as feature vectors performed well in segmenting both natural and synthetic textures. We showed that gain and shrinking operators may be used for image deblurring and discuss limitations of the mothodology. xii
PAGE 13
CHAPTER 1 INTRODUCTION 1.1 Motivations of the Research In the past decade, wavelets have stimulated great interests in many fields of applied mathematics and engineering, generating a significant amount of research activities. Within these fields, traditional theories and techniques have been reevaluated to explore opportunities for wavelet applications. Although continuous time wavelets possess mathematical elegance, discretetime wavelet transforms are of special importance for practical applications. Among them, discretetime orthogonal wavelet transforms have gained popularity. Coefficients of such a transform constitute a complete representation of a signal without redundancy. Complete representations have produced promising results in image compression and denoising. However, there are applications where complete representations are disadvantageous. Understanding the strength and weakness of various representations and the relationships among them is essential for satisfying intellectual inquires and developing new methodologies. In this thesis, we considered both classes of Fourierbased representations and waveletbased representations with emphasis in the discretetime domain. We found that the overcomplete wavelet representations served to bridge the two classes. The merit of a particular representation should be judged by its capability to facilitate solving problems. Two important but difficult problems, segmentation of textured images and image deblurring, were selected as benchmarks. Accordingly, we value a representation by its capability to extract certain features from signals, efficacy of representing an operation and invariance under certain operations. 1
PAGE 14
2 With the understanding of various representations and the problems in hand, we developed a framework of discretetime overcomplete wavelet representation. The framework included construction algorithms and prototype filters, timefrequency interpretation and three operations. Algorithms associated with gain operators, shrinking operators and envelope detectors were developed. The framework of overcomplete wavelet representations was then applied to segmentation of textured images and image deblurring. This demonstrated that envelopes as feature vectors performed well in segmenting both natural and synthetic textures. For the deblurring application, our formulation extended the waveletvaguelette inversion into discretetime domain and was applicable to more blurring sources. 1.2 Review of Related Work On the general topic of wavelet theory, a brief history and comprehensive review were included in Jawerth and Sweldens' survey [32]. The papers by Daubechies [11] and Mallat [42, 41] were considered influential for the recent developments. The paper by another wavelet pioneer Strang [59] was also explanatory. The orthonormal wave packet bases is a generalization of orthogonal wavelets, was introduced by Coifman and Meyer [10]. The book by Daubechies [13] included indepth mathematical discussions and real design examples. The book by Vaidyanathan [63], which was written for signal processing audience, provided much insight into the relationship between wavelets and multirate filter banks. The lack of translation invariance of discretetime orthogonal wavelets was recognized as a major shortcoming for pattern recognition by many researchers [41, 59]. Various proposals were made to address the issue. Multiscale edge representations were investigated by Mallat and Zhong [44], Mallat and Hwang [43], and Berman and Baras [5]. Steerable filters with shiftabilities were studied by Simoncelli et al. [58]. A orthogonal representation with invariance of the subband structure was proposed by Pesquet et al. [52].
PAGE 15
3 Using energy distribution in subbands of wavelet packet decomposition as signatures for texture classification was presented by Laine and Fan [36]. A similar approach was also reported by Chang and Kuo [8]. Earlier results using envelopes of overcomplete wavelet packet representation for texture segmentation was reported by Laine and Fan [35]. A simplified version of Chapter 5 of this thesis was later published [37]. The overcomplete wavelet representation was also used by User [62] with an ad hoc feature extraction algorithm. A methodology combining Markov random field and orthogonal wavelet packet transform was described by Bello [4]. The nonlinear wavelet shrinkage for denoising was due to Donoho and Johnstone [17]. The approach was applied to power spectrum estimation [48], and to medical image processing [1]. Nonlinear denoising based on wavelet maxima representation by Mallat and Hwang shared some similar ideas [43], although tracing singularities through the scale space turned out to be much more difficult, especially in a twodimensional space. Manipulating wavelet representations to enhance certain features is obviously a good idea. Jawerth et al. [31] reported image enhancement using weight factors to modify coefficients of orthogonal wavelet transforms. Laine et al. [38] demonstrated various approaches for enhancement of digital mammographic images. Fan and Laine [20] pointed out the connection between linear gain enhancement using overcomplete dyadic wavelet representations and the unsharp masking, and introduced a nonlinear function for contrast enhancement. However, due to the lack of an objective definition of image quality, those enhancement algorithms were mostly ad hoc and hard to evaluate. On the other hand, image deblurring is considered to be a well defined problem. The approach adapted by Banham et al. [3] is to decompose the linear minimum mean square error (LMMSE) filters into the orthogonal transform domain with advantage of reduced alliance on the assumption of global stationarity. However, their proposal was basically the linear filtering and no optimal subband selection was
PAGE 16
4 considered. The mathematical framework of the waveletvaguelette inversion was due to Donoho [14, 16]. The limitations were that it was derived on the continuous time domain and inapplicable to some common sources such as Gaussian and uniform blurs. 1.3 Overview of the Thesis This thesis is organized as follows: Chapter 2 reviews signal representations derived from the Fourier transform and the wavelet transform in the context of being operated upon by a convolution operator. The scrutiny reveals that complete representations suffer from not being translation invariant and the existence of the aliasingenhancement anomaly, and are thus inappropriate for certain applications. Chapter 3 presents the framework of the overcomplete wavelet representation with two particular building algorithm and filters. The timefrequency localization property of overcomplete wavelet representations is examined under the uncertainty principle. Chapter 4 introduces gain, shrinking and envelope operators applicable upon an overcomplete wavelet representation. Algorithms for determining specific parameters are detailed with examples and comparisons shown. Both Chapter 5 and 6 are applications of the theory of overcomplete wavelet representations developed in previous chapters. Chapter 5 deals with the topic of image texture segmentation. The segmentation problem is formulated in the paradigm of spatialfrequency analysis, and a texture feature based on the concept of power density distribution is proposed. Chapter 6 addresses the problem of image deblurring using the gain operator for deconvolution and the shrinking operator for nonlinear denoising. Chapter 7 concludes the research and proposes future directions.
PAGE 17
CHAPTER 2 SIGNAL REPRESENTATIONS AND CONVOLUTION OPERATORS 2.1 Introduction For a given signal x, it is often beneficial to transform (map) x into another form (domain), denoted x X. X is called the representation of x in the transform domain. Instead of directly dealing with the original signal x, we may deal with its representation X. By doing so we usually transform a particular problem into a different form and make it easier to solve. Clearly, different applications demand different representations. The applications considered in this thesis are texture image segmentation and image deblurring. Accordingly, the timeinvariability and the efficacy of representations are the most important factors. Both properties may be revealed by their behavior under a convolution operator. With these in mind, we shall review Fourier transforms and wavelet transforms of one dimensional signals. To facilitate our discussions, we used the following basic definitions: Definition 2.1.1 (Translation Operator) For a continuous function x(t), a translation operator is defined as T s [x(t)] Â— x(t Â— s); For a discrete function x(n), a translation operator is defined as T d [x(n)] = x{n Â— d). Definition 2.1.2 (Convolution Operator) For a continuous function f(t), a convolution operator is defined as T f [x(t)] = x(t) * f(t) = P^:c(t)/(Â£ r)rfr; For a discrete function x(n), a convolution operator is defined as Tf [x(n)] = x(n) * f(n) = T,m=oo x ( m )f( n ~ rn )The functions f (t) and f{n) are called the kernel. Notice that a translation operator is a special convolution operator with kernel f(t) = 5(t r) 5
PAGE 18
6 (f(n) = S(n d)). Therefore, we will use T for both translation and convolution operators. Definition 2.1.3 (Eigenfunction and Eigenvalue) For a linear operator T , if there exists a function e, such that T [e] = Xe, where A is a constant, the function e is called the eigenfunction of T, and the value A is called the eigenvalue. Definition 2.1.4 (Translation invariant) For a given signal x, a translation operator T and a mapping operator M., if the two operators are commutable such that M[T(x)] = T[M(x)], the mapping is called translation invariant, and the representation X is said to be a translation invariant representation. 2.2 Fourier Transforms Frequencydomain representation by Fourier transform played a major role in signal analysis and linear system theory [51, 49, 63]. The efficacy of Fourier representation is due to the fact that its basis functions of complex exponentials are eigenfunctions of a convolution operator [49, page 39]. For a linear convolution operator T, the eigenequation is: T f [e*"] = F(w)e*", where eigenvalue F(u) is the Fourier transform of f(t). An almost identical form can be derived for the discretetime Fourier transform. Fourier representations are particularly powerful for signal deblurring since it transforms a deconvoltuion problem into the simple form of divisions. However, Fourier representations may not be good at capturing frequency evolution over time. As time variable is added to Fourier transforms, basis functions of shorttime Fourier transforms are no longer eigenfunctions of a convolution operator.
PAGE 19
2.2.1 Fourier Transform (FT) The Fourier transform pair may be written as: /oo x(t)e~ ]u,t dt (FT), oo /Â•oo x(t) = Â— / X(u)e ]u3t du (Inverse FT). A linear convolution operator Tf can be characterized by /oo r , roo X(u)T f e juJt doo= X(u)F(u))e jult duj, 00 J Â— oc or, Y(u>) = X{w)F(u>). This shows that Fourier representation of the convolution operator T is the Fourier transform of its kernel. The beauty of this representation is that on the Fourier domain, a convolution (deconvolution) is simply multiplications (divisions) of the both representations. For a translation operator with kernel f(t) = 5(t Â— s), F(u) = e~i ws , and therefore F(w) = A r (o;), which means that magnitude of Fourier transform is translation invariant. 2.2.2 Discretetime Fourier Transform (DTFT) The discretetime Fourier transform may be written as [49]: oo X(e? w ) = Yl x(n)e]u,n (DTFT), n= Â— oo x(n) = Â— P X(j w )e? M dw (Inverse DTFT). Similar to its continuous counterpart, y(n) = T f [x(n)} = if X{^)F{e? u )^du), or, Y(e ju} ) = X(e juJ )F(e Jul ). For a translation operator with f(n) = 6(n d). F(e juJ ) = e _jW and thus \Y(e^)\ = \X(e^)\.
PAGE 20
8 2.2.3 Shorttime Fourier Transform (STFT) In order to capture frequency evolution of a nonstationary signal, a window w(t) is introduced into the Fourier transform. The shorttime Fourier transform may be written as: /OO x(r)w(t T)e~ ]UJT dT (STFT). oc roc x(t) = Â— Â— / X(u, t)e JU ' t du) (Inverse STFT). 2irw(0) Joc where we assumed w(0) ^ 0. For a convolution operator Tj, T f [w(t r)e^'] = F(u, t r)e jut , where F(u;,t) is the STFT representation of the kernel f(t). It means that in general the windowed complex exponentials are no longer eigenfunctions of a convolution operator. Consequently, STFT representation of y(t) = Tf [x(t)] is another convolution of either or. /oo F(uJ T)x(r)e^ T dT, oo /oo X(u,tT)f(r)e^ T dT. oo For a translation operator with f(t) = S(t Â— s), Y(u,t) = X(u,t Â— s)e~^ s , and thus F(u;.t) = \X(u,t Â— s)\. Therefore, magnitude of STFT transform is a timeinvariant representation. 2.2.4 Discretetime Shorttime Fourier Transform (DTSTFT) The discretetime shorttime Fourier transform may be written as [49, 53] oo X(e jw ,n)= J2 x(k)w(nk)eJujk , (DTSTFT) (2.1) k=Â— oo x(n) = Â— F X(e juJ ,n)e^ n du). (Inverse DTSTFT) (2.2) 2irw(Q) JTT
PAGE 21
For a discrete convolution operator T) and y(n) = Tj [x(n)], we have oo Y{e? u ,n)= Â£ X(e^,nk)f(k)e^ k , k= Â— oo or, 00 Y(e juJ ,n) = Y, F{e ]UJ ,nk)x{k)eju}k . k= Â— oo For a translation operator 7) with f(n) = 8(nk), \Y(e juJ ,n)\ = A"(e Ja, ,n /c). Thus, the magnitude of DTSTFT is also a translationinvariant representation. 2.2.5 Connection Between Shorttime Fourier Transform and Filter Banks The discretetime shorttime Fourier transform X(e^, n) is a twodimensional function of a continuous variable uj and a integer variable n. Is it possible to have only finite samples X{e? Uk ,n) in the frequency domain, and yet still able to recover the original signal x{n)1 It turns out to be possible by carefully designing channel filters [63, 53]. For a given frequency u! k , STFT coefficients of (2.1) can be rewritten as oc X{e jUk , n) = eiu " n Â£ x{m)w(n m)e^ ( "m) = e~ iu " n [x{n) * v k (n)} , m=Â— oo where Vk(n) = w(n)e juJkTl . This means that a frequency point of STFT can be computed using a filter with frequency response V k {uo) = W(u Â— Uk) and a complex multiplier e ju>kU . Note that \X(e jUk ,n)\ = \x(n) * v k (n)\. For a filter bank consists of M Â— 1 such channels satisfying Ml Ml Â£ V k (uj)= Y W(uu k ) = l, (2.3) k=0 k=0 the original function x(n) can be recovered by replacing the integral of (2.2) with the following summation: Ml Ml Y X(e j " k ,n)e ju,kn = Â£ [x{n) * v k (n)} = x(n) * Ml Y v k(n) (2.3) . . = x{n). k=0 k0 ik=Q Figure 2.1 showed a filter bank structure for the analysissynthesis process of STFT.
PAGE 22
10 vi(n) x(n) r "o(n) fJWl(n) X"( e Jo.") x) Â• ( x A'(e^in ) X) IX x(n) Figure 2.1: A Filter Bank Implementation of STFT. 2.2.6 Generalized Shorttime Fourier Transform We already pointed out that STFT magnitude can be computed by a filter bank without modulators and demodulators (see Figure 2.1). We can further relax the constraint that channel filters are frequencyshifted of a single lowpass filter v(n), and define a generalized shorttime Fourier transform pair as 00 w k{n) Â— ^2 x { m ) v k{n Â— m )i 0 < < M Â— 1, m= Â— oo Ml x ( n ) = Yl u 'fc( n ) Â• 2.3 Wavelet Transforms Wavelet transforms have become a powerful tool for analyzing nonstationary signals in recent years. It has several major advantages over the shorttime Fourier transform. First, wavelet transforms choose to fix the ratio of bandwidth/centerfrequency (called constantQ) of channel frequency response instead of the frequency resolution (determined by the bandwidth of the window) as in STFT. In other words, wavelet transforms use a narrow bandwidth window for low frequency signals and a wide
PAGE 23
11 bandwidth window for high frequency signals. The idea behind this is that low frequency components represent slow changes in the time domain for which time resolution is not critical while higher resolution in the frequency domain is more desirable. In the opposite case, a wide bandwidth window for high frequency components means better resolution in the time domain to capture abrupt changes. Second, recent development of wavelet packet transforms and other signaladaptive wavelettype transforms are much more flexible and powerful in exploiting timefrequency concept for wideranging applications. 2.3.1 Continuous Wavelet Transform (CWT) For a continuous time signal x(t), a continuous wavelet transform pair may be written as [41, 12]: wavelet. Parameter r is the translation factor and a is the dilation factor. Informally speaking, when a increases, function ip a (t Â— r) expands and takes longtime behavior into account. In the opposite, when a decreases, function ip a (t Â— r) contracts and only focuses on the shorttime behavior. In order to recover the original signal from its CWT, the wavelet ip{t) must satisfy the admissibility condition: However, a wavelet function tl> a {t) is generally not an eigenfunction of a convolution operator T, as: where x* denotes complex conjugate of x, ip a (t) V>(a)^) * s caue d the basic does not equal to Xtp a (t).
PAGE 24
12 Since inverse CWT of (2.5) can be rewritten as: /Â•O=0O (Iq x(t) = ?r / [X(a,t)*i> a (t)}, Ja=0 cr where * denotes convolution on variable t, we have: 1 rÂ°Â° da Tf[x(t)} = Â— [(f(t)*X(a,t))*rp a (t)]f O^, Jo tr If we denote W as the operator of continuous wavelet transform, the above result showed: /oo f(S)X(a,T Â£)<% = T f W[x(t)]. oo Therefore, CWT is a translation invariant representation. 2.3.2 Discrete Wavelet Transform (DWT) Discrete wavelet transform pair on real orthogonal wavelet bases may be written as [63]: /oo x{i)2jl2 xl>(2n n)dt {DWT), oo +00 +oo x{t) = Y. E Vj{n)2> /2 ip(2J tn) {Inverse DWT). j=0 n=oo where functions ^2~^ 2 ip{2~H Â— n) consist an orthonomal basis. Notice that DWT maps a continuoustime function x{t) into a discretetime sequence /ij(n). Clearly, it is not a translation invariant representation. As in the case of CWT, basis functions 2~^ 2 ^{2~H Â— n) generally is not a eigenfunction of convolution operators. However, Ty 2 ^ 2 ip{2 H Â— n) decomposed in DWT space: mav be T, 2^(2~H n)\ = Y, Â£ m)2, /V(2"'t m), (=0 m=Â— oo where the transform coefficients may be calculated by: /oo r . T f 2]l2 4){2~H n)\ 2ll2 i>{2~h m)dt oo L * = [Â°Â° IT /(02" J/2 ^ (2~ j {t 0 n) del 2 // V(2'i m)dt Joo UÂ— oo ' J j co yj Â—oo v y
PAGE 25
13 x(n) 9( rio(n) Â®f$(n7 /i(n)  Â— (T2)c 0 (n) Lfc(n)[(2)i @[^) cow i(n) Figure 2.2: Fast Discretetime Wavelet TransformFirst Two Levels. Note: Down arrow = decimation and up arrow = expansion. We can derive DWT representation of Tj [x(t)\ to be +00 +00 T f [x(t)] = E E H(n)T f [2"V(2'Â« n) j=0 n=oo +00 +00 = I E /=0 m=Â— 00 +cÂ» +00 E E N( n ) v iA n ,Â™) j=0 n=Â— oo Therefore, {^(n, m)} can be seen as representation of the operator Tf under the basis {2^ 2 ^{2~H, n)}. 2.3.3 Discretetime Wavelet Transform (DTWT) The discretetime wavelet transform (DTWT) cannot be obtained by sampling the continuoustime wavelet transforms nor by simply mimicking the continuous ones. Assuming a discrete "mother wavelet" u(n), a discretetime version of dilation v(2 k n Â— m) does not produce a frequency response ~ V(e 2 u ). Moreover, a dilation V{e 2ku) ) in frequency domain is generally not a bandpass function but a multiband function. The fundamental difference lies on both the 27rperiodic characteristic of frequency response and the indexing constraint of discretetime series. A fast algorithm based on the concept of multiresolution was found by S. Mallat [42] and is illustrated in Figure 2.2. If we consider only one level, we have M n ) = Em=_oo x(m)g(2n m), (2.6) (2.7)
PAGE 26
14 x(n) = Â£Â£_oo co(l)h(n 2/) + Â£Â£_Â«, MW" ~ 2/). (2.8) Therefore, the decomposition filters h(n) and g(n) and the reconstruction filters h(n) and g(n) must satisfy: Em #(2Â« m)g(m 21) =6(nl), E m h{2n m)h(m 21) = 5(n I), Zm9(2nm)h{m2l)=0, (2.9) Â£ m h(2n m)g(m 21) = 0, Â£, /i(2/ m)ft(n 2/) + Â£, (7(2/ m)g{n 21) = 5(m n) It can be proved (see Appendix A) that in frequency domain those conditions are equivalent to: Gie^Gi^) + G(^ u+1() )G{S u+ir) ) = 2 H(e? w )H(e? u ) + H(e j ^ +n) )H(e j{u}+7[) ) = 2 G{e? u )H{j") + G{e j{uJ+ir) )H(e^ +w) ) = 0 #(e^)G(e Ju; ) + H{e?(" + * ) )G(e>( u+ * ) ) = 0 H{^)H[e ju ) + G{e juJ )G(e Jul ) = 2 H(e jul )H(e j{uJ+w) ) + G(e JUJ )G(e j{uJ+ir) ) = 0. One possible choice is to choose h(n), g(n), h(n) and g(n) by: g (n) = {l) n h{l n) G(e>' w ) = e~^H*{e^ + ^), h(n) = h*(n) H(e?Â») = i/*(e JW ), g(n)=g*(n) # G(^) = G*(e^l In this case, the resulting bases is called orthogonal bases. Otherwise, the resulting (2.10) (2.11) bases is called biorthogonal bases. It can be proved (see Appendix A) that the structure of Figure 2.2 is equivalent to a filter bank with the transform written as [63]: d k (n) = ]T x(m)v k {2 k+1 n m), 0 < k < M m=Â— 00 00 CMi(n) = Y. x{m)v M \{2 M ~ l n m), m=Â— 00 M2 00 (DTWT) x ( n ) = X! d k (m)u k (n 2 k+l m) + ^ c A /i(m)u A/ i(n 2 M 1 m) fc=0 m=Â— 00 m=Â— 00 (/river se D7WT). Particularly, for orthonomal basis, it can be proved (see Appendix A) that u k (n) = v* k (n), u k {m Â— n k l)u* c (m Â— n c n) Â— S(k Â— c)S(l Â— n). (212)
PAGE 27
l.J x(n) 1 /^~\ Do(n)  vo(n)  ~(t"oJ ' (WjÂ«o(i) Â—  ui(n) !>*( ^)_@ Â— Â— (K)_M (n) x(n) Figure 2.3: Generalized Discretetime Filter Bank Transform. Since DTWT can be included into a general framework to be presented next, we leave more in depth discussions to the coming section. 2.4 Generalized Discretetime Filter Bank Transforms Previous sections showed that both STFT and DTWT can be implemented by a filter bank structure. If we ignore the modulator (e~ julkri ) and the demodulator (e^ h11 ) of STFT, the discretetime version of the two can be included in a generalized form: Mn) = E~=oo x{m)v k {n k n m), 0 < k < M 1, (GDTFBT) (2.13) x(n) = Efclo 1 E^ = oc Mm)u k (n n k m) {Inverse GDTFBT). (2.14) It was called the generalized filter bank transform [63]. Figure 2.3 illustrated the structure of GDTFBT. Depending on n k , v k (n) and u k (n), GDTFBT includes the following special cases. 1) Orthonomal wavelet representations. n k = 2 fc for 0 < k < M Â— 2 and Um_i = um2In this case, basis functions possess orthonomality of (2.12).
PAGE 28
16 pj/Htn)^)[9o(n)[ ^2)H^W(2)(t^ Hgo(")h ffi(n)L/io(n)[(2)921 /Â»3(n)[^2)^2) ^3(n) ] g 3 (n)[(j2){2)33(n) wp(n) (n) i(n) Figure 2.4: Example of General Binarytreestructured Filter Bank of Five Channels. 2) Orthonomal wavelet packet representations. This is a generalization on the orthonomal wavelet by applying the recursive decompositionreconstruction structure to all branches. Therefore, = 2 L for 0 < k < 2 L Â— 1. 3) Biorthogonal wavelet (packet) representations. The same as the orthonomal counterpart except that the constraints (2.11) are lifted. 4) Generalized binary tree structured filter bank representations. This is a further generalization upon the (bi)orthogonal wavelet (packets). Every channel may be split into two with filters satisfy (2.9). Therefore, every is power of 2. Moreover, filters used in one channel (node) may be different with another. Figure 2.4 showed an example. It was proved in Appendix A that the basis functions constructed this way satisfy the biorthonomality expressed by: u c (m Â— n c l)vki;nf.n Â— m) = 5(c Â— k)S(l Â— n). (2.15) 5) Overcomplete wavelet representations. In this case, no downsampling and upsampling are used, which means = 1 for all channels in the Figure 2.3. As an example, the overcomplete wavelet transform corresponding to Figure 2.4 is shown in Figure 2.5.
PAGE 29
17 x(n) r Go(u;) Hl(2w) Hi(2u;) 103(71) Gi(2w) Gi(2w) Go(uj) ff 3 (4w) Â«'_.(;/ ) //..< Â«G 2 (2oj)G 3 ( 1,'; u>l(n) G 3 (4u;) G 2 (2o;)i wo(n) H 2 (2ui) x(n) Figure 2.5: The Overcomplete Wavelet Transform for the Example of Figure 2.4. Representations 14 above are called "complete. The reason of calling the representations of the category 5 "overcomplete" or "redundant" is that they contain more analyzing functions ({ujt(n)}) than a basis. Since these analyzing and synthesizing functions must satisfy A/1 00 A/1 Â£ Â£ u k (ml)v k (nm)=8(ln), or Â£ U k {e?Â«)V k {e?") = 1, A;=0 m=Â— 00 k=0 they must be linear dependent. In the other word, some functions must be expressible by others. Complete and overcomplete representations behave rather differently under a convolution operator. 2.4.1 Convolution Operator on Complete Representations In general, a basis function Uk{nÂ— n^m) is not an eigenfunction of a convolution operator. Using the same idea as in the case of DWT, a convolution operator Tj can be represented by basis functions {u^n Â— n^rn)}: Ml 00 T f [u k {n n k m)) = J2 Â£ %,c( m > b)u c (n n c b) , c=0 b= 00 00 00 VkA m ' b ) = Â£ Â£ /(')***(* ~ * ~ n kTn)v c (n c b I). (2.16) /=Â— 00 i=Â— cx> It can be proved that ujt( n Â— is an eigenfunction of the operator Tj if and only if /7/t, c (m, 6) = XS(k c)5{m Â— b).
PAGE 30
18 Proof: If T]k,c( m ^ b) = \8(k Â— c)5(m Â— b), then Tj [Uk(n n k m)} = EcL~ 0 l EZoo ^(k c)S(m b)u c (n n c b) = \u k (nn k m). In the other direction, if Tj [u k (n Â— n k m)\ = Xuk(n Â— n k m), then VkA m M = En=oo T f[ u k( n n km)]v c (n c bn) = En=oc ^ u k( n n k m)v c (n c b n) = X5(k c)8(m b) , where we utilized the property (2.15). The set of coefficients {rj ktC (m,b)} is the representation of the operator Tj in the basis and is independent of input signal. Since a convolution operation on a sequence x(n) can be derived as: A/1 oo T f [x{n)} = Y Y w k (m)T f [u k (nn k m)] k=0 m=oo A/1 oo Ml oo = Y Y w k(m) Y Y %,c( TO > b)u c (n n c b) kÂ—0 m= Â— oo cÂ—0 b=Â—oo Ml oo = E E c=0 b=Â— oo Ml oo 53 Y w k(m)ri ktC (m,b) u c {n n c b) , . k =0 m=oo the representation of T [x(n)] is thus A/1 oo a ci b )=Y Y Wk(m)r} k
PAGE 31
1!) 2.4.2 Convolution Operator on Overcomplete Representations The overcomplete representation of a convolution operator Tj can be obtained from (2.16) with n c = 1 and n k = 1: oo oo ilk,c{m,b)= Y, f{i)M l i m ) v c(bl) = {f*Uk*v c )(bm). (2.18) /=Â— oo := Â— oo Accordingly, the overcomplete representation of 7/ [x(n)] can be obtained from (2.17): Ml oc Â«c(n)=S Y w k{m)r]k,c(n7n) = f(n)*w c {n), (2.19) kÂ—O m=oo which showed that discrete overcomplete wavelet transform operator and convolution operator are commutable, and thus overcomplete wavelet representations are translationinvariant. To verify this, assume an overcomplete wavelet transform operator of W and a translation operator of Tj with f(n) = S(n d), overcomplete wavelet representation of T) [x(n)] is: WTj [x{n)\ = T f W[x(n)} = T f [{tu fc (n)o<*c (e>Â») = F{e?")U k {e>Â«)V c {e?") , where %,c(e Ju; ) is the discretetime Fourier transform of i]k, c (. n )If the passbands of the filters C4(e7 ' w ) and V c (e^ w ) have little overlap, the following approximation is justified: U k (enV c (en Â« 5(k c)U c (e^ w )V c (eP u ) .
PAGE 32
20 Moreover, if the function F(e JW ) is real and the passbands of channel c's are narrow enough, we may further approximate fjk^ie^) as: VkAen Â« A C 5(A: c)^ e (c> tf )V c (e> w ), or, %,c(Â«) ~ A c <$(fc ~ c)u c {n) * v c (n) , where A c is a real number. In this case, (2.19) may be well approximated by: Ml <*c(n) ~ KHk c)w k {n) * u c (n) * v c (n) = A c tu c (n) . (2.20) k=0 This means that a convolution operator with real frequency response F(e* w ) may be approximated by a set of real multipliers {A^}. This is another advantage of overcomplete representation since we found that the approximation is generally not extendible to complete representations. 2.4.4 Aliasing Enhancement on Complete Representations For complete representations, subsampling usually does not meet the Shannon sampling theorem, and thus produces aliasing distortions. Those aliasing components get canceled only by specially designed filters at the reconstruction stage. If we manipulate the representation in a way like (2.20), aliasing distortions may not be canceled. For a concrete example, we consider a wavelet transform of Figure 2.2 with only one level. If we multiply representations d 0 (n) and c 0 (n) by k d and k c respectively, we can rewrite Equations (2.62.8) in frequency domain: D' 0 (e? u ) = h d [X(e^ /2 )G(e j ^ 2 ) + X{e^ +2 * )/2 )G(e j ^ +2 * )/2 )} . Cj(e*") = ^k c [X{e^' 2 )H{e^l 2 ) + X{e^ +2 ^ )l2 )H{e j{M)l2 )\ , X'{e> u ) = l [k d G{e juJ )G(e juJ ) + k e H{e^)H(e iu )] X{e> w ) + i [k d Q{eP {u+v) )G(e iw ) + k c H{e j{uJ+7r) )H{e jiJ )] X(e j(w+,r) ). The second term is the aliasing component. Unless k d = k c , the second term is generally not equal to zero even though filters meet the perfect reconstruction
PAGE 33
21 conditions (2.10). In this case, the system is not a linear time invariant (LTI) system, but rather a linear periodically time varying (LPTV) system [63]. It is easy to recognize that the system can not be characterized by the familiar form of Y{e^) = M{e^ UJ )X{e^ ul ). Figure 2.6 shows an example of such aliasingenhancement effect. 50 100 150 200 250 0 0 5 1 1 5 2 2.5 3 (a) (b) 50 100 1 50 200 250 0 0 5 1 15 2 2.5 3 (c) (d) Figure 2.6: Example of the Aliasingenhancement Anomaly. Where: (a) The original sinusoid signal; (b) Magnitude of DFT of the signal (a); (c) Reconstructed signal with orthogonal Haar filters {H{e? u ) = 2'/ 2 (l + e' u ), G(e>' w ) = 2" 1 / 2 (1 e"^)), one level decomposition and kj = 6, k c = 1; (d) Magnitude of DFT of the signal (c).
PAGE 34
22 2.5 Summary and Discussion We have reviewed signal representations based on Fourier and wavelet transformations and included them in a general framework called generalized filter bank transform. We studied their behavior under a convolution operator. It revealed that complex sinusoid functions of Fourier transform are the only basis functions in this class which are eigenfunctions of the operator. In this case, a convolution operator is equivalent to multiplications in the representation (frequency) domain and the inverse operator can be simply expressed as divisions. For overcomplete representations, a convolution operator may be approximated by multiplications of coefficients by real factors. However, such an approximation can not be extended to complete representations due to aliasing distortions. Moreover, we realized that discrete complete representations also lack translationinvariant. Based on the characteristics of these representations, we may find the best applications for them. 1) Complete representations may be good for compression, progressive transmission. No redundancy alone would be a major advantage for those applications. Throwing away some coefficients may be seen as a zeroing operation, which does not enhance aliasing components. After all, we are prepared to accept some distortions for those applications. 2) Overcomplete representations possess many desirable properties for restoration, enhancement, feature extraction and time delay estimation [52]. First, they are translation invariant which is critical for pattern recognition. Second, they do not have the aliasing distortions. Third, they allow much higher degree of flexibility in filter design.
PAGE 35
CHAPTER 3 OVERCOMPLETE WAVELET REPRESENTATIONS 3.1 Introduction The paradigm of overcomplete wavelet representations is a versatile and powerful tool shown to be advantageous for certain applications. In this chapter, we shall study overcomplete wavelet representations in detail, including issues of decomposition and reconstruction algorithm, filter selection and timefrequency interpretation. 3.2 Overcomplete Wavelet Transforms and Filters In general, a discretetime overcomplete wavelet transform (DTOWT) pair can be obtained by setting n k to 1 in the Equations (2.132.14), as rewritten in the following: w k (n) = i>*(ro)a:(n m), 0 < k < M1, (DTOWT) (3.1) x(n) = Efcio 1 E~=oo Mm)u k (n m) {Inverse DTOWT). (3.2) More importantly, we need an algorithm to generate analyzing and synthesizing filters Vk(n) and u k (n). In this thesis, we restrict ourselves to the binary tree algorithm as illustrated in the previous chapter, and consider two particular classes: overcomplete wavelet packets and dyadic wavelets. 3.2.1 Overcomplete Wavelet Packet Representations The construction algorithm of overcomplete wavelet packet representations corresponds to a complete binary tree, as shown in Figure 3.1. Though each tree node represents an analyzing filter, only a subset of all tree nodes is needed to achieve a perfect reconstruction. Notice that the position and the width of each rectangular 23
PAGE 36
24 ff(w) CM H{2uj) i/ /'\^G{2uj) // /\h(2u j) D H{2ui) JJ #2,0 1*2,1 N2,2 #2,3 G{2u)) 2tt (a) (6) Figure 3.1: Binary Tree for Overcomplete Wavelet Packet Decomposition: (a) the tree; (b) illustration of frequency scaling of the prototype filters. node also represents the channel's position and the band width in the frequency domain. In general, prototype filters H(to) and G(u) in each level can be distinct, as proved in Appendix A, as long as they satisfy the condition H(lu)H(cu)+G(uj)G(lo) = 1. The frequency response of each node can be calculated recursively: P 0 (u) = H{u>), P 1 (u) = G{u)] jV 0l oM = 1; N l+l , m (u) = P c @u)N lAm/21 (u), I > 0. (3.3) where the subscript c of the filter P c (u) follows the periodic pattern of 0, 1, 1, 0, and can be calculated by: c = (m + [m/2\) mod 2. The purpose of such ordering is to make the index m of the node N^ m proportional to its central frequency. The reconstruction from two child nodes to their parent node can be expressed by the following formula: where we used the property of: Pkmod2{^>) Pkmod2(^) + P(k+l)mod2(u) P(k+l)mod2(u) = H{lo)H{u) + G{lo)G{oj) = 1.
PAGE 37
25 There are many possible choices of filters Vk{n) from the tree to achieve the perfect reconstruction. For example, for the tree shown in the Figure 3.1 with the height of two [55, page 449], five sets are available: {A^o}, {Ni,o, iV"i,i}, {^1,0,^2,2,^2,3}, {^2,0,^2,1,^1,1} and {N 2 ,o, #2,1' N 2t2 ,N 2 , 3 }. In general, let T(h) be the number of selections from a complete binary decomposition tree of height h. Since the tree may be split into two subtrees of height hl each in turn have T(hl) selections, we have the following recursive formula: T(h) = [T{h l)] 2 + 1, h>l, and T(0) = 1. The first five numbers were calculated as T(l)=2, T(2)=5, T(3)=26, T(4)=677 and T(5)=458330. All quadrature mirror filters (QMF), good for orthogonal wavelet transform, are also good for overcomplete wavelet packet transforms. However, for real QMF except Haar wavelet, I. Daubechies proved that linear phase and finite impulse response (FIR) properties are mutual exclusive [13]. For example, Daubechies wavelet filters [11] are FIR but not linear phase. In the other hand. BattleLemarie wavelet filters are symmetric but not FIR, which may be written as [42]: H p (u) where p is a positive integer and (3.4) +00 j Note: An algorithm for numerical computation of BattleLemarie wavelets is included in Appendix B. 3.2.2 Dyadic Wavelet Representations The building structure of a dyadic wavelet transform is a particular subset of the overcomplete wavelet packet transform. It decomposes lowfrequency branch only (assume H{uj) is lowpass), as shown in Figure 3.2.
PAGE 38
26 Figure 3.2: Binary Tree for Dyadic Wavelet Decomposition. Obviously, all filters good for overcomplete wavelet packet transforms are good for dyadic wavelet transforms. However, one may want to exploit the extra opportunity created by the more relaxed constraint on filters. The following filter classes [44, 20] possess distinct property in the time domain: H(u) = e jpu/2 cos 2n+p {uj/2), H{J) = H*(u), G(u) = (j)Pe^/ 2 sin 2 p (u;/2), G{u) = UY^smnul2)Y^T l ca^{ul2) (3.5) where n > 0 is an integer and p Â£ {0, 1}. In fact, for: p = 0 filter g(l) = 0.25, g(0) = 0.5 and g(l) = 0.25 is proportional to a discrete Laplacian operator and for: p = 1, g(0) = 0.5 and #(1) = 0.5 is proportional to a discrete gradient operator. Notice that filters within this class are linear phase and FIR, but not orthogonal. For p = 0, filters H, H, G and G and their frequency scalings {e.g. H(2 l u)) are all symmetric. For p = 1, different prototype filters may be used for frequency scaling to minimize spatial shifting caused by the phase factor. If we denote Hi(u), Hi(u>), G[(lu) and Gi(to) for filters used in level / > 1, we find the following filters are either symmetric or antisymmetric as well as FIR: H t (u) = cos 2 " +1 (2'2 o;), Hfa) = H t {u), 2n G t (u) = j sin(2'2 u;), Gi(u) = G t {u) Â£ cos 2m (2'2 o;). m=0
PAGE 39
27 Figure 3.3: Examples of the Function Q a ,b( u )In both cases, frequency response of tree node N^ k (I > 0 and k G {0, 1}) can be derived as: li N lfi {u) = e^' 2 J] cos 2n+p (2 m 1 u;) = e jpu/2 Nu{u) = (j) p e 771 = 0 2'1 sin(u;/2) 2P sin(2'1 u;) 2' sin(w/2) sin(2'2 o;) 2n+p 2n+2 2'" 1 sin(u;/2) For convenience, we defined a function Q a< b(u)) as: Â©o,ft(w) sin(2 Q 1 tj) 2 a sin(w/2)_ For 6 > 4, the function QaA 00 ) ls approximately Gaussian as plotted in Figure 3.3. Similarly, we can define a reconstruction tree with nodes: Ni f i(u) = iV/_i,o(w)G,(w). A dyadic wavelet representation consists of leaf nodes. For example, a threelevel representation consists of nodes {N 3fi , jV 3)1 , AT 2) i, Ni,i}. In terms of filter bank, they can be viewed as multiple channels. Channel frequency responses can be derived as: C lfi (u)) = N lfi {iO)N lfi {uj) = Yl H m (u)H m (u) = 9l,4n+2p(u), 771=1 C M (w) = JV u (w)iV, il (a;) = iV / _ 1 ,oMN i _ 1 ,o(u;)G,(a;)G / (a;)
PAGE 40
28 = iV^oMA^oM [l H^H^uj) = @il,47i+2p(^') Â— Â©/,4n+2p(^)The channel frequency responses for a threelevel representation are {C3fl,C3,i,C2,i,C\ % i}. Notice that the channel frequency responses Ct t i(u) are an approximate difference of two Gaussians (DOG) [47, page 63]. Two examples of channel filters are shown in Figure 3.6. 3.2.3 Filters of Autocorrelation Functions of Wavelets From a filtering point of view, the idea is very simple. Since there is no downsamplingupsampling taking place in overcomplete wavelet transforms, it seems that we can simply combine decomposition and reconstruction into a single filtering step. To achieve this, one only need to change prototype filters into: H a {u) = H{u)H(u), (36) G a (co) = G(u)G{u). The perfect reconstruction condition now takes the form of: H a (u) + G a (u) = l. As H(u) is connected to a wavelet, H a {uj) is connected to an autocorrelation function of wavelet. For rigorous mathematical treatment, refer to Saito's Ph.D thesis [57]. In this case, a reconstruction step is simply a summation of all coefficients as (3.2) degenerates to: x(n) = ZkJo 1 ufc(n), which takes the same form as the generalized STFT. The most important property is that filters H a (u) and G a (uj) are always symmetric or antisymmetric, and thus providing a simple way to construct symmetry and FIR filters for certain applications.
PAGE 41
29 3.3 Connection to Complete Wavelet Representations In Appendix A, Proposition A. 0.3 built the bridge between an overcomplete wavelet representation and a complete wavelet representation. If both representations used the exactly same prototype filters H(u), G(u), H(uj) and G(u), the only difference lies on the decimation. In this case, the complete wavelet representation is a subset of the corresponding overcomplete wavelet representation due to decimation. If the prototype filters used by an overcomplete wavelet representation does not satisfy (2.10), there is no corresponding complete wavelet representation. 3.4 Timefrequency Interpretation and the Uncertainty Principle The set of transform coefficients {w k (n)\ 0fc(n)'s. The shorter the distribution is, the higher the time resolution it can achieve. Unfortunately, high resolution in time and frequency domains are conflict goals, as demonstrated in the Figure 3.5. In this example, we saw that as passband narrowed, impulse responses spread and consequently the boundary between
PAGE 42
30 Figure 3.4: Two Examples of the Overcomplete Wavelet Representation. Legend: (a) An ideal square waveform; (b) A linear chirp signal s(n) = 5cos(0.027rn + 0.00l7rn 2 ). In each image, top half is the signal waveform, and the bottom half is (0255) scaled overcomplete wavelet representation (coefficients) of 64 leaf nodes of the level six. For the bottom half, horizontal axis is time, vertical axis is frequency.
PAGE 43
31 two impulses blurred. This phenomenon reflected a fundamental principle called the uncertainty principle [21, 51]. Time and frequency localization of a discrete lowpass function f(n) can be mathematically described by two quantities, ^ = 4Â£(nn) 2 !/(n) 2 , n where E = Â£Â„ /(n) 2 = Â£ Â£, F(e*) a [E \F(e jn )f] /(4Â£ 2 ), and thus for filters with F(e jn ) = 0, U > 0.25. That the uncertainty factor has a lower bound is significant and is called the uncertainty principle. It basically states that we cannot have filters with arbitrarily narrow bandwidth in the frequency domain and arbitrarily short duration in the time domain. As a result, we cannot achieve arbitrarily precise time and frequency localization simultaneously. (Numerical computation of uncertainty factors for filter banks is discussed in Appendix C.) Figure 3.6 shows examples of analyzing filters of overcomplete wavelet transform and their uncertainty factors. Uncertainty factors of the analyzing filters of the dyadic wavelet showed that they are indeed very close to Gaussian functions. In general, the results showed that uncertainty factor varied from channel to channel, and level to level. As a statistical measure, we used maximum, minimum and mean to describe higher levels. Such statistics for higher level overcomplete wavelet packets are presented in the Table 3.1, where Regular stands for regular Lemarie filters of (3.4) and Autocorrelation stands for autocorrelation functions of Lemarie filters. We may catch some trends from the data on Lemariefilterspanned analyzing filters of overcomplete wavelet packet representations:
PAGE 44
32 Col 1 Col 2 Col 3 Figure 3.5: Examples of Timefrequency Localization by Overcomplete Wavelet Packet Representations. Where: Row 1: A signal consists of two puretone impulses with frequency of u = 0.3n and u! = 0.327r: the spectral (Col 1) and the waveform (Col 3). Row 25: level 2 representation. Row 613: level 3 representation. For row 213: columnl, frequency responses {A^ ifc 0 < k < 2'}; column2, impulse responses of columnl (dot lines approximate envelopes). column3, overcomplete wavelet representation. The prototype filter used was Lemare filter of p = 1.
PAGE 45
33 Row 1 Col 1 Col 2 Figure 3.6: Uncertainty factors (drawn on top of each band) of Channel Filters. Where: Row 1: Overcomplete wavelet packet of level 2 using Lemarie filter of p = 1 (Col 1) andp = 9 (Col 2); Row 2: Overcomplete wavelet packet of level 3 using Lemarie filter of p = 1 (Col 1) and p = 9 (Col 2); Row 3: Channel filters {C 4j0 , C 4) i, C 3)i , Â€2,1, C^i} of dyadic wavelet using the filter class of (3.5) with n = 1 (Col 1) and p = 1, n = 1 (Col 2).
PAGE 46
34 Table 3.1: Uncertainty Factors of Analyzing Filters Regular Autocorrelation p level T J "max J j T T U ^max C'min TT L 4 1 .618 0.350 0.701 0.8o3 c\ orÂ™ 0.36 ( 0.49< 1 1 0 O.Ool U.04Z i 7At; 1. 1 uo z.uoo U.OIO 6 16.721 0.318 4.535 5.693 0.262 0.980 4 1.011 0.437 0.628 1.178 0.470 0.584 2 5 3.464 0.397 0.954 0.871 0.333 0.432 6 7.984 0.320 1.809 2.736 0.260 0.475 4 1.283 0.529 0.674 1.349 0.496 0.646 3 5 2.386 0.418 0.704 0.721 0.329 0.398 6 5.244 0.318 1.005 1.605 0.261 0.346 1) Joint timefrequency localization of those using low order filters tends to deteriorate quickly as level increase. This is due to the existence of small sidelobes. 2) Using filters of autocorrelation function is more effective than using higher order regular filter, if the noreconstruction structure is acceptable by applications. However, one should be very careful in using uncertainty factors as an optimality measure due to two main reasons. First, how can we come up a single sensible number for a filter bank? Minimum U max or minimum U or something else? Second, more fundamentally, the uncertainty factor does not take the signal into account. For configurations with similar uncertainty factors, as in the case of Gaussianlike filters such as (3.5), the measure of uncertainty factor alone would not tell us which one is optimal. 3.5 Twodimensional Extensions So far we have only discussed onedimensional representations. In order to apply overcomplete wavelet transforms to twodimensional images, we need to extend the transform to twodimension. The following two separable extensions can be implemented as separated ID filtering of rows and columns.
PAGE 47
35 1) Tensor product extension [42]. This extension is originally applied to orthogonal wavelet transform. Only ID filters H(u) and G(u>) are used. It directly exploit the perfect reconstruction property of quadrature mirror filters by extending: H{u))H{u) + G{u)G{u>) = 1 into: H(u x )H{u x ) + G(u x )G{u x )] [H{u y )H(Lu y ) + G(uj y )G(uj y ) The above 2D extension can be rewritten as HH(uj x , uj y )HH(uj x ,LUy) + HG{u x , u y )HG(u) x ,u} y ) + GH(u x ,ujy)GH(uj x ,ujy) + GG{uj x ,u)y)GG(u) x ,u}y) = 1. where HH(uj Xl Uy) = H{u x )H{ujy), HH{uj x ,ujy) = H(u) x )H{ujy), HG{lu x , ujy) = H(uj x )G(uj y ), HG(u; x , ujy) = H(uj x )G{u y ), GH{u x ,u y ) = G(uj x )H(ujy), GH{jj x ,uj y ) = G(u x )H(uj y ) : GG(uj x ,u y ) = G(u x )G{ujy), GG(uj x ,ujy) = G(uj x )G{ui y ). Filters {HH, HG, GH, GG} are used for decomposition while filters {HH, HG, GH, GG} are used for reconstruction. The same construction method is applied to frequency scaling of higher levels with each node decomposed into four child nodes. 2) Twoorientational extension. For the particular dyadic wavelet filters (3.5) with emphasis on spatial operation (gradient and Laplacian), the component {GG} of the tensor product method may not be useful. Therefore, the following 2D extension was proposed by Mallat [44]: HH(uj x ,ujy) = H{uj x )H{uj y ), HH(cu x ,u!y) = H(U X )H(iUy), GX(u x ,uj y ) = G(lu x ), GX(u) x ,u y ) = G(uJ x )L{u y ), GY(u x ,ujy) = G(u y ), GY(v x ,v y ) = L(u x )G(u y ). This 2D extension has only three components:
PAGE 48
36 Â• a horizontal component GX Â• a vertical component GY Â• a lower resolution component HH However, it introduced a new filter L(u), which is to be determined by the condition of: HH HH + GX GX + GY GY = 1. It turned out that the formula for L(uj) is L{u) l + H(u;)H{u) 2 For filters of (3.5), L(u;) = 0.5 * 1 + cos ,4n+2p (w/2)] , which is a symmetric and FIR filter.
PAGE 49
CHAPTER 4 OPERATIONS ON OVERCOMPLETE WAVELET REPRESENTATIONS 4.1 Introduction In most cases, a suitable signal representation is not the end of a problemsolving process. On the contrary, how to manipulate it to achieve a desired effect is the more challenging part. For overcomplete wavelet representations, we have identified three operations: gain, shrinkage and envelope operators for our applications in image deblurring and segmentation. We demonstrated that a linear convolution may be approximated by a gain operator, and devised a tree pruning algorithm based on a tolerant of approximation error. We extended the wavelet shrinkage proposed by Donoho and Johnstone [17] to overcomplete wavelet representations for denoising. We built the connection between the power density distribution of a overcomplete wavelet representation and its envelopes, and presented two envelope detection algorithms. 4.2 Gain Operators In the Section 2.4.3, we determined that a convolution operator can be approximated by multiplication of real factors {A fc  0
PAGE 50
38 reconstructed signal x'(n) in the frequency domain is simply M1 M1 X'(u) = Y, hW k {u>)U k (u) = X(u) Y \ k V k (u)U k (u), k=0 k=0 and thus the equivalent filter is Q(w) = [X 0 Xi Â• A M i] [C 0 {u)d(u) Â• C M ,(w)f , where [A 0 Ai Â• Â• Â• Ami] is a gain vector, C k (u;) = V k (u)U k (u) is the channel filters, and M l denotes the matrix transpose of M. Two fundamental issues immediately arise: 1) What is the characteristics of the set of filters expressible by a gain operator? In the other words, what kind of filters may be approximated by a gain operator? A broad answer is that any gain operator can only represent a symmetric filter. This is because channel filters {C k (uj)\0
PAGE 51
39 It is well known that the optimal gain vector corresponding to the minimum mean square error may be found by setting all partial derivatives to zero: 35 1 r 1 r* 7T JTT Ml k=0 C q {uo)duj = 0, 0 < q < Ml, which is a set of linear equations: Ml Y IT C k {u))C q (u)duj\ \ k = r F{Lu)C q (u>)duu, 0 < q )du Â£ AÂ£Â— / F '{u)C k {u>) , Z7T Jtt ,_ n zTT J w k=0 The minimum mean square error can be derived as Â«r M 1 pit k=0 where {AÂ£o< 9 0, it must be true that E f > E^Lo 1 Â£r F(u)C k (u)du, and therefore 0 < e < 1 . The criteria of mean square error provides a way to determine the optimal gain vector for a given frequency response F(u) and a given configuration. However, the criteria alone suggests that the bottommost nodes are the best representation, for which e may reach its minimum among all possible configurations. This can be formally proved by the following proposition. Proposition 4.2.1 Splitting a channel into two will not increase the error measure e.
PAGE 52
40 Proof: By contradiction. Assume that for the current configuration {Cfc(w) 0 e. Without loss of generality, we can write: C qi (u)=C q (u)H(2 l u,)H(2 l u>), C Q2 (u) = C q {uj)G{2 l u;)G{2 l uj). K > Since the prototype filters satisfy H(lo)H(u)) + G(w)G(uj) = 1, we have C qi (u)) + C q2 {u) = C q (uj). For the new configuration, we can find a gain vector {\' k = \Â° k , k / qi, k^q2\ \ qi = X' q2 = AÂ°} with error e' = e. This contradicts the assumption that e 1 > e, and therefore e' < e Examples showing higher level corresponding to better approximation can be visualized in Figure 4.1. Overcomplete wavelet packet representations of leaf nodes at Level 4, 5, 6, 7 were selected, corresponding to 16, 32, 64, 128 channels, respectively. Lemarie filter of p = 3 were used. 4.2.2 Timefrequency Tradeoff and a Greedy Algorithm Proposition 4.2.1 clearly states that approximation error e of frequency response decreases as the channel bandwidth decreases, as demonstrated by the examples in Figure 4.1. Therefore, by the criteria alone, the Fourier transform would be the best representation since its approximation error is zero. This is certainly not the answer we are looking for. What we lost in sight is the other part of the picture in the time domain. As dictated by the uncertainty principle, time resolution decreased as the frequency resolution increased. A overcomplete wavelet packet representation by the deepest leaves would be closed to a Fourier representation and thus lost its resolution in the time domain. Moreover, from computational point of view, such a downtothebottom decomposition is rather inefficient.
PAGE 53
4] Row 1 o o.b i is a 2 Â» BO i o 4 : 20 O OS 1 1.B a 2 5 3 Figure 4.1: Examples of Gain Vector Approximation (frequency range [0, it] shown) Where: Row 1: a given frequency response F(e J1J ); Row 2 to 5: approximated frequency responses by overcomplete wavelet packet representations of level 4, 5, 6, 7, overlaid with F(e^) (dotted curve), with 5=39.4%, 18.5%, 4.5%, 0.02%, respectively.
PAGE 54
42 One possible tradeoff between time and frequency can be accomplished by setting up a tolerant on the approximation error e. Under an error tolerant, the goal could be to find a tree configuration with minimum number of leaves. In the following, we present a greedy algorithm in the order of breadthfirst search [45]. Notice that a tree node here represents a channel filter C^ m (u) and the relationship between children nodes and their parent node is characterized by (4.3). Algorithm 4.2.2 (Minimum Tree) Given a frequency response F(u>) and an error tolerant Â£q, let current leaves C = {Co,o(w) = 1} and compute the current error Â£c using the set C. Step 1. One by one, compute error Ek by temporarily substituting a node in the current leaves C with its two children. Choose the pair with minimum Ek to replace their parent in the set Â£ and update the current error Ec = m'm{Ek}Step 2. If Ec < Eq, stop and use the leaves L as representation; else, go to step 1. Examples of overcomplete wavelet packet approximation using the minimum tree algorithm are shown in Figure 4.2. The improvement is appreciated by comparing it with Figure 4.1. In Figure 4.1, the representation using thirtytwo channels of the Level 5 has the approximation error s Â— 18.5%. However, for the same frequency response, the minimum tree algorithm was able to find a representation of only twentysix channels and yet reduced the approximation error to e = 6%. In the other case, the minimum tree algorithm found a representation with merely seven channels and achieved approximation error of 0.6%. 4.3 Shrinking Operators Wavelet shrinkage was introduced by Donoho and Johnstone [17] for signal denoising and linear inverse problems. For a thorough introduction, refer to [15]. The description of the methodology was given in [15, page 173], we quote, "Wavelet
PAGE 55
43 Row 1 Col 1 Col 2 Figure 4.2: Examples of the Minimum Tree Approximation with the Maximum Depth of Seven Levels. Where: Col 1: A given frequency response Fi(e juJ ) (rowl, same as in Figure 4.1), approximation result overlaid with F x (row2) with e = 6% and selected 26 channels (row3); Col 2: A given frequency response F 2 (e ju ') (rowl), approximation result overlaid with F 2 (row2) with e = 0.6% and selected 7 channels (row3). Overcomplete wavelet packet with Lemarie filter Hi(e juJ ) was used in both cases, and the frequency range [0, 7r] is shown.
PAGE 56
44 shrinkage refers to reconstructions obtained by wavelet transformation, followed by shrinking the empirical wavelet coefficients towards zero, followed by inverse transformation." For a thorough introduction, refer to [15]. We extended the idea to overcomplete wavelet representations on the ground that they differs with orthogonal wavelet representations mainly on having extra correlated coefficients. A shrinking operator K. t can be written as K t [w] w Â— t, w > t, 0, \w\ < t, (4.4) w +t, w < Â—t, where t > 0 is a threshold. Notice that the shrinking operator is a nonlinear and memoryless operator, and is supposed to work in spatial domain. The goal of a shrinking operation is to cut the noise off while keeping useful signals. Obviously, this is possible only if the amplitude of desirable components is stronger than the noise components. Therefore, the threshold value t is dependent on the signaltonoise ratio. Examples of choosing appropriate thresholds are discussed in Chapter 6. To demonstrate the shrinking operation and the advantage of wavelet denoising, an example is included in Figure 4.3. The major advantage of wavelet shrinkage denoising, as compared with linear smoothing, is that it diminishes noise but does not smear the sharp edges, as seen in the example. Comparison with a linear averaging filter with h = [1, 1, 1, 1, 1, 1, l]/7 is presented in Figure 4.4. 4.4 Envelope Detectors Energy distribution is a very important concept in the physical world. It is a indispensable tool for signal analysis, too. For a harmonic current i(t) = As'm((jj 0 t + a) passing through a oneohm pure resistant, the instantaneous power (energy per unit time) at time t is i 2 (t). The
PAGE 57
45 Row 1 90 100 >io TOO " "MO 1 Col 1 Col 2 Figure 4.3: An Example of Overcomplete Wavelet Denoising. Where: Col 1: An perfect impulse contaminated by white Gaussian noise (rowl) and its level 2 overcomplete wavelet packet representation (row25); Col 2: Overcomplete wavelet packet representation after shrinking operations (row25) and reconstructed signal (rowl).
PAGE 58
46 Row 1 Â•* O Â« O 3 2 :: Figure 4.4: Comparison of Overcomplete Wavelet Shrinkage Denoising with Linear Smoothing Using the Signal of (row l,col 1) of Figure 4.3. Where: Row 1: Overcomplete wavelet shrinkage (left, the same as row 1, col 2 of Figure 4.3) and linear smoothing (right); Row 2: Local zoomin with overlay of the original noisy signal. average normalized power in a period T is thus W=\[i\t)dt=\A\ Therefore, the average normalized power of a harmonic signal is proportional to its amplitude scjuared. For an arbitrary waveform i(t), one can decompose it into harmonic components using Fourier analysis: /oo I(u)e> 2 * ft df. oo By the same token, energy density (per unit time and per unit frequency) at frequency uj is proportional to the amplitude squared of the component 7(a;)e J ' 27r ^ t , which is /(u;) 2 . The function /() 2 describes power distribution in the frequency domain and is called power density spectrum [49, page 66]. The above concepts can be extended to overcomplete wavelet representations, by extending the concept of amplitude to "envelope" [26, chapter 4], or "local amplitude." An overcomplete wavelet representation is consist of bandpass components.
PAGE 59
47 Since any bandpass waveform may be represented by A(t) sin(u c f + a(t)) [26, page 229], where A(t) is the envelope and ui c is the associated carrier frequency, an overcomplete wavelet representation {wjfc(t)o=0 otherwise.
PAGE 60
48 (b) Figure 4.5: Two Examples of Power Density Distribution of Overcomplete Wavelet Packet Representations. Legend: (a) A linear chirp signal, the same as in Figure 3.4(b); (b) A multisegment pure tone signal with frequencies uj 0 = 0.087T, u>i = 0.257T, o>2 = 0.157T and u 3 = 0.557T, in the order of occurrence. The overcomplete wavelet representation and visual arrangements are the same as in Figure 3.4.
PAGE 61
49 0.6 0 4 0.2 0.8 02 04 06 0 8 1 !) 7 5 3 1 1 3 5 7 9 n 9.117551207e2 1.889891117el 6.286127232el 6.286127232el 1.889891117el 9.117551207e2 4.525349255e2 1.946367728e2 1.946367728e2 4.525349255e2 h(n) (a) (b) Figure 4.6: A FIR Hilbert Transformer. Legend: (a) the imaginary part of the frequency response (the real part is zero). (b) nonzero coefficients. Therefore, the Fourier transform of the analytical signal x(t) is: For implementations in discretetime domain, approximate FIR Hilbert transformers may be designed by windowing the ideal frequency response [49]. Figure 4.6 shows an example. It is a typeIll FIR Hilbert transformer designed with parameters M = 18, and (3 Â— 2.629 [49. page 680]. This Hilbert transformer is antisymmetric, of length 19 and has only 10 nonzero coefficients. 4.4.2 Envelope Detection by Zero Crossings In this method, the maximum absolute value between two adjacent zerocrossings is first found, and then assigned to each point within the interval. Algorithm 4.4.1 (Envelope by Zero Crossings) For a given array x(l : N), start from the index i = 1, find the next index k which is either a zero crossing point or a zero valued point x(k) = 0 or the other end point k = N, whoever is first encountered, assign the maximum absolute value = max {x(n)j<Â„=0 otherwise.
PAGE 62
50 {x(n)\ i Â€ [0.17T, 0.77r]). While Hilbert transform was unacceptable at the low end (u < O.ln), the zerocrossing approach did poor at the high end (uj > 0.77r). The test cases were shown in the rows 1 and 2 of Figure 4.7. We constructed a noisy impulse by introducing a random frequency perturbation: s{n) = sin((w 0 + 0.05 * RANDN(n))n), where RANDN (a MATLAB function) is a random noise with normal distribution, and filtered it with a Lemarie filter Hi(u). This case was included in the row 3 of Figure 4.7. We found that the zerocrossing method exhibited edgepreserving smoothing characteristics and was more robust to wideband noise. 4.4.4 Comparison with Other Energy Operators A useful and simple energy operator was analyzed on [46]. The operator \& for a discrete signal x(n) was given as * [x(n)\ = x 2 (n) x(n + l)x(n 1). However, this operator differs with envelope detectors in the way that its value on a pure tone signal is not only proportional to the amplitude squared but also a function of frequency [46] : * [A sin(u 0 n + a)} = A 2 sin 2 (u;o). 4.4.5 Two Dimensional Envelope Detection Next we extended the ID envelope detection algorithms for the analysis of twodimensional image signals. In the frequency domain, a twodimensional analytic
PAGE 63
51 Row 1 WWW(WWW^ Col 1 Col 2 Col 3 Figure 4.7: Comparison of the Two Envelope Detectors. Where: Col 1: Input signals (top row, u 0 Â— 0.1ir, Wi = 0.57r; center row, a;o = 0.057r, a;i=0.757r; bottom row, noisy signal with a; 0 = 0.l7r). Col 2: Envelopes detected via the 19tap Hilbert transformer. Col 3: Envelopes detected via the zerocrossing method. signal may be obtained by setting an appropriate half plane to zero, based on its orientation. That is, for a 2D signal f(x,y), the Fourier transform of an analytic signal f(x,y) is either: F(uJ X ,UJy) _ f 2F(u) x ,u y ) , uj x >0 I 0 , otherwise or, F(u> x ,u y ) = l 2F ^y) > u Â»>=Â° 0 , otherwise. For the 2D filters constructed by the tensor product method with halfband filters, the equivalent complex quadrature filters exhibited the frequency response shown in Figure 4.8. Notice that lowpass and diagonal components may have an alternate arrangement by zeroing out the left or bottom half plane. This separable property allowed one to compute the envelope of a 2D signal using the ID algorithms described earlier in a straightforward manner, described in the following.
PAGE 64
52 f. Figure 4.8: Frequency Response of Equivalent Complex Quadrature Filters (level 1). Legend: (a) lowpass channel (b) vertical channel (c) horizontal channel (d) diagonal channel The diagonal shadowed areas identify zeroed half planes. Algorithm 4.4.2 (2D Envelope Detection) For a given 2D array w(m,n) and its orientation. If its orientation is horizontal, apply the ID envelope detector columnwise; If its orientation is vertical, apply the ID envelope detector rowwise; Otherwise, apply the ID envelope detector columnwise.
PAGE 65
CHAPTER 5 APPLICATION I: TEXTURE SEGMENTATION 5.1 Introduction In this chapter, we shall apply overcomplete wavelet representations to the problem of segmentation of textured images. However, it seems that we were back to the beginning. It may sound odd, but it is true that there is not even a precise definition of "texture," let alone "texture segmentation" [61]. In spite of the difficulty, there is a general consensus that texture is a property of area. We felt that the concept of texture is parallel to the concept of "local frequency" for an nonstationary signal. The concept of local frequency is easy to understand. It describes the rate of change occurs in a window around a particular time. Yet there is not a unique definition of local frequency. Therefore, our treatment of texture segmentation is to embrace it into the paradigm of timefrequency analysis. There have been many research works on texture segmentation using the concept of spatialfrequency representation. The analyzing functions used by researchers include: Â• Laws microtexture masks (filters) [24] Â• Complex prolate spheroidal sequences [64] Â• Complex Gabor functions [6, 19] Â• Real Gabor functions [30] Â• The Wigner distribution [54] More recently, wavelet and wavelet packet representations have been added to the list [34, 35, 36, 62, 9, 37]. A segmentation process usually consists of two distinct phases: feature extraction and clustering. Features for texture representation are of crucial importance 53
PAGE 66
54 for accomplishing segmentation [27]. In this chapter, we demonstrated that overcomplete wavelet packet representation and envelope detection make a good feature extraction scheme on a variety of both natural and synthetic textured images. 5.2 Texture Feature Extraction The texture features we chose are the envelopes of overcomplete wavelet representations. The interpretation of the feature is that their squares corresponds to spatialfrequency power density distribution as pointed out in the last chapter. The feature extraction thus consists of two stages: 1) overcomplete wavelet packet decomposition and 2) envelope detection. For 2D envelope detection on overcomplete wavelet packet representations, we classified each node in the decomposition tree into four possible categories, taking into account orientation, as follows: Â• The root node is omnidirectional. Â• The node last filtered by Gi(u> x )Hi(u> y ) corresponds to verticalorientation. (Highpass filter G< is applied rowwise and lowpass filter Hi columnwise.) Â• The node last filtered by Hi(u x )Gi(u) y ) corresponds to horizontalorientation. (Lowpass filter Hi is applied rowwise and highpass filter Gi columnwise.) Â• The node last filtered by Gi(u x )Gi(u y ) corresponds to diagonalorientation. (Highpass filter G t is applied rowwise and highpass filter Gi columnwise) Â• The node last filtered by H t (uj x )H t (u y ) has the same orientation as its parent. (Lowpass filter H t is applied rowwise and lowpass filter H t columnwise) At the end of feature extraction, each sample of a signal was attached a feature vector. In our case, such a feature vector is consist of envelopes of overcomplete wavelet packet representation. It is possible to apply a monotonic function to the
PAGE 67
55 feature space to transform it into another feature space. For example, a square function x 2 will transform the envelope space into a space of power density distribution and log function log(x) will transform the envelope space into a space of log spectral. We raised the issue of filter selection in the Chapter 3 on overcomplete wavelet representations. For the application of segmentation, we argued that symmetry, frequency response, and boundary accuracy are important factors in the selection of filters for feature extraction. In the following, we discuss these constraints in terms of overall performance. Â• Symmetry. For accomplishing texture segmentation, accuracy in texture boundary detection is crucial for reliable performance. In this application, filters with symmetry or antisymmetry are clearly favored. Such filters have a linear phase response, where the delay (shift) is predictable. Alternatively, filters with nonlinear phase may introduce complex distortion. Moreover, symmetric or antisymmetric filters are also advantageous in alleviating boundary effects through simple methods of mirror extension (see Appendix D). Â• Optimal Frequency Response. In order to derive an ideal filter frequency response for the chosen feature, we considered a twoband filter bank with input signals of infinite length consisting of two segments with distinct pure tones. The input signals can be written as: where .4i>0 and A 2 >0. Except for the boundary (n = 0), we derived the feature vectors (envelopes of channel outputs) as, 5.3 Considerations on Filter Selection A\ cos{u)in + Oil) ,n<0, A 2 cos(tu 2 n + 012) , n > 0, T = (e H ,e G ) f left =(A 1 \H(oj 1 )\,A 1 \G(u 1 )\) ,n<0, fright = (A 2 \HM\,A 2 \G(u 2 )\) ,n > 0.
PAGE 68
50 The angle 9 between vectors fi eft and T right is \HMH(u 2 )\ + \G{an)GM\ cos 1 Vl^OI 2 + \GM\ 2 y/\H(u2)\ 2 + \G(u2)f and is bound by O<0< f . The distance between the two classes in the feature space is then D Ti left + right it ft r, right cos{9) For 9 =  , vectors fi eft and f right are orthogonal, and the distance D reaches its maximum. Clearly, the maximum distance in feature space between any two classes is optimal for segmentation and classification applications, in the sense that the classes are better seperated and more robust to noise perturbations. Notice that for H{u> x ) + 0 and G{uj 2 ) Â± 0, cos(0) = 0 if and only if G{u)i) = 0 and H(u 2 ) = 0. This means that optimal filter banks should have no overlap in the frequency domain, and thus filter H(u) with optimal frequency response must be a perfect halfband filter. Indeed, [50] derived a similar optimal filter for image coding applications. Of course such ideal filters cannot be realized in practice. Based on the preceding discussion, filter bands overlapping in the frequency domain tend to reduce class distance in feature space. Therefore, a filter H(uj) with a large stopband attenuation and flat passband frequency response is desirable. Spatial Localization. There are two types of boundaries for signals of finite length: 1) boundaries of regions exhibiting distinct characteristics, and 2) the physical boundaries of a data segment. Feature vectors close to the boundaries will be affected. The size of the affected region depends on the length of the channel filters, and the distribution of filter coefficients. Therefore, filters of short length and fast decay shall be better suited for boundary detection.
PAGE 69
57 Unfortunately, the preceding three criteria cannot be satisfied simultaneously. As previously pointed out, quadrature mirror filters (QMF) with compact support cannot be symmetric or antisymmetric. This means that a symmetric constraint on any QMF will be in conflict with spatial localization goals. Moreover, large attenuation in the stop band requires a longer filter, which in turn degrades the filter's spatial localization. Again, we faced the limitation of the uncertainty principle, and the previous discussions about the uncertainty factor of a filter bank is applicable here. In this study, we chose the Lemarie filters for its symmetry and good frequency characteristics. 5.4 The Basic Isodata Clustering Algorithm Segmentation algorithms accept a set of features as input and output a consistent labeling for each pixel. Basically, this is a multidimensional data clustering problem with no general algorithm available [23]. Clustering algorithms that have been previously used for texture segmentation can be divided into two categories: 1) supervised segmentation and 2) unsupervised segmentation. In practice, unsupervised segmentation is often desirable and easy to validate. It is particularly useful in those cases where testing samples are difficult to prepare (making supervised segmentation infeasible). Thus, we used a Baisc Isodata clustering algorithm [18, page 201]. Algorithm 5.4.1 (Basic Isodata) Given a 2D image array x of structure containing feature vectors and label fields, and the number of classes N c , Step 1. Scan the representation matrix x in raster order. For every pixel encountered, randomly pick a label from set {0, ...,N C Â— 1} and assign it to the pixel. Step 2. Compute the class center {C^ o
PAGE 70
58 Step 3. Rescan the whole representation matrix x, and assign pixel to the class k if the Euclidean distance between the feature vector of the pixel and the class center Ck is the closest. Step 4. // no pixel changes its class in the Step 3, stop; else, go to the Step 2. This algorithm differs from a Kmeans algorithm [29] in only one aspect: Basic Isodata updates class centers after a complete scan of an input feature set while Kmeans updates for every reassignment. In our experiments, Basic Isodata outperformed Kmeans for almost all cases. 5.5 Postprocessing The Basic Isodata algorithm labels each pixel independently and does not take into account the high correlation between neighboring pixels. A postprocessing stage, such as the relaxation labeling [25], can be used to incorporate some neighborhood constraint into the segmentation process. For simplicity, we used median filtering as our postprocessing to simulate the benefit of a local constraint. In particular, a 5x5 median filtering was repeatedly applied to an initial segmentation until no change of labeling occurred. 5.6 Experimental Results Our segmentation algorithm was tested on both onedimensional signals and twodimensional textured images. Our test images included samples of two distinct families of textures. In all examples shown, straightforward envelopes of the overcomplete wavelet packet representation were used as texture feature. 5.6.1 Onedimensional Signals Figure 5.1 shows the segmentation of a signal consist of a sinusoid segment and a triangular segment with the same period (sixteen samples) and amplitude.
PAGE 71
59 Row 1 f\ A A 3 == ^WV\A/VNAA/WVVVvWV\/WWv Figure 5.1: Segmentation of a ID Signal Consists of Triangular Waveform and Sinusoid. Where: Row 1: the original signal and the segmentation result; Row 25: the overcomplete wavelet packet representation and envelopes. With envelopes of four channels of the Level 2 representation, the perfect result was achieved. Figure 5.2 is a segmentation example of a noisy signal consist of two sinusoid segments (u = 0.3n and u; = 0.57r) contaminated by white Gaussian noise. Envelopes of eight channels of the Level 3 representation was able to achieve the perfect result. 5.6.2 Natural Textures Here we used textures obtained from the Brodatz album [7] and public archive. Each testing sample was first histogramequalized so that a segmentation result based only on first order statistics was not possible. Experimental results are displayed in Figure 5.3. Experimentally, we observed that a lower order LemarieBattle filter (p=l) performed well in boundary detection (Figure 5.3 (b)), while the higher order LemarieBattle filter (p=2) did a better job within nonboundary (internal) regions.
PAGE 72
60 Row 1 rv\/vsArWWWW\AAAAA/ Figure 5.2: Segmentation of a Noisy ID Signal Consists of Two Pure Tone Segments. Where: Row 1: the original signal and the segmentation result, Row 29: the overcomplete wavelet packet representation and envelopes.
PAGE 73
Col 1 Col 2 Col 3 Figure 5.3: Segmentation Results of a Image Consist of Natural Textures. Where: Row 1: Test image Tl (256x256, 8bit) consists of D17, herringborn weave and bark (true boundary overlaid for display only). Row 2: Clustering results using features extracted from a Level 4 filter bank generated by (Col 1) LemarieBattle filter of p = 1, (Col 2) autocorrelation function of LemarieBattle filter of p=l, (Col 3) LemarieBattle filter of p = 2. The zerocrossing algorithm were used for envelope detection. Row 3: Final segmentations after postprocessing.
PAGE 74
62 5.6.3 Synthetic Textures We tested the performance of our algorithm on several texture images synthesized from random noise based on [27]. Â• Gaussian Lowpass and Bandpass Textures. The images were generated by filtering a white Gaussian noise with mean of zero and standard deviation of 30 with a Gaussian filter of frequency response (in polar coordinate): Â• Filtered Impulse Noise (FIN). The images were generated by filtering a random impulse image I(m, n) generated by: where RAN G (0.0,1.0) is a random number with uniform distribution, with Gaussian impulse response: In both cases, images were linearlyscaled into the range of [0, 255]. Figure 5.4 shows a segmentation result on a Gaussian lowpass texture image, and Figure 5.5 shows a segmentation result on a filtered impulse noise (FIN) texture image. For this difficult test image T3, the algorithm achieved outstanding performance. We also tested our algorithm on a texture image containing regular and sparse elements. Figure 5.6 demonstrates the accurate segmentation result. A quantitative comparison, presenting the accuracy of our segmentation results of textured images is summarized in Table 5.1. This performance is consistent G{r,e)=e (6e 0 ) 2 /(2^BZ) e (rF c ) 2 /(2S 2 r ) g(m,n) = e m 2 /(2S 2 ) e n 2 /(2S 2 )_ 5.7 Summary and Discussion
PAGE 75
(a) (b) (c) Figure 5.4: Segmentation of a Synthetic Gaussian Lowpassed Texture Image. Legend: (a) Test image T2 (256 x 256, 8bit): Gaussian LP, left: isotropic F c = 0, S r = 60; right: nonisotropic F c = 0. 5 r = 60, 9 0 = 0, B 0 = 0.175. (b) Initial segmentation (LemarieBattle filter of p = 1, Level 4 and zerocrossing envelope detection) (c) Final segmentation after postprocessing. (a) (b) (c) Figure 5.5: Segmentation of a Synthetic FIN Texture Image. Legend: (a) Test image T3 (256 x 256, 8bit): Filtered impulse noise, left: nonisotropic T = 0.15, S x Â— 1.0. S y = 1.5; right: nonisotropic T = 0.15, S x = 2.0, S y = 1.0. (b) Initial segmentation (LemarieBattle filter of p = 1, Level 4 and zerocrossing envelope detection) (c) Final segmentation after postprocessing.
PAGE 76
64 (a) (b) (c) Figure 5.6: Segmentation of a Synthetic Texture Image with Line Patterns. Legend: (a) Test image T4 (256 x 256, 8bit) (b) Final segmentation (LemarieBattle filter of p = 1, Level 3 and zerocrossing based envelope detection). (c) Detected boundary overlaid with the original image. Table 5.1: Boundary Accuracy of the Segmentation Results. Test image Maximum ABE Average ABE Â±a Tl 11.0 3.2 2.6 T2 15.0 2.9 2.4 T3 1 1.0 2.9 2.6 ABE: Absolute Boundary Error (in pixels). with the difficulty of segmentation perceived by human observers. We observed that boundary errors are dependent on shape i.e., complex boundaries yielded higher variance. Overcomplete wavelet packet representations and envelope features provide a versatile and flexible framework. The examples showed that satisfactory results can be achieved. Based on our experiments, loworder Lemarie filter (p = 1) achieved the best results. However, we are aware that there were several important parameters to be determined manually for the segmentation algorithm to work. These include the configuration of the representation and the number of classes. On the face of lacking a precise definition of the problem, we believed that no mathematical solution
PAGE 77
65 is possible, and these parameters can only be designed casebycase for particular applications. Comparing with Gabor filters, overcomplete wavelet packet representations possess several potential advantages for further exploration: Â• channel filters cover exactly the frequency domain (provide a mathematically complete representation) without significant overlap, thus greatly reducing correlations between features extracted from distinct channels Â• adaptive pruning of a decomposition tree makes possible the reduction of computational complexity and the length of feature vectors Â• by moving up the decomposition tree, feature vectors at distinct resolution levels can be obtained to increase the accuracy of boundary detection
PAGE 78
CHAPTER 6 APPLICATION II: IMAGE DEBLURRING 6.1 Introduction Image deblurring is a special case of signal restoration. The goal is to recover an image which is blurred and usually degraded by noise. During the past two decades, large amount of research works have been done in this area [33, 2, 28]. Unlike the texture segmentation, image deblurring is a well defined problem. The mathematical model commonly used to describe the degradation process is a convolution and an additive noise, written as: y(k u k 2 ) = d{k u k 2 ) * x(k u k 2 ) + n(k u k 2 ), where x(ki,k 2 ) is the original image, y(ki,k 2 ) is the observed image, d(ki,k 2 ) is the impulse response of the blurring operator, n(k x ,k 2 ) is the noise and the symbol * stands for convolution. The deblurring problem is to restore the original image x(ki,k 2 ) from the degraded image y(h, k 2 ). This is thus a linear inverse problem. At the first glance, the problem seems solvable. Assuming known impulse response of the blurring process and the absent of noise, one may formally derive the solution in the frequency domain as: X(ui,oj 2 ) = Y(uj u lo 2 )/D(uji,uj 2 ). This is the simplest, most intuitive approach and is called inverse filtering [2, 28, 22]. However, the frequency response D(u\,u) 2 ) associated with a blurring process usually has zeros in the frequency plane and its reciprocal is thus unbounded. In the other point of view, blurring generally causes irreversible information loss even without the existence of noise. By taking into account of noise, the inverse filtering approach 66
PAGE 79
G7 would produce a restoration: X'(ui,u} 2 ) = Y(ui,u}2)/D{ui,u 2 ) = X(ui,u 2 ) + N{u)i,u)2)/D(u)i,u)2)In this case, the closeness of the restored image X' to the original image X is hinged on the strength of the noise N. Notice that the precise waveform of the noise is unknown and thus cannot be completely subtracted from the observed image Y. Inverse problems associated with a unbounded inverse operator is said to be illposed [33, 14]. The traditional mathematical tool for dealing with such problems is regularization methods [60]. In this chapter, we shall apply overcomplete wavelet packet representations to the deblurring problem. Our approach explicitly suppresses noise using the nonlinear shrink operator and approximates an inverse filter using the gain operator. In this section, we shall review modified inverse filters, Wiener filters and the more recent theory of waveletvaguelette inversion. 6.2.1 Modified Inverse Filters For blurs having zeros in the frequency plane, there are two modifications available to make a bounded inverse filter. 1. Gainlimited Inverse Filter. The idea is simply setting a upperbound for the magnitude of frequency response to grow, as described below: where R is a positive real value and D(u x ,ui y ) is a real filter. 2. Pseudoinverse Filter[28, page 276]. This is the same as the gainlimited inverse filter except the constant gain region set to zero: 6.2 Review of Some Deblurring Techniques ' 1/R F{U X ,Uy) = < l/D{LU X ,UJy) 0 < D(u x ,u y ) < R, \D{W X ,Uy)\ > R. R < D(u x ,u y ) < 0, (6.1) 0 \/D(UJ X ,U)y) \D(u) x ,u) y )\ < R, \D{u) x ,u y )\ > R, (6.2)
PAGE 80
68 where R is a positive real value. Both approaches have their pros and cons. The advantage of the gainlimited inverse filter is that it is continuous in the frequency domain. The disadvantage is that the passband of \D(u} x ,u y ) \ < R may significantly enhance wideband noise. In opposite, the pseudoinverse filter will completely wipe out anything outside its region of inversion by sacrificing the continuity. 6.2.2 Wiener Filters The idea of the Wiener filter is to choose a deblurring filter / to minimize the mean square restoration error defined as [18, 2, 22]: Â£ = E{[x'(k u k 2 )x(k u k 2 )} 2 }, where x' = /* (d*x + n) and E denotes mathematical expectation [49]. Assume that noise n is a random field with zero mean and is independent of x, the expression of Â£ can be evaluated as: Â£ = E[{f*d*xx) 2 2{f*d*xx){f* n) + (/ * n) 2 ] = E [((/ * d 8) * xf] +E[(f* nf where F, D, <3> xx and $Â„Â„ are all functions of (u^,^), and $ xx and $Â„Â„ are the power density spectrum of the signal and noise [49, Section 2.10], respectively. Furthermore, the integrand can be factored as: \FD 1 2 $ xx + F 2 $ nn = (F1/D) (F* 1/D*) \D\ 2 $ xx + FF* xx + nn ) P Â£> 2 $ xx + $Â„Â„) + (D 2 $ xx + $ nn I F D' XX \D\ 2 IX + r (\D\ 2 +$ 1 l D l 2 *" + $
PAGE 81
69 Since t he hist term is large than or equal to zero, the optimal deblurring filter (Wiener filter) is thus F(u x ,U2) = Â— 2 , (6.3) \D{u)^UJ 2 )\ 9 xx (Ui,U2) + Â«Â„ n (Wi,W2) which can be rewritten in terms of the signaltonoise power ratio D*{u u uj 2 ) F(u}\,U} 2 ) Â— 9~ \D[u) X ,U) 2 )\ + $nn(uJl,UJ 2 )/$ xx (uJ u U 2 ) For white Gaussian noise, $ nn (^i, w 2 ) = NoThe power spectrum $ xx {ui,u 2 ) characterizes correlation property of the signal x. As the correlation usually falls off as the sample distant increases, $ xx (ui, u 2 ) usually is a lowpass function. If the signal has no correlation among its adjacent samples, $ xx (u)i,uj 2 ) = S 0 , and the signaltonoise power ratio equals to a constant S 0 /N 0 = 1/R. In this case, the Wiener filter is reduced to %U2 ) = JM . (6.4) Comparing with the gainlimited inverse filter and the pseudoinverse filter, the Wiener filter of (6.4) possesses both merits of continuity and noise suppression. 6.2.3 WaveletVaguelette Inversion The combination of waveletvaguelette decomposition (WVD) and nonlinear shrinkage of WVD coefficients was proposed by Donoho [14] for linear inversion problems involving dilationhomogeneous operators. An operator T with such property is commutable with a dilation operator D a such that TD a = a a D a T , where a is a constant, and D a is the dilation operator defined as D a [/(*)] = f{at) . The WVD involves three sets of functions: an orthogonal wavelet basis {ipj,k{t) = 2 j/2 *P(2 j t k)} and two nearorthogonal sets {u jJe (t) = 2 j/2 u{2 j t k)} and
PAGE 82
70 {vj, k (t) = 2 j / 2 v(2H k)}. The three sets are related by the quasisingular value relations r ^j,fc = IjVjJe, (65) TX* = 7A*> (66) where quasisingular values jj depend on resolution index j but not spatial index A;. Moreover, the two sets {%*(<)} and {v jt k(t)} satisfy biothogonality /oo Mj,*(*)Wm,n(')^ = S J.mh,n, oo and nearorthogonality 2 i,fcLinking everything together, the reproducing formula of the WVD is expressed by = S t rx ' 7TViJfc(*)i where [x, y] = ff^, x(i)y*(Â£)di is the inner product of x and y. Finally, the WVD inversion of a white Gaussian contaminated measurement y(t) = Tx(t) + n(t) may be achieved via x'(t) = E^{b.M77 1 }^(*)> ( 6 7 ) j,* where /C (j is the nonlinear shrinking operator defined by (4.4). Notice that the threshold tj depends only on resolution j. 6.2.4 Discussion The Wiener filter provided a powerful tool for linear inversion problems. Given the statistics of both image and noise processes, one can design a optimal filter. However, Wiener filter belongs to the linear filtering paradigm and works in the
PAGE 83
71 frequency domain. The linear property determines that it has to tradeoff between smoothing out noise and sharpening up edges. In the opposite, the waveletvaguelette approach deployed nonlinear shrinkage to combat noise in the time domain. It is characterized by three components: 1) dyadic analyzing and synthesizing functions determined by the blurring operator T; 2) leveldependent gain factors 7" 1 ; and 3) nonlinear shrinkage with leveldependent thresholds tj. The nonlinearity enables it to achieve better compromise between denoising and sharpening. However, it is limited to dilationhomogeneous operators only, excluding the most commonly encountered Gaussian and uniform blurs [14]. Moreover, the theory was developed in the continuoustime domain. The limitation of the WVD inversion lies on its insistence of the quasisingular value relations (Eqs. (6.5) and (6.6)). For a convolution operator T with kernel h(t), in the frequency domain (6.5) is equal to H(u)^(2j lo) = jjV{2J io). (6.8) Let u)' = 2 k uj and plug it into (6.8), we have H(2 k u)V{2u k) oj) = jjV(2{j k) u). (6.9) In the other hand, directly plugging level index j Â— k into (6.8) produces H{uj)y(2{j k) uj) = ^ k V(2~^k) u). (6.10) By comparing (6.9) and (6.10), we conclude that it must be true that H(2 k u) 7* , TT . , = Â— Â— (independent of ui) , H(u) 7j_ fc which is the characteristic of dilationhomogeneous operators in the frequency domain.
PAGE 84
72 6.3 Discretetime Overcomplete Wavelet Packet Inversion 6.3.1 Onedimensional Formulation There were two major considerations in onr searching for a deblurring algorithm. First, it should be available in the discretetime domain. Second, it should be applicable to inhomogeneous blurring operators. The result was the discretetime overcomplete wavelet packet representations equipped with a nonlinear shrinking operator and a gain operator. For an onedimensional observation y{k) = d(k) * x(k) + n(k), our deblurring algorithm can be expressed by: 00 Wj{k)= E Vj{m)y{km), (6.11) m=Â— oo A/1 oo = E E K h K(Â™)] x J u j( k m )> ( 6 12 ) j'=0 m= Â— oo where {Aj 0
PAGE 85
73 To complete the framework, we still need to resolve two issues. First, for a given blur with frequency response D(u>), how can we determine the channel configuration (analyzing and synthesizing filters) as well as the gain vector? Second, how should we choose thresholds tj for denoising? We proposed the following algorithm for the channel structure and the gain vector: 1) Choose a gainlimited inverse filter or a Wiener filter to approximate. 2) Choose the prototype filters H(u), H(u), G(u) and G{u), and then the channel filters of the Level 1 can be determined by Ci, 0 (^) = H(uj)H{uj) and Ci,i{u) = G(u)G{u). 3) Call Algorithm 4.2.2 (Minimum Tree) to determine both the tree configuration and gain vector. However, the issue of determining thresholds tj of denoising is much more difficult. Thresholding schemes for several particular applications were suggested by Donoho [15]. Generally speaking, knowledge about signals and noise is needed to devise a thresholding scheme. By treating signal and noise as two random processes, the correlation property may be used to characterize them, as in the case of Wiener filtering. For widesense stationary white Gaussian noise and colored Gaussian signals, their autocorrelation functions d> and power spectrum densities $ may be written as: Note that <3> XI (u;) is generally different from the power density spectrum X(u>) 2 of a particular realization x of the signal process. Under these assumptions, a possible thresholding scheme is: (6.13) (6.14)
PAGE 86
74 where Bj is the bandwidth of the channel j, and uij is the center frequency of the channel. 6.3.2 Twodimensional Extension The onedimensional algorithm of the overcomplete wavelet packet inversion need to be extended to twodimension in order to deal with images degraded by symmetric 2D blurs. Formally, the 2D version may be written as A/1 Ml x'{k u k 2 ) =J2 Y, 5m^c 1 , CJ [Â« ; ci ) c 2 (mi,m2)] A Cl)C2 u Cl)CJ (fci m 1 ,k 2 m 2 ). Cl =0 C2=0 mi m 2 However, an efficient 2D overcomplete wavelet packet representation is much more difficult to seek than in the ID cases. In general, the 2D separable overcomplete wavelet packets may not be the efficient representation. This is due to the fact that both 2D modified inverse filters and Wiener filters are nonseparable even for separable symmetric 2D blurs. For example, for a separable Gaussian blur: D(u x ,u y ) = Ae^' +u2 y^ 2 , the pseudoinverse filter and the Wiener filter are l/AetA+tU* , < osjMA/R) (615) 0 , otherwise, and F(^u; y ) = l/(l + R/Ae^ + ^Â° 2 ). Both filters are isotropic and can be efficiently represented by donutshaped channel filters. In practice, a reasonable approximation may be achieved by applying the ID algorithm separately to rows and columns. For the separable Gaussian blur, a separable pseudoinverse filter may be constructed as: F(u x ,ujy) = Fi(u) x )Fi(ujy), F{U X ,LOy)
PAGE 87
75 where the ID filter Fi(lj) is F l {u) f Ji/Ae" 2 /* 2 , W  < a^HA/R) 1 0 , otherwise. It is clear that the rectangular region (\uj x \ < a^/ln(.4/i?), < CT^ln(A/i?)) of the above 2D filter includes the disk of yju^ + to* < o\J\n(A/R) of the (6.15). By choosing appropriate parameter R, acceptable results may also be achieved by approximating 2D Wiener filters with ID Wiener filters. For uniform blurs, gainlimited inverse filters are similar to Wiener filters. The major problem of such separable approximations is they may boost diagonal frequency components significantly. Figures 6.1 and 6.2 show examples of both 2D nonseparable and separable filters. The emphasis of the diagonal frequency region is particularly clear in the case of Gaussian blur. 6.4 Experimental Results To quantitatively compare deblurring results, we used an objective measure, the improvement in signaltonoise ratio (ISNR), defined as [3]: 75^ = 20 logjlti},p=l,2, where \\v\\ p is L p norm defined as [56]: /Ni \ l lv E Hn)\ P ,n=0 and x, y and x' are the original, degraded and deblurred signals, respectively. Moreover, a performance upper bound is useful in providing a gauge for comparison. The performance of an oracle Wiener filter may be considered the upper bound for Wiener filters. An oracle Wiener filter is a Wiener filter with $ nn (a>) and $^(0;) replaced by power density spectrums iV(a>) 2 and lA'Xw)! 2 of the particular realization of the noise and the original signal. The filter may be written as: D*(u) D(u;) 2 + aW/A>)
PAGE 88
7G Row 1 2 3 Figure 6.1: Log Frequency Responses of Filters for a Gaussian Blur. Where: Row 1: Blurring filter Row 2: 2D nonseparable filters, gainlimited, pseudoinverse and Wiener filter from left to right Row 3: 2D separable filters, gainlimited, pseudoinverse and Wiener filter from left to right
PAGE 89
77 Figure 6.2: Log Frequency Responses of Filters for an Uniform Blur. Row 1: Blurring filter Row 2: 2D nonseparable filters, gainlimited, pseudoinverse and Wiener filter from left to right Row 3: 2D separable filters, gainlimited, pseudoinverse and Wiener filter from left to right
PAGE 90
78 However, it is not necessarily the upper bound of the overcomplete wavelet packet inversion. 6.4.1 Onedimensional Signals Onedimensional deblurring examples are shown in Figure 6.3 and 6.4. The ideal signal in both cases is a perfect multilevel blocky function, which is very challenging and revealing for any deblurring method. For overcomplete wavelet packet representations used in both examples, Lemarie filter of p= 1 was used. In Figure 6.3, the impulse response of a Gaussian blur was: Aexv[{k/(aN/2)) 2 }, 0
PAGE 91
79 Row 1 Figure 6.3: Onedimensional Deblurrings of a Gaussian Blur. Where: Row 1: The original signal Row 2: The blur filter and the degraded signal Row 3: The oracle Wiener filter and the deblurred signal Row 4: The Wiener filter and the deblurred signal Row 5: The gainlimited inverse filter and the deblurred signal Row 6: The overall frequency response of the overcomplete wavelet packet inversion and the deblurred signal Row 7: Channels of the overcomplete wavelet packet representation Note: The left column are frequency responses. The central column are signals of fulllength. The right column are zoomin of the feature region, where deblurring results were overlaid with the degraded signal of doted line.
PAGE 92
80 Figure 6.4: Onedimensional Deblurrings of an Uniform Blur. Where: Row 1: The original signal Row 2: The blur filter and the degraded signal Row 3: The oracle Wiener filter and the deblurred signal Row 4: The Wiener filter and the deblurred signal Row 5: The gainlimited inverse filter and the deblurred signal Row 6: The overall frequency response of the overcomplete wavelet packet inversion and the deblurred signal Row 7: Channels of the overcomplete wavelet packet representation Note: The left column are frequency responses. The central column are signals of fulllength. The right column are zoomin of the feature region, where deblurring results were overlaid with the degraded signal of doted line.
PAGE 93
SI Table 6.1: Performance of ID Deblurring Examples Blur Type ISNR 2 ( DB ) ISNRr (DB) OWF \VF GLIF OWPI OWF WF GLIF OWPI Gaussian 2.37 1.88 9.04 1.90 0.75 1.45 15.87 1.18 Uniform 5.17 3.90 4.32 3.62 1.00 2.40 11.80 0.00 Note: OWF=Oracle Wiener Filter WF=Wiener Filter GLIF=Gainlimited Inverse Filter OWPI=Overcomplete Wavelet Packet Inversion 6.4.2 Twodimensional Images Examples of image deblurring using the standard Lena image were included in Figures 6.5 and 6.6. Separable blurring sources were used. The same ID blurring filters used in the ID examples were applied rowwise and columnwise on the original Lena image. A statistical signal model of (6.13) with a = 3.5 and o x = 0.008 was used for both Wiener filtering and the overcomplete wavelet packet inversion. In both cases, white Gaussian noise with d 2 n Â— 0.5 was added such that SNR = 44.3DR The thresholding scheme of (6.14) with a = 25.2 was used for both cases. The shrinking operator was applied on the 2D components. The gainlimited inverse filter (6.1) with R = 0.08 was used for both comparison and overcomplete wavelet packet inversion. A separable overcomplete wavelet packet inversion algorithm was used to approximate a 2D overcomplete wavelet packet inversion. For the case of Gaussian blur, the approximation used 11x11 channels with e = 0.005% (ID). For the case of uniform blur, the approximation used 25 x 25 channels with e = 3.9% (ID). The deblurred images were all linear converted to the same intensity range of the original image for the calculation of ISNR indexes. Table 6.2 compares deblurring performance in terms of ISNR. For the two particular blurring sources and the Lena image, it shows that overall performance of
PAGE 94
S2 Table 6.2: Performance of 2D Deblurring Examples Blur Type ISNR2 (DB) ISNRi (DB) OWF WF GLIF OWPI OWF WF GLIF OWPI Gaussian 3.20 3.17 14.04 2.80 2.96 3.25 16.82 2.69 Uniform 3.55 1.78 14.78 1.76 L.46 0.60 18.05 0.64 Note: OWF=Oracle Wiener Filter WF=Wiener Filter GLIF=Gainlimited Inverse Filter OWPI=Overcomplete Wavelet Packet Inversion the overcomplete wavelet packet inversion is about the same as Wiener filtering by both measures of ISNR 2 and ISNRi. 6.5 Summary and Discussion We have demonstrated that the deblurring problem can be handled using an overcomplete wavelet packet representation. We showed that inverse filtering in the frequency domain and denoising in the time domain can be done in a single step. In other words, for this application the gain and shrinking operators can be combined as a single operation on the spatialfrequency representation. Moreover, our approach may be seen as an extension of the waveletvaguelette inversion into both the discretetime and broader problem domains. Although the one step inversion and denoising may be conceptually appealing, its limitation is also apparent. Since nonlinear shrinkage works only within areas where the magnitude of signal components is stronger than that of noise components, the best representation for denoising is to maximize local signaltonoise ratio. For example, if a sharp edge is decomposed into many channels, its components at some channels may be indistinguishable from noise. Therefore, the best representations for approximating the inverse filter and denoising are generally different. Having only one choice of representation, the approach has to limit its application to certain signals or blurring sources.
PAGE 95
Figure 6.5: Deblurring a Gaussianblurred Lena Image. Legend: (a) The original 256 x 256 image (b) Degraded image with SNR=44.3 db (c) Deblurred by the oracle Wiener filter (d) Deblurred by a Wiener filter (e) Deblurred by a gainlimited inverse filter (f) Deblurred by a separable overcomplete wavelet packet inversion
PAGE 96
(a) (b) (c) (d) (e) (f) Figure 6.6: Deblurring an Uniformblurred Lena Image. Legend: (a) The original 256 x 256 image (b) Degraded image with SNR=44.3 db (c) Deblurred by the oracle Wiener filter (d) Deblurred by a Wiener filter (e) Deblurred by a gainlimited inverse filter (f) Deblurred by a separable overcomplete wavelet packet inversion
PAGE 97
85 This limitation may be the reason that our preliminary experimental results showed roughly the same performance as the much simpler approach of the Wiener filtering. For the Gaussian and uniform blurring sources, some channels with very narrow bandwidth are needed to achieve satisfactory approximation using the gain vector. For those channels, shrinking operation is very close to multiplication.
PAGE 98
CHAPTER 7 CONCLUSION We examined both classes of Fourierbased and waveletbased representations. We showed that orthogonal wavelet transforms lack translation invariance and the efficacy of representing convolution operators. In particular, we demonstrated that aliasing distortion due to critical subsampling limits its applications in image enhancement and restoration. On the other hand, overcomplete wavelet representations, although not as efficient as nonredundant representations, avoid those shortcomings. Moreover, overcomplete wavelet representations can be conveniently analyzed using linear time invariant system theory. In this thesis, we evaluated their capability of timefrequency localization using the uncertainty factor. We also identified gain, shrinking and envelope operators to approximate convolution operators, suppress noise and extract power density distributions. We investigated the criteria of minimum mean square error and designed an optimization algorithm. We extended the wavelet shrinkage to overcomplete representations. We also presented two envelope detection algorithms and compared their performances. The framework of the overcomplete wavelet representation was applied to two difficult problems: segmentation of textured images and image deblurring. We demonstrated that channel envelopes as feature vector produced satisfactory results on both natural and synthetic textures. We showed the gain and shrinking operators may be used for image deblurring and pointed out its limitations. The issues of timefrequency representations, pattern recognition and noise suppression are very fundamental ones. Our knowledge about them is going to be accumulative and evolving. This thesis increases our level of understanding and points 80
PAGE 99
87 to many opportunities for future research. The issue of timefrequency localization was touched upon but not fully explored. Although the uncertainty factor was used for evaluation of individual channels, we lack a criteria of optimality on a representation as a whole. Once such a criteria is found, we may able to search for the optimal representation of textured images, which was not explored here. Denoising is certainly another front. The optimal representation and adaptive selection of thresholds are going to be intertwined and challenging.
PAGE 100
APPENDIX A PROOFS RELATED TO DISCRETETIME WAVELETS Some basic proofs related to (bi)orthogonal discretetime wavelets were included here in order to make this thesis selfsustained. Proposition A. 0.1 The necessary condition for a pair of discrete sequences to be orthogonal (as defined by (A.l)) is (A. 2): oo ]T a(2n m)b(m 21) = \5{n I) (A.l) m=Â— oo A{e^)B{e juJ ) + A(e j{uJ+n) )B(e^ +7!) ) = 2A. (A.2) where A is a constant, and A(e joJ ) and B{e juJ ) are discretetime Fourier transform of a(n) andb(n), respectively. Proof: By changing variable k = m21, the left side of (A.l) is changed to T,T=oo a [ 2 ( n l ) ~ k] Hk)Then it is equivalent to prove that Â£ a(2l k)b(k) = \S(l). kÂ—oo Notice that the left side of the above is a 2fold decimation of the series oo p(l) = Â£ a(lk)b(k). k=Â—oo Since p(l) is the convolution of a(l) and b(l), its discretetime Fourier transform is known as P{e> u ) = A{e ju} )B(e^). 88
PAGE 101
89 Using the result on Mfold decimator proved in [63]. Fourier transform of p(2l) is equal to l [P(e^ 2 ) + P{eÂ« M) ' 2 )\ , and therefore the Fourier transform of (A.l) is P( e ^/ 2 ) + p( e J(^)/2) = 2A. Finally, by substituting P(e JUJ ) = A{e? w )B{e? w ) back, we proved A{e J " /2 )B(e juJ ? 2 ) + A(^ w+2ir)/2 )B{e>^ +2w)/2 ) = 2A Proposition A.0.2 The necessary condition for (A. 3) is (A. 4): 53 a(2/ m)b{n 21)+ Â£ c(2/ m)d(n 2/) = 8{m n) (A.3) /=Â— oo !=Â— oo A(e J ' w )B(e jw ) +C(e J ' w )D(e J ' w ) = 2. A(e J ' w )B(e J ' (w+,r) ) + C'(e J ' w )D(e J ' ( ' J,+,r) ) = 0. (A.4) Proof: Consider both sides of (A.3) to be twodimensional series of index (m,n), and apply twodimensional Fourier transform to the both sides. Let p(m, n) = E^oo a ( 21 ~ m)b{n 21), then 00 oo P(e J ' w, ,e , ' w Â») = 53 E p(m,n)ejmuJx e m=Â— oo n=Â— oo = Y. l=oo Lm=oo 00 e /=Â— oo oo 53 6(n 2/)ejnw Â» 5] a(2J m)e,TBW " 00 ^ a(ra)e jmu; * l= Â— 00 = 53 2tt5(2o; x + 2a;Â„ + 2fc7r)i4(eiw *)B(e jh 'Â»') ^ 0 j7(2wx+2w 1/ ) A;= Â— oo oo t= Â— oc
PAGE 102
90 Therefore, the 2D Fourier transform of the left side is Â£ 7TÂ») m= Â— oo n=Â—oo Y 2it6(uj x + u! y + 2kir). k=oo Thus, 2D Fourier transform of (A. 3) is Y 5(u x + tuy + Attt) [A{ejw *)B(e> u y) + C{ejw ')D{e>^)\ fc=Â— oo oo = Y 25(io x +u) y + 2k7r). k= Â— oc By combining terms with even and odd index of k, the above equation can be rewritten as oo Y 6(u x +u y + 2/ctt) [X{e? u ',e? u Â») 2 k=Â—oo oo + Y. 5(ux + u y + (2k + l)ir)X(e? w *,e> u Â») = 0. k= Â— oo where X(e? w *,f? u *) = A{eiu *)B{e? w *) + C(ejW )D(e J '^). Using the properties of the 5 function, the coefficients of the above 8 series must satisfy X(ei{f * 7k *\et u ) = X(ejw ,e juJ ) = 2, X(ej{uJ+{2h+l)n \e jul ) = X{e^ +7I \e juJ ) = 0. Finally, by substituting Xie^ 1 ^ e^ Wy ) back, we proved A(e juJ )B{e juJ ) + C(e juJ )D(e juJ ) = 2,
PAGE 103
91 f2)C(^)[ ^2)PM A(w)B(2w) (4 f4)C(2u;)Â£>M (a) (b) Figure A.l: Equivalent Structures for (a) Convolution and Decimation and (b) Expansion and Convolution. Proposition A.0.3 Two cascaded stages of convolution and decimation (expansion) are equivalent to a single stage convolution and decimation ( expansion) as illustrated in Figure A.l. Proof: Assume input x(n) and output y(n), and a(n) and b(n) are the inverse DFT of A(u>) and B(u). For (a), y(n) = E 00 m= Â— oo Y,f = _ 0O x{k)a{2m k)\ b(2n m) = Â£Â£00 *(*) [E^ = oo fl(2m k)b(2n m)] = ESUo *(*) [ESoc Â«(4n k 206(0] = EÂ£_ooX(fc)p(4n*) where p(n) = a(n 21)6(0. or ^V") = A(e^)B(e^). Similarly, for (b), y(n) = E^ = oo [E2L00 *(*)c(m 2*)j d(n 2m) = Er=oc *(*) [E^=oo c(m 2k)d(n 2m) = YSLoo *(*) [E/^oo c(0d(n 4* 2/) = EÂ£_oo*(*)9(n4*) where g(n) = Ei=_oc c (O rf (Â™ 2/), or Q(e*") = C{e j2uJ )D(e j ")
PAGE 104
92 v k {n) Ufc(n) v c (n) g(n) (J2) i'c,i(n) Â«c,l(n) 9(n) i L A(n) Q2) Â»c,2(n) Â« c ,2(n) (f2Vh(n) 1 u c (n) Figure A. 2: Illustration of Adding One More Channel by Splitting a Channel into Two. Proposition A. 0.4 The basis functions for a binary tree structured filter bank can be expressed as: k = 0 fci V 0 (e>") = G(e> w ), fci V k (e? u ) = J] H(e>* u )G(e> 2k "), U k {e?") = J] H(e^ 2 ' w )G(e' 2kw ), I < k < M Â— 2 1=0 1=0 A/1 Ml Vki(^) = I] iTfe*' 2 ' 1 "), Â£/ A /i(e>' w ) = I] H(e j2 ' u 'l k = M1. /=o /=o u^ere M > 2. For orthogonal wavelet basis satisfying condition (2.11), we have U k (en = V k *(en, or, u k (n) = v* k (n). Proof: By direct application of the Proposition A. 0.3. Proposition A. 0.5 Basis functions generated by a general binary tree with filters hi(n) and gi(n) satisfying (2.9) satisfy the following biorthogonality: Em=oo u c(Â™ n c l)v k (n k n m) =6(ck)S(l n) (A.5) Proof: By induction. For filter bank of 2 channels, u 0 Â— gain), u x = h 0 (n), v 0 = g 0 {n) and vi = h 0 (n). In this case, the orthonomality of the basis functions is the given property. Now, assume (A.5) is true for up to M channels. To construct &M+1 channel filter bank, we split an arbitrary channel c into two as illustrated in Figure A. 2.
PAGE 105
93 Since transform coefficient w c (n) before splitting is w e (n) = Zm=oc x ( m ) v c(n c n m), the new coefficient w c i(n) and w C 2(n) are Wd(n) = YZoo^SWnl) = E^oc *(m) [E^oc Vc(nJ ~ m)g c (2n I) = Y^=rx ,x(m)v cl {2n c n m), where v cl (n) = E/=_oo V M ~ n c l)g c (l). Similarly, vain) = E/=_oo v c(n n c l)h c (l), Â«cl(n) = E;=_oo u c( n ~ n J)9c(l), u
PAGE 106
94 Similarly, we can prove Â£m=oo U cl(Â™ n c ii)v C 2(n c in Â— m) = 8{n Â— i), n cl i)v c2 {n cX n m) = 0, n c ii)v c i(n c in Â— m) Â— 0 We have thus proved that basis functions of the new filter bank constructed this way are also orthonomal. Notice that orthonomality hold under quite general conditions: 1) Filters g c , h c , g c and h c may be different with those used in other channels, as long as they satisfy condition (2.9). 2) Splitting can be done arbitrarily.
PAGE 107
APPENDIX B NUMERICAL COMPUTATION OF BATTLELEMARIE WAVELET BattleLemarie wavelets are built from polynomial spline of order 2p+ 1 [42]. Although they are not compactly supported, they do possess two important properties: symmetric and orthogonal. The symmetric property is especially desirable for some image processing applications such as edge detection, character recognition [39], and texture segmentation [62], but is mutual exclusive with the compact support property [13]. In this appendix, we presented an algorithm to compute the scaling function, wavelet, and associated quadrature mirror filters of BattleLemarie wavelets. B.l Functions in Finite Terms The Fourier transform of the scaling function (f>, wavelet xp, and the associated quadrature mirror filter h may be written as [42]: = / (B.l) XT/ I <\ e S 4p+4 (2 +tt) m 9 x *P W = ~2^2\ V 1 YV ( BJ > uj 2 p+ 2 \ Â£ 4 p +4 (u;)X;4p + 4(2) H p (u) = where p is a positive integer and Â£ 4p+4 (q;) 24p+4Â£ 4p+4 (2u;) (B.3) + 00 ^ ^n{u) = Yl 7 1 oi, V 95
PAGE 108
96 Therefore, the computation of the function E n (u) is the key. Fortunately, we have a closed form for n Â— 2 [42], 1 1_ ,(u; + 2A;7r) 2 ~4sin 2 (!)' = m and a closed form expression of the function Â£ n (u;) based on Â£ 2 (u;): (_l) n An2 Thus, all we needed is an algorithm to compute derivatives of the function T, 2 (u), which we shall derive in the following: Proposition B.l.l For any integer n>\, the n th order derivative of the function E 2 (w) has the form of d n v f . gn(cosf) m , = SF*(f)' (B 4) where #Â„(cos ) is a polynomial of order n and satisfied the following recursive relation: <7Â„+i(s) = ~2 x ) ^9n(x) * x (B.5) Proof: By mathematical induction. For n = 1, Â£S 2 (u;) = 0.25$g^. Thus, 0i(cosw/2) = 0.25 cos(u;/2). Assume for any n > 1, (B.4) is true, and # n (cos ) is a polynomial of order n. Then, for n + 1 , d" +1 E 2 (a;) = _o[_ / g n (cosf) \ du^ 1 ~ do; Vsin n+2 (f ) J Â£ (gn(cos f )) * sin" +2 gÂ„(cos % ) * (n + 2) * sin" +1 (f ) * cos() * i ;Â„2n+4 sin (I) ff (^(cos I)) * sin ^*g n (cosf).cos dcos . sin< n+1 > +2 (f) ^^( cos I)*! 1 cos2 f) ^ * ^( cos I) * cos f sin (Â«+l)+2() g n+1 (cos) sin (n+D+2()'
PAGE 109
97 where g n+ i(x) = Â— \ d 9 "^ (1 x 2 ) ^^g n {x) * x, and is clearly of order (n 41) Proposition B.1.2 The coefficients {a n< k} of the polynomial g n (x) satisfies the following recursive relation: ao,o = 0.25 Oo,i = 0 a n,Q Â— Â—QÂ§a>n\,\ a n ,k = 0.5 [(fc n 2)a n _ 1> *_i (fc + ljani.fc+i] , lnl,nlX (B8) Comparing (B.8) with (B.7), we obtain (B.6) Proposition B.1.3 For even order n of the polynomial g n (x), only terms with an even index are nonzero. For odd order n of the polynomial g n (x), only odd index
PAGE 110
98 terms are nonzero. (This property halves the number of calculations needed for the coefficients.) Proof: By induction. For n = 1, a li0 = Â— 0.5ao,i = 0, a^i = Â— a^o = Â—0.25. For n = 2, a 2 , 0 = 0.5a u = 0.125, a 2 ,i = 1.5ai. 0 = 0, a 2 , 2 = ai,i = 0.25. Assume for n 1, the assertion is valid. For even n, let n = 2m, 02m,277i Â— ~~ o 2m _i i2m _i For 2m is even, 2m 1 is odd. And for odd /, / Â— 1 is even. Use the induction hypothesis, all the odd index terms are zero. The similar proof applied to odd order (n) polynomial. Based on the above properties, we can now write T, n (u) as G2m,0 Â— Â—0.5 0,2m\,\ 02m,i = 0.5 [(I 2m 2)a 2m _i ) ii 02m,2m+l = Â—15 a 2m _i )2m _ 2 (/ + l)a 2m i,;+i] Â£ n (u>) = (I)" gn2(cOSf) (n1)! sin"  (B.9) By combining (B.9) and (B.3), we obtain and g 4p +2(sin ) \ 94 P +2{cosuj)' From Eqs. (B.9), (B.l) and (B.2), we obtain, 'sin(u;/2)r + ' (4p + 3)! (u/2) J ^2 4 P+V+2(cosf)' and therefore, *p(0) (4p + 3)! %(0) = 0. \ 2^^ 4p+2 (l)
PAGE 111
99 Having derived analytical expressions, we shall be able to calculate these three functions of any order at any frequency point. B.2 Numerical Computation Using FFT The three functions that we investigated in the last section are continuous functions in the frequency domain. To compute their waveforms in the time domain, we used the FFT (Fast Fourier Transform) approach. H p (uj) is a 27rperiodic function. It is Discrete Fourier Transform of quadrature mirror filter h(n) which is a discrete function in time domain. Recall that sampling in unit circle of frequency domain results in periodic extension of h(n) in time domain. However, if the sampling period is small enough, aliasing in time domain is negligible and h(n) obtained will be accurate. The sampling values of the H p (u) can be obtained: 2p+2 I \ 0(w) and are Fourier transforms of scaling function (t) and wavelet ip(t), respectively. Notice that functions (t) and t/)(t) are continuous functions in time domain. By numerical method, we can only obtain sampling values in time domain. Recall that DFT (Discrete Fourier Transform) of sampling values (j)(nT) and ip(nT) are related to the Fourier transform of continuous functions
PAGE 112
100 Table B.l: Coefficients of g 24 (x) (a) and Truncated (a) Impulse Response of hi(n) (b). (b) k a 2A,k n hi(n) 0 3679416778537.75 0 0.54173576 2 475582801692177.0 1 0.30682964 4 8011347151626016.5 2 0.03549798 6 40907559988277171.5625 3 0.07780792 8 81471913154443867.5 4 0.02268462 10 69736261315895408.25 5 0.02974682 12 26157606919852821.0 6 0.01214549 14 4123375115615239.5 7 0.01271542 16 243456149817686.25 8 0.00614143 18 4245937488920.0 9 0.00579932 20 13213718716.5 10 0.00307863 22 2097148.875 11 0.00274529 24 0.25 12 0.00154624 Note: hp(n) = h p (n) To use FFT to compute values Â£)(A:) and respectively. On the issue of implementation, we notice that coefficients of the polynomial g n (x) is growing very fast. As an example, the coefficients of #24(2) is shown in Table B.l. In our numerical calculations, 64bit precision was used. Notice that g n (x) appears in both numerator and denominator of the formulae of H p (u). This suggests another way of preventing overflow in calculating H p (io) by using k(n)g n (x), where k(n) is an appropriate multiplier and should be embedded into the iterative process of (B.6).
PAGE 113
101 1 \2 14 1.6 1.6 2 2.2 2.4 2.6 26 Figure B.l: Transition Band of Filters H p {uj) with p = 1,3,9. As the order p of the filter increases, the filter H p (u>) is closer to the ideal halfband filter. Figure B.l showed transition band of filters with p 1, 3, 9. Two MATLAB mfiles for calculating QMF of the BL wavelets are listed in the following. function [H,G] = Lfilter(p,N) % This function build a Quadrature Mirror Filter pair H, G % H: lowpass filter; G: highpass filter % p: 2p + 1 is the order of Lemarie wavelet % N : length of the filters L = 4*p + 2; C = gcoef(L); % length of C is L+l H = zeros(l,N); G = zeros(l, N); H{1) = sqrt{2); % at w=0, H= v / 2 H{N/2 + 1) = 0.0; % at w = tt ,H=0 dw = 2* pi/N; for % = 1 : N/2 1 omega = i * dw; u = 0; v = 0: for k = 1 : 2 : L + 1 v Â— v + C(A:) * cos(omega/2)A(k Â— l); u = u + C(k) * cos(omega) A(k Â— 1); end: + 1) = s
PAGE 114
102 function [c] = gcoef(n) c = zeros(l, n+1); t = zeros(l, n + 1); c(l) = 0.25; c(2) = 0; if( n > 8 ) k = n/4; else A: = n; end; for i=l:l:n for j = 0 : 1 : i if j == 0 *(1) = 0.5*c(2); elseif( j>0 & (j <= i2) ) t(j + 1) = 05 *((j i 2)*c(j) {j + l)*c(j + 2)): elseif j == (i Â— 1) t(j + l) = 1.5*c(*l); else t(j + 1) = c(Â»); end; end; if( i > A; ) % apply multiplier c(l : t + 1) = t(l : t + l)/(t fc); else c(l : t + 1) = f(l : i + 1); end: end:
PAGE 115
APPENDIX C NUMERICAL COMPUTATION OF THE UNCERTAINTY FACTOR Eq. (3.7) defines the uncertainty factor of a lowpass filter in the continuous frequency domain. In order to compute uncertainty factors of filters in a filter bank, two issues need to be resolved. First, the definition for a lowpass filter has to be extended to bandpass and highpass filters. Second, the formulas have to be modified for computation in discrete frequency domain. The general definition of the variance of the frequency distribution, including bandpass filters, adopted by Liu and Akansu [40] is: The drawback of applying this definition to bandpass filters is that for real filters \F(e ju> )\ = \F(e~^)\ and thus u; = 0 and is not the central frequency of the passband. Therefore, this is not a fair comparison between lowpass and bandpass filters. Moreover, the variance will also depend on the location of the passband due to the factor (u Â— u;) 2 = a; 2 , and thus is not fair even among bandpass filters. To overcome this problem, we notice that the idea behind the variance a w is the energy distribution within a complete passband. For bandpass filters, a single passband in the positive frequency side should be used. We thus adopted the following definition: where C \F(eiÂ«)\Muj (C.l) 103
PAGE 116
104 Bo Figure C.l: Channel Bandwidths of Overcomplete Wavelet Representations. C u\F{&)\*du> u, = _ JuJi_ Q \F{eJÂ»)\*du ' (C.2) where Wl = < for lowpass filters, for bandpass filters, for highpass filters. Notice that this definition is good only for filters whose passband is completely con7T, 0. 0. 7T, 27T, tained in the range [uji,uj h ], although sidebands are allowed within this range. Also notice that the bandwidth of the lowpass band is twice the width of bandpass bands by the construction of (3.3) as illustrated in Figure C.l. In discrete frequency domain, Eqs. (C.l) and (C.2) should be modified as: a = Et"(feA; c ) 2 F(fc) 2 /27TX 2 Ekl \F(k)\ 2 \N) ' where N is the number of sampling points in the [0, 27r], and k. ' N/2, 0, 0. N/2, for lowpass filters, N/2, for bandpass filters, N, for highpass filters. Notice that \F(k)\ is Nperiodic such that \F(k)\ = \F(k + N)\, and therefore \F(k)\ = \F(Nk)\. Similarly, due to the periodic nature of the sequence f(n) in the time domain, the variance o n should be modified as E^o l (Â«n) 2 /(n + d) 2 mm 0 Z^\f(n + d)\* '
PAGE 117
105 where f(n + d) is the original f(n) circular shifted by d. A MATLAB function uf actor () for computing the uncertainty factor is included in the following. Also included is a MATLAB function lwuf() for building analyzing filters of an overcomplete wavelet packet using a prototype Lemarie filter and computing uncertainty factors of them, where the function Lf titer () is given in Appendix B. function [u]=ufactor(H) % H is the Discrete Fourier Transform of a discrete filter h(n) N = length{H); W = H.*conj(H); vmax = max(W); if W(l) > 0.9b*vmax % lowpass filter F = \W{N/2 1:N), Wi\ : N/2)]; elseif W(N/2) > 0.9b*vmax % highpass filter F = W; else % bandpass filter F = W{\ : N/2); end; L = length{F); fc=[0: Ll\* F'/sum(F); sf = (([0 : Ll) fc). A 2) * F' /sum(F)*(2*pi/N) A2; % transform to time domain h = real(ifft(H)); st = lelO; for d = 2 : N1 % circular shift to find the minimum x = [h{d: N),h(l : d1)]; \Y = x. A 2; tc=[0 : Nl}* W'/sum(W); m = (([0 : Nl] tc). A 2) * W'/sum{W); if( m < st ) st = m; end: end: u = st * sf; function [] = lwuf(shell,p,nlevel) % shell : =1, autocorrelation shell; =0, regular % p : 2p + 1 is the order of Lemarie wavelet % nlevel: number of levels N = 2 A nlevel; SIZE = 256; omega = 2*pi*[0 : SIZEl]/ SIZE; [H.G] = Lf titer {p, SIZE); H = H/sqrt{2); G = G/sqrt{2); if shell == 1 H = H.* conj{H);
PAGE 118
G = G. * conj{G); end: W = ones(N, SIZE); HH = ones(l, SIZE); GG = ones (1, SIZE); d=l; for 72 = 1 '. hIcvgI HH = H(rem([0 : {SIZEl)]*d, SIZE) + 1); GG = G{rem([0 : (SIZEl)]*d,SIZE) + l); for j = 2An : 2 : 1 Irf = ceil{j/2); if rem(pt, 2) = 0 Wj,:) = W(pt,:).*GG; W(j1,:) = W(pt,:). * Gls6 W(j,:) = W(pt,:).*HH; W(jl,:)=W(pt,:).*GG; end: end; of = d*2; end: clg; mi = 1.1 * max(max(abs(W (:, :)))); for n = 1 : AT f = ab S (W(n,:)); if n <= TV/2 p\ot(om.ega, f); else plot (ome^a, /,' :'); end; uf = u/a(ior(VT(n, :)); str = sprint /('%.2/', Â«/); text(((n1)+ 0.1) *pi/iV, 0.95 *mrr, sir); if n==l axis([0 pi 0 mi]); hold; end; end; hold off; of = sprintf('p%dl%ds%d.eps',p, nlevel, shell); eval(['print deps ' of]);
PAGE 119
APPENDIX D BOUNDARY TREATMENTS OF FINITELENGTH SEQUENCES D.l Periodic Extensions For convolutions involving finitelength discrete sequences, we face the boundary problem. One solution is to construct an infinitelength signal from the finitelength sequence. For a finitelength sequence s(n) of length N, two constructions are commonly used: 1) NPeriodic extension. An infinitelength signal s(n) of period N is constructed as: s(n){ S{U) ,0
PAGE 120
108 and denote t m (s(n)) (n). Equivalently, we may write: 00 s^m(n) = E s(k)5(n Â— km). (D.l) k= Â— oo We then define symmetric/antisymmetric sequences as: Definition D.2.1 For a discrete sequence s(n), if there exist integers c and m such that Sfm(c Â— n) = s^m(c + n) (s^m(c Â— n) = Â—s^m(c + n)), the sequence is said to be symmetric (antisymmetric) at c/m, and C = c/m is said to be the symmetry center. For symmetry sequences, we may define a symmetry index S: S = 1 for symmetric sequences, and S= 1 for antisymmetric sequences. Symmetry sequences possess some special properties. We will prove that symmetry sequences is closed under convolution. We first prove a proposition. Proposition D.2.2 The expansion operation is distributive over convolution: r(s*f)(n) =r (s(n)hr (/(*)). Proof: According to (s * f)(n) = T.T=oo s {k)f(n ~ k ) and (D1 )* we have the left side: 00 00 r(s*f)(n) = E E s(k)f(lk)S(nlm) l= Â— oo fc=Â— oo oo oo = E s ( k ) E f(lk)S{nlm) fr = oc l=Â— oo 00 oo E Â«(*) E mS{n(i + k)m) k= Â— oo i=Â—oo and the right side: 00 00 r(s(n))*r(f(n)) = E E s(l)5(k Im) mS(nkim) fe=Â— oo /=Â— oo i=Â— oo 00 00 oo = E s ( l ) E /(O E *(* lm ) S (n k im) /= Â— oo i= Â— oo fc=Â— oo oo oo = E s (0 E f{i)^(n Â— lm Â— im) m l= Â— 00 t= Â— 00 We now prove the closure property on convolution.
PAGE 121
109 Theorem D.2.3 Convolution of two symmetry sequences s(n) and f(n) results in a symmetry sequence y(n) with symmetry index S y = S x Sj and symmetry center Cy C X + Cf . Proof: Based on the given condition, we have: x^m (c x Â— ?i) Â— S x x^m {c x ~\~ nj f V n (cf n) = Sffrm (cf + n) Applying Proposition D.2.2, we have: 00 Vt m i c x + c/ Â— n) = ^2 Xm (k) ffm (c x + Cf Â— n Â— k) k=Â—oo 00 = ]L x t m ( c * ~~ 0/t m (c/ n + 0 i=Â— 00 00 = ]T 5 x arfm (c x + l)Sffrm (c f + n I) l=Â—oo OO = S x S f ^2 x t m(k)f^ m (c x + Cf + n k) k= Â— oo = S x S f y^m(c x + Cf + n) Since translation of index n will not actually alter a discrete sequence, without loss of generality, symmetry sequences may be classified according to their symmetry index S and symmetry center C as below: Â• TypeI symmetric. s(n) = s(Â—n). The symmetry center C = 0. Â• TypeII symmetric. s(n) = s(Â—n 1). The symmetry center C = 1/2 (c = Â— \,m = 2). The 2Nperiodic mirror extended sequences introduced previously belong to this class. Â• TypeIll symmetric. s(n) = s(n + 1). The symmetry center C = 1/2 (c = l, m = 2).
PAGE 122
110 Â• TypeI antisymmetric. s(n) = s(n). The symmetry center C = 0. Notice that s(0) =0 is the constraint. Â• TypeII antisymmetric. s(n) = s(n 1). The symmetry center C = 1/2 (c= l,m = 2). Â• TypeIll antisymmetric. s(n) = s(n+ 1). The symmetry center C = 1/2 (c= l,m= 2). The closure property of symmetric sequences may reduce the computation burden significantly. For symmetric filters and mirror extension, we only need to carry out convolution on at most N + 1 sample points instead of 2N.
PAGE 123
REFERENCES [1] Akram Aldroubi and Michael Unser, editors. Wavelet in Medicine and Biology. CRC Press, Boca Raton, Florida, 1996. [2] H. C. Andrews and B. R. Hunt. Digital Image Restoration. Prentice Hall, Englewood Cliffs, New Jersey, 1989. [3] M. R. Banham, N. P. Galatsanos, H. L. Gonzalez, and A. K. Katsaggelos. Multichannel restoration of single channel images using a waveletbased subband decomposition. IEEE Transactions on Image Processing, 3:821833, 1994. [4] M. G. Bello. A combined Markov random field and wavepacket transformbased approach for image segmentation. IEEE Transactions on Image Processing, 3(6):834846, 1994. [5] Z. Berman and J. S. Baras. Properties of the multiscale maxima and zerocrossings representations. IEEE Transactions on Signal Processing, 41(12):32163231, 1993. [6] A. C. Bovik. Analysis of multichannel narrowband filters for image texture segmentation. IEEE Transaction on Signal Processing, 39:20252043, 1991. [7] P. Brodatz. Texturesa Photographic Album for Artists and Designers. Dover Publications, New York, 1966. [8] T. Chang and C.C J. Kuo. Texture analysis and classification with treestructured wavelet transform. IEEE Transaction on Image Processing, 2(4):429441, 1993. [9] J. Chen and A. Kundu. Rotation and gray scale transform invariant texture identification using wavelet decomposition and hidden Markov model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(2):208214, 1993. [10] R. R. Coifman and Y. Meyer. Orthonormal wavelet packet bases, preprint, Department of Mathematics, Yale University, 1990. [11] I. Daubechies. Orthogonal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, XLP909996, 1988. [12] I. Daubechies. The wavelet transform, timefrequency localization and signal analysis. IEEE Transactions on Information Theory, 36(5) .9611005, 1990. [13] I. Daubechies. Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, 1992. Ill
PAGE 124
112 [14] D. L. Donoho. Nonlinear solution of linear inverse problems by waveletvaguelette decomposition. Technical report, Department of Statistics, Stanford University, 1992. [15] D. L. Donoho. Nonlinear wavelet methods for recovery of signals, densities, and spectra from indirect and noisy data. In Proceedings of Symposia Applied Mathematics, volume 00, pages 173205, 1993. [16] D. L. Donoho. Nonlinear solution of linear inverse problems by waveletvaguelette decomposition. Applied and Computational Harmonic Analysis, 2:101 126, 1995. [17] D. L. Donoho and I. M. Johnstone. Ideal spatial adaptation via wavelet shrinkage. Technical report, Department of Statistics, Stanford University, 1992. [18] R. 0. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley & Sons, New York, 1973. [19] D. Dunn, W. E. Higgins, and J. Wakeley. Texture segmentation using 2D Gabor elementarv functions. IEEE Transaction on Pattern Analysis Machine Intelligence, 16(2):130149, 1994. [20] J. Fan and A. Laine. Multiscale contrast enhancement and denoising in digital radiographs. In Akram Aldroubi and Michael Unser, editors, Wavelet in Medicine and Biology, chapter 7, pages 163189. CRC Press, 1996. [21] D. Gabor. Theory of communication. Journal of the IEE, 93:429457, 1946. [22] R. C. Gonzalez and P. Wintz. Digital Image Processing. AddisonWesley Publishing Company, Reading, Massachusetts, second edition, 1987. [23] R. M. Haralick and L. G. Shapiro. Image segmentation techniques. Comput. Vision Graphics Image Processing, 29:100132, 1985. [24] J. Y. Hsiao and A. A. Sawchuk. Supervised textured image segmentation using feature smoothing and probabilistic relaxation techniques. IEEE Transaction on Pattern Analysis Machine Intelligence, 11:12791292, 1989. [25] R. A. Hummel and S. W. Zucker. On the foundation of relaxation labeling processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(3):267287, 1983. [26] L. W. Couch II. Digital and Analog Communication Systems. Macmillan Publishing Company, New York, third edition, 1990. [27] M. Kardan J. M. H. D. Buf and M. Span. Texture feature performance for image segmentation. Pattern Recognition, 23:291309, 1990. [28] A. K. Jain. Fundamentals of Digital Image Processing. Prentice Hall, Englewood Cliffs, New Jersey, 1989. [29] A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, New Jersey, 1988. [30] A. K. Jain and F. Farrokhnia. Unsupervised texture segmentation using Gabor filters. Pattern Recognition, 24(12):1167 1186, 1991.
PAGE 125
113 [31] B. Jawerth, M. L. Hilton, and T. L. Huntsberger. Local enhancement of compressed images. Journal of Mathematical Imaging and Vision, 3:3949, 1993. [32] B. Jawerth and W. Sweldens. An overview of wavelet based multiresolution analysis. SIAM Review, 36(3):377412, 1994. [33] A. K. Katsaggelos, editor. Digital Image Restoration. SpringerVerlag, New York, 1991. [34] A. Kundu and J. Chen. Texture classification using QMF bankbased subband decomposition. CVGIP: Graphical Models and Image Processing, 54(5):369384, 1992. [35] A. Laine and J. Fan. An adaptive approach for texture segmentation by multichannel wavelet frames. In SPIE Proceedings on Mathematical Imaging: Wavelet Applications in Signal and Image Processing, volume 2034, pages 288299, 1993. [36] A. Laine and J. Fan. Texture classification by wavelet packet signatures. IEEE Transaction on Pattern Analysis Machine Intelligence, 15(11):11861191, 1993. [37] A. Laine and J. Fan. Frame representations for texture segmentation. IEEE Transaction on Image Processing, 5(5):771780, 1996. [38] A. Laine, S. Schuler, J. Fan, and W. Huda. Mammography feature enhancement by multiscale analysis. IEEE Transaction on Medical Imaging, 13(4):725740, 1994. [39] A. Laine, S. Schuler, and V. Girish. Orthonormal wavelet representations for recognizing complex annotations. Machine Vision and Applications, 6:110123, 1993. [40] Y. Liu and A. N. Akansu. An evaluation of timefrequency localization in transforms and filter banks. In IEEE International Conference on Acoustics, Speech and Signal Processing, pages 261263, 1993. [41] S. Mallat. Multifrequency channel decompositions of images and wavelet^ methods. IEEE Transaction on Acoustics, Speech and Signal Processing, 37:2091 2110, 1989. [42] S. Mallat. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11:674 693, 1989. [43] S. Mallat and W. L. Hwang. Singularity detection and processing with wavelets. IEEE Transactions on Information Theory, 38(2):617643, 1992. [44] S. Mallat and S. Zhong. Characterization of signals from multiscale edges. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(7) :710732, 1992. [45] U. Manber. Introduction to Algorithms A Creative Approach. AddisionWesley Publishing Company, Reading, Massachusetts, 1989. [46] P. Maragos, J. F. Kaiser, and T. F. Quatieri. On amplitude and frequency demodulation using energy operators. IEEE Transactions on Signal Processing. 41(4):15321550, 1993.
PAGE 126
114 [47] D. Marr. Vision. W.H. Freeman and Company, New York, 1982. [48] Pierre Moulin. Wavelet thresholding techniques for power spectrum estimation. IEEE Transactions on Signal Processing, 42(11):31263136, 1994. [49] A. V. Oppenheim and R. W. Schafer. Discretetime Signal Processing. Prentice Hall, Englewood Cliffs, New Jersey, 1989. [50] B. Macq P. Desarte and D. T. Slock. Signaladapted multiresolution transform for image coding. IEEE Transactions on Information Theory. 38(2) :897 904, 1992. [51] A. Papoulis. Signal Analysis. McGrawHill Book Company, New York, 1977. [52] J.C. Pesquet, H. Krim, and H. Carfantan. Timeinvariant orthonormal wavelet representations. IEEE Transaction on Signal Processing, 44:19641970, 1996. [53] L. R. Rabiner and R. W. Schafer. Digital Processing of Speech Signals. Prentice Hall, Englewood Cliffs, New Jersey, 1978. [54] T. R. Reed and H. Wechesler. Segmentation of textured images and Gestalt organization using spatial/spatialfrequency representations. IEEE Transaction on Pattern Analysis Machine Intelligence, 12:112, 1990. [55] K. H. Rosen. Discrete Mathematics and Its Applications. Random House, New York, 1988. [56] S. A. Ruzinsky and E. T. Olsen. L\ and minimization via a variant of Karmarkar's algorithm. IEEE Transaction on Acoustics, Speech, and Signal Processing, 37:245253, 1989. [57] N. Saito. Local Feature Extraction and Its Applications Using a Library of Bases. PhD thesis, Yale University, 1994. [58] E. P. Simoncelli, W. T. Freeman, E. H. Adelson, and D. J. Heeger. Shiftable multiscale transforms. IEEE Transaction on Information Theory, 38(2):587607, 1992. [59] G. Strang. Wavelets and dilation equations: a brief introduction. SIAM Review, 31(4):614627, 1989. [60] A. N. Tikhonov and V. Y. Arsenin. Solution of Illposed Problems. V. H. Winston & Sons, Washington, D. C, 1977. [61] M. Tuceryan and A. K. Jain. Texture analysis. In C. H. Chen, L. F. Pau, and P. S. P. Wang, editors, The Handbook of Pattern Recognition and Computer Vision, pages 235276. World Scientific Publishing Company, Singapore, 1993. [62] M. Unser. Texture classification and segmentation using wavelet frames. IEEE Transactions on Image Processing, 4(11):1549 1560, 1995. [63] P. P. Vaidyanathan. Multirate Systems and Filter Banks. PrenticeHall, Englewood Cliffs, New Jersey, 1993. [64] R. Wilson and M. Spann. Finite prolate spheroidal sequences and their applications II: image feature description and segmentation. IEEE Transaction on Pattern Analysis Machine Intelligence, 10:193203, 1988.
PAGE 127
BIOGRAPHICAL SKETCH Jian Fan was born in Fuzhou City, Fujian Province, the People's Republic of China, on April 15, 1954. In 1978, he began his studies at the Xiamen University, Xiamen, PRC, where he received BS degree in radio physics (82) and MS degree in marine physics (85). From 1985 to 1989, he served in the faculty of the Oceanography Department of the Xiamen University. He arrived in the United States on January 13, 1989 to pursue graduate studies. He continued his studies at the University of Florida, Gainesville, where he received MS degrees in electrical engineering (90) and in computer and information sciences (92). He jointed the HewlettPackard Company as a software design engineer in 1995. 115
PAGE 128
I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Andrew F. Laine, Chairman Associate Professor of Computer and Information Science and Engineering I certify that I have read this study and that in my opiniorflt/dDnforms to acceptable standards of scholarly presentation and is fully adequate, \jl scope and quality, as a dissertation for the degree of Doctor of Philosophy. Gerh^G Putter Pr/ff^ssor of Computer and lformation Science and Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Sartaj SaTmi Professor of Computer and Information Science and Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Dave Wilson Professor of Mathematics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Mm G. Harris Professor of Electrical and Computer Engineering
PAGE 129
This dissertation was submitted to the Graduate Faculty of the College of Engineering and to the Graduate School and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. August 1997 Winfred M. Phillips Dean, College of Engineering Karen A. Holbrook Dean, Graduate School

