Citation
Overcomplete wavelet representations with applications in image processing

Material Information

Title:
Overcomplete wavelet representations with applications in image processing
Creator:
Fan, Jian, 1954-
Publication Date:
Language:
English
Physical Description:
xii, 115 leaves : ill. ; 29 cm.

Subjects

Subjects / Keywords:
Approximation ( jstor )
Dyadics ( jstor )
Fourier transformations ( jstor )
Frequency response ( jstor )
Image filters ( jstor )
Image processing ( jstor )
Mathematical vectors ( jstor )
Signals ( jstor )
Symmetry ( jstor )
Wavelet analysis ( jstor )
Computer and Information Science Engineering thesis, Ph. D ( lcsh )
Dissertations, Academic -- Computer and Information Science and Engineering -- UF ( lcsh )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1997.
Bibliography:
Includes bibliographical references (leaves 111-114).
Additional Physical Form:
Also available online.
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Jian Fan.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. §107) for non-profit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
Resource Identifier:
028019760 ( ALEPH )
37851420 ( OCLC )

Downloads

This item has the following downloads:


Full Text










OVERCOMPLETE WAVELET REPRESENTATIONS WITH APPLICATIONS IN IMAGE PROCESSING












By

JIAN FAN


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY


UNIVERSITY OF FLORIDA


1997





























Copyright 1997



by



Jian Fan














ACKNOWLEDGEMENTS


First, I would like to express my deep gratitude to my advisor, Andrew Laine, for all his support and encouragement. I thank him for introducing me to the field of wavelets, and I appreciate the opportunity to work with him.

I would like to thank Dr. Jerry C. L. Chen and Mrs. Shirley H. W. Chen of Palo Alto, California, for their help and support. Without their support, I may not have the opportunity to pursue my graduate studies in the United States.

I would like to thank Drs. Gerhard Ritter, Sartaj Sahni, Dave Wilson and John G. Harris for being on my thesis committee, and Dr. Baba C. Vemuri for attending my defense. Their time and thoughtful recommendations were greatly appreciated.
I would also like to thank Dr. David Burchfield of the Department of Pediatrics, University of Florida, for providing me a graduate research assistantship from January 1991 through December 1991.
During my early years at the University of Florida, Ming Jiang and Min Shi gave me much help. I also benefited greatly from discussions with my follow graduate students, Shuwu Song, Iztok Koren, Sergio Schuler, Wuhai Yang, Chun-ming Chang and Hongchi Shi.
In addition, I would like to thank my manager Catherine Hunt of Hewlett Packard Company, San Diego, for her support. I have also benefited from HewlettPackard Company's educational assistance program. I appreciate the efforts of John Shumate who helped polish the writing of this dissertation.








Finally, I would like to thank my mother Yuzhen Chen and my late father Laikun Fan for their inspiration and encouragement, as well as my wife \eeihua Liu and my son Xiying Fan for their understanding, love, and patience.

This research was supported in part by the Whitaker Foundation, and the U. S. Army Medical Research and Development Command, Grand number DAMD1793-J-3003.














TABLE OF CONTENTS




ACKNOWLEDGEMENTS ............................ iii

LIST OF TABLES ............................. .... viii

LIST OF FIGURES ................................ ix

A B ST R A C T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

CHAPTERS

1 INTRODUCTION ............................. 1

1.1 Motivations of the Research ..................... 1
1.2 Review of Related W ork ....................... 2
1.3 Overview of the Thesis ....................... 4

2 SIGNAL REPRESENTATIONS AND CONVOLUTION OPERATORS 5

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Fourier Transforms .......................... 6
2.2.1 Fourier Transform (FT) ......... .......... .. 7
2.2.2 Discrete-time Fourier Transform (DTFT) ......... 7
2.2.3 Short-time Fourier Transform (STFT) .............. 8
2.2.4 Discrete-time Short-time Fourier Transform (DTSTFT) 8
2.2.5 Connection Between Short-time Fourier Transform and
Filter Banks ............................ 9
2.2.6 Generalized Short-time Fourier Transform ............ 10
2.3 Wavelet Transforms .............. .......... 10
2.3.1 Continuous Wavelet Transform (CWT) .......... 11
2.3.2 Discrete Wavelet Transform (DWT) ................ 12
2.3.3 Discrete-time Wavelet Transform (DTWT) ........... 13
2.4 Generalized Discrete-time Filter Bank Transforms ........... 15
2.4.1 Convolution Operator on Complete Representations . . . 17 2.4.2 Convolution Operator on Overcomplete Representations 19
2.4.3 Approximation of Convolution Operators on Overcomplete
Representations ............................ 19
2.4.4 Aliasing Enhancement on Complete Representations . . . 20
2.5 Summary and Discussion ...... ...................... 22








3 OVERCOMPLETE WAVELET REPRESENTATIONS .......... 23

3.1 Introduction . . . . . . . . . . . . . . . .. . . . . . 23
3.2 Overcomplete \Wavelet Transforms and Filters ............... 23
3.2.1 Overcomplete Wavelet Packet Representations ........ 23 3.2.2 Dvadic Wavelet Representations .................. 25
3.2.3 Filters of Autocorrelation Functions of Wavelets ....... 28
3.3 Connection to Complete Wavelet Representations ............ 29
3.4 Time-frequency Interpretation and the Uncertainty Principle 29 3.5 Two-dimensional Extensions ...... .................... 34

4 OPERATIONS ON OVERCOMPLETE WAVELET
REPRESENTATIONS ............................... 37

4.1 Introduction ....... ............................. 37
4.2 Gain Operators ............................... 37
4.2.1 Mininmn Mean Square Error (MMSE) Approximation. 38 4.2.2 Time-frequency Trade-off and a Greedy Algorithm . ... 40
4.3 Shrinking Operators ...... ......................... 42
4.4 Envelope Detectors ............................. 44
4.4.1 Envelope Detection by Hilbert Transform ........... 47
4.4.2 Envelope Detection by Zero Crossings .............. 49
4.4.3 Comparison Between the Two Detectors ............. 50
4.4.4 Comparison with Other Energy Operators ............ 50
4.4.5 Two Dimensional Envelope Detection .............. 50

5 APPLICATION I: TEXTURE SEGMENTATION .............. 53

5.1 Introduction ....... ............................. 53
5.2 Texture Feature Extraction 4.....................34
5.3 Considerations on Filter Selection .................... 55
5.4 The Basic Isodata Clustering Algorithm .................. 57
5.5 Postprocessing ... ............................ 58
5.6 Experimental Results ..... ........................ 58
5.6.1 One-dimensional Signals ....................... 58
5.6.2 Natural Textures ........................... 59
5.6.3 Synthetic Textures ............................ 62
5.7 Summary and Discussion ...... ...................... 62

6 APPLICATION II: IMAGE DEBLURRING .... .............. 66

6.1 Introduction ....... ............................. 66
6.2 Review of Some Deblurring Techniques .... .............. 67
6.2.1 Modified Inverse Filters ....................... 67
6.2.2 Wiener Filters ...... ........................ 68
6.2.3 Wavelet-Vaguelette Inversion ..................... 69
6.2.4 Discussion ....... .......................... 70
6.3 Discrete-time Overcomplete Wavelet Packet Inversion ........ 72
6.3.1 One-dimensional Formulation ..................... 72
6.3.2 Two-dimensional Extension ..................... 74
6.4 Experimental Results ..... ........................ 75
6.4.1 One-dimensional Signals ....................... 78
6.4.2 Two-dimensional Images ..... .................. 81
6.5 Summary and Discussion ...... ...................... 82








7 CONCLUSION .............................. 86


APPENDIXES

A PROOFS RELATED TO DISCRETE-TIME WAVELETS ......... 88 B NUMERICAL COMPUTATION OF BATTLE-LEMARIt WVAVELET 95

B.1 Functions in Finite Terms ..................... 95
B.2 Numerical Computation Using FFT ................ 99

C NUMERICAL COMPUTATION OF THE UNCERTAINTY FACTOR 103 D BOUNDARY TREATMENTS OF FINITE-LENGTH SEQUENCES . 107

D.1 Periodic Extensions .......................... 107
D.2 Closure of Symmetry Under Convolution ............. 107

REFERENCES . ....... . .... .. ...... .... .... .. . .. . 111

BIOGRAPHICAL SKETCH ............................ 115













LIST OF TABLES


3.1 Uncertainty Factors of Analyzing Filters ..................... 34

5.1 Boundary Accuracy of the Segmentation Results ................ 64

6.1 Performance of ID Deblurring Examples ..................... 81

6.2 Performance of 2D Deblurring Examples ..................... 82

B.1 Coefficients of g24(x) (a) and Truncated Impulse Response of hi(n) (b). 100













LIST OF FIGURES


2.1 A Filter Bank Implementation of STFT ..................... 10

2.2 Fast Discrete-time Wavelet Transform-First Two Levels ........... 13

2.3 Generalized Discrete-time Filter Bank Transform ................ 15

2.4 Example of General Binary-tree-structured Filter Bank of Five Channels. 16 2.5 The Overcomplete Wavelet Transform for the Example of Figure 2.4. 17 2.6 Example of the Aliasing-enhancemnent Anomaly ................. 21

3.1 Binary Tree for Overcomplete Wavelet Packet Decomposition ..... . 24 3.2 Binary Tree for Dyadic Wavelet Decomposition ................ 26

3.3 Examples of the Fumction &)a,b(w) ...... ..................... .....27

3.4 Two Examples of the Overcomplete Wavelet Representation ...... . 30

3.5 Examples of Time-frequency Localization by Overcomplete Wavelet
Packet Representations ............................... 32

3.6 Uncertainty factors (drawn on top of each band) of Channel Filters. 33 4.1 Examples of Gain Vector Approximation ..................... 41

4.2 Examples of the Minimum Tree Approximation with the Maximum
Depth of Seven Levels ...... ......................... 43

4.3 An Example of Overcomplete Wavelet Denoising ................ 45

4.4 Comparison of Overcomplete Wavelet Shrinkage Denoising with Linear
Smoothing Using the Signal of (row 1,col 1) of Figure 4.3 ...... .46

4.5 Two Examples of Power Density Distribution of Overcomplete Wavelet
Packet Representations ............................... 48

4.6 A FIR Hilbert Transformer .............................. 49

4.7 Comparison of the Two Envelope Detectors ................... 51

4.8 Frequency Response of Equivalent Complex Quadrature Filters (level 1). 52








5.1 Segmentation of a ID Signal Consists of Triangular \Vaveforrn and
Sinusoid .......... ................................. 59

5.2 Segmentation of a Noisy ID Signal Consists of Two Pure Tone Segments 60 5.3 Segmentation Results of a inage Consist of Natural Textures..... 61 5.4 Segmentation of a Synthetic Gaussian Lowpassed Texture inage . . 63 5.5 Segmentation of a Synthetic FIN Texture Image ............... 63

5.6 Segmentation of a Synthetic Texture inage with Line Patterns. . . 64 6.1 Log Frequency Responses of Filters for a Gaussian Blur ........... 76

6.2 Log Frequency Responses of Filters for an uniform Blur ........... 77

6.3 One-dimensional Deblurrings of a Gaussian Blur ............... 79

6.4 One-dimensional Deblurrings of an Uniform Blur ................ 80

6.5 Deblurring a Gaussian-blurred Lena Image .................... 83

6.6 Deblurring an Uniform-blurred Lena inage ................... 84

A.1 Equivalent Structures for (a) Convolution and Decimation and
(b) Expansion and Convolution ......................... 91

A.2 Illustration of Adding One More Channel by Splitting a Channel into
Two ........ ................................... 92

B.1 Transition Band of Filters H,(w) with p = 1, 3,9 ............... 101

C.1 Channel Bandwidths of Overcomplete \Vavelet Representations. . . . 104














Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



OVERCOMPLETE WAVELET REPRESENTATIONS
WITH APPLICATIONS IN IMAGE PROCESSING By

Jian Fan

August 1997

Chairman: Dr. Andrew F. Laine
Major Department: Computer and Information Science and Engineering

Orthogonal wavelet transforms have been applied to the field of signal and image processing with promising results in compression and denoising. Coefficients of such a transform constitute a complete representation of a signal without redundancy. However, there are applications where complete represent at ions are disadvantageous. In this thesis, we examine classes of Fourier transforms and wavelet transforms in terms of their efficacy of representing convolution operators. Ve have identified two shortcomings associated with complete representations of the discrete-time domain:

(1) time lack of translation invariance and the (2) a possible anomaly of aliasingenhancement.

On the other hand, our analysis showed that overcomplete wavelet representations do not bear those shortcomings of their non-redundant counterparts. Our framework of overcomplete wavelet representat ions include construction algorithms








and prototype filters, spatial-frequency interpretation and three operations. Capabilities of spatial-frequency localization were quantitatively evaluated using uncertainty factors. Associated with gain, shrinking and envelo)e operators, algorithms for convolution, denoising and analysis of power density distribution were presented and analyzed.

The framework of overcomplete wavelet representations was then applied to segmentation of textured images and image del)lurring. NVe deinonstrated that envelopes as feature vectors performed well in segmenting both natural and synthetic textures. We showed that gain and shrinking operators may be used for image deblurring and discuss limitations of the mothodology.














CHAPTER 1
INTRODUCTION

1.1 Motivations of the Research

In the past decade, wavelets have stimulated great interests in many fields of applied mathematics and engineering, generating a significant amount of research activities. Within these fields, traditional theories and techniques have been reevaluated to explore opportunities for wavelet applications.

Although continuous time wavelets possess mathematical elegance, discretetime wavelet transforms are of special importance for practical applications. Among them, discrete-time orthogonal wavelet transforms have gained popularity. Coefficients of such a transform constitute a complete representation of a signal without redundancy. Complete representations have produced promising results in image compression and denoising. However, there are applications where complete representations are disadvantageous. Understanding the strength and weakness of various representations and the relationships among them is essential for satisfying intellectual inquires and developing new methodologies. In this thesis, we considered both classes of Fourier-based representations and wavelet-based representations with emphasis in the discrete-time domain. We found that the overcomplete wavelet representations served to bridge the two classes.

The merit of a particular representation should be judged by its capability to facilitate solving problems. Two important but difficult problems, segmentation of textured images and image deblurring, were selected as benchmarks. Accordingly, we value a representation by its capability to extract certain features from signals, efficacy of representing an operation and invariance under certain operations.








With the understanding of various representations and the problems in hand,. we developed a framework of discrete-time overcomplete wavelet representation. The framework included construction algorithms and prototype filters, time-frequency interpretation and three operations. Algorithms associated with gain operators. shrinking operators and envelope detectors were developed. The framework of overcomplete wavelet representations was then applied to segmentation of textured images and image deblurring. This demonstrated that envelopes as feature vectors performed well in segmenting both natural and synthetic textures. For the deblurring application, our formulation extended the wavelet-vaguelette inversion into discrete-time domain and was applicable to more blurring sources.

1.2 Review of Related Work

On the general topic of wavelet theory, a brief history and comprehensive review were included in Jawerth and Sweldens' survey [32]. The papers by Daubechies [11] and Mallat [42. 41] were considered influential for the recent developments. The paper by another wavelet pioneer Strang [59] was also explanatory. The orthonormal wave packet bases is a generalization of orthogonal wavelets. was introduced by Coifman and Meyer [10]. The book by Daubechies [13] included in-depth mathematical discussions and real design examples. The book by Vaidyanathan [63], which was written for signal processing audience, provided much insight into the relationship between wavelets and multirate filter banks.

The lack of translation invariance of discrete-time orthogonal wavelets was recognized as a major shortcoming for pattern recognition by many researchers [41. 59]. Various proposals were made to address the issue. Multiscale edge representations were investigated by Mallat and Zhong [44], Mallat and Hwang [43], and Berman and Baras [5]. Steerable filters with shiftabilities were studied by Simoncelli et al. [58]. A orthogonal representation with invariance of the subband structure was proposed by Pesquet et al. [52].








Using energy distribution in subbands of wavelet packet decomposition as signatures for texture classification was presented by Laine and Fan [36]. A similar approach was also reported by Chang and Kuo [8]. Earlier results using envelopes of overcoinplete wavelet packet representation for texture segmentation was reported by Laine and Fan [35]. A simplified version of Chapter 5 of this thesis was later published [37]. The overcomplete wavelet representation was also used by User [62] with an ad hoc feature extraction algorithm. A methodology combining Markov random field and orthogonal wavelet packet transform was described by Bello [4].

The nonlinear wavelet shrinkage for denoising was due to Donoho and Johnstone [17]. The approach was applied to power spectrum estimation [48]. and to medical image processing [1]. Nonlinear denoising based on wavelet maxima representation by Mallat and Hwang shared some similar ideas [43], although tracing singularities through the scale space turned out to be much more difficult, especially in a two-dimensional space.

Manipulating wavelet representations to enhance certain features is obviously a good idea. Jawerth et al. [31] reported image enhancement using weight factors to modify coefficients of orthogonal wavelet transforms. Laine et al. [38] demonstrated various approaches for enhancement of digital mammographic images. Fan and Laine [20] pointed out the connection between linear gain enhancement using overcoiplete dyadic wavelet representations and the unsharp masking, and introduced a nonlinear function for contrast enhancement. However, due to the lack of an objective definition of image quality, those enhancement algorithms were mostly ad hoc and hard to evaluate. On the other hand, image deblurring is considered to be a well defined problem. The approach adapted by Banhamn et al. [3] is to decompose the linear minimum mean square error (LMMSE) filters into the orthogonal transform domain with advantage of reduced alliance on the assumption of global stationarity. However, their proposal was basically the linear filtering and no optimal subband selection was








considered. The mathematical framework of the wavelet-vaguelette inversion was due to Donoho [14. 16]. The limitations were that it was derived on the continuous time domain and inapplicable to some common sources such as Gaussian and uniform blurs.

1.3 Overview of the Thesis

This thesis is organized as follows:

Chapter 2 reviews signal representations derived from the Fourier transform and the wavelet transform in the context of being operated upon by a convolution operator. The scrutiny reveals that complete representations suffer from not being translation invariant and the existence of the aliasing-enhancement anomaly. and are thus inappropriate for certain applications.

Chapter 3 presents the framework of the overcomplete wavelet representation with two particular building algorithm and filters. The time-frequency localization property of overcomnplete wavelet representations is examined under the uncertaintv principle.

Chapter 4 introduces gain, shrinking and envelope operators applicable upon anr overcomplete wavelet representation. Algorithms for determining specific parameters are detailed with examples and comparisons shown. Both Chapter 5 and 6 are applications of the theory of overcomplete wavelet representations developed in previous chapters.

Chapter 5 deals with the topic of image texture segmentation. The segmentation problem is formulated in the paradigm of spatial-frequency analysis, and a texture feature based on the concept of power density distribution is proposed.

Chapter 6 addresses the problem of image deblurring using the gain operator for deconvolution and the shrinking operator for nonlinear denoising.

Chapter 7 concludes the research and proposes future directions.














CHAPTER 2
SIGNAL REPRESENTATIONS AND CONVOLUTION OPERATORS

2.1 Introduction

For a given signal x, it is often beneficial to transform (map) x into another form (domain), denoted x-44 X. X is called the representation of x in the transform domain. Instead of directly dealing with the original signal x. we may deal with its representation X. By doing so we usually transform a particular problem into a different form and make it easier to solve. Clearly, different applications demand different representations.

The applications considered in this thesis are texture image segmentation and image deblurring. Accordingly, the time-invariability and the efficacy of representations are the most important factors. Both properties may be revealed by their behavior under a convolution operator. With these in mind, we shall review Fourier transforms and wavelet transforms of one dimensional signals. To facilitate our discussions, we used the following basic definitions:


Definition 2.1.1 (Translation Operator) For a continuous function x(t), a translation operator is defined as T, [x(t)] = x(t - s); For a discrete function x(n), a translation operator is defined as Td [x(n)] = x(n - d).


Definition 2.1.2 (Convolution Operator) For a continuous function f(t), a convolution operator is defined as Tf [x(t)] = x(t) * f(t) = ff .X(T)f(t - 'r)dT; For a discrete function .r(n), a convolution operator is defined as Tf [x(n)] = x(n) * f(n) = S-x(m)f(n m). The functions f(t) and f(n) are called the kernel. Notice that a translation operator is a special convolution operator with kernel f(t) = 6(t - 7)








(f(n) = 6(n, - d)). Therefore, we will use T for both translation and convolution operators.


Definition 2.1.3 (Eigenfunction and Eigenvalue) For a linear operator T. if there exists a function e, such that T [e] = Ae, where A is a constant, the function (' is called the eigenfunction of T, and the value A is called the eigenvalue.


Definition 2.1.4 (Translation invariant) For a given signal x. a translation operator T and a mapping operator A4, if the two operators are commutable such that


A4 [T(,r)] = T [M(Ax)]I

the mapping is called translation invariant, and the representation X is said to be a translation invariant representation.

2.2 Fourier Transforms

Frequency-domain representation by Fourier transform played a major role in signal analysis and linear system theory [31. 49, 63]. The efficacy of Fourier representation is due to the fact that its basis functions of complex exponentials are eigenfunctions of a convolution operator [49, page 39]. For a linear convolution operator T, the eigenequation is:


Tf [ei't1 = F(w)ejwt,

where eigenvalue F(,,) is the Fourier transform of f(t). An almost identical form can be derived for the discrete-time Fourier transform. Fourier representations are particularly powerful for signal deblurring since it transforms a deconvoltuion problem into the simple form of divisions.

However, Fourier representations may not be good at capturing frequency evolution over time. As time variable is added to Fourier transforms., basis functions of short-time Fourier transforms are no longer eigenfunctions of a convolution operator.








2.2.1 Fourier Transform (FT)

The Fourier transform pair may be written as: X(wz) =J x(t)e-Jw"dt (FT). x(t) = J j X(a)YtdL, (Inverse FT). A linear convolution operator Tf can be characterized by


y(t) = Tf [x(t) =J X(Lz)Tf [ejwtl d L X( )F(L)eJtd,
or.
I~w X y )Fy)

This shows that Fourier representation of the convolution operator T is the Fourier transform of its kernel. The beauty of this representation is that on the Fourier domain, a convolution (deconvolution) is simply multiplications (divisions) of the both representations.

For a translation operator with kernel f(t) = 6(t-s), F(w) = e-j'. and therefore JY(w')I = JX(L,)J. which means that magnitude of Fourier transform is translation invariant.


2.2.2 Discrete-time Fourier Transform (DTFT)

The discrete-time Fourier transform may be written as [49]:

X(c) L 3 x(n)e-j"' (DTFT),

X(n) = J X(eJ')eJndaw (Inverse DTFT).

Similar to its continuous counterpart,

q(n)- Tf [.x(n)] 7r X ((j') F (eij'eifdL,, 2rr
0O'.
Y(eFlat X(ewi F(ew)= For a translation operator with f (n) 6(,n, d), F(eJa;)

I e= I X (ejw) I.


C -j d and thus








2.2.3 Short-time Fourier Transform (STFT)

In order to capture frequency evolution of a non-stationary signal, a window w(t) is introduced into the Fourier transform. The short-time Fourier transforn may be written as:

X( t) J x(T)UI(t- T)eCjwrdT (STFT).,

M -0 X(, t)eJWtdw (Inverse STFT). where we assumed w(0) # 0.

For a convolution operator TJ,

Tf [W(t T)ejt F(w, t -T


where F(w, t) is the STFT representation of the kernel f(t). It means that in general the windowed complex exponentials are no longer eigenfunctions of a convolution operator. Consequently, STFT representation of y(t) = Tf [x(t)] is another convolution of either

or.
)Y(w, ) J X(, t - T).f(T) C-J-dT.
-00
For a translation operator with f(t) = 6(t - s), Y(&, t) X(w t - s)e-jL', and thus 't)l = X(,, t -s)1. Therefore., magnitude of STFT transform is a time-invariant representation.

2.2.4 Discrete-time Short-time Fourier Transform (DTSTFT)

The discrete-time short-time Fourier transform may be written as [49. 53]
OC
X(e, i) 3 x(k)w(n - k)e-jwk, (DTSTFT) (2.1) k= c

,()- 27;w(0) J- (,, n)eJ, dw. (Inverse DTSTFT) (2.2)








For a discrete convolution operator Tf and y(n) =-Tf [x(n)]7 we have
0
Y(eJ', n) = X(eL',n - k)f(k)e-J-k k=-c
0o'.
0
Y(eJ3,n) = F(cY ,- k)x(k)e -j k.
k=-oc
For a translation operator Tf with f(n) = 6(n-k), '(ejw, n)I =X(jn Thus, the magnitude of DTSTFT is also a translation-invariant representation.

2.2.5 Connection Between Short-time Fourier Transform and Filter Banks

The discrete-time short-time Fourier transform X(ejw, it) is a two-dimensional function of a continuous variable w and a integer variale tt. Is it possible to have only finite samples X(c-', n) in the frequency domain, and yet still able to recover the original signal x(n)? It turns out to be possible by carefully designing channel filters [63. 53].

For a given frequency a'k, STFT coefficients of (2.1) can be rewritten as

X(ej* ) -= 7 i"A Z r(in)'( - ,)(ejWk(n-m') = J kfl [r(") * ,'k(1)],


where Vk(n) = w(n)ejL" '. This means that a frequency point of STFT can be computed using a filter with frequency response 1"(w) = IV(w- W:k) and a complex multiplier e-( '"'. Note that IX(eJwk, n) X * Vk(n).

For a filter bank consists of M - 1 such channels satisfying MI t A 1
E 1,(w) = E R'( - W 1, (2.3) k=O k=O
the original function x(n) can be recovered by replacing the integral of (2.2) with the following summation:
Al- ' Al- 1l- 1- (U , n)e 'kf ~>: [X(n) * Vk(n)] = (n) '(I) - x(n).
k=O k=O Ls
Figure 2.1 showed a filter bank structure for the analys is-synthesis process of


STFT.













v~ (n) A X(ed-nx I)




e n"'M T 'N - I n


Figure 2.1: A Filter Bank Imtplementation of STFT.

2.2.6 Generalized Short-time Fourier Transform

We already pointed out that STFT magnitude can be computed by a filter bank without modulators and denodulators (see Figure 2.1). We can further relax the constraint that channel filters are frequency-shifted of a single lowpass filter v(n), and define a generalized short-time Fourier transform pair as
DC3
Vk (fl) Z X(M)Vk(n i n). 0 <~ A- < Al - 1.
Al- I
=~) Z Uk(11)
k=O

2.3 Wavelet Transforms

Vavelet transforms have become a powerful tool for analyzing non-stationary signals in recent years. It has several major advantages over the short-time Fourier transform.

First, wavelet transforms choose to fix the ratio of bandwidt h/center-frequency (called constant-Q) of channel frequency response instead of the frequency resolution (determined by the bandwidth of the window) as in STFT. In other words, wavelet transforms use a narrow bandwidth window for low frequency signals and a wide








bandwidth window for high frequency signals. The idea behind this is that low frequency components represent slow changes in the time domain for which time resolution is not critical while higher resolution in the frequency domain is more desirable. In the opposite case, a wide bandwidth window for high frequency components means better resolution in the time domain to capture abrupt changes.

Second., recent development of wavelet packet transforms and other signaladaptive wavelet-type transforms are much more flexible and powerful in exploiting time-frequency concept for wide-ranging applications. 2.3.1 Continuous Wavelet Transform (CWT)

For a continuous time signal x(t), a continuous wavelet transform pair may be written as [41, 12]:

X(a, ) J X(u)y!(u - T)du (CVT), (2.4) oo d2
X (t) = -CIX(,T'),=t0T(7 Ivcs 1 ) 25


where x* denotes complex conjugate of x, V'a(t) = , (t) is called the basic wavelet. Parameter T is the translation factor and a is the dilation factor. Informally speaking. when a increases, function Ya(t- T) expands and takes long-time behavior into account. In the opposite, when a decreases, function '( (t - T) contracts and only focuses on the short-time behavior. In order to recover the original signal from its CWT, the wavelet 'y(t) must satisfy the admissibility condition: = p P)I'du < +oc.

However, a wavelet function Vp0(t) is generally not an eigenfumction of a convolution operator T, as:

Tf aV'(t)] f f( )V/;a(t--)d


does not equal to A.C,(t).








Since inverse CWVT of (2.5) can be rewritten as:
1 f(= da M [X(a, t) * (1,(I] a where � denotes convolution on variable t, we have:

Tf [x(t)] - [((t) * X(a. t)) 2, (t)]d2 Ga2

If we denote W42 as the operator of continuous wavelet transform, the above result showed:

/VTf [x(t)]- f(t) * X(a, t) - J f( )X(a, T - = TfW [x(t)

Therefore. CWT is a translation invariant representation.

2.3.2 Discrete Wavelet Transform (DWT)

Discrete wavelet transform pair on real orthogonal wavelet bases may be written as [63]:

pj~) J70 x(t)2 -ji'(22t - n)dt (DIT).
+oc +occ

x(t) = E E Ij(n)2 -J/2y/(2-r - n) (In vers( D11T).
j=0 n=---0

where functions {2 -Ji/2',(2-it- n)} consist an orthonornal basis.

Notice that DWT maps a continuous-time function x(t) into a discrete-time sequence pj(n). Clearly, it is not a translation invariant representation.

As in the case of CWT. basis functions 2 J/%2V (2-Jt - n) generally is not a eigenfunction of convolution operators. However, T [2 -ji/2(2-t- n)] may be decomposed in DWT space:
+co +oc

T [2 -/ (2 -i t - P)] v=, i(n, m-)2 -/2 (2- 't- ,,)
1=0 rrt=-oc
where the transform coefficients may be calculated by:

v2,1(n, i) Tf [2 j2 (2-(t -)] 2 1/2 (2- - iii)dt

Jf f(&)2 ij/2y (2 2(t - n)- ) d&] 2 -1 /2 '(2-'t - nd

-2 -(jl)/2 J f()J Q (2 -i(t - ) ' y(2 -1t nr) dt <.

















Figure 2.2: Fast Discrete-time Vavelet Transform-First Two Levels.
Note: Down arrow = decimation and up arrow = expansion.

We can derive DWT representation of Tf [x(t)] to be


+o -Oo ++CC
T- > >3(t) = >3 > 1,())T [2,-j/1 2''(2-jt -'t
j=0 =O

= E Z pj(,,)v,,,.(,,,,) 2 -1/2 02-1t - nt).
1= = L=0 n=-oc

Therefore, {uj(n, m)} can be seen as representation of the operator Tf under the basis {2 -Ji2? '(2-Jt - n)}.

2.3.3 Discrete-time Wavelet Transform (DTWT)

The discrete-time wavelet transform (DTWT) cannot be obtained by sampling the continuous-time wavelet transforms nor by simply mimicking the continuous ones. Assuming a discrete "mother wavelet" v(n), a discrete-time version of dilation v(2kn - m) does not produce a frequency response - I (e 2',). Moreover, a dilation I (e2kW) in frequency domain is generally not a band-pass function but a multi-band function. The fundamental difference lies on both the 27r-periodic characteristic of frequency response and the indexing constraint of discrete-time series.

A fast algorithm based on the concept of multiresolution was found by S. Mallat [42] and is illustrated in Figure 2.2.

If we consider only one level, we have


do(,n) = E'i-= x(,,)g(2n - mi), (2.6) co(n) = Em_0 x(n)h(2n - m), (2.7)








x(n) = _ co(1)h(, 21) + E- d(l)0(,i 21). (2.8)

Therefore, the decomposition filters h(n) and g(n) and the reconstruction filters h n) and 4(n) must satisfy:
.. g(21,( - 21) = 6(,, ), 1)
Z, h(2n - in)h(m - 21) = 6(n - 1),
E, g(2n1 - m)h.(n - 21) = 0, (2.9)
E711 h(2n, - - )(;n 21) = 0,
El h(21 - r)i)h(n- 2/) + Yt g(21 - t)4(n - 21) = a(mi - n)
It can be proved (see Appendix A) that in frequency domain those conditions are equivalent to:
G(c G(elw) + G(cj(w+r)),(J(&e+")) = 2 H(e J H(cJ) + H(ci('+'))H(e*('+')) 2 G(cl H(eJ) + G(eJ(+'))R(j('+')) 0 H (eJ G (el ) + H (ej(W+ ))G(ej(W+ )) 0 H(c")H(2') + G(e")G(e.') 2 H(eJ')H(e�(-+-)) + G(eJw)G(ej(W+7)) = 0. One possible choice is to choose h(n), g(n), hi(n) and 4(n) by:
g(,i) =(-1)"h(1 - n) 4:: G(ej ) =-e-j 'H*(eJ( +7),
(n) = h*(-n) � H(&ei) H*(eJ(,),/ (2.11) 0(, = f-,,)** (ej') G*(cj"),

In this case, the resulting bases is called orthogonal bases. Otherwise, the resulting bases is called biorthogonal bases.
It can be proved (see Appendix A) that the structure of Figure 2.2 is equivalent to a filter bank with the transform written as [63]:

dk(n) Z X(1i)ik(2k+in - in). 0 < k < M- 2,
0=- (D TTVT )
CM-1(7) Z X(il)'AI-1(2 A-111- T),
M- 2 c
x(n) = dk(,n) k( - 2k+l in) -+ E C -1(iTn)uit-I(n - 2 M- it) (Inverse DTTI'T).
Particularly. for orthonomnal basis, it can be proved (see Appendix A) that ILA(f) - r(( n),
Y>14n(11 -nkl)nU*(m nan) = 6(k - c)6(l - n). (2.12)
M =-c













x (n) x (n)







Figure 2.3: Generalized Discrete-time Filter Bank Transform.

Since DTWT can be included into a general framework to be presented next,. we leave more in depth discussions to the coming section.

2.4 Generalized Discrete-time Filter Bank Transforms

Previous sections showed that both STFT and DTWT can be implemented by a filter bank structure. If we ignore the modulator (e-jwA f) and the demodulator (e ' IL) of STFT, the discrete-time version of the two can be included in a generalized form:


Wk(n) 7 1i=- x(m)vk(nkn - in), 0 < k < Al - 1, (GDTFBT) (2.13) x(n) -,A -1 0c wk(nm)uk(n nnm) (Inverse GDTFBT). (2.14)


It was called the generalized filter bank transform [63]. Figure 2.3 illustrated the structure of GDTFBT.

Depending on nk, Vk(n) and uk(n). GDTFBT includes the following special cases.


1) Orthonomal wavelet representations. 1k = 2k for 0 < k < - 2 and n,1-=

11AI-2. In this case, basis functions possess orthonomnality of (2.12).




















Figure 2.4: Example of General Binary-tree-structured Filter Bank of Five Channels.

2) Orthonomal wavelet packet representations. This is a generalization on the orthonomal wavelet by applying the recursive decomposition-reconstruction structure to all branches. Therefore, Ilk = 2L for 0 < k < 2L - 1.

3) Biorthogonal wavelet (packet) representations. The same as the orthonornal

counterpart except that the constraints (2.11) are lifted.

4) Generalized binary tree structured filter bank representations. This is a further

generalization upon the (bi)orthogonal wavelet (packets). Every channel may be split into two with filters satisfy (2.9). Therefore, every nk is power of 2.

Moreover, filters used in one channel (node) may be different with another.

Figure 2.4 showed an example. It was proved in Appendix A that the basis

functions constructed this way satisfy the biorthonomality expressed by:


Sa,,(im - nCl)vk(nkn - In) = 6(c - k)6(l - n). (2.15)




5) Overcomplete wavelet representations. In this case. no down-sampling and upsampling are used, which means nk =1 for all channels in the Figure 2.3. As an example, the overcomnplete wavelet transform corresponding to Figure 2.4 is

shown in Figure 2.5.





















Figure 2.5: The Overcomplete \Vavelet Transform for the Example of Figure 2.4.

Representations 1-4 above are called "complete". The reason of calling the representations of the category 5 "overcomplete" or "redundant' is that they contain more analyzing functions ({uk(n)}) than a basis. Since these analyzing and synthesizing functions must satisfy
At- c A- 1
Y, 1: '1(0 - 1)Vk(n- I) = 6(1- 7), or E ["k(cj)1k(Cj') 1.
k=0 in=- k=0
they must be linear dependent. In the other word, some functions must be expressible by others.

Complete and overcomplete representations behave rather differently under a convolution operator.

2.4.1 Convolution Operator on Complete Representations

In general. a basis function uk(n-nkm) is not an eigenfunction of a convolution operator. Using the same idea as in the case of DWT, a convolution operator Tf can be represented by basis functions {Uk(n - nkm)}: Al- I oc
if fUk(P -kn~ 1100 = E 'Ik(mi, b) ui,(n - nab)
c=0 b=-oc

Nrxk(nI, b) =- f .f(i)7k(l - I k- 'IkM)V.(n,b- 1). (2.16) l=-oc i=-Oc
It can be proved that uk(n - nkm) is an eigenfunction of the operator Tf if and only if 1k,c (m, b) = A6(k - c)6(in - b).








Proof: If %lA((m. b) A6(k - c)6(rm - b), then

T [Uk( - 1101)] 'o E=-z A6(k - ()6(I - b)u,(n- inb) Allk( l-lkIII).

In the other direction, if Tf [uk(u,- 7,km)] = AUk(n- 10km). then
11k,,(in,,b) = E' oT [(,- nk1)] c(n~b - it)


= E cc Au11k(n - nkm)v (nb - n) = A6(k - c)6(m - b) where we utilized the property (2.15). m

The set of coefficients {T/k,,(tn, b)} is the representation of the operator Tf in the basis and is independent of input signal. Since a convolution operation on a sequence X(n) can be derived as:

Tf [x(11)] > 3 wk(nt)Tf [Uk(" - ,,k in)]
k=O m= - 00
A1] oc Al- 1 SE E Ul'k (nl) Y 1: %x" C(i , b ) 'ac(n - nc b) k=0 m =- c=O b=-oc
E= 1L k (n 1)1,k,,(in , b) .( r - :) ,
c=0 b=--c l k=O r=-Oc

the representation of T [x(n)] is thus A/-I oc
at(b)= 1: E Wk(m)7k,c(rnb). (2.17)
k=O M=-00

For A-periodic sequences x(n), Uk(U) and Vk(n), l/k.(m, b) is --periodic in
Ilk
in index and X--periodic in b index, and a,(b) is , periodic in b index.
ncc
A representation in matrix form was derived by Banham et al [3].

For a translation operator of f(i) = 6(i - d),

,lk..(nt, b) = E'_, U( - d - nkm) ,(nb - 1),

and generally is not equal to A(k - c)6(,m - b). Therefore, complete representations are not translation invariant.








2.4.2 Convolution Operator on Overcomplete Representations

The overcomplete representation of a convolution operator T" can be obtained from (2.16) with n, 1 and Ik = 1:


llk,c(in, b) = Z f(i)uk(l- i- rn)v,(b- 1) = (f * Uk * c) (b - in). (2.18)

Accordingly, the overcomplete representation of T [x(n)] can be obtained from (2.17):
I-1 xc
a, (10 = Z Y wk()rk,,(11- rM) = f(n) * u?,,(n), (2.19)
k=O ?'n=-oc
which showed that discrete overcomplete wavelet transform operator and convolution operator are commutable, and thus overcomplete wavelet representations are translation-invariant. To verify this, assume an overcomplete wavelet transform operator of W/V and a translation operator of Tf with f(n) = 6(n - d). overcomplete wavelet representation of Tf [x(n)] is:

WTf [x(n)] = TfW [x(n)] = TJ [{wk(n)10
which means that the representation of a time-shifted signal is simply the translated version of the original representation with the same amount of time-shift.

2.4.3 Approximation of Convolution Operators on Overcomplete Representations

In frequency domain, (2.18) can be written as:

7lk.Cde') = F(edw)uk(eJ )1 .(e'),


where ik,j(e') is the discrete-tiie Fourier transform of llkx((f).

If the passbands of the filters Uk((eJ) and I .(eiJ") have little overlap, the following approximation is justified:


, (d ) ;.( ') 6(k - ct:d' ;d'








Moreover, if the function F(cj ') is real and the l)assbands of channel c's are narrow enough., we may further approximate k,c(eC) as: Ik,,cC ) A,6a(k - cU (e.-1 I (el )
or.
TI .c(fl) - Aj6(k - c)u,(n) * j,,(n)

where A, is a real number. In this case, (2.19) may be well approximated by: Al- 1
(,(n) - A(k - C)I,,,k(,) * U(,) * V,(,) = A,(,,n). (2.20) k=O

This means that a convolution operator with real frequency response F(e>j) may be approximated by a set of real multipliers {k}. This is another advantage of overcomplete representation since we found that the approximation is generally not extendible to complete representations.

2.4.4 Aliasing Enhancement on Complete Representations

For complete representations, subsampling usually does not meet the Shannon sampling theorem, and thus produces aliasing distortions. Those aliasing components get canceled only by specially designed filters at the reconstruction stage. If we manipulate the representation in a way like (2.20). aliasing distortions may not be canceled. For a concrete example, we consider a wavelet transform of Figure 2.2 with only one level. If we multiply representations do(n) and co(n) by kd and k, respectively, we can rewrite Equations (2.6-2.8) in frequency domain:


D' [X(eJi2)G(eJ/2) + X(J(+2-)/2)G(eJ("+27)i2)1 C1(e 2k [)cJ)H(ew/)H

X- (') - [kdG(eJ')G(eJ') + kcH(ej-)(eii] X(ej-) +
2
2[kdG(ej(-+�-))G(ej-) + kc!H(e j(W�T) )(ej1 I (A:7 The second term is the aliasing component. Unless kd = k , the second term is generally not equal to zero even though filters meet the perfect reconstruction








conditions (2.10). In this case, the system is not a linear time invariant (LTI) system, but rather a linear periodically time varying (LPTV) system [63]. It is easy to recognize that the system can not be characterized by the familiar form of Y(JW) = I(&jX(eJ ). Figure 2.6 shows an example of such aliasing-enhancement effect.


80:1
70


40

20,
100 50 100 150 200 250 0


2 25 3


Ahasng component


50 100 150 200 250 0 05 1 1 5 2 21
(c) (()

Figure 2.6: Example of the Aliasing-enhancement Anomaly. Where:
(a) The original sinusoid signal;
(b) Magnitude of DFT of the signal (a);
(c) Reconstructed signal with orthogonal Haar filte


(H(eJw) = 2- 1/2(1 +- e-J), G(ej-) = 2- 1/2(1 _ e- )
one level decomposition and kd = 6, k. 1;
(d) Magnitude of DFT of the signal (c).


5 3


rs








2.5 Summary and Discussion

We have reviewed signal representations based ol Fourier and wavelet transformations and included them in a general framework called generalized filter bank transform. \We studied their behavior under a convolution operator. It revealed that complex sinusoid functions of Fourier transform are the only basis functions in this class which are eigenfunctions of the operator. In this case, a convolution operator is equivalent to multiplications in the representation (frequency) domain and the inverse operator can be simply expressed as divisions. For overcomplete representations, a convolution operator may be approximated by multiplications of coefficients by real factors. However, such an approximation can not be extended to complete representations due to aliasing distortions. Moreover, we realized that discrete complete representations also lack traislation-invariant. Based oil the characteristics of these representations, we may find the best applications for them.


1) Complete rel)resentat ions may be good for compression, progressive transinssion. No redundancy alone would be a major advantage for those applications.

Throwing away some coefficients may be seen as a zeroing operation, which does not enhance aliasing components. After all. we are prepared to accept

some distortions for those applications.

2) Overcomplete represent at ions possess many desirable properties for restoration.

enhancement, feature extraction and time delay estimation [52]. First, they are translation invariant which is critical for pattern recognition. Second. they do not have the aliasing distortions. Third, they allow much higher degree of

flexibility in filter design.














CHAPTER 3
OVERCOMPLETE WkAVELET REPRESENTATIONS

3.1 Introduction

The paradigm of overcornplete wavelet representations is a versatile and powerful tool shown to be advantageous for certain applications. In this chapter. we shall study overcomplete wavelet representations in detail, including issues of decomposition and reconstruction algorithm, filter selection and time-frequency interpretation.

3.2 Overcomplete Wavelet Transforms and Filters

In general, a discrete-time overcomplete wavelet transform (DTOWT) pair can be obtained by setting nk to 1 in the Equations (2.13-2.14), as rewritten in the following:


Uk(f) - =_V uk(M)x(n - in), 0 < k < M- 1, (DTOIT-T) (3.1) A(n) = ' - O w-(7)uk(n n) (Inverse DTOI VT). (3.2)
x~n) -k=O -M=-'1qk~)U(T


More importantly, we need an algorithm to generate analyzing and synthesizing filters Vk(n) and uk(n). In this thesis, we restrict ourselves to the binary tree algorithm as illustrated in the previous chapter, and consider two particular classes: overcomplete wavelet packets and dyadic wavelets.

3.2.1 Overcomplete Wavelet Packet Representations

The construction algorithm of overcomplete wavelet packet representations corresponds to a complete binary tree, as shown in Figure 3.1. Though each tree node represents an analyzing filter, only a subset of all tree nodes is needed to achieve a perfect reconstruction. Notice that the position and the width of each rectangular









11(w)

No,o I N ,
1() GNw ((w H(2,,)

h1(2w) (;(2w) /\1(2w ) :
G(2we)
N2.0 N2,1 IA2.2 IN2,3
... ... ... ... 0 X 27r
(a) (b) Figure 3.1: Binary Tree for Overcomplete Wavelet Packet Decomposition:
(a) the tree; (b) illustration of frequency scaling of the prototype filters. node also represents the channel's position and the band width in the frequency domain.

In general, prototype filters H(w) and G( ) in each level can be distinct, as proved in Appendix A, as long as they satisfy the condition H(w)H(,)+G(w)G(') The frequency resl)onse of each node can be calculated recursively:

Po(Lw) = H(wc). P(a) = G();

No,o(w) 1 N,+,,m(w)= P,,(2'W)NI\,L[,,,2j(L'), 1 > 0. (3.3) where the subscript c of the filter P,(,) follows the periodic pattern of 0. 1. 1. 0. and can be calculated by: c = (m + [m/2J) mod 2. The purpose of such ordering is to make the index m, of the node N,, proportional to its central frequency.

The reconstruction from two child nodes to their parent node can be expressed by the following formula:

"VlV+l,2k(W)Pkmod2(21W) + AI +1,2k+1(W')P(k+i)mod2(2'LW) ( where we used the property of: Pkmod2=(W)PkO(+ 2(W) + P(k�1)rn.d2(W)P(k+1)d2(W) H H(w)HR(w) + G (L,))G(w) = 1.





25


There are many possible choices of filters vk(11) from the tree to achieve the perfect reconstruction. For example, for the tree shown in the Figure 3.1 with the height of two [55, page 449], five sets are available: {N,0}, {N1,o, Nl, I}, {N,o, 2, 2,3}, {N2,0, N2,. ,N1,1} and {N2,0. N2,A12N2,2, N2,3}. In general, let T(h) be the number of selections from a complete binary decomposition tree of height h. Since the tree may be split into two subtrees of height h- 1 each in turn have T(h - 1) selections, we have the following recursive formula:

T(h) = [T(h -1)]2 + 1, h > 1, and T(0)= 1.

The first five numbers were calculated as T(1)=2. T(2)=5. T(3)=26, T(4)=677 and T(5)=458330.

All quadrature mirror filters (QMF), good for orthogonal wavelet transform, are also good for overcomuplete wavelet packet transforms. However, for real QMF except Haar wavelet, I. Daubechies proved that linear phase and finite impulse respouse (FIR) properties are mutual exclusive [13]. For example, Daubechies wavelet filters [11] are FIR but not linear phase. In the other hand. Battle-Lemari6 wavelet filters are symmetric but not FIR, which may be written as [42]: Hp() = - 4p+4(CO) (3.4) 24p+4-4p+4 (2w)

where p is a positive integer and


Ekp (w�+2kir)"'

Note: An algorithm for numerical computation of Battle-Lemari6 wavelets is included in Appendix B.

3.2.2 Dyadic Wavelet Representations

The building structure of a dyadic wavelet transform is a particular subset of the overcomplete wavelet packet transform. It decomposes low-frequency branch only (assume H(w) is lowpass), as shown in Figure 3.2.



















Figure 3.2: Binary Tree for Dyadic Wavelet Decomposition.

Obviously, all filters good for overcomplete wavelet packet transforms are good for dyadic wavelet transforms. However, one may want to exploit the extra opportunitv created by the more relaxed constraint on filters. The following filter classes [44, 20] possess distinct property in the time domain:

H() CJp',/2 cos2n+P(/2),
H(w) - * ) 35
G( ) = (-j)'Yejpw/2 snP(/2), (3.5) G -(-J)Pc- ill (,;/2) 0(') -(J)PJPW!2sinP(c'/2) _n=0 cos2 (L'i2) where n, > 0 is an integer and p E {0, 1}.

In fact. for: p = 0 filter g(-1) = 0.25, g(0) -0.5 and g(1) = 0.23 is proportional to a discrete Laplacian operator and for: p - 1, g(0) = 0.5 and g(1) = -0.5 is proportional to a discrete gradient operator.

Notice that filters within this class are linear phase and FIR, but not orthogonal. For p = 0, filters H, H, G and C and their frequency scalings (e.g. H(2,..u)) are all symmetric. For p = 1. different prototype filters may be used for frequency scaling to minimize spatial shifting caused by the phase factor. If we denote H(w).

-Hj(w), G1(w) and GI(w) for filters used in level I > 1, we find the following filters are either symmetric or antisymmetric as well as FIR:

H,(w) = cos2n 1(2 -2w), H11(L) = HI(w),

Gi(w) = j sin(2' -2L,), ,() = -G(L,) E cos2'"(21-'2)� Mn=O









0,9, a=l, b=4 08 a=2 b=4 07, a=3 b=4 06
0,5
04/
03
0 2'
01
0
3 - 2 - 1 0 1 2 3 Figure 3.3: Examples of the Function -),ob(W).

In both cases, frequency response of tree node Nl,k (I > 0 and A E {0, 1}) can be derived as:

'O,( .) =(CjP'w2 JJCos 2 fl�(21n- 'w) =J)~/ si(2 1.) w+
Sl- 21 sin(/2) j "r/,l11=0 -(- p -jp,;i2 [2 - Sl('/ )2-p [ Sill (2' -2a;) I] 2n+2

i -(j )P - 1[2' "1 sin( /2)l j [ 1 sin( /2 i)

For convenience, we defined a function -a,b(w ) as:

e[.(,) = Lsin/2j


For b > 4. the function O,b,) is approximately Gaussian as plotted in Figure 3.3.

Similarly, we can define a reconstruction tree with nodes: A--,(w) H.=j Hrn(u), N,,1(w) N l- ,o(w)Gj().

A dyadic wavelet representation consists of leaf nodes. For example., a threelevel representation consists of nodes {N,0, N3.1 N,,, Njj}. In terms of filter bank, they can be viewed as multiple channels. Channel frequency responses can be derived as:

1
Co(,) =A ,o()N,0()= I !-,o(w) )H,(w)= ,4n+2p(L )








1- 1 jl(w+)N/ -,0(W)[1i Hi(w)i(w)1



The channel frequency responses for a three-level representation are {C3,o, Ca,i,C2,1 ,C.1}. Notice that the channel frequency responses C.1(LL) are an approximate difference of two Gaussians (DOG) [47, page 63]. Two examples of channel filters are shown in Figure 3.6.

3.2.3 Filters of Autocorrelation Functions of Wavelets

From a filtering point of view, the idea is very simple. Since there is no downsampling-upsampling taking place in overcomplete wavelet transforms, it seems that we can simply combine decomposition and reconstruction into a single filtering step. To achieve this, one only need to change prototype filters into: H1(w) H(w)H(w), (3.6)


The perfect reconstruction condition now takes the form of: H, (L,) + G,,,(Lo) = 1.


As H(w) is connected to a wavelet, Ha(w) is connected to an autocorrelation function of wavelet. For rigorous mathematical treatment, refer to Saito's Ph.D thesis [57].

In this case. a reconstruction step is simply a summation of all coefficients as (3.2) degenerates to:



which takes the same form as the generalized STFT.

The most important property is that filters H(w) and G,,()) are always symmetric or antisvmmnetric, and thus providing a simple way to construct symmetry and FIR filters for certain applications.








3.3 Connection to Complete Wavelet Representations

In Appendix A, Proposition A.0.3 built the bridge between an overcomiplete wavelet representation and a complete wavelet representation. If both representations used the exactly same prototype filters H(s), G(O), H(,) and ((), the only difference lies on the decimation. In this case, the complete wavelet representation is a subset of the corresponding overcomplete wavelet representation due to decimation. If the )rototy)e filters used by an overcont)lete wavelet representation does not satis y (2.10), there is no corresponding complete wavelet representation.

3.4 Time-frequency Interpretation and the Uncertainty Principle

The set of transform coefficients { Wk(n)10
Figure 3.4 shows two overcomplete wavelet packet representations using symmetric Lemari6 filter of p = 1. It is apparent that overcomplete wavelet representations include both frequency and temporal information of the signals. The frequency resolution or capability to separate different frequency component of an overcomplete wavelet representation is determined by I 'k(w)'s. The narrower the passband of time 1'(4w) is, the higher it is tihe frequency resolution. On the other hand. tme time resolution is determined by the distribution (or window size) of the impulse responses Vk(1)'s. The shorter the distribution is, the higher the time resolution it can achieve. Unfortunately, high resolution ill time and frequency domains are conflict goals, as demonstrated in tihe Figure 3.5. In this example, we saw that as passband narrowed, impulse responses spread and consequently the boundary between











































Figure 3.4: Two Examples of the Overcomplete Wavelet Representation.
Legend:
(a) An ideal square waveform;
(b) A linear chirp signal s(n) = 5cos(O.027 + 0.00lrn,2).
In each image, top half is the signal waveform, and the bottom half is (0-255) scaled overcomplete wavelet representation (coefficients) of 64 leaf nodes of the level six. For the bottom half,
horizontal axis is time. vertical axis is frequency.








two impulses blurred. This phenomenon reflected a fundamental principle called the uncertainty principle [21, 51].

Time and frequency localization of a discrete lowpass function f(n) can be mathematically described by two quantities, (2= 1 Y) 2
n EZ(n -) If(n).

r7r
a2 2irE I~w

where E 2 1 7f(n)12 f , IF(ei") Vdw and Hi = E Zn "If(n)(

The product of o, and a2, is defined as the unccrtainty factor:

U = 7o- (3.7) Liu and Akansu [40] proved that U > [E - F(eJ )12] /(4E2), and thus for filters with F(ej') = 0, U > 0.25. That the uncertainty factor has a lower bound is significant and is called the uncertainty principle. It basically states that we cannot have filters with arbitrarily narrow bandwidth in the frequency domain and arbitrarily short duration in the time domain. As a result, we cannot achieve arbitrarily precise time and frequency localization simultaneously. (Numerical computation of uncertainty factors for filter banks is discussed in Appendix C.)

Figure 3.6 shows examples of analyzing filters of overcomplete wavelet transform and their uncertainty factors. Uncertainty factors of the analyzing filters of the dyadic wavelet showed that they are indeed very close to Gaussian functions. In general, the results showed that uncertainty factor varied from channel to channel, and level to level. As a statistical measure, we used maximum, minimum and mean to describe higher levels. Such statistics for higher level overcomplete wavelet packets are presented in the Table 3.1, where Regular stands for regular Lernari6, filters of (3.4) and Autocorrelation stands for autocorrelation functions of Lelnari6 filters.

\We may catch some trends from the data on Lemari6-filter-spanned analyzing filters of overcoInplete wavelet packet representations:









Row 1



2 3 4 5



6




8 9 10




12 13

Col I Col 2 Col 3 Figure 3.5: Examples of Time-frequency Localization by Overcomplete Wavelet Packet Representations. Where:
Row 1: A signal consists of two pure-tone impulses with frequency of 0' - 0.3re
and & = 0.327r: the spectral (Col 1) and the waveforn (Col 3).
Row 2-5: level 2 representation.
Row 6-13: level 3 representation.
For row 2-13: columnil, frequency responses {NI,kI0 < k < 2'}; column2, impulse responses of columni (dot lines approximate envelopes). column3, overcomplete wavelet representation. The prototype filter used was Lemar filter of p = 1.





















070 035 035 070




08 Row 1 06



04i



0U\




0 1 1s 2 20 3


070 035 044 043 043 044 035 0,70 1





084




2


04 02




00 2 20



09 1 025 0,34

00 07.


0.6 029
0 30 /

3 05 29

04,




02
01 . . .. . \


333




0.8



06




04 02


1 27 1.27 333


0 05 1 15 2 25 1296 117 117 1 000 10 117 117 296


00 06!








02r




0 05 081
~i




06' 030
030
030 04 02 011


15 2 25



042


0 05 1 15 2 2M 3 0 0 1 15 2 25 3


Col 1 Col 2



Figure 3.6: Uncertainty factors (drawn on top of each band) of Channel Filters. Where:


Row 1: Overcomplete wavelet packet of level 2 using Lemnari6 filter of p = 1 (Col

1) and p = 9 (Col 2);


Row 2: Overcomplete wavelet packet of level 3 using Lemari6 filter of p = 1 (Col

1) and p = 9 (Cot 2);


Row 3: Channel filters {C4.(). C4,1. C3., C2,, C1 } of dyadic wavelet using the filter class of (3.5) with n = 1 (Col 1) and p = 1, n = 1 (Col 2).


0









Table 3.1: Uncertainty Factors of Analyzing Filters Regular A utocorrelation


p level Umax Ii U UrEax Ulin, U
4 1.618 0.350 0.701 0.853 0.367 0.497 1 5 5.581 0.342 1.705 2.053 0.318 0.579
6 16.721 0.318 4.535 5.693 0.262 0.980 4 1.011 0.437 0.628 1.178 0.470 0.584 2 5 3.464 0.397 0.954 0.871 0.333 0.432
6 7.984 0.320 1.809 2.736 0.260 0.475 4 1.283 0.529 0.674 1.349 0.496 0.646 3 5 2.386 0.418 0.704 0.721 0.329 0.398
6 5.244 0.318 1.005 1.605 0.261 0.346


1) Joint time-frequency localization of those using low order filters tends to deteriorate quickly as level increase. This is due to the existence of small sidelobes.

2) Using filters of autocorrelation function is more effective than using higher order

regular filter, if the no-reconstruction structure is acceptable by applications. However, one should be very careful in using uncertainty factors as an optimality measure due to two main reasons. First, how can we come lip a single sensible number for a filter bank? Minimum U,,, or minimm U or something else'? Second, more fundamentally, the uncertainty factor does not take the signal into account. For configurations with similar uncertainty factors, as in the case of Gaussian-like filters such as (3.5), the measure of uncertainty factor alone would not tell us which one is optimal.

3.5 Two-dimensional Extensions

So far we have only discussed one-dimensional representations. In order to apply overcomplete wavelet transforms to two-dimensional images, we need to extend the transform to two-dimension. The following two separable extensions can be implemented as separated ID filtering of rows and columns.





35

1) Tensor product extension [42]. This extension is originally applied to orthogonal

wavelet transform. Only ID filters H(L,) and G(&u) are used. It directly exploit

the perfect reconstruction property of quadrature mirror filters by extending: H(w)H(w) + G( )G(w) = 1

into:


[H(,;x)H(W ) + G(wJ,.)G(w,)] [H(wj)fH(u y)+ G(u)G(-:y)] 1.


The above 2D extension can be rewritten as

HH(wx, Wy) HH(w1, wy) + HG(w'1. Ly) HG(wx&. wy)
+
GH (w,wy)G6H (w,;,u ) + GGY1w G(w,.wy) 1

where
HH(L, ,y) = H (Lx) H (Ly), iH(LL;, Loj) = RH(L,x)HR(w), HG(,,wy) = H(wx)G(wa), HG(A.;.cWy)= H(;.)( ("CV), GH(wwy ) = G(Lux)H(wy), GH(w, y) = G()H()R.) GG(w1.,w,) = G( ,x)G(wy). GG(a,, wy) = G(w,)G(w).

Filters {HH HG, GH, GG} are used for decomposition while filters {HH, HG, GH. GG} are used for reconstruction. The same construction method is applied to frequency scaling of higher levels with each node decomposed into

four child nodes.

2) Two-orientational extension. For the particular dyadic wavelet filters (3.5) with

emphasis on spatial operation (gradient and Laplacian), the component {GG} of the tensor product method may not be useful. Therefore, the following 2D

extension was proposed by Mallat [44]:

HH(wx,wy) = H(wx)H(wy), HH(WX, WY)= H( X)H(LY),
GX (w'xw) = G(wx), GXv(W,,w,,,) = -(< )L
Gl'(,,) = G(L,), (W,. ,y) = L(,.) 0(w,).


This 2D extension has only three components:








* a horizontal component GX
* a vertical component GY
* a lower resolution component HH


However. it introduced a new filter L(w), which is to be determined by the condition of:

HH. HH + GX + GX+G) GY-1.

It turned out that the formula for L(w) is

1 + H(w)H(w)
L i 2 For filters of (3.5), L(w) = 0.3 * [1 + cOsI2p /)2)]


which is a symmetric and FIR filter.














CHAPTER 4
OPERATIONS ON OVERCOMPLETE WAVELET REPRESENTATIONS

4.1 Introduction

In most cases. a suitable signal representation is not the end of a problemsolving process. On the contrary, how to manipulate it to achieve a desired effect is the more challenging part.

For overcomplete wavelet representations. we have identified three operations: gain, shrinkage and envelope operators for our applications in image deblurring and segmentation. We demonstrated that a linear convolution may be approximated by a gain operator, and devised a tree pruning algorithm based on a tolerant of approximation error. We extended the wavelet shrinkage proposed by Donoho and Johnstone [17] to overcomplete wavelet representations for denoising. We built the connection between the power density distribution of a overcomplete wavelet representation and its envelopes, and presented two envelope detection algorithms.

4.2 Gain Operators

In the Section 2.4.3. we determined that a convolution operator can be approximated by multiplication of real factors {Akj0







reconstructed signal x'(n) in the frequency domain is simply A!- 1 -1

k=O k=O

and thus the equivalent filter is


Q(G,) = [AoA, ...AA I-] [Co(L)C,(L') C I -(L,)]',


where [A O l... AA,-,] is a gain vector, Ck(W) = V ,(w)Uk(W) is the channel filters, and 11t denotes the matrix transpose of M.

Two fundamental issues immediately arise:

1) What is the characteristics of the set of filters expressible by a gain operator? In

the other words, what kind of filters may be approximated by a gain operator?

A broad answer is that any gain operator can only represent a symmetric filter.

This is because channel filters {C k(w)I0< k
also be a real function.

2) For a given frequency response F(w). how can we determine a tree configuration

and a gain vector as the best approximation? Note that this question really

asks for a criteria of "best" and an algorithm to find it.

4.2.1 Minimum Mean Square Error (MMSE) Approximation

The closeness of the approximation in terms of frequency response by a wavelet representation with channel frequency responses {Ck(w')j0
1-f _[F(7) M AkCk(w)] d,. (4.1)








It is well known that the optimal gain vector corres)onding to the minimum mean square error may be found by setting all partial derivatives - to zero:


17 ) Ak Q( w& )d& = 0, 0< q < 3-l,
Oq 7F 7r k=0 which is a set of linear equations:


[ Ck(W)Cq()dw Ak J F(w)Cq,(w)dw, 0 < q <31 1. (4.2)
k=O -IT

The minimum mean square error can be derived as

S1 F 2(w)dw - F () Ck(w)dw 27 7. k=0 where {A O
We may define a normalized minimum mean square error (NMMSE) as
c min
Ef'

where Ef - 2i f F2(w)dw is the energy of the given frequency response. Since 6rin > 0, it must )e true that Ef > F>l'l-' AO fF (w) C k () dw,
_ = k 27 !r ( ) (,) ,


and therefore 0 < K < 1 .

The criteria of mean square error provides a way to determine the optimal gain vector for a given frequency response F(w) and a given configuration. However. the criteria alone suggests that the bottom-most nodes are the best representation., for which E may reach its minimum among all possible configurations. This can be formally proved by the following proposition. Proposition 4.2.1 Splitting a channel into two will not increase the error measure








Proof: By contradiction. Assume that for the current configuration {Ck(P) 0<.< -1 }, the optimal gain vector is {A" } and the NMMSE is _. We further assume that the channel q is split into two channels with indexes q, and q2, with the optimal gain vector {A0' } for the new configuration and the new s' >

Without loss of generality, we can write:

C4, (W) Cq(W)H(21w)H(2w). (4.3) Cq,(w;) -Cq(w))G(2'Lw)GO(2Lw).

Since the prototype filters satisfy H(w)H(w) + G(w)() =1. we have Cqi (w) + Cq. (L) = Cq(w). For the new configuration. we can find a gain vector {A'k =A", k # q, k 4q2; A', =A 2 A'} with error c = E. This contradicts the assumption that E' > t, and therefore ' < E

Examples showing higher level corresponding to better approximation can be visualized in Figure 4.1. Overcomplete wavelet packet representations of leaf nodes at Level 4, 5, 6. 7 were selected, corresponding to 16, 32. 64. 128 channels, respectively. Lemari6 filter of p = 3 were used.

4.2.2 Time-frequency Trade-off and a Greedy Algorithm

Proposition 4.2.1 clearly states that approximation error F of frequency response decreases as the channel bandwidth decreases, as demonstrated by the examples in Figure 4.1. Therefore, by the criteria alone, the Fourier transform would be the best representation since its approximation error is zero. This is certainly not the answer we are looking for.

What we lost in sight is the other part of the picture in the time domain. As dictated by the uncertainty principle, time resolution decreased as the frequency resolution increased. A overcomplete wavelet packet representation by the deepest leaves would be closed to a Fourier representation and thus lost its resolution in the time domain. Moreover, from computational point of view, such a down-to-thebottom decomposition is rather inefficient.















R'owlI






20






3













5,


Figure 4.1: Examples of Where:


Gain Vector Approximation (frequency range [0, 7r] shown)


Row 1: a given frequency response F(cJw); Row 2 to 5: approximated frequency responses )y overcomplete wavelet l)acket
representations of level 4. 5, 6, 7, overlaid with F(eJ) (dotted curve), with
E=39.4, 18.5%, 4.5%, 0.02%, respectively.


I








One possible trade-off between time and frequency can be accomplished by setting up a tolerant oil the approximation error i. Under an error tolerant. the goal could be to find a tree configuration with minimum number of leaves. In the following, we present a greedy algorithm in the order of breadth-first search [45]. Notice that a tree node here represents a channel filter Cl,.,(w) and the relationship between children nodes and their parent node is characterized by (4.3). Algorithm 4.2.2 (Minimum Tree) Given a frequency response F(,) and an error tolerant Fo, let current leaves � = {Co,0(w) -= 1} and compute the current error u using the set C.

Step 1. One by one, compute error Ek by temporarily substituting a node in the current leaves C with its two children. Choose the pair with minimum, Ek to replace their parent in the set � and update the current error Ec = min{k}.

Step 2. If -c < Eo, stop and use the leaves C as representation; else, go to step 1.


Examples of overcomlplete wavelet packet approximation using the minimum tree algorithm are shown in Figure 4.2. The improvement is appreciated by comparing it with Figure 4.1. In Figure 4.1, the representation using thirty-two channels of the Level 5 has the approximation error E = 18.5%. However, for the same frequency response, the minimum tree algorithm was able to find a representation of only twenty-six channels and yet reduced the approximation error to ) = 6%. In the other case, the minimum tree algorithm found a representation with merely seven channels and achieved approximation error of 0.6%.

4.3 Shrinking Operators

Wavelet shrinkage was introduced by Donoho and Johnstone [17] for signal denoising and linear inverse problems. For a thorough introduction, refer to [15]. The description of the methodology was given in [15, page 173], we quote, "Wavelet


















Row 1






2


h)]1


Ci Col 2


Col 1


Figure 4.2: Examples of the Minimum Tree Approximation with the Maximum Depth of Seven Levels.
Where:
Col 1: A given frequency response F(ei') (rowi, same as in Figure 4.1), approximation result overlaid with F (row2) with E = 6% and selected 26
channels (row3);
Col 2: A given frequency response F2(cjw) (rowi), approximation result overlaid
with F2 (row2) with - = 0.6% and selected 7 channels (row3).
Overcolnplete wavelet packet with Lelnari6 filter H, (ejw) was used in both cases, and the frequency range [0, wr] is shown.


05








shrinkage refers to reconstructions obtained by wavelet transformation, followed by shrinking the empirical wavelet coefficients towards zero, followed by inverse transformation."' For a thorough introduction, refer to [15].

We extended the idea to overcomplete wavelet representations on the ground that they differs with orthogonal wavelet representations mainly on having extra correlated coefficients.

A shrinking operator KIt can be written as
"aw - t, ?1 > t,
Ict [TV] = 0, 1 U <_ t, (4.4) IV + t, It <-t

where t > 0 is a threshold. Notice that the shrinking operator is a nonlinear and memoryless operator, and is supposed to work in spatial domain.

The goal of a shrinking operation is to cut the noise off while keeping useful signals. Obviously. this is possible only if the amplitude of desirable components is stronger than the noise components. Therefore, the threshold value t is dependent on the signal-to-noise ratio. Examples of choosing appropriate thresholds are discussed in Chapter 6.

To demonstrate the shrinking operation and the advantage of wavelet denoising, an example is included in Figure 4.3. The major advantage of wavelet shrinkage denoising, as compared with linear smoothing, is that it diminishes noise but does not smear the sharp edges. as seen in the example. Comparison with a linear averaging filter with h = [1, 1. 1.1, 1. 1, 1]/7 is presented in Figure 4.4.

4.4 Envelope Detectors

Energy distribution is a very important concept in the physical world. It is a indispensable tool for signal analysis, too.

For a harmonic current i(t) = A sin (wot + a) passing through a one-ohm pure resistant, the instantaneous power (energy per unit time) at time t is i2(t). The














Row 1I





2'





3
4













IY

Col I Col 2 Figure 4.3: An Example of Overcomplete Wavelet Denoising.
Where:
Col 1: An perfect impulse contaminated by white Gaussian
noise (rowl) and its level 2 overcomplete wavelet
packet representation (row2-5);
Col 2: Overcomplete wavelet packet representation after shrinking operations (row2-5) and reconstructed signal
(rowl).










Row 1





2




Figure 4.4: Comparison of Overcomplete Wavelet Shrinkage Denoising with Linear Smoothing Using the Signal of (row 1,col 1) of Figure 4.3. Where:
Row 1: Overcomplete wavelet shrinkage (left, the same as row 1, col 2
of Figure 4.3) and linear smoothing (right):
Row 2: Local zoom-in with overlay of the original noisy signal.


average normalized power in a period T is thus

1 I = i2(t)dt = 12

Therefore, the average normalized power of a harmonic signal is proportional to its amplitude squared.

For an arbitrary waveform i(t), one can decompose it into harmonic components using Fourier analysis:


i * --- IDwe wff

By the same token, energy density (per unit time and per unit frequency) at frequency w is proportional to the amplitude squared of the component I(w)ej2rIft, which is I(W)12. The function I(W)12 describes power distribution in the frequency domain and is called power density spectrum [49, page 66].

The above concepts can be extended to overcomplete wavelet representations, by extending the concept of amplitude to "envelope" [26, chapter 4], or "local amplitude." An overcomplete wavelet representation is consist of bandpass components.








Since any bandpass waveforn may be represented by A(t) sin(wt + a(t)) [26. page 229], where A(t) is the envelope and uvc is the associated carrier frequency, an overcomplete wavelet representation {wk(t)10
Thus, squared envelopes {A2( Wk, t)} shall describe the power density distribution of the overcomplete wavelet representation in the time-frequency plane (also called "phase space" in [12]). Figure 4.5 shows examples of such a distribution.

For time-frequency representations using complex Gabor filters [6], envelopes may be extracted by simply performing a modulus operation on two quadratic components. However., for representations with real analyzing filters., such as overcomplete wavelet representations, more sophisticated envelope detection algorithms are needed. We present the next two envelope detection algorithms and investigate their performance.

4.4.1 Envelope Detection by Hilbert Transform

The envelope of a narrowband bandpass signal can be coml)uted by a corresponding analytical signal [51]. For a signal r(t), the analytic signal is defined by:

.x(t) = x(t) + jx(t),

where . (t) is the Hilbert transform of x(t), x(t) =1 f�x(r;) d
7T t- dii
The envelope of the original signal x(t) is then simply the modulus of the analytic signal .(t):



The frequency characteristics of the Hilbert transform can be expressed by: j , W > = 0
H(w) = { otherwise.











































Figure 4.5: Two Examples of Power Density Distribution of Overcomplete NVavelet Packet Representations.
Legend:
(a) A linear chirp signal, the same as in Figure 3.4(b);
(b) A multi-segment pure tone signal with frequencies 'O = 0.087r, &; = 0.2531,
)2 = 0.10-r and wL3 = 0.55r, in the order of occurrence. The overcomplete wavelet representation and visual arrangements are the same as in Figure 3.4.









, 1(n)
08 -9 -1.94636728e-2 06 -7 -4.525349255e-2 0.4 -5 -9.117551207e-2 02- -3 -1.889891117e-1 i -1 -6.286127232e-1
-0.2 1 6.286127232e-1 -0.4 3 1.889891117e-1 -6 5 9.117551207e-2 08 7 4.525349255e-2
- 9 1.946367728e-2
0 1 2 3 4 5 6
(a) (b) Figure 4.6: A FIR, Hilbert Transformer. Legend:
(a) the imaginary part of the frequency response (the real part is zero).
(b) nonzero coefficients.


Therefore, the Fourier transform of the analytical signal x(t) is: {2X(L,) w >= 0
X(w)= { ,) otherwise.

For implementations in discrete-time domain, approximate FIR Hilbert transformers may be designed by windowing the ideal frequency response [49]. Figure 4.6 shows an example. It is a type-III FIR Hilbert transformer designed with parameters 1-18, and 3 = 2.629 [49, page 680]. This Hilbert transformer is antisvmmetric, of length 19 and has only 10 nonzero coefficients.

4.4.2 Envelope Detection by Zero Crossings

In this method, the maximum absolute value between two adjacent zerocrossings is first found, and then assigned to each point within the interval.

Algorithm 4.4.1 (Envelope by Zero Crossings) For a given array x(1 : N), start from, the index i = 1, find the next index k which is either a zero crossing point or a zero valued point x(k) = 0 or the other end point k= N, whoever is first encountered, assign the maximum absolute value A,,k = max {x(n)ki<







{x(n) [i<,,
4.4.3 Comparison Between the Two Detectors

We compared two envelope detectors by their working frequency range and robustness under noise perturbations. For pure-tone impulse signals, both (lid well in the middle range (w E [0.1r, 0.77r]). While Hilbert transform was unacceptable at the low end (Lo < 0.17), the zero-crossing approach did poor at the high end (L' > 0.77r). The test cases were shown in the rows 1 and 2 of Figure 4.7. We constructed a noisy impulse by introducing a random frequency perturbation: s(n) = sin(( o + 0.05 * RANDN(n))n)., where RANDN (a MATLAB function) is a random noise with normal distribution, and filtered it with a Lemari6 filter H, (L). This case was included in the row 3 of Figure 4.7. We found that the zero-crossing method exhibited edge-preserving smoothing characteristics and was more robust to wide-band noise.

4.4.4 Comparison with Other Energy Operators

A useful and simple energy operator was analyzed on [46]. The operator %P for a discrete signal x(n) was given as

q [x(")]= ?(n) -x(n + 1)x(, - 1).

However, this operator differs with envelope detectors in the way that its value on a pure tone signal is not only proportional to the anplitude squared but also a function of frequency [46]:

TI [A sin(won + a)] = A2 sin2( o).

4.4.5 Two Dimensional Envelope Detection

Next we extended the ID envelope detection algorithms for the analysis of two-dimensional image signals. In the frequency domain, a two-dimensional analytic














2




3


Col 1 Col 2 Col 3 Figure 4.7: Comparison of the Two Envelope Detectors.
Where:
Col 1: Input signals (top row, wo00.1w, wl=0.5T-; center row.
",0=0.057, w1-0.757; bottom row, noisy signal with ,0- 0.1T).
Col 2: Envelopes detected via the 19-tap Hilbert transformer.
Col 3: Envelopes detected via the zero-crossing method.


signal may be obtained by setting an appropriate half plane to zero. based on its orientation. That is, for a 2D signal f(x, y), the Fourier transform of an analytic signal f(x, y) is either:

F{w, 2F(LOXW) , 4; > 0
(w , w) - { 2,0 otherwiSC 0'.
F{) 2 F (w,,y) , >- 0
Fy(,,y) - { 2Fw , otherwise. For the 2D filters constructed by the tensor product method with halfband filters, the equivalent complex quadrature filters exhibited the frequency response shown in Figure 4.8. Notice that lowpass and diagonal components may have an alternate arrangement by zeroing out the left or bottom half plane.

This separable property allowed one to compute the envelope of a 2D signal using the ID algorithms described earlier in a straightforward manner, described in the following.








ff ff







(a) (b)(c) (d)

Figure 4.8: Frequency Response of Equivalent Conplex Quadrature Filters (level 1). Legend:
(a) lowpass channel (b) vertical channel
(c) horizontal channel
(d) diagonal channel
The diagonal shadowed areas identify zeroed half planes.

Algorithm 4.4.2 (2D Envelope Detection) For a given 2D array u(m, n) and its orientation. If its orientation is horizontal, apply the ID envelope detector columnwise; If its orientation is vertical, apply the ID envelope detector row-wise; Otherwise, apply the 1D envelope detector column-wise.














CHAPTER 5
APPLICATION I: TEXTURE SEGMENTATION

5.1 Introduction

In this chapter, we shall apply overcoinplete wavelet representations to the problem of segmentation of textured images.

However. it seems that we were back to the beginning. It may sound odd. but it is true that there is not even a precise definition of "texture," let alone "texture segmentation [61]. In spite of the difficulty, there is a general consensus that texture is a property of area. We felt that the concept of texture is parallel to the concept of "local frequency" for an non-stationary signal. The concept of local frequency is easy to understand. It describes the rate of change occurs in a window around a particular time. Yet there is not a unique definition of local frequency. Therefore, our treatment of texture segmentation is to embrace it into the paradigm of time-frequency analysis.

There have been many research works on texture segmentation using the concept of spatial-frequency representation. The analyzing functions used by researchers include:
" Laws microtexture masks (filters) [24]
" Complex prolate spheroidal sequences [64]
* Complex Gabor functions [6, 19]
* Real Gabor functions [30]
" The Wigner distribution [54]

More recently, wavelet and wavelet packet representations have been added to the list [34, 35, 36. 62, 9, 37].

A segnmentation process usually consists of two distinct phases: feature extraction and clustering. Features for texture representation are of crucial importance








for accomplishing segmentation [27]. In this chapter. we demonstrated that overcomplete wavelet packet representation and envelope detection make a good feature extraction scheme on a variety of both natural and synthetic textured images.

5.2 Texture Feature Extraction

The texture features we chose are the envelopes of overcomplete wavelet representations. The interpretation of the feature is that their squares corresponds to spatial-frequency power density distribution as pointed out in the last chapter. The feature extraction thus consists of two stages: 1) overcomplete wavelet packet decomposition and 2) envelope detection.

For 2D envelope detection on overcomplete wavelet packet representations, we classified each node in the decomposition tree into four possible categories, taking into account orientation, as follows:

" The root node is omnidirectional.

" The node last filtered by Gj(w,)H(wy) corresponds to vertical-orientation.

(Highpass filter G, is applied row-wise and lowpass filter H, column-wise.)

" The node last filtered by H1(w,)G1(wy) corresponds to horizontal-orientation.

(Lowpass filter H, is applied row-wise and highpass filter G, column-wise.)

" The node last filtered by G1(w)x)Gj(w.) corresponds to diagonal-orientation.

(Highpass filter G, is applied row-wise and highpass filter G, column-wise)

" The node last filtered by H(a),)H(wy) has the same orientation as its parent.

(Lowpass filter H, is applied row-wise and lowpass filter H, colunmn-wise)

At the end of feature extraction, each sample of a signal was attached a feature vector. In our case, such a feature vector is consist of envelopes of overcomplete wavelet packet representation. It is possible to apply a monotonic function to the





55


feature space to transform it into another feature space. For example, a square function x' will transform the envelope space into a space of power density distribution and log function log(x) will transform the envelope space into a space of log spectral.

5.3 Considerations on Filter Selection

\Ve raised the issue of filter selection in the Chapter 3 on overcomnplete wavelet representations. For the application of segmentation, we argued that symnetry, frequency response. and boundary accuracy are important factors in the selection of filters for feature extraction. In the following, we discuss these constraints in terms of overall performance.


" Symmetry. For accomplishing texture segmentation, accuracy in texture

boundary detection is crucial for reliable performance. In this application, filters with symmetry or antisymmuetry are clearly favored. Such filters have a linear phase response. where the delay (shift) is predictable. Alternatively.

filters with nonlinear phase may introduce complex distortion. Moreover. symmetric or anti-symmetric filters are also advantageous in alleviating boundary

effects through simple methods of nirror extension (see Appendix D).

* Optimal Frequency Response. In order to derive an ideal filter frequency

response for the chosen feature, we considered a two-band filter bank with input signals of infinite length consisting of two segments with distinct pure tones.

The input signals can be written as:
(- 1 A1 cos(w, dn + a,) , n < 0,
A2 cos(wU2n1 + a2) ,I > 0,


where .41 >0 and .42 >0. Except for the boundary (n =0). we derived the

feature vectors (envelopes of channel outputs) as.,

T= (eICG) = 2eft (A, IH(L,)I) A, IG(Lm)j) , n < 0, Tight (A2 IH(Lw2) ,A2 IG(W2)1) ,, > 0.








The angle 0 between vectors Tleft and Tright is


cs ( IH(w)H(I2) + jG(wij)G(2) )


and is bound by 0 < 0 < . The distance between the two classes in the feature space is then
Vi 2 2 -2
D = rTejf 2 + T,,nt - 2 ftj Tgrig,1, cos(O)


For 0 = E. vectors Tieft and Tright are orthogonal. and the distance D reaches its maximum. Clearly, the maximum distance in feature space between any two classes is optimal for segmentation and classification al)plications, in the sense that the classes are better seperated and more robust to noise perturbations.

Notice that for H(Loj) : 0 and G(W2) - 0, cos(0) = 0 if and only if G(WI) = 0 and H( 2) = 0. This means that optimal filter banks should have no overlap in the frequency domain, and thus filter H(,) with optimal frequency response must be a perfect half-band filter. Indeed, [50] derived a similar optimal filter
for image coding applications.

Of course such ideal filters cannot be realized in practice. Based on time preceding discussion., filter bands overlapping in the frequency domain tend to reduce class distance in feature space. Therefore, a filter H(w) with a large stop-band

attenuation and flat pass-band frequency response is desirable.

e Spatial Localization. There are two types of boundaries for signals of finite

length: 1) boundaries of regions exhibiting distinct characteristics. and 2) the physical boundaries of a data segment. Feature vectors close to the boundaries will be affected. The size of the affected region depends on the length of the channel filters, and the distribution of filter coefficients. Therefore, filters of

short length and fast decay shall be better suited for boundary detection.





37


Unfortunately. the )receding three criteria cannot be satisfied simultaneously. As previously pointed out, quadrature mirror filters (QNIF) with compact support cannot be symmetric or anti-symmetric. This means that a symmetric constraint on any QMF will be in conflict with spatial localization goals. Moreover, large attenuation in the stop band requires a longer filter, which in turn degrades the filter's spatial localization. Again, we faced the limitation of the uncertainty principle. and the previous discussions about the uncertainty factor of a filter bank is applicable here.

In this study, we chose the Lemari6 filters for its symmetry and good frequency characteristics.

5.4 The Basic Isodata Clustering Algorithm

Segmentation algorithms accept a set of features as input and output a consistent labeling for each pixel. Basically, this is a multi-dimensional data clustering problem with no general algorithm available [23]. Clustering algorithms that have been previously used for texture segmentation can be divided into two categories: 1) supervised segmentation and 2) unsupervised segmentation.

In practice. unsupervised segmentation is often desirable and easy to validate. It is particularly useful in those cases where testing samples are difficult to prepare (making supervised segmentation infeasible). Thus, we used a Baisc Isodata clustering algorithm [18, page 201].


Algorithm 5.4.1 (Basic Isodata) Given a 2D image array x of structure containing feature vectors and label fields, and the number of classes N ,

Step 1. Scan the representation matrix x in raster order. For every pixel encountered, randomly pick a label from set {0. ..., N.-1} and assign it to the pixel.

Step 2. Compute the class center {Ck 10




58


Step 3. Rescan the whole representation matrix x, and assign pixel (ij) to the class k if the Euclidean distance between the feature vector of the pixel and the class center Ok is the closest.

Step 4. If no pixel changes its class in the Step 3, stop; else, go to the Step 2.


This algorithm differs from a K-means algorithm [29] in only one aspect: Basic Isodata updates class centers after a complete scan of an input feature set while K-means updates for every reassignment. In our experiments, Basic Isodata outperformed K-means for almost all cases.

5.5 Postprocessing

The Basic Isodata algorithm labels each pixel independently and does not take into account the high correlation between neighboring pixels. A postprocessing stage, such as the relaxation labeling [25], can be used to incorporate some neighborhood constraint into the segmentation process.

For simplicity, we used median filtering as our postprocessing to simulate the benefit of a local constraint. In particular, a 5 x 5 median filtering was repeatedly applied to an initial segmentation until no change of labeling occurred.

5.6 Experimental Results

Our segmentation algorithm was tested on both one-dimensional signals and two-dimensional textured images. Our test images included samples of two distinct families of textures. In all examples shown, straightforward envelopes of the overcomplete wavelet packet representation were used as texture feature.

5.6.1 One-dimensional Signals

Figure 5.1 shows the segmentation of a signal consist of a sinusoid segment and a triangular segment with the same period (sixteen samples) and amplitude.









Row 1


2


3


4






Figure 5. 1: Segmentation of a 1D Signal Consists of Triangular Waveformn and Sinusoid.
Where:
Row 1: the original signal and the segnmentation result:
Row 2-5: the overcornpiete wavelet packet representation anid envelopes.


With envelopes of four channels of the Level 2 representation. the perfect result was achieved. Figure 5.2 is a segmentation example of a noisy signal consist of two sinusoid segments P=0O.37, and w=0.57) contaminated by white Gaussian noise. Envelopes of eight channels of the Level 3 representation was able to achieve the perfect result.

5.6.2 Natural Textures

Here we used textures obtained from the Brodatz album [1] and public archive. Each testing sample was first histogram-equalized so that a segmentation result based only on first order statistics was not possible. Experimental results are displayed in Figure 5.3. Experimentally, we observed that a lower order Lemari6&Battle filter (p =1) performed well in boundary detection (Figure 5.3 (b)), while the higher order Leinari6-Battle filter (p =2) did a better job within non-boundary (internal) regions.

















Row 1


I I~ II


Figure 5.2: Segmentation of a Noisy ID Signal Consists of Two Pure Tone Segments. Where:
Row 1: the original signal and the segmentation result,
Row 2-9: the overcomplete wavelet packet representation and envelopes.


8 -, I I I 1 11, 1 1 , . " 1, 1 1 1 44A .


















Row 1








2








3




Col I Col2 Col3

Figure 5.3: Segmentation Results of a Image Consist of Natural Textures. Where:
Row 1: Test image TI (256 x 256, 8-bit) consists of D17, herringborn
weave and bark (true boundary overlaid for display only).
Row 2: Clustering results using features extracted from a Level 4
filter bank generated by (Col 1) Lemari6-Battle filter of p = 1, (Col 2) autocorrelation function of Lemari6-Battle filter of p= 1, (Col 3) Lemari6-Battle filter of p =2. The zero-crossing
algorithm were used for envelope detection.
Row 3: Final segmentations after postprocessing.








5.6.3 Synthetic Textures

We tested the perfornance of our algorithm oil several texture images synthesized from random noise based on [27].


" Gaussian Lowpass and Bandpass Textures. The images were generated

by filtering a white Gaussian noise with mean of zero and standard deviation

of 30 with a Gaussian filter of frequency response (in polar coordinate): G(r, 0) = e-�-�0i e- -F /(').



" Filtered Impulse Noise (FIN). The images were generated by filtering a

random impulse image I(m, n) generated by:

I (m, n)= 1.0. if RAN > T.
{ 0.0. if RAN < T.

where RAN E (0.0, 1.0) is a random number with uniform distribution, with

Gaussian impulse response:

g (t , '/'I) = C-m2-,2/(2 .52) 2/(2.,


In both cases. images were linearly-scaled into the range of [0, 255].

Figure 5.4 shows a segmentation result on a Gaussian lowpass texture image, and Figure 5.5 shows a segmentation result oil a filtered impulse noise (FIN) texture image. For this difficult test image T3, the algorithm achieved outstanding performance.

W\e also tested our algorithm on a texture image containing regular and sparse elements. Figure 5.6 demonstrates the accurate segmentation result.

5.7 Summary and Discussion

A quantitative comparison, presenting the accuracy of our segmentation results of textured images is summarized in Table 5.1. This performance is consistent



















(a) (b)


Figure 5.4: Segmentation of a Synthetic Gaussian Lowpassed Texture Image.
Legend:
(a) Test image T2 (256 x 256, 8-bit): Gaussian LP, left: isotropic
F, =0. S= 60; right: non-isotropic F,= 0. S =60. 00=0,
B0 = 0.175.


(b) Initial segmentation (Lemari6-Battle filter
zero-crossing envelope detection)
(c) Final segmentation after postprocessing.


of p = 1. Level 4 and


(a) (b)


Figure 5.5: Segmentation of a Synthetic FIN Texture Image. Legend:
(a) Test image T3 (256 x 256, 8-bit): Filtered impulse noise, left:
non-isotropic T = 0.15, S, = 1.0, Sy = 1.5 right: non-isotropic
T = 0.15, S= 2.0, Sy = 1.0.
(b) Initial segmentation (Lemari-Battle filter of p = 1. Level 4 and
zero-crossing envelope detection)
(c) Final segmentation after postprocessing.

















(a) (b)


Figure 5.6: Segmentation of a Synthetic Texture Image with Line Patterns.
Legend:
(a) Test image T4 (256 x 256, 8-bit)
(b) Final segmentation (Lemari&Battle filter of p = 1. Level 3 and
zero-crossing based envelope detection).
(c) Detected boundary overlaid with the original image.


Table 5.1: Boundary Accuracy of the Segmentation Results.
Test image Maximum ABE Average ABE �u T1 11.0 3.2 2.6 T2 15.0 2.9 2.4 T3 11.0 2.9 2.6 ABE: Absolute Boundary Error (in pixels).


with the difficulty of segmentation perceived by human observers. We observed that boundary errors are dependent oi shape i.e., complex boundaries yielded higher variance.

Overcomplete wavelet packet representations and envelope features provide a versatile and flexible framework. The examples showed that satisfactory results (an be achieved. Based on our experiments, low-order Lemari6 filter (p = 1) achieved the best results. However, we are aware that there were several important parameters to be determined manually for the segmentation algorithm to work. These include the configuration of the representation and the number of classes. On the face of lacking a precise definition of the problem, we believed that no mathematical solution








is possible, and these parameters can only be designed case-bv-case for particular applications.

Comparing with Gabor filters, overcomplete wavelet packet representat ions possess several potential advantages for further exploration:

" channel filters cover exactly the frequency domain (provide a mathematically

complete representation) without significant overlap, thus greatly reducing correlations between features extracted from distinct channels

" adaptive pruning of a decomposition tree makes possible the reduction of computational complexity and tle length of feature vectors

" by moving up the decomposition tree, feature vectors at distinct resolution

levels can be obtained to increase the accuracy of boundary detection














CHAPTER 6
APPLICATION II: IMAGE DEBLURRING

6.1 Introduction

Image deblurring is a special case of signal restoration. The goal is to recover an image which is blurred and usually degraded by noise. During the past two decades, large amount of research works have been done in this area [33. 2. 28].

Unlike the texture segmentation, image deblurring is a well defined problem. The mathematical model commonly used to describe the degradation process is a convolution and an additive noise, written as: !(k, k2)= d(ki, k2) * x(kl, k2) + n(k1. A2), where x(kl. A2) is the original image., y(kl, k2) is the observed image, d(kl, k2) is the impulse response of the blurring operator, n(kI, k2) is the noise and the symbol

* stands for convolution. The deblurring problem is to restore the original image X(k], k2) from the degraded image y(kj, k2). This is thus a linear inverse problem.

At the first glance. the problem seerns solvable. Assuming known impulse response of the blurring process and the absent of noise, one may formally derive the solution in the frequency domain as:




This is the simplest, most intuitive approach and is called inverse filtering [2, 28, 22]. However, the frequency response D(w1, W2) associated with a blurring process usually has zeros in the frequency plane and its reciprocal is thus unbounded. In the other point of view, blurring generally causes irreversible information loss even without the existence of noise. By taking into account of noise, the inverse filtering approach








would produce a restoration:

X'(1: I 02) = Y(1) I2)/D(w1,W2) = X(W1,W2) + N(,,2)/D(wi.w2).

In this case, the closeness of the restored image X' to the original image X is hinged on the strength of the noise N. Notice that the precise waveform of the noise is unknown and thus cannot be completely subtracted from the observed image Y.

Inverse I)roblens associated with a unbounded inverse operator is said to be ill-posed [33, 14]. The traditional mathematical tool for dealing with such problems is regularization mnethods [60].

In this chapter, we shall apply overcomnplete wavelet packet representations to the deblurring problem. Our approach explicitly suppresses noise using the nonlinear shrink operator and approximates an inverse filter using the gain operator.

6.2 Review of Some Deblurring Techniques

In this section, we shall review modified inverse filters, Wiener filters and the more recent theory of wavelet-vaguelette inversion.

6.2.1 Modified Inverse Filters

For blurs having zeros in the frequency plane, there are two modifications available to make a bounded inverse filter.

1. Gain-limited Inverse Filter. The idea is simply setting a upper-bound for

the magnitude of frequency response to grow, as described below: 11R .0 < D(.) < R. F(K. ,&Y) 1/D(w,w ) (D(u', ) > R. (6.1)
-1/R - < D(x,,)Y) < O, where R is a positive real value and D(wr, wy) is a real filter.

2. Pseudo-inverse Filter[28, page 276]. This is the same as the gain-limited

inverse filter except the constant gain region set to zero:

F 0 , D(,, 4,',)j < R, (6.2) 1/D(L, , L, ) , ID( ,),w)I > R,








where R is a positive real value.

Both approaches have their pros and cons. The advantage of the gain-limited inverse filter is that it is continuous in the frequency domain. The disadvantage is that the passband of ID(wa,,w') < R may significantly enhance wideband noise. In opposite. the pseudo-inverse filter will com)letely wipe out anything outside its region of inversion by sacrificing the continuity.

6.2.2 Wiener Filters

The idea of the Wiener filter is to choose a deblurring filter f to minimize the mean square restoration error defined as [18, 2, 221: C � {[x'(k ,. k2) x~i 2]2}


where x' f * (d * x + ) and E denotes mathematical expectation [49]. Assume that noise n is a random field with zero mean and is independent of x, the expression of E can be evaluated as:

= E [(.f * d * x-)2 -2(f *d*x x)(f *n)(f� )2] � (( *d6) * ,.)2] +�[(.f * 1 2

-2 J [ D - 112 (DX + IF2 D,, d .,wId2,
47r2 _" _"[r D

where F, D, Dxx and ,, are all functions of (w1l wL2). and (Dx, and ,D,, are the power density spectrum of the signal and noise [49, Section 2.10]. respectively. Furthermore. the integrand can be factored as:
IFD - 112 4),X + IF12 4),"
= (F - 1/D) (F* - l!D*) ID12 4xx + FF*J,.
= (FF* - F/D* - F*iD + 1/1ID12) 1D12 I)x. + FF*,,,
= FF* (ID 2 x + Nn.) - (F/D* + F*/D) D12 4,, + 4
= [FF* (F/D* + F*/D) IDI 12] (ID12 (D X + 4) + ),X.

2 ) 1 (~242 �2 i D) 2 (D D DF 1x 2(IDI 'i, + 4Dn)+ - I ID ~~ I)Pn A Pxx+(D








Since the first term is large than or equal to zero, the optimal deblurring filter (Wiener filter) is thus

DF1 W , 12).xw , W2) 12 (6.3) r( '~w' = D( ,I, W2)1 'Dxx Pol, W2) + (IDnn (" l -2)' which can be rewritten in terms of the signal-to-noise power ratio D* (.w, &2)
FYI I2) D(i,, 2)12 + (Dn( . I, 2) / (D,, '2)

For white Gaussian noise, (N. (a)], W2) = No. The power spectrum (xD,, (")1-2) characterizes correlation property of the signal X. As the correlation usually falls off as the sample distant increases, (, 2) usually is a lowpass function. If the signal has no correlation among its adjacent samples, (Dx,,(W] - 2) = So, and the signal-tonoise power ratio equals to a constant So/N - 1/R. In this case. the Wiener filter is reduced to

D* I +, 1.12) (6.4) Comparing with the gain-limited inverse filter and the pseudo-inverse filter., the Wiener filter of (6.4) possesses both merits of continuity and noise suppression.

6.2.3 Wavelet-Vaguelette Inversion

The combination of wavelet-vaguelette decomposition (WVD) and nonlinear shrinkage of WVD coefficients was proposed by Donoho [14] for linear inversion problems involving dilation-honlogeneous operators. An operator F with such property is commutable with a dilation operator D0, such that FDa = aD0F ,

where a is a constant. and D, is the dilation operator defined as Da [f(t)] = f(at) .

The WVD involves three sets of functions: an orthogonal wavelet basis {'0,k(t) = 2i (2 t - k)} and two near-orthogonal sets {uj,k(t) = 2ji2u(23t - k)} and








{Vj,k(t) = 2Ji12v(2ft - k)}. The three sets are related by the quasi-singular value retations

Ftj,k = ' ~~ (6.5) F*Ujk = Yj,k, (6.6)


where quasi-singular values ij depend on resolution index j but not spatial index k. Moreover, the two sets {uj,k(t)} and {vj,k(t)} satisfy biothogonality Uj,k(t)v,',,(t)dt z=- 56 ,Sk


and near-orthogonality

ff (z,;kaj,kj,kY() dt Ej,k aj,k


Linking everything together. the reproducing formula of the WVD is expressed by

r(t) - I [FX.Jik] -1j 'j,k(t) jk
where [x, y] = r(t)y-(t)dt is the inner product of x and y.

Finally, the WVD inversion of a white Gaussian contaminated measurement y(t) = Fxr(t) + n(t) may be achieved via X'(t) - > Kt, { [Y, Uj,k] j1 } I Vj,k(t), (6.7) j,k

where ICt, is the nonlinear shrinking operator defined by (4.4). Notice that the threshold tj depends only on resolution j.

6.2.4 Discussion

The Wiener filter provided a powerful tool for linear inversion problems. Given the statistics of both image and noise processes, one can design a optimal filter. However, Wiener filter belongs to the linear filtering paradigm and works in the








frequency domain. The linear property determines that it has to trade-off between smoothing out noise and sharpening up edges.

In the opposite, the wavelet-vaguelette approach deployed nonlinear shrinkage to combat noise in the time domain. It is characterized by three components: 1) dvadic analyzing and synthesizing functions determined by the blurring operator F 2) level-dependent gain factors X 1; and 3) nonlinear shrinkage with level-dependent thresholds td. The nonlinearity enables it to achieve better compromise between denoising and sharpening. However, it is limited to dilation-homogeneous operators only, excluding the most commonly encountered Gaussian and uniform blurs [14]. Moreover, the theory was developed in the continuous-time domain.

The limitation of the WVD inversion lies on its insistence of the quasi-singular value relations (Eqs. (6.5) and (6.6)). For a convolution operator F with kernel h(t), in the frequency domain (6.5) is equal to H(w)'I(2-j ) = - jli(2-J ). (6.8)


Let L' = 2kL and plug it into (6.8), we have H(2=k)4T(2 ( k)) =1V(2 U k)) (6.9)


In the other hand, directly plugging level index j-k into (6.8) produces H(w)T(2-(J-k)w) = ' j-kV(2-(j-k)L). (6.10)


By comparing (6.9) and (6.10), we conclude that it must be true that H (2 kW,) -,j
-- independentt of w)
H(w') - -k

which is the characteristic of dilation-homogeneous operators in the frequency domain.








6.3 Discrete-time Overcomplete Wavelet Packet Inversion
6.3.1 One-dimensional Formulation

There were two major considerations in our searching for a deblurring algorithm. First. it should be available in the discrete-time domain. Second, it should be applicable to inhomogeneous blurring operators. The result was the discrete-time overcomplete wavelet packet representations equipped with a nonlinear shrinking operator and a gain operator. For an one-dimensional observation y(k) = d(k) * x(k) + n(k), our deblurring algorithm can be expressed by: wj(k) = v (n)y(k i), (6.11)


X'(k) = ) 3 K1, [wj(,,)] Auj(k -in), (6.12) where {AjI0
which means that the order of gain and shrinking operator can be exchanged. If the shrinking threshold is adaptive to magnitude of coefficients, such a change of order would not alter the result. We prefer the order of "denoising" and then -inversing" as expressed in (6.12) over the order of "inversing" and then "denoising" of (6.7) since we believed the former is more logical.

The idea behind our method is the same as in the WVD inversion (6.7). That is to cut off noise by nonlinear means and then sharpen the survived coefficients. Although the formula (6.12) looks virtually the same as the WVD inversion, it has several major differences. First. it uses discrete-time overcomplete wavelet packet representation, which guarantees it will not fall on the anomaly of aliasing enhancement. Second, its analyzing and synthesizing functions do not subject to the dyadic bandwidth constraint, and do not necessarily satisfy the quasi-singular value relations (Eqs. (6.5) and (6.6)).








To complete the framework, we still need to resolve two issues. First. for a given blur with frequency response D(u;), how can we determine the channel configuration (analyzing and synthesizing filters) as well as the gain vector? Second., how should we choose thresholds tj for denoising?

We proposed the following algorithm for the channel structure and the gain vector:

1) Choose a gain-limited inverse filter or a Wiener filter to approximate.

2) Choose the prototype filters H(w;), H(w) Gc) and ((), and then the channel

filters of the Level 1 can be determined by C1,0(&:)= H(w )H(,:) and

C1,I(w) =

3) Call Algorithm 4.2.2 (Minimurn Tree) to determine both the tree configuration

and gain vector.

However, the issue of determining thresholds tj of denoising is much more difficult. Thresholding schemes for several particular applications were suggested by Donoho [15]. Generally speaking, knowledge about signals and noise is needed to devise a thresholding scheme. By treating signal and noise as two random processes, the correlation property may be used to characterize them, as in the case of Wiener filtering. For wide-sense stationary white Gaussian noise and colored Gaussian signals., their autocorrelation functions o and power spectrum densities 4 may be written as:



ae-)xx(w) a (-2e- (6.13) Note that q)x(w) is generally different from the power density spectrum IX(,)I2 of a particular realization x of the signal process. Under these assumptions, a possible thresholding scheme is:

tj = 3B/Ix1( ), (6.14)








where Bj is the bandwidth of the channel j, and wvj is the center frequency of the channel.

6.3.2 Two-dimensional Extension

The one-dinensional algorithm of the overcomplete wavelet packet inversion need to be extended to two-dimension in order to deal with images degraded by symmetric 2D blurs. Formally, the 2D version may be written as

Af- 1M 1
x' (k1, '2) = EZ Z Z 1t,,C2 [wCI,C2 (ni , nl.2)] ) ,c2 �Uc'2(Aj' - Ik2 - 11 2)
CI= (2=0 rn1 M2

However, an efficient 2D overcomplete wavelet packet representation is much more difficult to seek than in the ID cases. In general, the 2D separable overcomplete wavelet packets may not be the efficient representation. This is due to the fact that both 2D modified inverse filters and Wiener filters are non-separable even for separable symmetric 2D blurs. For example, for a separable Gaussian blur: D(wx, wY) = Ac- i+ )iq2


the pseudo-inverse filter and the Wiener filter are F - l /R) (6.1-5) ( = , otherwise,

and

F (L,;x, w) = 1/ (I + R/A e + )i- ). Both filters are isotropic and can be efficiently represented by donut-shaped channel filters.

In practice, a reasonable approximation may be achieved by applying the ID algorithm separately to rows and columns. For the separable Gaussian blur, a separable pseudo-inverse filter may be constructed as:








where the ID filter F(w ) is

Fi,) { / e 2/,2 IWI < oaln(A/R)
0 , otherwise.

It is clear that the rectangular region (,< crV/ln(A/R). w11y - a/ln(A/R))of the above 2D filter includes the disk of VwU + wV < 1n(A/R) of the (6.15). By choosing appropriate parameter R, acceptable results may also be achieved by approxilating 2D Wiener filters with ID Wiener filters. For uniform blurs, gain-limited inverse filters are similar to Wiener filters.

The major problem of such separable approximations is they may boost diagonal frequency components significantly. Figures 6.1 and 6.2 show examples of both 2D non-separable and separable filters. The emphasis of the diagonal frequency region is particularly clear in the case of Gaussian blur.

6.4 Experimental Results

To quantitatively compare deblurring results, we used an objective measure, the improvement in signal-to-noise ratio (ISNR), defined as [3]: ISNR, = 20 log I X P = 1,2, where Ij,,jP is Lp norm defined as [56]:
Nv-1 ) i/



and x. y and x' are the original, degraded and deblurred signals, respectively.

Moreover, a, performance upper bound is useful in providing a gauge for comparison. The performance of an oracle Wiener filter may be considered the upper bound for \Wiener filters. An oracle Wiener filter is a Wiener filter with Dn(w) and 4)x,(w) replaced by power density spectrums JN(w)12 and IX (W)12 of the particular realization of the noise and the original signal. The filter may be written as: DI(w)
I() + IN()2 / IX(w)IH




















Row 1





2m










3





Figure 6.1: Log Frequency Responses of Filters for a Gaussian Blur.
Where:
Row 1: Blurring filter
Row 2: 2D non-separable filters, gain-limited, pseudo-inverse and
Wiener filter from left to right
Row 3: 2D separable filters, gain-limited, pseudo-inverse and
Wiener filter from left to right






















Row 1






2



040,



3





Figure 6.2: Log Frequency Responses of Filters for an Uniform Blur.
Where:
Row 1: Blurring filter
Row 2: 2D non-separable filters, gain-limited, pseudo-inverse and
\Wiener filter from left to right
Row 3: 2D separable filters, gain-limited, pseudo-inverse and
Wiener filter from left to right








However, it is not necessarily the upper bound of the overcomplete wavelet packet inversion.

6.4.1 One-dimensional Signals

One-dimensional deblurring examples are shown in Figure 6.3 and 6.4. The ideal signal in both cases is a perfect multi-level blocky function, which is very challenging and revealing for any deblurring method. For overcomplete wavelet packet representations used in both examples, Lemari6 filter of p= 1 was used. In Figure 6.3, the impulse response of a Gaussian blur was:

f {k A exp [-(h7/ ((N/2))2], 0< k< N12.
f~)= Aexp[-((N-k)/((cN/2))] V/ 2< k

with a = 0.03 and A 1.0/ Ek f(k). The impulse response of the uniform blur was f = [1, 1, 1. 1, 1, 1. 11/7. In both cases, white noise with (72 = 0.01 was added such that 1log( 2c~o i2,
the ratio of peak signal power to average noise power was SNR= 10 Xog(,1(1a) = 21.6db. To make fair comparisons, the same Gaussian signal model of (6.13) with a = 1.0 an(l Tx = 0.007 was used for both Wiener filtering and the overcomplete wavelet packet inversion. The thresholding scheme of (6.14) with a = 0.25 and a = 0.18 was used for the Gaussian and uniform blur, respectively. The gain-limited inverse filter (6.1) with R = 0.04 and R = 0.05 was used for the Gaussian and uniform blur, respectively. For the minimum tree algorithm, the maximum depth was set to Level 6. The algorithm found a configuration of twenty-five channels with F = 5.7% in the case of uniform blur, and twelve channels with g = 0.004(/ in the case of Gaussian blur.

Table 6.1 compares deblurring performance in terms of ISNR. For the two particular blurring sources and the signal, it shows that overall performance of the overcomplete wavelet packet inversion is about the same as Wiener filtering by the measure of ISNR2. and it is better than Wiener filtering by ISNR.











Row 1



2 3



4


5 I



6







Figure 6.3: One-dimensional Deblurrings of a Gaussian Blur.
Where:
Row 1: The original signal
Row 2: The blur filter and the degraded signal
Row 3: The oracle Wiener filter and the deblurred signal
Row 4: The Wiener filter and the deblurred signal
Row 5: Tile gain-limited inverse filter and the deblurred signal
Row 6: The overall frequency response of the overcomplete wavelet
packet inversion and the deblurred signal
Row 7: Channels of the overcomplete wavelet packet representation
Note: The left column are frequency responses. The central column are signals of full-length. The right column are zoom-in of the feature region., where deblurring results were overlaid with the degraded signal
of doted line.











Row 1



2



3



4







6
Ij











Figure 6.4: One-dimensional Deblurrings of an Uniform Blur.
Where:
Row 1: The original signal
Row 2: The blur filter and the degraded signal
Row 3: The oracle Wiener filter and the deblurred signal
Row 4: The Wiener filter and the deblurred signal
Row 5: The gain-limited inverse filter and the deblurred signal
Row 6: The overall frequency response of the overcoinplete wavelet
packet inversion and the deblurred signal
Row 7: Channels of the overcomplete wavelet packet representation
Note: The left column are frequency responses. The central column are signals of full-length. The right column are zoom-in of the feature region, where deblurring results were overlaid with the degraded signal
of doted line.









Table 6.1: Performance of ID Deblurring Examples
Blur Type I ISNR2 (DB) ISNR1 (DB)
I OWF WF GLIF OWPI OWF WF I GLIF OWPI
Gaussian 2.37 1.88 -9.04 1.90 -0.75 -1.45 -15.87 1.18
Uniform 5.17 3.90 -4.32 3.62 -1.00 -2.40 -11.80 0.00
Note:
OWF=Oracle Wiener Filter
WF=Wiener Filter
GLIF=Gain-linited Inverse Filter
OV\PI=Overcomplete Wavelet Packet Inversion


6.4.2 Two-dimensional Images

Examples of image deblurring using the standard Lena image were included in Figures 6.5 and 6.6. Separable blurring sources were used. The same ID blurring filters used in the ID examples were applied row-wise and column-wise on the original Lena image. A statistical signal model of (6.13) with a = 3.5 and ur. = 0.008 was used for both Wiener filtering and the overcomplete wavelet packet inversion. In both cases. white Gaussian noise with ar = 0.5 was added such that SNR 44.3DB. The thresholding scheme of (6.14) with a = 25.2 was used for both cases. The shrinking operator was applied on the 2D components. The gain-limited inverse filter (6.1) with R = 0.08 was used for both comparison and overcomplete wavelet packet inversion. A separable overcomplete wavelet packet inversion algorithm was used to approximate a 2D overcomuplete wavelet packet inversion. For the case of Gaussian blur, the approximation used 11 x 11 channels with = 0.005'c (1D). For the case of uniform blur, the approximation used 25 x 25 channels with E = 3.9% (1D). The deblurred images were all linear converted to the same intensity range of the original image for the calculation of ISNR indexes.

Table 6.2 compares deblurring performance in terms of ISNR. For the two particular blurring sources and the Lena image, it shows that overall performance of









Table 6.2: Performance of 2D Deblurring Examples
Blur Type ISNR2 (DB) ISNR1 (DB)
OWF WE[ GLIF OWPI OWE] IVF GLIF OWPI
Gaussian 3.20 13.17 -14.04 2.80 2.96 3.25 -16.82 2.69 Uniform 3.55 1.78 -14.78 1.76 1.46 -0.60 -18.05 -0.64
Note:
OWF=Oracle Wiener Filter
WF=Wiener Filter
GLIF=Gain-limited Inverse Filter
OWPI=Overcomplete Wavelet Packet Inversion


the overcomplete wavelet packet inversion is about tile same as Wiener filtering by both measures of ISNR2 and ISNR1.

6.5 Summary and Discussion

We have demonstrated that the deblurring problem can be handled using an overcomplete wavelet packet representation. We showed that inverse filtering in the frequency domain and denoising in the time domain can be done in a single step. In other words, for this application the gain and shrinking operators can be coml)ined as a single operation on the spatial-frequency representation. Moreover, our approach may be seen as an extension of the wavelet-vaguelette inversion into both the discretetime and broader problem domains.

Although the one step inversion and denoising nmay be conceptually appealing, its limitation is also apparent. Since nonlinear shrinkage works only within areas where the magnitude of signal components is stronger than that of noise components,. the best representation for denoising is to maximize local signal-to-noise ratio. For example, if a sharp edge is decomposed into many channels, its components at some channels may be indistinguishable from noise. Therefore, the best representations for approximating the inverse filter and denoising are generally different. Having only one choice of representation, the approach has to limit its application to certain signals or blurring sources.




















(a) (b)


(c) (d)


(e) (f)


Figure 6.5: Deblurring a Gaussian-blurred Lena inage. Legend:
(a) The original 256 x 256 image
(b) Degraded image with SNR=44.3 db
(c) Deblurred by the oracle Wiener filter
(d) Deblurred by a Wiener filter
(e) Deblurred by a gain-limited inverse filter
(f) Deblurred by a separable overcomplete wavelet packet inversion




















(a) (b)


(e) (d)


(e) (f)


Figure 6.6: Deblurring an Uniform-blurred Lena Image. Legend:
(a) The original 256 x 256 image
(b) Degraded image with SNR=44.3 db
(c) Deblurred by the oracle Wiener filter
(d) Deblurred by a Wiener filter
(e) Deblurred by a gain-limited inverse filter
(f) Deblurred by a separable overcomplete wavelet packet inversion





85


This limitation may be tile reason that our preliminary experimental results showed roughly the same performance as the much simpler approach of the Wiener filtering. For the Gaussian and uniform blurring sources, some channels with very narrow bandwidth are needed to achieve satisfactory approximation using the gain vector. For those channels, shrinking operation is very close to multiplication.














CHAPTER 7
CONCLUSION

We examined both classes of Fourier-based and wavelet-based representations. \Ve showed that orthogonal wavelet transforms lack translation invariance and the efficacy of representing convolution operators. In particular, we demonstrated that aliasing distortion due to critical subsampling limits its applications in image enhancement and restoration. On the other hand, overcomplete wavelet representations. although not as efficient as non-redundant representations, avoid those shortcomings. Moreover, overcomplete wavelet representations can be conveniently analyzed using linear time invariant system theory. In this thesis, we evaluated their capability of time-frequency localization using the uncertainty factor. W�e also identified gain, shrinking and envelope operators to approximate convolution operators. suppress noise and extract power density distributions. We investigated the criteria of minimum mean square error and designed an optimization algorithm. We extended the wavelet shrinkage to overcomplete representations. We also presented two envelope detection algorithms and compared their performances.

The framework of the overcomplete wavelet representation was applied to two difficult problems: segmentation of textured images and image deblurring. W\e demonstrated that channel envelopes as feature vector produced satisfactory results on both natural and synthetic textures. We showed the gain and shrinking operators may be used for image deblurring and pointed out its limitations.

The issues of time-frequency representations. pattern recognition and noise suppression are very fundamental ones. Our knowledge about them is going to be accumulative and evolving. This thesis increases our level of understanding and points





87


to many opportunities for future research. The issue of time-frequency localization was touched upon but not fully explored. Although the uncertainty factor was used for evaluation of individual channels, we lack a criteria of optimality on a representation as a whole. Once such a criteria is found, we may able to search for the optimal representation of textured images, which was not explored here. Denoising is certainly another front. The optimal representation and adaptive selection of thresholds are going to be intertwined and challenging.














APPENDIX A
PROOFS RELATED TO DISCRETE-TIME WAVELETS

Some basic proofs related to (bi)orthogonal discrete-time wavelets were included here in order to make this thesis self-sustained.


Proposition A.O.1 The necessary condition for a pair of discrete sequences to be orthogonal (as defined by (A.1)) is (A.2):


a(2n,
rtn X.--


- n)b(i - 21) = A6(n


.4(cjW)B(c)-)) + A(d(L+ ))B(c( )) = 2A.


(A.1)


(A.2)


where A is a constant, and A(cj') a(n) and b(n), respectively.

Proof. By changing variable k

a0 - a [2(n 1) - k] b(k). Then


and B(ejw) are discrete-time Fourier transform of



= m-21, the left side of (A.1) is changed to it is equivalent to prove that


Z a(21- k)b(k) = A6(1).
k=o

Notice that the left side of the above is a 2-fold decimation of the series


p(l) = E a(/ - k)b(k).
k=-o

Since p(l) is the convolution of a(l) and b(l), its discrete-time Fourier transform is known as

P(dW) =A(ejw)B(ej').




Full Text

PAGE 1

OVERCOMPLETE WAVELET REPRESENTATIONS WITH APPLICATIONS IN IMAGE PROCESSING By JIAN FAN A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1997

PAGE 2

Copyright 1997 by Jian Fan

PAGE 3

ACKNOWLEDGEMENTS First, I would like to express my deep gratitude to my advisor, Andrew Laine, for all his support and encouragement. I thank him for introducing me to the field of wavelets, and I appreciate the opportunity to work with him. I would like to thank Dr. Jerry C. L. Chen and Mrs. Shirley H. W. Chen of Palo Alto, California, for their help and support. Without their support, I may not have the opportunity to pursue my graduate studies in the United States. I would like to thank Drs. Gerhard Ritter, Sartaj Sahni, Dave Wilson and John G. Harris for being on my thesis committee, and Dr. Baba C. Vemuri for attending my defense. Their time and thoughtful recommendations were greatly appreciated. I would also like to thank Dr. David Burchfield of the Department of Pediatrics, University of Florida, for providing me a graduate research assistantship from January 1991 through December 1991. During my early years at the University of Florida, Ming Jiang and Min Shi gave me much help. I also benefited greatly from discussions with my follow graduate students, Shuwu Song, Iztok Koren, Sergio Schuler, Wuhai Yang, Chun-ming Chang and Hongchi Shi. In addition, I would like to thank my manager Catherine Hunt of Hewlett Packard Company, San Diego, for her support. I have also benefited from HewlettPackard Company's educational assistance program. I appreciate the efforts of John Shumate who helped polish the writing of this dissertation. iii

PAGE 4

Finally, I would like to thank my mother Yuzhen Chen and my late father Laikun Fan for their inspiration and encouragement, as well as my wife Weihua Liu and my son Xiying Fan for their understanding, love, and patience. This research was supported in part by the Whitaker Foundation, and the U. S. Army Medical Research and Development Command, Grand number DAMD1793-J-3003. iv

PAGE 5

TABLE OF CONTENTS ACKNOWLEDGEMENTS iii LIST OF TABLES viii LIST OF FIGURES ix ABSTRACT xi CHAPTERS 1 INTRODUCTION 1 1.1 Motivations of the Research 1 1.2 Review of Related Work 2 1.3 Overview of the Thesis 4 2 SIGNAL REPRESENTATIONS AND CONVOLUTION OPERATORS 5 2.1 Introduction 5 2.2 Fourier Transforms 6 2.2.1 Fourier Transform (FT) 7 2.2.2 Discrete-time Fourier Transform (DTFT) 7 2.2.3 Short-time Fourier Transform (STFT) 8 2.2.4 Discrete-time Short-time Fourier Transform (DTSTFT) . 8 2.2.5 Connection Between Short-time Fourier Transform and Filter Banks 9 2.2.6 Generalized Short-time Fourier Transform 10 2.3 Wavelet Transforms 10 2.3.1 Continuous Wavelet Transform (CWT) 11 2.3.2 Discrete Wavelet Transform (DWT) 12 2.3.3 Discrete-time Wavelet Transform (DTWT) 13 2.4 Generalized Discrete-time Filter Bank Transforms 15 2.4.1 Convolution Operator on Complete Representations ... 17 2.4.2 Convolution Operator on Overcomplete Representations 19 2.4.3 Approximation of Convolution Operators on Overcomplete Representations 19 2.4.4 Aliasing Enhancement on Complete Representations ... 20 2.5 Summary and Discussion 22 v

PAGE 6

3 OVERCOMPLETE WAVELET REPRESENTATIONS 23 3.1 Introduction 23 3.2 Overcomplete Wavelet Transforms and Filters 23 3.2.1 Overcomplete Wavelet Packet Representations 23 3.2.2 Dyadic Wavelet Representations 25 3.2.3 Filters of Autocorrelation Functions of Wavelets 28 3.3 Connection to Complete Wavelet Representations 29 3.4 Time-frequency Interpretation and the Uncertainty Principle . . 29 3.5 Two-dimensional Extensions 34 4 OPERATIONS ON OVERCOMPLETE WAVELET REPRESENTATIONS 37 4.1 Introduction 37 4.2 Gain Operators 37 4.2.1 Minimum Mean Square Error (MMSE) Approximation . . 38 4.2.2 Time-frequency Trade-off and a Greedy Algorithm .... 40 4.3 Shrinking Operators 42 4.4 Envelope Detectors 44 4.4.1 Envelope Detection by Hilbert Transform 47 4.4.2 Envelope Detection by Zero Crossings 49 4.4.3 Comparison Between the Two Detectors 50 4.4.4 Comparison with Other Energy Operators 50 4.4.5 Two Dimensional Envelope Detection 50 5 APPLICATION I: TEXTURE SEGMENTATION 53 5.1 Introduction 53 5.2 Texture Feature Extraction 54 5.3 Considerations on Filter Selection 55 5.4 The Basic Isodata Clustering Algorithm 57 5.5 Postprocessing 58 5.6 Experimental Results 58 5.6.1 One-dimensional Signals 58 5.6.2 Natural Textures 59 5.6.3 Synthetic Textures 62 5.7 Summary and Discussion 62 6 APPLICATION II: IMAGE DEBLURRING 66 6.1 Introduction 66 6.2 Review of Some Deblurring Techniques 67 6.2.1 Modified Inverse Filters 67 6.2.2 Wiener Filters 68 6.2.3 WaveletVaguelette Inversion 69 6.2.4 Discussion 70 6.3 Discrete-time Overcomplete Wavelet Packet Inversion 72 6.3.1 One-dimensional Formulation 72 6.3.2 Two-dimensional Extension 74 6.4 Experimental Results 75 6.4.1 One-dimensional Signals 78 6.4.2 Two-dimensional Images 81 6.5 Summary and Discussion 82 vi

PAGE 7

7 CONCLUSION 86 APPENDIXES A PROOFS RELATED TO DISCRETE-TIME WAVELETS 88 B NUMERICAL COMPUTATION OF BATTLE-LEMARIE WAVELET 95 B.l Functions in Finite Terms 95 B.2 Numerical Computation Using FFT 99 C NUMERICAL COMPUTATION OF THE UNCERTAINTY FACTOR 103 D BOUNDARY TREATMENTS OF FINITE-LENGTH SEQUENCES . 107 D.l Periodic Extensions 107 D.2 Closure of Symmetry Under Convolution 107 REFERENCES Ill BIOGRAPHICAL SKETCH 115 vii

PAGE 8

LIST OF TABLES 3.1 Uncertainty Factors of Analyzing Filters 34 5.1 Boundary Accuracy of the Segmentation Results 64 6.1 Performance of ID Deblurring Examples 81 6.2 Performance of 2D Deblurring Examples 82 B.l Coefficients of g 2 4 (x) (a) and Truncated Impulse Response of hi(n) (b). 100 viii

PAGE 9

LIST OF FIGURES 2.1 A Filter Bank Implementation of STFT 10 2.2 Fast Discrete-time Wavelet Transform-First Two Levels 13 2.3 Generalized Discrete-time Filter Bank Transform 15 2.4 Example of General Binary-tree-structured Filter Bank of Five Channels. 16 2.5 The Overcomplete Wavelet Transform for the Example of Figure 2.4. 17 2.6 Example of the Aliasing-enhancement Anomaly 21 3.1 Binary Tree for Overcomplete Wavelet Packet Decomposition 24 3.2 Binary Tree for Dyadic Wavelet Decomposition 26 3.3 Examples of the Function O a ,6(<^) 27 3.4 Two Examples of the Overcomplete Wavelet Representation 30 3.5 Examples of Time-frequency Localization by Overcomplete Wavelet Packet Representations 32 3.6 Uncertainty factors (drawn on top of each band) of Channel Filters. . 33 4.1 Examples of Gain Vector Approximation 41 4.2 Examples of the Minimum Tree Approximation with the Maximum Depth of Seven Levels 43 4.3 An Example of Overcomplete Wavelet Denoising 45 4.4 Comparison of Overcomplete Wavelet Shrinkage Denoising with Linear Smoothing Using the Signal of (row l,col 1) of Figure 4.3 46 4.5 Two Examples of Power Density Distribution of Overcomplete Wavelet Packet Representations 48 4.6 A FIR Hilbert Transformer 49 4.7 Comparison of the Two Envelope Detectors 51 4.8 Frequency Response of Equivalent Complex Quadrature Filters (level 1). 52 ix

PAGE 10

5.1 Segmentation of a ID Signal Consists of Triangular Waveform and Sinusoid 59 5.2 Segmentation of a Noisy ID Signal Consists of Two Pure Tone Segments 60 5.3 Segmentation Results of a Image Consist of Natural Textures 61 5.4 Segmentation of a Synthetic Gaussian Lowpassed Texture Image ... 63 5.5 Segmentation of a Synthetic FIN Texture Image 63 5.6 Segmentation of a Synthetic Texture Image with Line Patterns. ... 64 6.1 Log Frequency Responses of Filters for a Gaussian Blur 76 6.2 Log Frequency Responses of Filters for an uniform Blur 77 6.3 One-dimensional Deblurrings of a Gaussian Blur 79 6.4 One-dimensional Deblurrings of an Uniform Blur 80 6.5 Deblurring a Gaussian-blurred Lena Image 83 6.6 Deblurring an Uniform-blurred Lena Image 84 A.l Equivalent Structures for (a) Convolution and Decimation and (b) Expansion and Convolution 91 A. 2 Illustration of Adding One More Channel bv Splitting a Channel into Two 92 B. l Transition Band of Filters H p (uj) with p = 1, 3, 9 101 C. l Channel Bandwidths of Overcomplete Wavelet Representations. . . . 104

PAGE 11

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy OVERCOMPLETE WAVELET REPRESENTATIONS WITH APPLICATIONS IN IMAGE PROCESSING By Jian Fan August 1997 Chairman: Dr. Andrew F. Laine Major Department: Computer and Information Science and Engineering Orthogonal wavelet transforms have been applied to the field of signal and image processing with promising results in compression and denoising. Coefficients of such a transform constitute a complete representation of a signal without redundancy. However, there are applications where complete representations are disadvantageous. In this thesis, we examine classes of Fourier transforms and wavelet transforms in terms of their efficacy of representing convolution operators. We have identified two shortcomings associated with complete representations of the discrete-time domain: (1) the lack of translation invariance and the (2) a possible anomaly of aliasingenhancement. On the other hand, our analysis showed that overcomplete wavelet representations do not bear those shortcomings of their non-redundant counterparts. Our framework of overcomplete wavelet representations include construction algorithms xi

PAGE 12

and prototype filters, spatial-frequency interpretation and three operations. Capabilities of spatial-frequency localization were quantitatively evaluated using uncertainty factors. Associated with gain, shrinking and envelope operators, algorithms for convolution, denoising and analysis of power density distribution were presented and analyzed. The framework of overcomplete wavelet representations was then applied to segmentation of textured images and image deblurring. We demonstrated that envelopes as feature vectors performed well in segmenting both natural and synthetic textures. We showed that gain and shrinking operators may be used for image deblurring and discuss limitations of the mothodology. xii

PAGE 13

CHAPTER 1 INTRODUCTION 1.1 Motivations of the Research In the past decade, wavelets have stimulated great interests in many fields of applied mathematics and engineering, generating a significant amount of research activities. Within these fields, traditional theories and techniques have been reevaluated to explore opportunities for wavelet applications. Although continuous time wavelets possess mathematical elegance, discretetime wavelet transforms are of special importance for practical applications. Among them, discrete-time orthogonal wavelet transforms have gained popularity. Coefficients of such a transform constitute a complete representation of a signal without redundancy. Complete representations have produced promising results in image compression and denoising. However, there are applications where complete representations are disadvantageous. Understanding the strength and weakness of various representations and the relationships among them is essential for satisfying intellectual inquires and developing new methodologies. In this thesis, we considered both classes of Fourier-based representations and wavelet-based representations with emphasis in the discrete-time domain. We found that the overcomplete wavelet representations served to bridge the two classes. The merit of a particular representation should be judged by its capability to facilitate solving problems. Two important but difficult problems, segmentation of textured images and image deblurring, were selected as benchmarks. Accordingly, we value a representation by its capability to extract certain features from signals, efficacy of representing an operation and invariance under certain operations. 1

PAGE 14

2 With the understanding of various representations and the problems in hand, we developed a framework of discrete-time overcomplete wavelet representation. The framework included construction algorithms and prototype filters, time-frequency interpretation and three operations. Algorithms associated with gain operators, shrinking operators and envelope detectors were developed. The framework of overcomplete wavelet representations was then applied to segmentation of textured images and image deblurring. This demonstrated that envelopes as feature vectors performed well in segmenting both natural and synthetic textures. For the deblurring application, our formulation extended the waveletvaguelette inversion into discrete-time domain and was applicable to more blurring sources. 1.2 Review of Related Work On the general topic of wavelet theory, a brief history and comprehensive review were included in Jawerth and Sweldens' survey [32]. The papers by Daubechies [11] and Mallat [42, 41] were considered influential for the recent developments. The paper by another wavelet pioneer Strang [59] was also explanatory. The orthonormal wave packet bases is a generalization of orthogonal wavelets, was introduced by Coifman and Meyer [10]. The book by Daubechies [13] included in-depth mathematical discussions and real design examples. The book by Vaidyanathan [63], which was written for signal processing audience, provided much insight into the relationship between wavelets and multirate filter banks. The lack of translation invariance of discrete-time orthogonal wavelets was recognized as a major shortcoming for pattern recognition by many researchers [41, 59]. Various proposals were made to address the issue. Multiscale edge representations were investigated by Mallat and Zhong [44], Mallat and Hwang [43], and Berman and Baras [5]. Steerable filters with shiftabilities were studied by Simoncelli et al. [58]. A orthogonal representation with invariance of the subband structure was proposed by Pesquet et al. [52].

PAGE 15

3 Using energy distribution in subbands of wavelet packet decomposition as signatures for texture classification was presented by Laine and Fan [36]. A similar approach was also reported by Chang and Kuo [8]. Earlier results using envelopes of overcomplete wavelet packet representation for texture segmentation was reported by Laine and Fan [35]. A simplified version of Chapter 5 of this thesis was later published [37]. The overcomplete wavelet representation was also used by User [62] with an ad hoc feature extraction algorithm. A methodology combining Markov random field and orthogonal wavelet packet transform was described by Bello [4]. The nonlinear wavelet shrinkage for denoising was due to Donoho and Johnstone [17]. The approach was applied to power spectrum estimation [48], and to medical image processing [1]. Nonlinear denoising based on wavelet maxima representation by Mallat and Hwang shared some similar ideas [43], although tracing singularities through the scale space turned out to be much more difficult, especially in a two-dimensional space. Manipulating wavelet representations to enhance certain features is obviously a good idea. Jawerth et al. [31] reported image enhancement using weight factors to modify coefficients of orthogonal wavelet transforms. Laine et al. [38] demonstrated various approaches for enhancement of digital mammographic images. Fan and Laine [20] pointed out the connection between linear gain enhancement using overcomplete dyadic wavelet representations and the unsharp masking, and introduced a nonlinear function for contrast enhancement. However, due to the lack of an objective definition of image quality, those enhancement algorithms were mostly ad hoc and hard to evaluate. On the other hand, image deblurring is considered to be a well defined problem. The approach adapted by Banham et al. [3] is to decompose the linear minimum mean square error (LMMSE) filters into the orthogonal transform domain with advantage of reduced alliance on the assumption of global stationarity. However, their proposal was basically the linear filtering and no optimal subband selection was

PAGE 16

4 considered. The mathematical framework of the waveletvaguelette inversion was due to Donoho [14, 16]. The limitations were that it was derived on the continuous time domain and inapplicable to some common sources such as Gaussian and uniform blurs. 1.3 Overview of the Thesis This thesis is organized as follows: Chapter 2 reviews signal representations derived from the Fourier transform and the wavelet transform in the context of being operated upon by a convolution operator. The scrutiny reveals that complete representations suffer from not being translation invariant and the existence of the aliasing-enhancement anomaly, and are thus inappropriate for certain applications. Chapter 3 presents the framework of the overcomplete wavelet representation with two particular building algorithm and filters. The time-frequency localization property of overcomplete wavelet representations is examined under the uncertainty principle. Chapter 4 introduces gain, shrinking and envelope operators applicable upon an overcomplete wavelet representation. Algorithms for determining specific parameters are detailed with examples and comparisons shown. Both Chapter 5 and 6 are applications of the theory of overcomplete wavelet representations developed in previous chapters. Chapter 5 deals with the topic of image texture segmentation. The segmentation problem is formulated in the paradigm of spatial-frequency analysis, and a texture feature based on the concept of power density distribution is proposed. Chapter 6 addresses the problem of image deblurring using the gain operator for deconvolution and the shrinking operator for nonlinear denoising. Chapter 7 concludes the research and proposes future directions.

PAGE 17

CHAPTER 2 SIGNAL REPRESENTATIONS AND CONVOLUTION OPERATORS 2.1 Introduction For a given signal x, it is often beneficial to transform (map) x into another form (domain), denoted x X. X is called the representation of x in the transform domain. Instead of directly dealing with the original signal x, we may deal with its representation X. By doing so we usually transform a particular problem into a different form and make it easier to solve. Clearly, different applications demand different representations. The applications considered in this thesis are texture image segmentation and image deblurring. Accordingly, the time-invariability and the efficacy of representations are the most important factors. Both properties may be revealed by their behavior under a convolution operator. With these in mind, we shall review Fourier transforms and wavelet transforms of one dimensional signals. To facilitate our discussions, we used the following basic definitions: Definition 2.1.1 (Translation Operator) For a continuous function x(t), a translation operator is defined as T s [x(t)] — x(t — s); For a discrete function x(n), a translation operator is defined as T d [x(n)] = x{n — d). Definition 2.1.2 (Convolution Operator) For a continuous function f(t), a convolution operator is defined as T f [x(t)] = x(t) * f(t) = P^:c(t)/(£ r)rfr; For a discrete function x(n), a convolution operator is defined as Tf [x(n)] = x(n) * f(n) = T,m=-oo x ( m )f( n ~ rn )The functions f (t) and f{n) are called the kernel. Notice that a translation operator is a special convolution operator with kernel f(t) = 5(t r) 5

PAGE 18

6 (f(n) = S(n d)). Therefore, we will use T for both translation and convolution operators. Definition 2.1.3 (Eigenfunction and Eigenvalue) For a linear operator T , if there exists a function e, such that T [e] = Xe, where A is a constant, the function e is called the eigenfunction of T, and the value A is called the eigenvalue. Definition 2.1.4 (Translation invariant) For a given signal x, a translation operator T and a mapping operator M., if the two operators are commutable such that M[T(x)] = T[M(x)], the mapping is called translation invariant, and the representation X is said to be a translation invariant representation. 2.2 Fourier Transforms Frequency-domain representation by Fourier transform played a major role in signal analysis and linear system theory [51, 49, 63]. The efficacy of Fourier representation is due to the fact that its basis functions of complex exponentials are eigenfunctions of a convolution operator [49, page 39]. For a linear convolution operator T, the eigenequation is: T f [e*"] = F(w)e*", where eigenvalue F(u) is the Fourier transform of f(t). An almost identical form can be derived for the discrete-time Fourier transform. Fourier representations are particularly powerful for signal deblurring since it transforms a deconvoltuion problem into the simple form of divisions. However, Fourier representations may not be good at capturing frequency evolution over time. As time variable is added to Fourier transforms, basis functions of short-time Fourier transforms are no longer eigenfunctions of a convolution operator.

PAGE 19

2.2.1 Fourier Transform (FT) The Fourier transform pair may be written as: /oo x(t)e~ ]u,t dt (FT), -oo /•oo x(t) = — / X(u)e ]u3t du (Inverse FT). A linear convolution operator Tf can be characterized by /oo r , roo X(u)T f e juJt doo= X(u)F(u))e jult duj, -00 J — oc or, Y(u>) = X{w)F(u>). This shows that Fourier representation of the convolution operator T is the Fourier transform of its kernel. The beauty of this representation is that on the Fourier domain, a convolution (deconvolution) is simply multiplications (divisions) of the both representations. For a translation operator with kernel f(t) = 5(t — s), F(u) = e~i ws , and therefore |F(w)| = |A r (o;)|, which means that magnitude of Fourier transform is translation invariant. 2.2.2 Discrete-time Fourier Transform (DTFT) The discrete-time Fourier transform may be written as [49]: oo X(e? w ) = Yl x(n)e]u,n (DTFT), n= — oo x(n) = — P X(j w )e? M dw (Inverse DTFT). Similar to its continuous counterpart, y(n) = T f [x(n)} = if X{^)F{e? u )^du), or, Y(e ju} ) = X(e juJ )F(e Jul ). For a translation operator with f(n) = 6(n d). F(e juJ ) = e _jW and thus \Y(e^)\ = \X(e^)\.

PAGE 20

8 2.2.3 Short-time Fourier Transform (STFT) In order to capture frequency evolution of a non-stationary signal, a window w(t) is introduced into the Fourier transform. The short-time Fourier transform may be written as: /OO x(r)w(t T)e~ ]UJT dT (STFT). -oc roc x(t) = — — / X(u, t)e JU ' t du) (Inverse STFT). 2irw(0) J-oc where we assumed w(0) ^ 0. For a convolution operator Tj, T f [w(t r)e^'] = F(u, t r)e jut , where F(u;,t) is the STFT representation of the kernel f(t). It means that in general the windowed complex exponentials are no longer eigenfunctions of a convolution operator. Consequently, STFT representation of y(t) = Tf [x(t)] is another convolution of either or. /oo F(uJ -T)x(r)e-^ T dT, -oo /oo X(u,t-T)f(r)e-^ T dT. -oo For a translation operator with f(t) = S(t — s), Y(u,t) = X(u,t — s)e~^ s , and thus |F(u;.t)| = \X(u,t — s)\. Therefore, magnitude of STFT transform is a time-invariant representation. 2.2.4 Discrete-time Short-time Fourier Transform (DTSTFT) The discrete-time short-time Fourier transform may be written as [49, 53] oo X(e jw ,n)= J2 x(k)w(nk)eJujk , (DTSTFT) (2.1) k=— oo x(n) = — F X(e juJ ,n)e^ n du). (Inverse DTSTFT) (2.2) 2irw(Q) J-TT

PAGE 21

For a discrete convolution operator T) and y(n) = Tj [x(n)], we have oo Y{e? u ,n)= £ X(e^,n-k)f(k)e-^ k , k= — oo or, 00 Y(e juJ ,n) = Y, F{e ]UJ ,n-k)x{k)eju}k . k= — oo For a translation operator 7) with f(n) = 8(n-k), \Y(e juJ ,n)\ = |A"(e Ja, ,n /c)|. Thus, the magnitude of DTSTFT is also a translation-invariant representation. 2.2.5 Connection Between Short-time Fourier Transform and Filter Banks The discrete-time short-time Fourier transform X(e^, n) is a two-dimensional function of a continuous variable uj and a integer variable n. Is it possible to have only finite samples X{e? Uk ,n) in the frequency domain, and yet still able to recover the original signal x{n)1 It turns out to be possible by carefully designing channel filters [63, 53]. For a given frequency u! k , STFT coefficients of (2.1) can be rewritten as oc X{e jUk , n) = eiu " n £ x{m)w(n m)e^ ( "m) = e~ iu " n [x{n) * v k (n)} , m=— oo where Vk(n) = w(n)e juJkTl . This means that a frequency point of STFT can be computed using a filter with frequency response V k {uo) = W(u — Uk) and a complex multiplier e ju>kU . Note that \X(e jUk ,n)\ = \x(n) * v k (n)\. For a filter bank consists of M — 1 such channels satisfying M-l M-l £ V k (uj)= Y W(u-u k ) = l, (2.3) k=0 k=0 the original function x(n) can be recovered by replacing the integral of (2.2) with the following summation: M-l M-l Y X(e j " k ,n)e ju,kn = £ [x{n) * v k (n)} = x(n) * M-l Y v k(n) (2.3) . . = x{n). k=0 k-0 ik=Q Figure 2.1 showed a filter bank structure for the analysis-synthesis process of STFT.

PAGE 22

10 vi(n) x(n) r "o(n) fJW-l(n) X"( e J-o.") x) • ( x A'(e^in ) X) IX x(n) Figure 2.1: A Filter Bank Implementation of STFT. 2.2.6 Generalized Short-time Fourier Transform We already pointed out that STFT magnitude can be computed by a filter bank without modulators and demodulators (see Figure 2.1). We can further relax the constraint that channel filters are frequency-shifted of a single lowpass filter v(n), and define a generalized short-time Fourier transform pair as 00 w k{n) — ^2 x { m ) v k{n — m )i 0 < < M — 1, m= — oo M-l x ( n ) = Yl u 'fc( n ) • 2.3 Wavelet Transforms Wavelet transforms have become a powerful tool for analyzing non-stationary signals in recent years. It has several major advantages over the short-time Fourier transform. First, wavelet transforms choose to fix the ratio of bandwidth/center-frequency (called constantQ) of channel frequency response instead of the frequency resolution (determined by the bandwidth of the window) as in STFT. In other words, wavelet transforms use a narrow bandwidth window for low frequency signals and a wide

PAGE 23

11 bandwidth window for high frequency signals. The idea behind this is that low frequency components represent slow changes in the time domain for which time resolution is not critical while higher resolution in the frequency domain is more desirable. In the opposite case, a wide bandwidth window for high frequency components means better resolution in the time domain to capture abrupt changes. Second, recent development of wavelet packet transforms and other signaladaptive wavelet-type transforms are much more flexible and powerful in exploiting time-frequency concept for wide-ranging applications. 2.3.1 Continuous Wavelet Transform (CWT) For a continuous time signal x(t), a continuous wavelet transform pair may be written as [41, 12]: wavelet. Parameter r is the translation factor and a is the dilation factor. Informally speaking, when a increases, function ip a (t — r) expands and takes long-time behavior into account. In the opposite, when a decreases, function ip a (t — r) contracts and only focuses on the short-time behavior. In order to recover the original signal from its CWT, the wavelet ip{t) must satisfy the admissibility condition: However, a wavelet function tl> a {t) is generally not an eigenfunction of a convolution operator T, as: where x* denotes complex conjugate of x, ip a (t) V>(a)^) * s caue d the basic does not equal to Xtp a (t).

PAGE 24

12 Since inverse CWT of (2.5) can be rewritten as: /•O=0O (Iq x(t) = ?r / [X(a,t)*i> a (t)}, Ja=0 cr where * denotes convolution on variable t, we have: 1 r°° da Tf[x(t)} = — [(f(t)*X(a,t))*rp a (t)]-f O^, Jo tr If we denote W as the operator of continuous wavelet transform, the above result showed: /oo f(S)X(a,T -£)<% = T f W[x(t)]. -oo Therefore, CWT is a translation invariant representation. 2.3.2 Discrete Wavelet Transform (DWT) Discrete wavelet transform pair on real orthogonal wavelet bases may be written as [63]: /oo x{i)2jl2 xl>(2-n n)dt {DWT), -oo +00 +oo x{t) = Y. E Vj{n)2-> /2 ip(2J t-n) {Inverse DWT). j=0 n=-oo where functions ^2~^ 2 ip{2~H — n)| consist an orthonomal basis. Notice that DWT maps a continuous-time function x{t) into a discrete-time sequence /ij(n). Clearly, it is not a translation invariant representation. As in the case of CWT, basis functions 2~^ 2 ^{2~H — n) generally is not a eigenfunction of convolution operators. However, Ty 2 ^ 2 ip{2 H — n) decomposed in DWT space: mav be T, 2-^(2~H n)\ = Y, £ m)2, /V(2"'t m), (=0 m=— oo where the transform coefficients may be calculated by: /oo r . T f 2]l2 4){2~H n)\ 2ll2 i>{2~h m)dt -oo L * = [°° IT /(02" J/2 ^ (2~ j {t 0 n) del 2 -// V(2-'i m)dt J-oo U— oo ' J j -co yj —oo v y

PAGE 25

13 x(n) 9( rio(n) -®-f$(n7 |/i(n) | — (T2)c 0 (n) L|fc(n)[-(|2)i -@-[^) cow i(n) Figure 2.2: Fast Discrete-time Wavelet Transform-First Two Levels. Note: Down arrow = decimation and up arrow = expansion. We can derive DWT representation of Tj [x(t)\ to be +00 +00 T f [x(t)] = E E H(n)T f [2-"V(2-'« n) j=0 n=-oo +00 +00 = I E /=0 m=— 00 +c» +00 E E N( n ) v iA n ,™) j=0 n=— oo Therefore, {^(n, m)} can be seen as representation of the operator Tf under the basis {2-^ 2 ^{2~H, n)}. 2.3.3 Discrete-time Wavelet Transform (DTWT) The discrete-time wavelet transform (DTWT) cannot be obtained by sampling the continuous-time wavelet transforms nor by simply mimicking the continuous ones. Assuming a discrete "mother wavelet" u(n), a discrete-time version of dilation v(2 k n — m) does not produce a frequency response ~ V(e 2 u ). Moreover, a dilation V{e 2ku) ) in frequency domain is generally not a band-pass function but a multi-band function. The fundamental difference lies on both the 27r-periodic characteristic of frequency response and the indexing constraint of discrete-time series. A fast algorithm based on the concept of multiresolution was found by S. Mallat [42] and is illustrated in Figure 2.2. If we consider only one level, we have M n ) = Em=_oo x(m)g(2n m), (2.6) (2.7)

PAGE 26

14 x(n) = ££_oo co(l)h(n 2/) + ££_«, MW" ~ 2/). (2.8) Therefore, the decomposition filters h(n) and g(n) and the reconstruction filters h(n) and g(n) must satisfy: Em #(2« m)g(m 21) =6(n-l), E m h{2n m)h(m 21) = 5(n I), Zm9(2n-m)h{m-2l)=0, (2.9) £ m h(2n m)g(m 21) = 0, £, /i(2/ m)ft(n 2/) + £, (7(2/ m)g{n 21) = 5(m n) It can be proved (see Appendix A) that in frequency domain those conditions are equivalent to: Gie^Gi^) + G(^ u+1() )G{S u+ir) ) = 2 H(e? w )H(e? u ) + H(e j ^ +n) )H(e j{u}+7[) ) = 2 G{e? u )H{j") + G{e j{uJ+ir) )H(e^ +w) ) = 0 #(e^)G(e Ju; ) + H{e?(" + * ) )G(e>( u+ * ) ) = 0 H{^)H[e ju ) + G{e juJ )G(e Jul ) = 2 H(e jul )H(e j{uJ+w) ) + G(e JUJ )G(e j{uJ+ir) ) = 0. One possible choice is to choose h(n), g(n), h(n) and g(n) by: g (n) = {-l) n h{l n) G(e>' w ) = -e~^H*{e^ + ^), h(n) = h*(-n) H(e?») = i/*(e JW ), g(n)=g*(-n) # G(^) = G*(e^l In this case, the resulting bases is called orthogonal bases. Otherwise, the resulting (2.10) (2.11) bases is called biorthogonal bases. It can be proved (see Appendix A) that the structure of Figure 2.2 is equivalent to a filter bank with the transform written as [63]: d k (n) = ]T x(m)v k {2 k+1 n m), 0 < k < M m=— 00 00 CM-i(n) = Y. x{m)v M -\{2 M ~ l n m), m=— 00 M-2 00 (DTWT) x ( n ) = X! d k (m)u k (n 2 k+l m) + ^ c A /-i(m)u A/ -i(n 2 M 1 m) fc=0 m=— 00 m=— 00 (/river se D7WT). Particularly, for orthonomal basis, it can be proved (see Appendix A) that u k (n) = v* k (-n), u k {m — n k l)u* c (m — n c n) — S(k — c)S(l — n). (2-12)

PAGE 27

l.J x(n) 1 /^~\ Do(n) | vo(n) | ~(t"oJ ' (Wj«o(i) — | ui(n) !>*( ^)_-@ — — (K)-_M (n) x(n) Figure 2.3: Generalized Discrete-time Filter Bank Transform. Since DTWT can be included into a general framework to be presented next, we leave more in depth discussions to the coming section. 2.4 Generalized Discrete-time Filter Bank Transforms Previous sections showed that both STFT and DTWT can be implemented by a filter bank structure. If we ignore the modulator (e~ julkri ) and the demodulator (e^ h11 ) of STFT, the discrete-time version of the two can be included in a generalized form: Mn) = E~=-oo x{m)v k {n k n m), 0 < k < M 1, (GDTFBT) (2.13) x(n) = Efclo 1 E^ = -oc Mm)u k (n n k m) {Inverse GDTFBT). (2.14) It was called the generalized filter bank transform [63]. Figure 2.3 illustrated the structure of GDTFBT. Depending on n k , v k (n) and u k (n), GDTFBT includes the following special cases. 1) Orthonomal wavelet representations. n k = 2 fc for 0 < k < M — 2 and Um_i = um-2In this case, basis functions possess orthonomality of (2.12).

PAGE 28

16 pj/Htn)^)|[9o(n)[ -^2)H^W|-(|2)-(t^ Hgo(")h ffi(n)L|/io(n)[-(|2)921 /»3(n)[-^2)^2) ^3(n)| -] g 3 (n)[-(j2)-{|2)-|33(n) wp(n) (n) i(n) Figure 2.4: Example of General Binary-tree-structured Filter Bank of Five Channels. 2) Orthonomal wavelet packet representations. This is a generalization on the orthonomal wavelet by applying the recursive decomposition-reconstruction structure to all branches. Therefore, = 2 L for 0 < k < 2 L — 1. 3) Biorthogonal wavelet (packet) representations. The same as the orthonomal counterpart except that the constraints (2.11) are lifted. 4) Generalized binary tree structured filter bank representations. This is a further generalization upon the (bi)orthogonal wavelet (packets). Every channel may be split into two with filters satisfy (2.9). Therefore, every is power of 2. Moreover, filters used in one channel (node) may be different with another. Figure 2.4 showed an example. It was proved in Appendix A that the basis functions constructed this way satisfy the biorthonomality expressed by: u c (m — n c l)vki;nf.n — m) = 5(c — k)S(l — n). (2.15) 5) Overcomplete wavelet representations. In this case, no down-sampling and upsampling are used, which means = 1 for all channels in the Figure 2.3. As an example, the overcomplete wavelet transform corresponding to Figure 2.4 is shown in Figure 2.5.

PAGE 29

17 x(n) r Go(u;) Hl(2w) Hi(2u;) 103(71) Gi(2w) Gi(2w) Go(uj) ff 3 (4w) «'_.(;/ ) //.-.< «G 2 (2oj)G 3 ( 1-,'; u>l(n) G 3 (4u;) -G 2 (2o;)i wo(n) H 2 (2ui) x(n) Figure 2.5: The Overcomplete Wavelet Transform for the Example of Figure 2.4. Representations 1-4 above are called "complete. The reason of calling the representations of the category 5 "overcomplete" or "redundant" is that they contain more analyzing functions ({ujt(n)}) than a basis. Since these analyzing and synthesizing functions must satisfy A/-1 00 A/-1 £ £ u k (m-l)v k (n-m)=8(l-n), or £ U k {e?«)V k {e?") = 1, A;=0 m=— 00 k=0 they must be linear dependent. In the other word, some functions must be expressible by others. Complete and overcomplete representations behave rather differently under a convolution operator. 2.4.1 Convolution Operator on Complete Representations In general, a basis function Uk{n— n^m) is not an eigenfunction of a convolution operator. Using the same idea as in the case of DWT, a convolution operator Tj can be represented by basis functions {u^n — n^rn)}: M-l 00 T f [u k {n n k m)) = J2 £ %,c( m > b)u c (n n c b) , c=0 b= -00 00 00 VkA m ' b ) = £ £ /(')***(* ~ * ~ n kTn)v c (n c b I). (2.16) /=— 00 i=— cx> It can be proved that ujt( n — is an eigenfunction of the operator Tj if and only if /7/t, c (m, 6) = XS(k c)5{m — b).

PAGE 30

18 Proof: If T]k,c( m ^ b) = \8(k — c)5(m — b), then Tj [Uk(n n k m)} = EcL~ 0 l EZ-oo ^(k c)S(m b)u c (n n c b) = \u k (nn k m). In the other direction, if Tj [u k (n — n k m)\ = Xuk(n — n k m), then VkA m M = En=-oo T f[ u k( n n km)]v c (n c b-n) = En=-oc ^ u k( n n k m)v c (n c b n) = X5(k c)8(m b) , where we utilized the property (2.15). The set of coefficients {rj ktC (m,b)} is the representation of the operator Tj in the basis and is independent of input signal. Since a convolution operation on a sequence x(n) can be derived as: A/-1 oo T f [x{n)} = Y Y w k (m)T f [u k (n-n k m)] k=0 m=-oo A/-1 oo M-l oo = Y Y w k(m) Y Y %,c( TO > b)u c (n n c b) k—0 m= — oo c—0 b=—oo M-l oo = E E c=0 b=— oo M-l oo 53 Y w k(m)ri ktC (m,b) u c {n n c b) , . k =0 m=-oo the representation of T [x(n)] is thus A/-1 oo a ci b )=Y Y Wk(m)r} k
PAGE 31

1!) 2.4.2 Convolution Operator on Overcomplete Representations The overcomplete representation of a convolution operator Tj can be obtained from (2.16) with n c = 1 and n k = 1: oo oo ilk,c{m,b)= Y, f{i)M l i m ) v c(b-l) = {f*Uk*v c )(b-m). (2.18) /=— oo := — oo Accordingly, the overcomplete representation of 7/ [x(n)] can be obtained from (2.17): M-l oc «c(n)=S Y w k{m)r]k,c(n-7n) = f(n)*w c {n), (2.19) k—O m=-oo which showed that discrete overcomplete wavelet transform operator and convolution operator are commutable, and thus overcomplete wavelet representations are translation-invariant. To verify this, assume an overcomplete wavelet transform operator of W and a translation operator of Tj with f(n) = S(n d), overcomplete wavelet representation of T) [x(n)] is: WTj [x{n)\ = T f W[x(n)} = T f [{tu fc (n)|o<*c (e>») = F{e?")U k {e>«)V c {e?") , where %,c(e Ju; ) is the discrete-time Fourier transform of i]k, c (. n )If the passbands of the filters C4(e7 ' w ) and V c (e^ w ) have little overlap, the following approximation is justified: U k (enV c (en « 5(k c)U c (e^ w )V c (eP u ) .

PAGE 32

20 Moreover, if the function F(e JW ) is real and the passbands of channel c's are narrow enough, we may further approximate fjk^ie^) as: VkAen « A C 5(A: c)^ e (c> tf )V c (e> w ), or, %,c(«) ~ A c <$(fc ~ c)u c {n) * v c (n) , where A c is a real number. In this case, (2.19) may be well approximated by: M-l <*c(n) ~ KHk c)w k {n) * u c (n) * v c (n) = A c tu c (n) . (2.20) k=0 This means that a convolution operator with real frequency response F(e* w ) may be approximated by a set of real multipliers {A^}. This is another advantage of overcomplete representation since we found that the approximation is generally not extendible to complete representations. 2.4.4 Aliasing Enhancement on Complete Representations For complete representations, subsampling usually does not meet the Shannon sampling theorem, and thus produces aliasing distortions. Those aliasing components get canceled only by specially designed filters at the reconstruction stage. If we manipulate the representation in a way like (2.20), aliasing distortions may not be canceled. For a concrete example, we consider a wavelet transform of Figure 2.2 with only one level. If we multiply representations d 0 (n) and c 0 (n) by k d and k c respectively, we can rewrite Equations (2.6-2.8) in frequency domain: D' 0 (e? u ) = h d [X(e^ /2 )G(e j ^ 2 ) + X{e^ +2 * )/2 )G(e j ^ +2 * )/2 )} . Cj(e*") = ^k c [X{e^' 2 )H{e^l 2 ) + X{e^ +2 ^ )l2 )H{e j{M)l2 )\ , X'{e> u ) = l [k d G{e juJ )G(e juJ ) + k e H{e^)H(e iu )] X{e> w ) + i [k d Q{eP {u+v) )G(e iw ) + k c H{e j{uJ+7r) )H{e jiJ )] X(e j(w+,r) ). The second term is the aliasing component. Unless k d = k c , the second term is generally not equal to zero even though filters meet the perfect reconstruction

PAGE 33

21 conditions (2.10). In this case, the system is not a linear time invariant (LTI) system, but rather a linear periodically time varying (LPTV) system [63]. It is easy to recognize that the system can not be characterized by the familiar form of Y{e^) = M{e^ UJ )X{e^ ul ). Figure 2.6 shows an example of such aliasing-enhancement effect. 50 100 150 200 250 0 0 5 1 1 5 2 2.5 3 (a) (b) 50 100 1 50 200 250 0 0 5 1 1-5 2 2.5 3 (c) (d) Figure 2.6: Example of the Aliasing-enhancement Anomaly. Where: (a) The original sinusoid signal; (b) Magnitude of DFT of the signal (a); (c) Reconstructed signal with orthogonal Haar filters {H{e? u ) = 2-'/ 2 (l + e-' u ), G(e>' w ) = 2" 1 / 2 (1 e"^)), one level decomposition and kj = 6, k c = 1; (d) Magnitude of DFT of the signal (c).

PAGE 34

22 2.5 Summary and Discussion We have reviewed signal representations based on Fourier and wavelet transformations and included them in a general framework called generalized filter bank transform. We studied their behavior under a convolution operator. It revealed that complex sinusoid functions of Fourier transform are the only basis functions in this class which are eigenfunctions of the operator. In this case, a convolution operator is equivalent to multiplications in the representation (frequency) domain and the inverse operator can be simply expressed as divisions. For overcomplete representations, a convolution operator may be approximated by multiplications of coefficients by real factors. However, such an approximation can not be extended to complete representations due to aliasing distortions. Moreover, we realized that discrete complete representations also lack translation-invariant. Based on the characteristics of these representations, we may find the best applications for them. 1) Complete representations may be good for compression, progressive transmission. No redundancy alone would be a major advantage for those applications. Throwing away some coefficients may be seen as a zeroing operation, which does not enhance aliasing components. After all, we are prepared to accept some distortions for those applications. 2) Overcomplete representations possess many desirable properties for restoration, enhancement, feature extraction and time delay estimation [52]. First, they are translation invariant which is critical for pattern recognition. Second, they do not have the aliasing distortions. Third, they allow much higher degree of flexibility in filter design.

PAGE 35

CHAPTER 3 OVERCOMPLETE WAVELET REPRESENTATIONS 3.1 Introduction The paradigm of overcomplete wavelet representations is a versatile and powerful tool shown to be advantageous for certain applications. In this chapter, we shall study overcomplete wavelet representations in detail, including issues of decomposition and reconstruction algorithm, filter selection and time-frequency interpretation. 3.2 Overcomplete Wavelet Transforms and Filters In general, a discrete-time overcomplete wavelet transform (DTOWT) pair can be obtained by setting n k to 1 in the Equations (2.13-2.14), as rewritten in the following: w k (n) = i>*(ro)a:(n m), 0 < k < M1, (DTOWT) (3.1) x(n) = Efcio 1 E~=-oo Mm)u k (n m) {Inverse DTOWT). (3.2) More importantly, we need an algorithm to generate analyzing and synthesizing filters Vk(n) and u k (n). In this thesis, we restrict ourselves to the binary tree algorithm as illustrated in the previous chapter, and consider two particular classes: overcomplete wavelet packets and dyadic wavelets. 3.2.1 Overcomplete Wavelet Packet Representations The construction algorithm of overcomplete wavelet packet representations corresponds to a complete binary tree, as shown in Figure 3.1. Though each tree node represents an analyzing filter, only a subset of all tree nodes is needed to achieve a perfect reconstruction. Notice that the position and the width of each rectangular 23

PAGE 36

24 ff(w) CM H{2uj) i/ /'\^G{2uj) // /\h(2u j) D H{2ui) JJ #2,0 1*2,1 N-2,2 #2,3 G{2u)) 2tt (a) (6) Figure 3.1: Binary Tree for Overcomplete Wavelet Packet Decomposition: (a) the tree; (b) illustration of frequency scaling of the prototype filters. node also represents the channel's position and the band width in the frequency domain. In general, prototype filters H(to) and G(u) in each level can be distinct, as proved in Appendix A, as long as they satisfy the condition H(lu)H(cu)+G(uj)G(lo) = 1. The frequency response of each node can be calculated recursively: P 0 (u) = H{u>), P 1 (u) = G{u)] jV 0l oM = 1; N l+l , m (u) = P c @u)N lAm/21 (u), I > 0. (3.3) where the subscript c of the filter P c (u) follows the periodic pattern of 0, 1, 1, 0, and can be calculated by: c = (m + [m/2\) mod 2. The purpose of such ordering is to make the index m of the node N^ m proportional to its central frequency. The reconstruction from two child nodes to their parent node can be expressed by the following formula: where we used the property of: Pkmod2{^>) Pkmod2(^) + P(k+l)mod2(u) P(k+l)mod2(u) = H{lo)H{u) + G{lo)G{oj) = 1.

PAGE 37

25 There are many possible choices of filters Vk{n) from the tree to achieve the perfect reconstruction. For example, for the tree shown in the Figure 3.1 with the height of two [55, page 449], five sets are available: {A^o}, {Ni,o, iV"i,i}, {^1,0,^2,2,^2,3}, {^2,0,^2,1,^1,1} and {N 2 ,o, #2,1' N 2t2 ,N 2 , 3 }. In general, let T(h) be the number of selections from a complete binary decomposition tree of height h. Since the tree may be split into two subtrees of height h-l each in turn have T(h-l) selections, we have the following recursive formula: T(h) = [T{h l)] 2 + 1, h>l, and T(0) = 1. The first five numbers were calculated as T(l)=2, T(2)=5, T(3)=26, T(4)=677 and T(5)=458330. All quadrature mirror filters (QMF), good for orthogonal wavelet transform, are also good for overcomplete wavelet packet transforms. However, for real QMF except Haar wavelet, I. Daubechies proved that linear phase and finite impulse response (FIR) properties are mutual exclusive [13]. For example, Daubechies wavelet filters [11] are FIR but not linear phase. In the other hand. Battle-Lemarie wavelet filters are symmetric but not FIR, which may be written as [42]: H p (u) where p is a positive integer and (3.4) +00 j Note: An algorithm for numerical computation of Battle-Lemarie wavelets is included in Appendix B. 3.2.2 Dyadic Wavelet Representations The building structure of a dyadic wavelet transform is a particular subset of the overcomplete wavelet packet transform. It decomposes low-frequency branch only (assume H{uj) is lowpass), as shown in Figure 3.2.

PAGE 38

26 Figure 3.2: Binary Tree for Dyadic Wavelet Decomposition. Obviously, all filters good for overcomplete wavelet packet transforms are good for dyadic wavelet transforms. However, one may want to exploit the extra opportunity created by the more relaxed constraint on filters. The following filter classes [44, 20] possess distinct property in the time domain: H(u) = e jpu/2 cos 2n+p {uj/2), H{J) = H*(u), G(u) = -(-j)Pe-^/ 2 sin 2 p (u;/2), G{u) = -UY^smnul2)Y^T l ca^{ul2) (3.5) where n > 0 is an integer and p £ {0, 1}. In fact, for: p = 0 filter g(-l) = 0.25, g(0) = -0.5 and g(l) = 0.25 is proportional to a discrete Laplacian operator and for: p = 1, g(0) = 0.5 and #(1) = -0.5 is proportional to a discrete gradient operator. Notice that filters within this class are linear phase and FIR, but not orthogonal. For p = 0, filters H, H, G and G and their frequency scalings {e.g. H(2 l u)) are all symmetric. For p = 1, different prototype filters may be used for frequency scaling to minimize spatial shifting caused by the phase factor. If we denote Hi(u), Hi(u>), G[(lu) and Gi(to) for filters used in level / > 1, we find the following filters are either symmetric or antisymmetric as well as FIR: H t (u) = cos 2 " +1 (2'2 o;), Hfa) = H t {u), 2n G t (u) = j sin(2'2 u;), Gi(u) = -G t {u) £ cos 2m (2'2 o;). m=0

PAGE 39

27 Figure 3.3: Examples of the Function Q a ,b( u )In both cases, frequency response of tree node N^ k (I > 0 and k G {0, 1}) can be derived as: l-i N lfi {u) = e^' 2 J] cos 2n+p (2 m 1 u;) = e jpu/2 Nu{u) = -(-j) p e 771 = 0 2'1 sin(u;/2) 2-P sin(2'1 u;) 2' sin(w/2) sin(2'2 o;) 2n+p 2n+2 2'" 1 sin(u;/2) For convenience, we defined a function Q a< b(u)) as: ©o,ft(w) sin(2 Q 1 tj) 2 a sin(w/2)_ For 6 > 4, the function QaA 00 ) ls approximately Gaussian as plotted in Figure 3.3. Similarly, we can define a reconstruction tree with nodes: Ni f i(u) = iV/_i,o(w)G,(w). A dyadic wavelet representation consists of leaf nodes. For example, a threelevel representation consists of nodes {N 3fi , jV 3)1 , AT 2) i, Ni,i}. In terms of filter bank, they can be viewed as multiple channels. Channel frequency responses can be derived as: C lfi (u)) = N lfi {iO)N lfi {uj) = Yl H m (u)H m (u) = 9l,4n+2p(u), 771=1 C M (w) = JV u (w)iV, il (a;) = iV / _ 1 ,oMN i _ 1 ,o(u;)G,(a;)G / (a;)

PAGE 40

28 = iV^oMA^oM [l H^H^uj) = @i-l,47i+2p(^') — ©/,4n+2p(^)The channel frequency responses for a three-level representation are {C3fl,C3,i,C2,i,C\ % i}. Notice that the channel frequency responses Ct t i(u) are an approximate difference of two Gaussians (DOG) [47, page 63]. Two examples of channel filters are shown in Figure 3.6. 3.2.3 Filters of Autocorrelation Functions of Wavelets From a filtering point of view, the idea is very simple. Since there is no downsampling-upsampling taking place in overcomplete wavelet transforms, it seems that we can simply combine decomposition and reconstruction into a single filtering step. To achieve this, one only need to change prototype filters into: H a {u) = H{u)H(u), (36) G a (co) = G(u)G{u). The perfect reconstruction condition now takes the form of: H a (u) + G a (u) = l. As H(u) is connected to a wavelet, H a {uj) is connected to an autocorrelation function of wavelet. For rigorous mathematical treatment, refer to Saito's Ph.D thesis [57]. In this case, a reconstruction step is simply a summation of all coefficients as (3.2) degenerates to: x(n) = ZkJo 1 ufc(n), which takes the same form as the generalized STFT. The most important property is that filters H a (u) and G a (uj) are always symmetric or antisymmetric, and thus providing a simple way to construct symmetry and FIR filters for certain applications.

PAGE 41

29 3.3 Connection to Complete Wavelet Representations In Appendix A, Proposition A. 0.3 built the bridge between an overcomplete wavelet representation and a complete wavelet representation. If both representations used the exactly same prototype filters H(u), G(u), H(uj) and G(u), the only difference lies on the decimation. In this case, the complete wavelet representation is a subset of the corresponding overcomplete wavelet representation due to decimation. If the prototype filters used by an overcomplete wavelet representation does not satisfy (2.10), there is no corresponding complete wavelet representation. 3.4 Time-frequency Interpretation and the Uncertainty Principle The set of transform coefficients {w k (n)\ 0fc(n)'s. The shorter the distribution is, the higher the time resolution it can achieve. Unfortunately, high resolution in time and frequency domains are conflict goals, as demonstrated in the Figure 3.5. In this example, we saw that as passband narrowed, impulse responses spread and consequently the boundary between

PAGE 42

30 Figure 3.4: Two Examples of the Overcomplete Wavelet Representation. Legend: (a) An ideal square waveform; (b) A linear chirp signal s(n) = 5cos(0.027rn + 0.00l7rn 2 ). In each image, top half is the signal waveform, and the bottom half is (0-255) scaled overcomplete wavelet representation (coefficients) of 64 leaf nodes of the level six. For the bottom half, horizontal axis is time, vertical axis is frequency.

PAGE 43

31 two impulses blurred. This phenomenon reflected a fundamental principle called the uncertainty principle [21, 51]. Time and frequency localization of a discrete lowpass function f(n) can be mathematically described by two quantities, ^ = 4£(n-n) 2 !/(n)| 2 , n where E = £„ |/(n)| 2 = £ £, |F(e*)| a [E \F(e jn )f] /(4£ 2 ), and thus for filters with F(e jn ) = 0, U > 0.25. That the uncertainty factor has a lower bound is significant and is called the uncertainty principle. It basically states that we cannot have filters with arbitrarily narrow bandwidth in the frequency domain and arbitrarily short duration in the time domain. As a result, we cannot achieve arbitrarily precise time and frequency localization simultaneously. (Numerical computation of uncertainty factors for filter banks is discussed in Appendix C.) Figure 3.6 shows examples of analyzing filters of overcomplete wavelet transform and their uncertainty factors. Uncertainty factors of the analyzing filters of the dyadic wavelet showed that they are indeed very close to Gaussian functions. In general, the results showed that uncertainty factor varied from channel to channel, and level to level. As a statistical measure, we used maximum, minimum and mean to describe higher levels. Such statistics for higher level overcomplete wavelet packets are presented in the Table 3.1, where Regular stands for regular Lemarie filters of (3.4) and Autocorrelation stands for autocorrelation functions of Lemarie filters. We may catch some trends from the data on Lemarie-filter-spanned analyzing filters of overcomplete wavelet packet representations:

PAGE 44

32 Col 1 Col 2 Col 3 Figure 3.5: Examples of Time-frequency Localization by Overcomplete Wavelet Packet Representations. Where: Row 1: A signal consists of two pure-tone impulses with frequency of u = 0.3n and u! = 0.327r: the spectral (Col 1) and the waveform (Col 3). Row 2-5: level 2 representation. Row 6-13: level 3 representation. For row 2-13: columnl, frequency responses {A^ ifc |0 < k < 2'}; column2, impulse responses of columnl (dot lines approximate envelopes). column3, overcomplete wavelet representation. The prototype filter used was Lemare filter of p = 1.

PAGE 45

33 Row 1 Col 1 Col 2 Figure 3.6: Uncertainty factors (drawn on top of each band) of Channel Filters. Where: Row 1: Overcomplete wavelet packet of level 2 using Lemarie filter of p = 1 (Col 1) andp = 9 (Col 2); Row 2: Overcomplete wavelet packet of level 3 using Lemarie filter of p = 1 (Col 1) and p = 9 (Col 2); Row 3: Channel filters {C 4j0 , C 4) i, C 3)i , €2,1, C^i} of dyadic wavelet using the filter class of (3.5) with n = 1 (Col 1) and p = 1, n = 1 (Col 2).

PAGE 46

34 Table 3.1: Uncertainty Factors of Analyzing Filters Regular Autocorrelation p level T J "max J j T T U ^max C-'min TT L 4 1 .618 0.350 0.701 0.8o3 c\ or™ 0.36 ( 0.49< 1 1 0 O.Ool U.04Z i 7At; 1. 1 uo z.uoo U.OIO 6 16.721 0.318 4.535 5.693 0.262 0.980 4 1.011 0.437 0.628 1.178 0.470 0.584 2 5 3.464 0.397 0.954 0.871 0.333 0.432 6 7.984 0.320 1.809 2.736 0.260 0.475 4 1.283 0.529 0.674 1.349 0.496 0.646 3 5 2.386 0.418 0.704 0.721 0.329 0.398 6 5.244 0.318 1.005 1.605 0.261 0.346 1) Joint timefrequency localization of those using low order filters tends to deteriorate quickly as level increase. This is due to the existence of small sidelobes. 2) Using filters of autocorrelation function is more effective than using higher order regular filter, if the no-reconstruction structure is acceptable by applications. However, one should be very careful in using uncertainty factors as an optimality measure due to two main reasons. First, how can we come up a single sensible number for a filter bank? Minimum U max or minimum U or something else? Second, more fundamentally, the uncertainty factor does not take the signal into account. For configurations with similar uncertainty factors, as in the case of Gaussian-like filters such as (3.5), the measure of uncertainty factor alone would not tell us which one is optimal. 3.5 Two-dimensional Extensions So far we have only discussed one-dimensional representations. In order to apply overcomplete wavelet transforms to two-dimensional images, we need to extend the transform to two-dimension. The following two separable extensions can be implemented as separated ID filtering of rows and columns.

PAGE 47

35 1) Tensor product extension [42]. This extension is originally applied to orthogonal wavelet transform. Only ID filters H(u) and G(u>) are used. It directly exploit the perfect reconstruction property of quadrature mirror filters by extending: H{u))H{u) + G{u)G{u>) = 1 into: H(u x )H{u x ) + G(u x )G{u x )] [H{u y )H(Lu y ) + G(uj y )G(uj y ) The above 2D extension can be rewritten as HH(uj x , uj y )HH(uj x ,LUy) + HG{u x , u y )HG(u) x ,u} y ) + GH(u x ,ujy)GH(uj x ,ujy) + GG{uj x ,u)y)GG(u) x ,u}y) = 1. where HH(uj Xl Uy) = H{u x )H{ujy), HH{uj x ,ujy) = H(u) x )H{ujy), HG{lu x , ujy) = H(uj x )G(uj y ), HG(u; x , ujy) = H(uj x )G{u y ), GH{u x ,u y ) = G(uj x )H(ujy), GH{jj x ,uj y ) = G(u x )H(uj y ) : GG(uj x ,u y ) = G(u x )G{ujy), GG(uj x ,ujy) = G(uj x )G{ui y ). Filters {HH, HG, GH, GG} are used for decomposition while filters {HH, HG, GH, GG} are used for reconstruction. The same construction method is applied to frequency scaling of higher levels with each node decomposed into four child nodes. 2) Two-orientational extension. For the particular dyadic wavelet filters (3.5) with emphasis on spatial operation (gradient and Laplacian), the component {GG} of the tensor product method may not be useful. Therefore, the following 2D extension was proposed by Mallat [44]: HH(uj x ,ujy) = H{uj x )H{uj y ), HH(cu x ,u!y) = H(U X )H(iUy), GX(u x ,uj y ) = G(lu x ), GX(u) x ,u y ) = G(uJ x )L{u y ), GY(u x ,ujy) = G(u y ), GY(v x ,v y ) = L(u x )G(u y ). This 2D extension has only three components:

PAGE 48

36 • a horizontal component GX • a vertical component GY • a lower resolution component HH However, it introduced a new filter L(u), which is to be determined by the condition of: HH HH + GX GX + GY GY = 1. It turned out that the formula for L(uj) is L{u) l + H(u;)H{u) 2 For filters of (3.5), L(u;) = 0.5 * 1 + cos ,4n+2p (w/2)] , which is a symmetric and FIR filter.

PAGE 49

CHAPTER 4 OPERATIONS ON OVERCOMPLETE WAVELET REPRESENTATIONS 4.1 Introduction In most cases, a suitable signal representation is not the end of a problemsolving process. On the contrary, how to manipulate it to achieve a desired effect is the more challenging part. For overcomplete wavelet representations, we have identified three operations: gain, shrinkage and envelope operators for our applications in image deblurring and segmentation. We demonstrated that a linear convolution may be approximated by a gain operator, and devised a tree pruning algorithm based on a tolerant of approximation error. We extended the wavelet shrinkage proposed by Donoho and Johnstone [17] to overcomplete wavelet representations for denoising. We built the connection between the power density distribution of a overcomplete wavelet representation and its envelopes, and presented two envelope detection algorithms. 4.2 Gain Operators In the Section 2.4.3, we determined that a convolution operator can be approximated by multiplication of real factors {A fc | 0
PAGE 50

38 reconstructed signal x'(n) in the frequency domain is simply M-1 M-1 X'(u) = Y, hW k {u>)U k (u) = X(u) Y \ k V k (u)U k (u), k=0 k=0 and thus the equivalent filter is Q(w) = [X 0 Xi • A M -i] [C 0 {u)d(u) • C M -,(w)f , where [A 0 Ai • • • Am-i] is a gain vector, C k (u;) = V k (u)U k (u) is the channel filters, and M l denotes the matrix transpose of M. Two fundamental issues immediately arise: 1) What is the characteristics of the set of filters expressible by a gain operator? In the other words, what kind of filters may be approximated by a gain operator? A broad answer is that any gain operator can only represent a symmetric filter. This is because channel filters {C k (uj)\0
PAGE 51

39 It is well known that the optimal gain vector corresponding to the minimum mean square error may be found by setting all partial derivatives to zero: 35 1 r 1 r* 7T J-TT M-l k=0 C q {uo)duj = 0, 0 < q < M-l, which is a set of linear equations: M-l Y IT C k {u))C q (u)duj\ \ k = r F{Lu)C q (u>)duu, 0 < q )du £ A£— / F '{u)C k {u>) , Z7T J-tt ,_ n zTT J -w k=0 The minimum mean square error can be derived as «r M 1 pit k=0 where {A£|o< 9 0, it must be true that E f > E^Lo 1 £r F(u)C k (u)du, and therefore 0 < e < 1 . The criteria of mean square error provides a way to determine the optimal gain vector for a given frequency response F(u) and a given configuration. However, the criteria alone suggests that the bottom-most nodes are the best representation, for which e may reach its minimum among all possible configurations. This can be formally proved by the following proposition. Proposition 4.2.1 Splitting a channel into two will not increase the error measure e.

PAGE 52

40 Proof: By contradiction. Assume that for the current configuration {Cfc(w)| 0 e. Without loss of generality, we can write: C qi (u)=C q (u)H(2 l u,)H(2 l u>), C Q2 (u) = C q {uj)G{2 l u;)G{2 l uj). K > Since the prototype filters satisfy H(lo)H(u)) + G(w)G(uj) = 1, we have C qi (u)) + C q2 {u) = C q (uj). For the new configuration, we can find a gain vector {\' k = \° k , k / qi, k^q2\ \ qi = X' q2 = A°} with error e' = e. This contradicts the assumption that e 1 > e, and therefore e' < e Examples showing higher level corresponding to better approximation can be visualized in Figure 4.1. Overcomplete wavelet packet representations of leaf nodes at Level 4, 5, 6, 7 were selected, corresponding to 16, 32, 64, 128 channels, respectively. Lemarie filter of p = 3 were used. 4.2.2 Time-frequency Trade-off and a Greedy Algorithm Proposition 4.2.1 clearly states that approximation error e of frequency response decreases as the channel bandwidth decreases, as demonstrated by the examples in Figure 4.1. Therefore, by the criteria alone, the Fourier transform would be the best representation since its approximation error is zero. This is certainly not the answer we are looking for. What we lost in sight is the other part of the picture in the time domain. As dictated by the uncertainty principle, time resolution decreased as the frequency resolution increased. A overcomplete wavelet packet representation by the deepest leaves would be closed to a Fourier representation and thus lost its resolution in the time domain. Moreover, from computational point of view, such a down-to-thebottom decomposition is rather inefficient.

PAGE 53

4] Row 1 o o.b i is a 2 » BO i o 4 : 20 O OS 1 1.B a 2 5 3 Figure 4.1: Examples of Gain Vector Approximation (frequency range [0, it] shown) Where: Row 1: a given frequency response F(e J1J ); Row 2 to 5: approximated frequency responses by overcomplete wavelet packet representations of level 4, 5, 6, 7, overlaid with F(e^) (dotted curve), with 5=39.4%, 18.5%, 4.5%, 0.02%, respectively.

PAGE 54

42 One possible trade-off between time and frequency can be accomplished by setting up a tolerant on the approximation error e. Under an error tolerant, the goal could be to find a tree configuration with minimum number of leaves. In the following, we present a greedy algorithm in the order of breadth-first search [45]. Notice that a tree node here represents a channel filter C^ m (u) and the relationship between children nodes and their parent node is characterized by (4.3). Algorithm 4.2.2 (Minimum Tree) Given a frequency response F(u>) and an error tolerant £q, let current leaves C = {Co,o(w) = 1} and compute the current error £c using the set C. Step 1. One by one, compute error Ek by temporarily substituting a node in the current leaves C with its two children. Choose the pair with minimum Ek to replace their parent in the set £ and update the current error Ec = m'm{Ek}Step 2. If Ec < Eq, stop and use the leaves L as representation; else, go to step 1. Examples of overcomplete wavelet packet approximation using the minimum tree algorithm are shown in Figure 4.2. The improvement is appreciated by comparing it with Figure 4.1. In Figure 4.1, the representation using thirty-two channels of the Level 5 has the approximation error s — 18.5%. However, for the same frequency response, the minimum tree algorithm was able to find a representation of only twenty-six channels and yet reduced the approximation error to e = 6%. In the other case, the minimum tree algorithm found a representation with merely seven channels and achieved approximation error of 0.6%. 4.3 Shrinking Operators Wavelet shrinkage was introduced by Donoho and Johnstone [17] for signal denoising and linear inverse problems. For a thorough introduction, refer to [15]. The description of the methodology was given in [15, page 173], we quote, "Wavelet

PAGE 55

43 Row 1 Col 1 Col 2 Figure 4.2: Examples of the Minimum Tree Approximation with the Maximum Depth of Seven Levels. Where: Col 1: A given frequency response Fi(e juJ ) (rowl, same as in Figure 4.1), approximation result overlaid with F x (row2) with e = 6% and selected 26 channels (row3); Col 2: A given frequency response F 2 (e ju ') (rowl), approximation result overlaid with F 2 (row2) with e = 0.6% and selected 7 channels (row3). Overcomplete wavelet packet with Lemarie filter Hi(e juJ ) was used in both cases, and the frequency range [0, 7r] is shown.

PAGE 56

44 shrinkage refers to reconstructions obtained by wavelet transformation, followed by shrinking the empirical wavelet coefficients towards zero, followed by inverse transformation." For a thorough introduction, refer to [15]. We extended the idea to overcomplete wavelet representations on the ground that they differs with orthogonal wavelet representations mainly on having extra correlated coefficients. A shrinking operator K. t can be written as K t [w] w — t, w > t, 0, \w\ < t, (4.4) w +t, w < —t, where t > 0 is a threshold. Notice that the shrinking operator is a nonlinear and memoryless operator, and is supposed to work in spatial domain. The goal of a shrinking operation is to cut the noise off while keeping useful signals. Obviously, this is possible only if the amplitude of desirable components is stronger than the noise components. Therefore, the threshold value t is dependent on the signal-to-noise ratio. Examples of choosing appropriate thresholds are discussed in Chapter 6. To demonstrate the shrinking operation and the advantage of wavelet denoising, an example is included in Figure 4.3. The major advantage of wavelet shrinkage denoising, as compared with linear smoothing, is that it diminishes noise but does not smear the sharp edges, as seen in the example. Comparison with a linear averaging filter with h = [1, 1, 1, 1, 1, 1, l]/7 is presented in Figure 4.4. 4.4 Envelope Detectors Energy distribution is a very important concept in the physical world. It is a indispensable tool for signal analysis, too. For a harmonic current i(t) = As'm((jj 0 t + a) passing through a one-ohm pure resistant, the instantaneous power (energy per unit time) at time t is i 2 (t). The

PAGE 57

45 Row 1 90 100 >io TOO " "MO 1 Col 1 Col 2 Figure 4.3: An Example of Overcomplete Wavelet Denoising. Where: Col 1: An perfect impulse contaminated by white Gaussian noise (rowl) and its level 2 overcomplete wavelet packet representation (row2-5); Col 2: Overcomplete wavelet packet representation after shrinking operations (row2-5) and reconstructed signal (rowl).

PAGE 58

46 Row 1 •* O « O 3 2 :: Figure 4.4: Comparison of Overcomplete Wavelet Shrinkage Denoising with Linear Smoothing Using the Signal of (row l,col 1) of Figure 4.3. Where: Row 1: Overcomplete wavelet shrinkage (left, the same as row 1, col 2 of Figure 4.3) and linear smoothing (right); Row 2: Local zoom-in with overlay of the original noisy signal. average normalized power in a period T is thus W=\[i\t)dt=\A\ Therefore, the average normalized power of a harmonic signal is proportional to its amplitude scjuared. For an arbitrary waveform i(t), one can decompose it into harmonic components using Fourier analysis: /oo I(u)e> 2 * ft df. -oo By the same token, energy density (per unit time and per unit frequency) at frequency uj is proportional to the amplitude squared of the component 7(a;)e J ' 27r ^ t , which is |/(u;)| 2 . The function |/()| 2 describes power distribution in the frequency domain and is called power density spectrum [49, page 66]. The above concepts can be extended to overcomplete wavelet representations, by extending the concept of amplitude to "envelope" [26, chapter 4], or "local amplitude." An overcomplete wavelet representation is consist of bandpass components.

PAGE 59

47 Since any bandpass waveform may be represented by A(t) sin(u c f + a(t)) [26, page 229], where A(t) is the envelope and ui c is the associated carrier frequency, an overcomplete wavelet representation {wjfc(t)|o=0 otherwise.

PAGE 60

48 (b) Figure 4.5: Two Examples of Power Density Distribution of Overcomplete Wavelet Packet Representations. Legend: (a) A linear chirp signal, the same as in Figure 3.4(b); (b) A multi-segment pure tone signal with frequencies uj 0 = 0.087T, u>i = 0.257T, o>2 = 0.157T and u 3 = 0.557T, in the order of occurrence. The overcomplete wavelet representation and visual arrangements are the same as in Figure 3.4.

PAGE 61

49 -0.6 -0 4 -0.2 -0.8 02 04 06 0 8 -1 -!) -7 -5 -3 -1 1 3 5 7 9 n -9.117551207e-2 -1.889891117e-l -6.286127232e-l 6.286127232e-l 1.889891117e-l 9.117551207e-2 4.525349255e-2 1.946367728e-2 -1.946367728e-2 -4.525349255e-2 h(n) (a) (b) Figure 4.6: A FIR Hilbert Transformer. Legend: (a) the imaginary part of the frequency response (the real part is zero). (b) nonzero coefficients. Therefore, the Fourier transform of the analytical signal x(t) is: For implementations in discrete-time domain, approximate FIR Hilbert transformers may be designed by windowing the ideal frequency response [49]. Figure 4.6 shows an example. It is a type-Ill FIR Hilbert transformer designed with parameters M = 18, and (3 — 2.629 [49. page 680]. This Hilbert transformer is antisymmetric, of length 19 and has only 10 nonzero coefficients. 4.4.2 Envelope Detection by Zero Crossings In this method, the maximum absolute value between two adjacent zerocrossings is first found, and then assigned to each point within the interval. Algorithm 4.4.1 (Envelope by Zero Crossings) For a given array x(l : N), start from the index i = 1, find the next index k which is either a zero crossing point or a zero valued point x(k) = 0 or the other end point k = N, whoever is first encountered, assign the maximum absolute value = max {|x(n)|j<„=0 otherwise.

PAGE 62

50 {x(n)\ i € [0.17T, 0.77r]). While Hilbert transform was unacceptable at the low end (u < O.ln), the zero-crossing approach did poor at the high end (uj > 0.77r). The test cases were shown in the rows 1 and 2 of Figure 4.7. We constructed a noisy impulse by introducing a random frequency perturbation: s{n) = sin((w 0 + 0.05 * RANDN(n))n), where RANDN (a MATLAB function) is a random noise with normal distribution, and filtered it with a Lemarie filter Hi(u). This case was included in the row 3 of Figure 4.7. We found that the zero-crossing method exhibited edge-preserving smoothing characteristics and was more robust to wide-band noise. 4.4.4 Comparison with Other Energy Operators A useful and simple energy operator was analyzed on [46]. The operator \& for a discrete signal x(n) was given as * [x(n)\ = x 2 (n) x(n + l)x(n 1). However, this operator differs with envelope detectors in the way that its value on a pure tone signal is not only proportional to the amplitude squared but also a function of frequency [46] : * [A sin(u 0 n + a)} = A 2 sin 2 (u;o). 4.4.5 Two Dimensional Envelope Detection Next we extended the ID envelope detection algorithms for the analysis of two-dimensional image signals. In the frequency domain, a two-dimensional analytic

PAGE 63

51 Row 1 |WWW(WWW^ Col 1 Col 2 Col 3 Figure 4.7: Comparison of the Two Envelope Detectors. Where: Col 1: Input signals (top row, u 0 — 0.1ir, Wi = 0.57r; center row, a;o = 0.057r, a;i=0.757r; bottom row, noisy signal with a; 0 = 0.l7r). Col 2: Envelopes detected via the 19-tap Hilbert transformer. Col 3: Envelopes detected via the zero-crossing method. signal may be obtained by setting an appropriate half plane to zero, based on its orientation. That is, for a 2D signal f(x,y), the Fourier transform of an analytic signal f(x,y) is either: F(uJ X ,UJy) _ f 2F(u) x ,u y ) , uj x >0 I 0 , otherwise or, F(u> x ,u y ) = l 2F ^y) > u »>=° 0 , otherwise. For the 2D filters constructed by the tensor product method with halfband filters, the equivalent complex quadrature filters exhibited the frequency response shown in Figure 4.8. Notice that lowpass and diagonal components may have an alternate arrangement by zeroing out the left or bottom half plane. This separable property allowed one to compute the envelope of a 2D signal using the ID algorithms described earlier in a straightforward manner, described in the following.

PAGE 64

52 f. Figure 4.8: Frequency Response of Equivalent Complex Quadrature Filters (level 1). Legend: (a) lowpass channel (b) vertical channel (c) horizontal channel (d) diagonal channel The diagonal shadowed areas identify zeroed half planes. Algorithm 4.4.2 (2D Envelope Detection) For a given 2D array w(m,n) and its orientation. If its orientation is horizontal, apply the ID envelope detector columnwise; If its orientation is vertical, apply the ID envelope detector row-wise; Otherwise, apply the ID envelope detector column-wise.

PAGE 65

CHAPTER 5 APPLICATION I: TEXTURE SEGMENTATION 5.1 Introduction In this chapter, we shall apply overcomplete wavelet representations to the problem of segmentation of textured images. However, it seems that we were back to the beginning. It may sound odd, but it is true that there is not even a precise definition of "texture," let alone "texture segmentation" [61]. In spite of the difficulty, there is a general consensus that texture is a property of area. We felt that the concept of texture is parallel to the concept of "local frequency" for an non-stationary signal. The concept of local frequency is easy to understand. It describes the rate of change occurs in a window around a particular time. Yet there is not a unique definition of local frequency. Therefore, our treatment of texture segmentation is to embrace it into the paradigm of time-frequency analysis. There have been many research works on texture segmentation using the concept of spatial-frequency representation. The analyzing functions used by researchers include: • Laws microtexture masks (filters) [24] • Complex prolate spheroidal sequences [64] • Complex Gabor functions [6, 19] • Real Gabor functions [30] • The Wigner distribution [54] More recently, wavelet and wavelet packet representations have been added to the list [34, 35, 36, 62, 9, 37]. A segmentation process usually consists of two distinct phases: feature extraction and clustering. Features for texture representation are of crucial importance 53

PAGE 66

54 for accomplishing segmentation [27]. In this chapter, we demonstrated that overcomplete wavelet packet representation and envelope detection make a good feature extraction scheme on a variety of both natural and synthetic textured images. 5.2 Texture Feature Extraction The texture features we chose are the envelopes of overcomplete wavelet representations. The interpretation of the feature is that their squares corresponds to spatial-frequency power density distribution as pointed out in the last chapter. The feature extraction thus consists of two stages: 1) overcomplete wavelet packet decomposition and 2) envelope detection. For 2D envelope detection on overcomplete wavelet packet representations, we classified each node in the decomposition tree into four possible categories, taking into account orientation, as follows: • The root node is omnidirectional. • The node last filtered by Gi(u> x )Hi(u> y ) corresponds to verticalorientation. (Highpass filter G< is applied row-wise and lowpass filter Hi column-wise.) • The node last filtered by Hi(u x )Gi(u) y ) corresponds to horizontalorientation. (Lowpass filter Hi is applied rowwise and highpass filter Gi column-wise.) • The node last filtered by Gi(u x )Gi(u y ) corresponds to diagonalorientation. (Highpass filter G t is applied row-wise and highpass filter Gi column-wise) • The node last filtered by H t (uj x )H t (u y ) has the same orientation as its parent. (Lowpass filter H t is applied row-wise and lowpass filter H t column-wise) At the end of feature extraction, each sample of a signal was attached a feature vector. In our case, such a feature vector is consist of envelopes of overcomplete wavelet packet representation. It is possible to apply a monotonic function to the

PAGE 67

55 feature space to transform it into another feature space. For example, a square function x 2 will transform the envelope space into a space of power density distribution and log function log(x) will transform the envelope space into a space of log spectral. We raised the issue of filter selection in the Chapter 3 on overcomplete wavelet representations. For the application of segmentation, we argued that symmetry, frequency response, and boundary accuracy are important factors in the selection of filters for feature extraction. In the following, we discuss these constraints in terms of overall performance. • Symmetry. For accomplishing texture segmentation, accuracy in texture boundary detection is crucial for reliable performance. In this application, filters with symmetry or antisymmetry are clearly favored. Such filters have a linear phase response, where the delay (shift) is predictable. Alternatively, filters with nonlinear phase may introduce complex distortion. Moreover, symmetric or anti-symmetric filters are also advantageous in alleviating boundary effects through simple methods of mirror extension (see Appendix D). • Optimal Frequency Response. In order to derive an ideal filter frequency response for the chosen feature, we considered a two-band filter bank with input signals of infinite length consisting of two segments with distinct pure tones. The input signals can be written as: where .4i>0 and A 2 >0. Except for the boundary (n = 0), we derived the feature vectors (envelopes of channel outputs) as, 5.3 Considerations on Filter Selection A\ cos{u)in + Oil) ,n<0, A 2 cos(tu 2 n + 012) , n > 0, T = (e H ,e G ) f left =(A 1 \H(oj 1 )\,A 1 \G(u 1 )\) ,n<0, fright = (A 2 \HM\,A 2 \G(u 2 )\) ,n > 0.

PAGE 68

50 The angle 9 between vectors fi eft and T right is \HMH(u 2 )\ + \G{an)GM\ cos 1 Vl^OI 2 + \GM\ 2 y/\H(u2)\ 2 + \G(u2)f and is bound by O<0< f . The distance between the two classes in the feature space is then D Ti left + right it ft r, right cos{9) For 9 = | , vectors fi eft and f right are orthogonal, and the distance D reaches its maximum. Clearly, the maximum distance in feature space between any two classes is optimal for segmentation and classification applications, in the sense that the classes are better seperated and more robust to noise perturbations. Notice that for H{u> x ) + 0 and G{uj 2 ) ± 0, cos(0) = 0 if and only if G{u)i) = 0 and H(u 2 ) = 0. This means that optimal filter banks should have no overlap in the frequency domain, and thus filter H(u) with optimal frequency response must be a perfect half-band filter. Indeed, [50] derived a similar optimal filter for image coding applications. Of course such ideal filters cannot be realized in practice. Based on the preceding discussion, filter bands overlapping in the frequency domain tend to reduce class distance in feature space. Therefore, a filter H(uj) with a large stop-band attenuation and flat pass-band frequency response is desirable. Spatial Localization. There are two types of boundaries for signals of finite length: 1) boundaries of regions exhibiting distinct characteristics, and 2) the physical boundaries of a data segment. Feature vectors close to the boundaries will be affected. The size of the affected region depends on the length of the channel filters, and the distribution of filter coefficients. Therefore, filters of short length and fast decay shall be better suited for boundary detection.

PAGE 69

57 Unfortunately, the preceding three criteria cannot be satisfied simultaneously. As previously pointed out, quadrature mirror filters (QMF) with compact support cannot be symmetric or anti-symmetric. This means that a symmetric constraint on any QMF will be in conflict with spatial localization goals. Moreover, large attenuation in the stop band requires a longer filter, which in turn degrades the filter's spatial localization. Again, we faced the limitation of the uncertainty principle, and the previous discussions about the uncertainty factor of a filter bank is applicable here. In this study, we chose the Lemarie filters for its symmetry and good frequency characteristics. 5.4 The Basic Isodata Clustering Algorithm Segmentation algorithms accept a set of features as input and output a consistent labeling for each pixel. Basically, this is a multi-dimensional data clustering problem with no general algorithm available [23]. Clustering algorithms that have been previously used for texture segmentation can be divided into two categories: 1) supervised segmentation and 2) unsupervised segmentation. In practice, unsupervised segmentation is often desirable and easy to validate. It is particularly useful in those cases where testing samples are difficult to prepare (making supervised segmentation infeasible). Thus, we used a Baisc Isodata clustering algorithm [18, page 201]. Algorithm 5.4.1 (Basic Isodata) Given a 2D image array x of structure containing feature vectors and label fields, and the number of classes N c , Step 1. Scan the representation matrix x in raster order. For every pixel encountered, randomly pick a label from set {0, ...,N C — 1} and assign it to the pixel. Step 2. Compute the class center {C^ |o
PAGE 70

58 Step 3. Rescan the whole representation matrix x, and assign pixel to the class k if the Euclidean distance between the feature vector of the pixel and the class center Ck is the closest. Step 4. // no pixel changes its class in the Step 3, stop; else, go to the Step 2. This algorithm differs from a K-means algorithm [29] in only one aspect: Basic Isodata updates class centers after a complete scan of an input feature set while K-means updates for every reassignment. In our experiments, Basic Isodata outperformed K-means for almost all cases. 5.5 Postprocessing The Basic Isodata algorithm labels each pixel independently and does not take into account the high correlation between neighboring pixels. A postprocessing stage, such as the relaxation labeling [25], can be used to incorporate some neighborhood constraint into the segmentation process. For simplicity, we used median filtering as our postprocessing to simulate the benefit of a local constraint. In particular, a 5x5 median filtering was repeatedly applied to an initial segmentation until no change of labeling occurred. 5.6 Experimental Results Our segmentation algorithm was tested on both one-dimensional signals and two-dimensional textured images. Our test images included samples of two distinct families of textures. In all examples shown, straightforward envelopes of the overcomplete wavelet packet representation were used as texture feature. 5.6.1 One-dimensional Signals Figure 5.1 shows the segmentation of a signal consist of a sinusoid segment and a triangular segment with the same period (sixteen samples) and amplitude.

PAGE 71

59 Row 1 f\ A A 3 == -^WV\A/VNAA/WVVVvWV\/WWv Figure 5.1: Segmentation of a ID Signal Consists of Triangular Waveform and Sinusoid. Where: Row 1: the original signal and the segmentation result; Row 2-5: the overcomplete wavelet packet representation and envelopes. With envelopes of four channels of the Level 2 representation, the perfect result was achieved. Figure 5.2 is a segmentation example of a noisy signal consist of two sinusoid segments (u = 0.3n and u; = 0.57r) contaminated by white Gaussian noise. Envelopes of eight channels of the Level 3 representation was able to achieve the perfect result. 5.6.2 Natural Textures Here we used textures obtained from the Brodatz album [7] and public archive. Each testing sample was first histogram-equalized so that a segmentation result based only on first order statistics was not possible. Experimental results are displayed in Figure 5.3. Experimentally, we observed that a lower order Lemarie-Battle filter (p=l) performed well in boundary detection (Figure 5.3 (b)), while the higher order Lemarie-Battle filter (p=2) did a better job within non-boundary (internal) regions.

PAGE 72

60 Row 1 rv\/vsArWWWW\AAAAA/ Figure 5.2: Segmentation of a Noisy ID Signal Consists of Two Pure Tone Segments. Where: Row 1: the original signal and the segmentation result, Row 2-9: the overcomplete wavelet packet representation and envelopes.

PAGE 73

Col 1 Col 2 Col 3 Figure 5.3: Segmentation Results of a Image Consist of Natural Textures. Where: Row 1: Test image Tl (256x256, 8-bit) consists of D17, herringborn weave and bark (true boundary overlaid for display only). Row 2: Clustering results using features extracted from a Level 4 filter bank generated by (Col 1) Lemarie-Battle filter of p = 1, (Col 2) autocorrelation function of Lemarie-Battle filter of p=l, (Col 3) Lemarie-Battle filter of p = 2. The zero-crossing algorithm were used for envelope detection. Row 3: Final segmentations after postprocessing.

PAGE 74

62 5.6.3 Synthetic Textures We tested the performance of our algorithm on several texture images synthesized from random noise based on [27]. • Gaussian Lowpass and Bandpass Textures. The images were generated by filtering a white Gaussian noise with mean of zero and standard deviation of 30 with a Gaussian filter of frequency response (in polar coordinate): • Filtered Impulse Noise (FIN). The images were generated by filtering a random impulse image I(m, n) generated by: where RAN G (0.0,1.0) is a random number with uniform distribution, with Gaussian impulse response: In both cases, images were linearly-scaled into the range of [0, 255]. Figure 5.4 shows a segmentation result on a Gaussian lowpass texture image, and Figure 5.5 shows a segmentation result on a filtered impulse noise (FIN) texture image. For this difficult test image T3, the algorithm achieved outstanding performance. We also tested our algorithm on a texture image containing regular and sparse elements. Figure 5.6 demonstrates the accurate segmentation result. A quantitative comparison, presenting the accuracy of our segmentation results of textured images is summarized in Table 5.1. This performance is consistent G{r,e)=e -(6-e 0 ) 2 /(2^BZ) e -(r-F c ) 2 /(2S 2 r ) g(m,n) = e m 2 /(2S 2 ) e -n 2 /(2S 2 )_ 5.7 Summary and Discussion

PAGE 75

(a) (b) (c) Figure 5.4: Segmentation of a Synthetic Gaussian Lowpassed Texture Image. Legend: (a) Test image T2 (256 x 256, 8-bit): Gaussian LP, left: isotropic F c = 0, S r = 60; right: non-isotropic F c = 0. 5 r = 60, 9 0 = 0, B 0 = 0.175. (b) Initial segmentation (Lemarie-Battle filter of p = 1, Level 4 and zero-crossing envelope detection) (c) Final segmentation after postprocessing. (a) (b) (c) Figure 5.5: Segmentation of a Synthetic FIN Texture Image. Legend: (a) Test image T3 (256 x 256, 8-bit): Filtered impulse noise, left: non-isotropic T = 0.15, S x — 1.0. S y = 1.5; right: non-isotropic T = 0.15, S x = 2.0, S y = 1.0. (b) Initial segmentation (Lemarie-Battle filter of p = 1, Level 4 and zero-crossing envelope detection) (c) Final segmentation after postprocessing.

PAGE 76

64 (a) (b) (c) Figure 5.6: Segmentation of a Synthetic Texture Image with Line Patterns. Legend: (a) Test image T4 (256 x 256, 8-bit) (b) Final segmentation (Lemarie-Battle filter of p = 1, Level 3 and zero-crossing based envelope detection). (c) Detected boundary overlaid with the original image. Table 5.1: Boundary Accuracy of the Segmentation Results. Test image Maximum ABE Average ABE ±a Tl 11.0 3.2 2.6 T2 15.0 2.9 2.4 T3 1 1.0 2.9 2.6 ABE: Absolute Boundary Error (in pixels). with the difficulty of segmentation perceived by human observers. We observed that boundary errors are dependent on shape i.e., complex boundaries yielded higher variance. Overcomplete wavelet packet representations and envelope features provide a versatile and flexible framework. The examples showed that satisfactory results can be achieved. Based on our experiments, low-order Lemarie filter (p = 1) achieved the best results. However, we are aware that there were several important parameters to be determined manually for the segmentation algorithm to work. These include the configuration of the representation and the number of classes. On the face of lacking a precise definition of the problem, we believed that no mathematical solution

PAGE 77

65 is possible, and these parameters can only be designed case-by-case for particular applications. Comparing with Gabor filters, overcomplete wavelet packet representations possess several potential advantages for further exploration: • channel filters cover exactly the frequency domain (provide a mathematically complete representation) without significant overlap, thus greatly reducing correlations between features extracted from distinct channels • adaptive pruning of a decomposition tree makes possible the reduction of computational complexity and the length of feature vectors • by moving up the decomposition tree, feature vectors at distinct resolution levels can be obtained to increase the accuracy of boundary detection

PAGE 78

CHAPTER 6 APPLICATION II: IMAGE DEBLURRING 6.1 Introduction Image deblurring is a special case of signal restoration. The goal is to recover an image which is blurred and usually degraded by noise. During the past two decades, large amount of research works have been done in this area [33, 2, 28]. Unlike the texture segmentation, image deblurring is a well defined problem. The mathematical model commonly used to describe the degradation process is a convolution and an additive noise, written as: y(k u k 2 ) = d{k u k 2 ) * x(k u k 2 ) + n(k u k 2 ), where x(ki,k 2 ) is the original image, y(ki,k 2 ) is the observed image, d(ki,k 2 ) is the impulse response of the blurring operator, n(k x ,k 2 ) is the noise and the symbol * stands for convolution. The deblurring problem is to restore the original image x(ki,k 2 ) from the degraded image y(h, k 2 ). This is thus a linear inverse problem. At the first glance, the problem seems solvable. Assuming known impulse response of the blurring process and the absent of noise, one may formally derive the solution in the frequency domain as: X(ui,oj 2 ) = Y(uj u lo 2 )/D(uji,uj 2 ). This is the simplest, most intuitive approach and is called inverse filtering [2, 28, 22]. However, the frequency response D(u\,u) 2 ) associated with a blurring process usually has zeros in the frequency plane and its reciprocal is thus unbounded. In the other point of view, blurring generally causes irreversible information loss even without the existence of noise. By taking into account of noise, the inverse filtering approach 66

PAGE 79

G7 would produce a restoration: X'(ui,u} 2 ) = Y(ui,u}2)/D{ui,u 2 ) = X(ui,u 2 ) + N{u)i,u)2)/D(u)i,u)2)In this case, the closeness of the restored image X' to the original image X is hinged on the strength of the noise N. Notice that the precise waveform of the noise is unknown and thus cannot be completely subtracted from the observed image Y. Inverse problems associated with a unbounded inverse operator is said to be ill-posed [33, 14]. The traditional mathematical tool for dealing with such problems is regularization methods [60]. In this chapter, we shall apply overcomplete wavelet packet representations to the deblurring problem. Our approach explicitly suppresses noise using the nonlinear shrink operator and approximates an inverse filter using the gain operator. In this section, we shall review modified inverse filters, Wiener filters and the more recent theory of wavelet-vaguelette inversion. 6.2.1 Modified Inverse Filters For blurs having zeros in the frequency plane, there are two modifications available to make a bounded inverse filter. 1. Gain-limited Inverse Filter. The idea is simply setting a upper-bound for the magnitude of frequency response to grow, as described below: where R is a positive real value and D(u x ,ui y ) is a real filter. 2. Pseudo-inverse Filter[28, page 276]. This is the same as the gain-limited inverse filter except the constant gain region set to zero: 6.2 Review of Some Deblurring Techniques ' 1/R F{U X ,Uy) = < l/D{LU X ,UJy) 0 < D(u x ,u y ) < R, \D{W X ,Uy)\ > R. -R < D(u x ,u y ) < 0, (6.1) 0 \/D(UJ X ,U)y) \D(u) x ,u) y )\ < R, \D{u) x ,u y )\ > R, (6.2)

PAGE 80

68 where R is a positive real value. Both approaches have their pros and cons. The advantage of the gain-limited inverse filter is that it is continuous in the frequency domain. The disadvantage is that the passband of \D(u} x ,u y ) \ < R may significantly enhance wideband noise. In opposite, the pseudo-inverse filter will completely wipe out anything outside its region of inversion by sacrificing the continuity. 6.2.2 Wiener Filters The idea of the Wiener filter is to choose a deblurring filter / to minimize the mean square restoration error defined as [18, 2, 22]: £ = E{[x'(k u k 2 )-x(k u k 2 )} 2 }, where x' = /* (d*x + n) and E denotes mathematical expectation [49]. Assume that noise n is a random field with zero mean and is independent of x, the expression of £ can be evaluated as: £ = E[{f*d*x-x) 2 -2{f*d*x-x){f* n) + (/ * n) 2 ] = E [((/ * d 8) * xf] +E[(f* nf where F, D, <3> xx and $„„ are all functions of (u^,^), and $ xx and $„„ are the power density spectrum of the signal and noise [49, Section 2.10], respectively. Furthermore, the integrand can be factored as: \FD 1| 2 $ xx + |F| 2 $ nn = (F1/D) (F* 1/D*) \D\ 2 $ xx + FF* xx + nn ) P £>| 2 $ xx + $„„) + (|D| 2 $ xx + $ nn -I F D' XX \D\ 2 IX + r (\D\ 2 +$ 1 l D l 2 *" + $

PAGE 81

69 Since t he hist term is large than or equal to zero, the optimal deblurring filter (Wiener filter) is thus F(u x ,U2) = — 2 -, (6.3) \D{u)^UJ 2 )\ 9 xx (Ui,U2) + «„ n (Wi,W2) which can be rewritten in terms of the signal-to-noise power ratio D*{u u uj 2 ) F(u}\,U} 2 ) — 9~ \D[u) X ,U) 2 )\ + $nn(uJl,UJ 2 )/$ xx (uJ u U 2 ) For white Gaussian noise, $ nn (^i, w 2 ) = NoThe power spectrum $ xx {ui,u 2 ) characterizes correlation property of the signal x. As the correlation usually falls off as the sample distant increases, $ xx (ui, u 2 ) usually is a lowpass function. If the signal has no correlation among its adjacent samples, $ xx (u)i,uj 2 ) = S 0 , and the signal-tonoise power ratio equals to a constant S 0 /N 0 = 1/R. In this case, the Wiener filter is reduced to %U2 ) = JM . (6.4) Comparing with the gain-limited inverse filter and the pseudo-inverse filter, the Wiener filter of (6.4) possesses both merits of continuity and noise suppression. 6.2.3 WaveletVaguelette Inversion The combination of waveletvaguelette decomposition (WVD) and nonlinear shrinkage of WVD coefficients was proposed by Donoho [14] for linear inversion problems involving dilation-homogeneous operators. An operator T with such property is commutable with a dilation operator D a such that TD a = a a D a T , where a is a constant, and D a is the dilation operator defined as D a [/(*)] = f{at) . The WVD involves three sets of functions: an orthogonal wavelet basis {ipj,k{t) = 2 j/2 *P(2 j t k)} and two near-orthogonal sets {u jJe (t) = 2 j/2 u{2 j t k)} and

PAGE 82

70 {vj, k (t) = 2 j / 2 v(2H k)}. The three sets are related by the quasi-singular value relations r ^j,fc = IjVjJe, (6-5) TX* = 7A*> (6-6) where quasi-singular values jj depend on resolution index j but not spatial index A;. Moreover, the two sets {%*(<)} and {v jt k(t)} satisfy biothogonality /oo Mj,*(*)Wm,n(')^ = S J.mh,n, -oo and near-orthogonality 2 i,fcLinking everything together, the reproducing formula of the WVD is expressed by = S t rx ' 7TViJfc(*)i where [x, y] = ff^, x(i)y*(£)di is the inner product of x and y. Finally, the WVD inversion of a white Gaussian contaminated measurement y(t) = Tx(t) + n(t) may be achieved via x'(t) = E^{b.M77 1 }^(*)> ( 6 7 ) j,* where /C (j is the nonlinear shrinking operator defined by (4.4). Notice that the threshold tj depends only on resolution j. 6.2.4 Discussion The Wiener filter provided a powerful tool for linear inversion problems. Given the statistics of both image and noise processes, one can design a optimal filter. However, Wiener filter belongs to the linear filtering paradigm and works in the

PAGE 83

71 frequency domain. The linear property determines that it has to trade-off between smoothing out noise and sharpening up edges. In the opposite, the waveletvaguelette approach deployed nonlinear shrinkage to combat noise in the time domain. It is characterized by three components: 1) dyadic analyzing and synthesizing functions determined by the blurring operator T; 2) level-dependent gain factors 7" 1 ; and 3) nonlinear shrinkage with level-dependent thresholds tj. The nonlinearity enables it to achieve better compromise between denoising and sharpening. However, it is limited to dilation-homogeneous operators only, excluding the most commonly encountered Gaussian and uniform blurs [14]. Moreover, the theory was developed in the continuous-time domain. The limitation of the WVD inversion lies on its insistence of the quasi-singular value relations (Eqs. (6.5) and (6.6)). For a convolution operator T with kernel h(t), in the frequency domain (6.5) is equal to H(u)^(2j lo) = jjV{2J io). (6.8) Let u)' = 2 k uj and plug it into (6.8), we have H(2 k u)V{2u k) oj) = jjV(2{j k) u). (6.9) In the other hand, directly plugging level index j — k into (6.8) produces H{uj)y(2{j k) uj) = ^ k V(2~^k) u). (6.10) By comparing (6.9) and (6.10), we conclude that it must be true that H(2 k u) 7* , TT . , = — — (independent of ui) , H(u) 7j_ fc which is the characteristic of dilation-homogeneous operators in the frequency domain.

PAGE 84

72 6.3 Discrete-time Overcomplete Wavelet Packet Inversion 6.3.1 One-dimensional Formulation There were two major considerations in onr searching for a deblurring algorithm. First, it should be available in the discrete-time domain. Second, it should be applicable to inhomogeneous blurring operators. The result was the discrete-time overcomplete wavelet packet representations equipped with a nonlinear shrinking operator and a gain operator. For an one-dimensional observation y{k) = d(k) * x(k) + n(k), our deblurring algorithm can be expressed by: 00 Wj{k)= E Vj{m)y{k-m), (6.11) m=— oo A/-1 oo = E E K h K(™)] x J u j( k m )> ( 6 12 ) j'=0 m= — oo where {Aj| 0
PAGE 85

73 To complete the framework, we still need to resolve two issues. First, for a given blur with frequency response D(u>), how can we determine the channel configuration (analyzing and synthesizing filters) as well as the gain vector? Second, how should we choose thresholds tj for denoising? We proposed the following algorithm for the channel structure and the gain vector: 1) Choose a gain-limited inverse filter or a Wiener filter to approximate. 2) Choose the prototype filters H(u), H(u), G(u) and G{u), and then the channel filters of the Level 1 can be determined by Ci, 0 (^) = H(uj)H{uj) and Ci,i{u) = G(u)G{u). 3) Call Algorithm 4.2.2 (Minimum Tree) to determine both the tree configuration and gain vector. However, the issue of determining thresholds tj of denoising is much more difficult. Thresholding schemes for several particular applications were suggested by Donoho [15]. Generally speaking, knowledge about signals and noise is needed to devise a thresholding scheme. By treating signal and noise as two random processes, the correlation property may be used to characterize them, as in the case of Wiener filtering. For wide-sense stationary white Gaussian noise and colored Gaussian signals, their autocorrelation functions d> and power spectrum densities $ may be written as: Note that <3> XI (u;) is generally different from the power density spectrum |X(u>)| 2 of a particular realization x of the signal process. Under these assumptions, a possible thresholding scheme is: (6.13) (6.14)

PAGE 86

74 where Bj is the bandwidth of the channel j, and uij is the center frequency of the channel. 6.3.2 Two-dimensional Extension The one-dimensional algorithm of the overcomplete wavelet packet inversion need to be extended to two-dimension in order to deal with images degraded by symmetric 2D blurs. Formally, the 2D version may be written as A/-1 M-l x'{k u k 2 ) =J2 Y, 5m^c 1 , CJ [« ; ci ) c 2 (mi,m2)] A Cl)C2 u Cl)CJ (fci -m 1 ,k 2 -m 2 ). Cl =0 C2=0 mi m 2 However, an efficient 2D overcomplete wavelet packet representation is much more difficult to seek than in the ID cases. In general, the 2D separable overcomplete wavelet packets may not be the efficient representation. This is due to the fact that both 2D modified inverse filters and Wiener filters are non-separable even for separable symmetric 2D blurs. For example, for a separable Gaussian blur: D(u x ,u y ) = Ae-^' +u2 y^ 2 , the pseudo-inverse filter and the Wiener filter are l/AetA+tU* , < osjMA/R) (615) 0 , otherwise, and F(^u; y ) = l/(l + R/Ae^ + ^° 2 ). Both filters are isotropic and can be efficiently represented by donut-shaped channel filters. In practice, a reasonable approximation may be achieved by applying the ID algorithm separately to rows and columns. For the separable Gaussian blur, a separable pseudo-inverse filter may be constructed as: F(u x ,ujy) = Fi(u) x )Fi(ujy), F{U X ,LOy)

PAGE 87

75 where the ID filter Fi(lj) is F l {u) f Ji/Ae" 2 /* 2 ,| W | < a^HA/R) 1 0 , otherwise. It is clear that the rectangular region (\uj x \ < a^/ln(.4/i?), < CT^ln(A/i?)) of the above 2D filter includes the disk of yju^ + to* < o\J\n(A/R) of the (6.15). By choosing appropriate parameter R, acceptable results may also be achieved by approximating 2D Wiener filters with ID Wiener filters. For uniform blurs, gain-limited inverse filters are similar to Wiener filters. The major problem of such separable approximations is they may boost diagonal frequency components significantly. Figures 6.1 and 6.2 show examples of both 2D non-separable and separable filters. The emphasis of the diagonal frequency region is particularly clear in the case of Gaussian blur. 6.4 Experimental Results To quantitatively compare deblurring results, we used an objective measure, the improvement in signal-to-noise ratio (ISNR), defined as [3]: 75^ = 20 logjlti},p=l,2, where \\v\\ p is L p norm defined as [56]: /N-i \ l lv E Hn)\ P ,n=0 and x, y and x' are the original, degraded and deblurred signals, respectively. Moreover, a performance upper bound is useful in providing a gauge for comparison. The performance of an oracle Wiener filter may be considered the upper bound for Wiener filters. An oracle Wiener filter is a Wiener filter with $ nn (a>) and $^(0;) replaced by power density spectrums |iV(a>)| 2 and lA'Xw)! 2 of the particular realization of the noise and the original signal. The filter may be written as: D*(u) |D(u;)| 2 + |aW/|A>)

PAGE 88

7G Row 1 2 3 Figure 6.1: Log Frequency Responses of Filters for a Gaussian Blur. Where: Row 1: Blurring filter Row 2: 2D non-separable filters, gain-limited, pseudo-inverse and Wiener filter from left to right Row 3: 2D separable filters, gain-limited, pseudo-inverse and Wiener filter from left to right

PAGE 89

77 Figure 6.2: Log Frequency Responses of Filters for an Uniform Blur. Row 1: Blurring filter Row 2: 2D non-separable filters, gain-limited, pseudo-inverse and Wiener filter from left to right Row 3: 2D separable filters, gain-limited, pseudo-inverse and Wiener filter from left to right

PAGE 90

78 However, it is not necessarily the upper bound of the overcomplete wavelet packet inversion. 6.4.1 One-dimensional Signals One-dimensional deblurring examples are shown in Figure 6.3 and 6.4. The ideal signal in both cases is a perfect multi-level blocky function, which is very challenging and revealing for any deblurring method. For overcomplete wavelet packet representations used in both examples, Lemarie filter of p= 1 was used. In Figure 6.3, the impulse response of a Gaussian blur was: Aexv[-{k/(aN/2)) 2 }, 0
PAGE 91

79 Row 1 Figure 6.3: One-dimensional Deblurrings of a Gaussian Blur. Where: Row 1: The original signal Row 2: The blur filter and the degraded signal Row 3: The oracle Wiener filter and the deblurred signal Row 4: The Wiener filter and the deblurred signal Row 5: The gain-limited inverse filter and the deblurred signal Row 6: The overall frequency response of the overcomplete wavelet packet inversion and the deblurred signal Row 7: Channels of the overcomplete wavelet packet representation Note: The left column are frequency responses. The central column are signals of full-length. The right column are zoom-in of the feature region, where deblurring results were overlaid with the degraded signal of doted line.

PAGE 92

80 Figure 6.4: One-dimensional Deblurrings of an Uniform Blur. Where: Row 1: The original signal Row 2: The blur filter and the degraded signal Row 3: The oracle Wiener filter and the deblurred signal Row 4: The Wiener filter and the deblurred signal Row 5: The gain-limited inverse filter and the deblurred signal Row 6: The overall frequency response of the overcomplete wavelet packet inversion and the deblurred signal Row 7: Channels of the overcomplete wavelet packet representation Note: The left column are frequency responses. The central column are signals of full-length. The right column are zoom-in of the feature region, where deblurring results were overlaid with the degraded signal of doted line.

PAGE 93

SI Table 6.1: Performance of ID Deblurring Examples Blur Type ISNR 2 ( DB ) ISNRr (DB) OWF \VF GLIF OWPI OWF WF GLIF OWPI Gaussian 2.37 1.88 -9.04 1.90 -0.75 -1.45 -15.87 1.18 Uniform 5.17 3.90 -4.32 3.62 -1.00 -2.40 -11.80 0.00 Note: OWF=Oracle Wiener Filter WF=Wiener Filter GLIF=Gain-limited Inverse Filter OWPI=Overcomplete Wavelet Packet Inversion 6.4.2 Two-dimensional Images Examples of image deblurring using the standard Lena image were included in Figures 6.5 and 6.6. Separable blurring sources were used. The same ID blurring filters used in the ID examples were applied rowwise and columnwise on the original Lena image. A statistical signal model of (6.13) with a = 3.5 and o x = 0.008 was used for both Wiener filtering and the overcomplete wavelet packet inversion. In both cases, white Gaussian noise with d 2 n — 0.5 was added such that SNR = 44.3DR The thresholding scheme of (6.14) with a = 25.2 was used for both cases. The shrinking operator was applied on the 2D components. The gain-limited inverse filter (6.1) with R = 0.08 was used for both comparison and overcomplete wavelet packet inversion. A separable overcomplete wavelet packet inversion algorithm was used to approximate a 2D overcomplete wavelet packet inversion. For the case of Gaussian blur, the approximation used 11x11 channels with e = 0.005% (ID). For the case of uniform blur, the approximation used 25 x 25 channels with e = 3.9% (ID). The deblurred images were all linear converted to the same intensity range of the original image for the calculation of ISNR indexes. Table 6.2 compares deblurring performance in terms of ISNR. For the two particular blurring sources and the Lena image, it shows that overall performance of

PAGE 94

S2 Table 6.2: Performance of 2D Deblurring Examples Blur Type ISNR2 (DB) ISNRi (DB) OWF WF GLIF OWPI OWF WF GLIF OWPI Gaussian 3.20 3.17 -14.04 2.80 2.96 3.25 -16.82 2.69 Uniform 3.55 1.78 -14.78 1.76 L.46 -0.60 -18.05 -0.64 Note: OWF=Oracle Wiener Filter WF=Wiener Filter GLIF=Gain-limited Inverse Filter OWPI=Overcomplete Wavelet Packet Inversion the overcomplete wavelet packet inversion is about the same as Wiener filtering by both measures of ISNR 2 and ISNRi. 6.5 Summary and Discussion We have demonstrated that the deblurring problem can be handled using an overcomplete wavelet packet representation. We showed that inverse filtering in the frequency domain and denoising in the time domain can be done in a single step. In other words, for this application the gain and shrinking operators can be combined as a single operation on the spatial-frequency representation. Moreover, our approach may be seen as an extension of the waveletvaguelette inversion into both the discretetime and broader problem domains. Although the one step inversion and denoising may be conceptually appealing, its limitation is also apparent. Since nonlinear shrinkage works only within areas where the magnitude of signal components is stronger than that of noise components, the best representation for denoising is to maximize local signal-to-noise ratio. For example, if a sharp edge is decomposed into many channels, its components at some channels may be indistinguishable from noise. Therefore, the best representations for approximating the inverse filter and denoising are generally different. Having only one choice of representation, the approach has to limit its application to certain signals or blurring sources.

PAGE 95

Figure 6.5: Deblurring a Gaussian-blurred Lena Image. Legend: (a) The original 256 x 256 image (b) Degraded image with SNR=44.3 db (c) Deblurred by the oracle Wiener filter (d) Deblurred by a Wiener filter (e) Deblurred by a gain-limited inverse filter (f) Deblurred by a separable overcomplete wavelet packet inversion

PAGE 96

(a) (b) (c) (d) (e) (f) Figure 6.6: Deblurring an Uniform-blurred Lena Image. Legend: (a) The original 256 x 256 image (b) Degraded image with SNR=44.3 db (c) Deblurred by the oracle Wiener filter (d) Deblurred by a Wiener filter (e) Deblurred by a gain-limited inverse filter (f) Deblurred by a separable overcomplete wavelet packet inversion

PAGE 97

85 This limitation may be the reason that our preliminary experimental results showed roughly the same performance as the much simpler approach of the Wiener filtering. For the Gaussian and uniform blurring sources, some channels with very narrow bandwidth are needed to achieve satisfactory approximation using the gain vector. For those channels, shrinking operation is very close to multiplication.

PAGE 98

CHAPTER 7 CONCLUSION We examined both classes of Fourier-based and wavelet-based representations. We showed that orthogonal wavelet transforms lack translation invariance and the efficacy of representing convolution operators. In particular, we demonstrated that aliasing distortion due to critical subsampling limits its applications in image enhancement and restoration. On the other hand, overcomplete wavelet representations, although not as efficient as non-redundant representations, avoid those shortcomings. Moreover, overcomplete wavelet representations can be conveniently analyzed using linear time invariant system theory. In this thesis, we evaluated their capability of time-frequency localization using the uncertainty factor. We also identified gain, shrinking and envelope operators to approximate convolution operators, suppress noise and extract power density distributions. We investigated the criteria of minimum mean square error and designed an optimization algorithm. We extended the wavelet shrinkage to overcomplete representations. We also presented two envelope detection algorithms and compared their performances. The framework of the overcomplete wavelet representation was applied to two difficult problems: segmentation of textured images and image deblurring. We demonstrated that channel envelopes as feature vector produced satisfactory results on both natural and synthetic textures. We showed the gain and shrinking operators may be used for image deblurring and pointed out its limitations. The issues of time-frequency representations, pattern recognition and noise suppression are very fundamental ones. Our knowledge about them is going to be accumulative and evolving. This thesis increases our level of understanding and points 80

PAGE 99

87 to many opportunities for future research. The issue of time-frequency localization was touched upon but not fully explored. Although the uncertainty factor was used for evaluation of individual channels, we lack a criteria of optimality on a representation as a whole. Once such a criteria is found, we may able to search for the optimal representation of textured images, which was not explored here. Denoising is certainly another front. The optimal representation and adaptive selection of thresholds are going to be intertwined and challenging.

PAGE 100

APPENDIX A PROOFS RELATED TO DISCRETE-TIME WAVELETS Some basic proofs related to (bi)orthogonal discrete-time wavelets were included here in order to make this thesis self-sustained. Proposition A. 0.1 The necessary condition for a pair of discrete sequences to be orthogonal (as defined by (A.l)) is (A. 2): oo ]T a(2n m)b(m 21) = \5{n I) (A.l) m=— oo A{e^)B{e juJ ) + A(e j{uJ+n) )B(e^ +7!) ) = 2A. (A.2) where A is a constant, and A(e joJ ) and B{e juJ ) are discrete-time Fourier transform of a(n) andb(n), respectively. Proof: By changing variable k = m21, the left side of (A.l) is changed to T,T=-oo a [ 2 ( n l ) ~ k] Hk)Then it is equivalent to prove that £ a(2l k)b(k) = \S(l). k—-oo Notice that the left side of the above is a 2-fold decimation of the series oo p(l) = £ a(lk)b(k). k=—oo Since p(l) is the convolution of a(l) and b(l), its discrete-time Fourier transform is known as P{e> u ) = A{e ju} )B(e^). 88

PAGE 101

89 Using the result on M-fold decimator proved in [63]. Fourier transform of p(2l) is equal to l [P(e^ 2 ) + P{e« M) ' 2 )\ , and therefore the Fourier transform of (A.l) is P( e ^/ 2 ) + p( e J(^)/2) = 2A. Finally, by substituting P(e JUJ ) = A{e? w )B{e? w ) back, we proved A{e J " /2 )B(e juJ ? 2 ) + A(^ w+2ir)/2 )B{e>^ +2w)/2 ) = 2A Proposition A.0.2 The necessary condition for (A. 3) is (A. 4): 53 a(2/ m)b{n -21)+ £ c(2/ m)d(n 2/) = 8{m n) (A.3) /=— oo !=— oo A(e J ' w )B(e jw ) +C(e J ' w )D(e J ' w ) = 2. A(e J ' w )B(e J ' (w+,r) ) + C'(e J ' w )D(e J ' ( ' J,+,r) ) = 0. (A.4) Proof: Consider both sides of (A.3) to be two-dimensional series of index (m,n), and apply two-dimensional Fourier transform to the both sides. Let p(m, n) = E^-oo a ( 21 ~ m)b{n 21), then 00 oo P(e J ' w, ,e , ' w ») = 53 E p(m,n)ejmuJx e m=— oo n=— oo = Y. l=-oo Lm=-oo 00 e /=— oo oo 53 6(n 2/)ejnw » 5] a(2J m)e,TBW " 00 ^ a(ra)e jmu; * l= — 00 = 53 2tt5(2o; x + 2a;„ + 2fc7r)i4(eiw *)B(e jh '»') ^ 0 -j7(2wx+2w 1/ ) A;= — oo oo t= — oc

PAGE 102

90 Therefore, the 2D Fourier transform of the left side is £ 7T») m= — oo n=—oo Y 2it6(uj x + u! y + 2kir). k=-oo Thus, 2D Fourier transform of (A. 3) is Y 5(u x + tuy + Attt) [A{ejw *)B(e> u y) + C{ejw ')D{e>^)\ fc=— oo oo = Y 25(io x +u) y + 2k7r). k= — oc By combining terms with even and odd index of k, the above equation can be rewritten as oo Y 6(u x +u y + 2/ctt) [X{e? u ',e? u ») 2 k=—oo oo + Y. 5(ux + u y + (2k + l)ir)X(e? w *,e> u ») = 0. k= — oo where X(e? w *,f? u *) = A{eiu *)B{e? w *) + C(ejW )D(e J '^). Using the properties of the 5 function, the coefficients of the above 8 series must satisfy X(ei{f * 7k *\et u ) = X(ejw ,e juJ ) = 2, X(ej{uJ+{2h+l)n \e jul ) = X{e-^ +7I \e juJ ) = 0. Finally, by substituting Xie^ 1 ^ e^ Wy ) back, we proved A(e juJ )B{e juJ ) + C(e juJ )D(e juJ ) = 2,

PAGE 103

91 f2)|C(^)[ -^2)-|PM A(w)B(2w) -(|4 f4)-C(2u;)£>M (a) (b) Figure A.l: Equivalent Structures for (a) Convolution and Decimation and (b) Expansion and Convolution. Proposition A.0.3 Two cascaded stages of convolution and decimation (expansion) are equivalent to a single stage convolution and decimation ( expansion) as illustrated in Figure A.l. Proof: Assume input x(n) and output y(n), and a(n) and b(n) are the inverse DFT of A(u>) and B(u). For (a), y(n) = E 00 m= — oo Y,f = _ 0O x{k)a{2m k)\ b(2n m) = ££-00 *(*) [E^ = -oo fl(2m k)b(2n m)] = ESUo *(*) [ES-oc «(4n k 206(0] = E£_ooX(fc)p(4n-*) where p(n) = a(n 21)6(0. or ^V") = A(e^)B(e^). Similarly, for (b), y(n) = E^ = -oo [E2L-00 *(*)c(m 2*)j d(n 2m) = Er=-oc *(*) [E^=-oo c(m 2k)d(n 2m) = YSL-oo *(*) [E/^-oo c(0d(n 4* 2/) = E£_oo*(*)9(n-4*) where g(n) = Ei=_oc c (O rf (™ 2/), or Q(e*") = C{e j2uJ )D(e j ")

PAGE 104

92 v k {n) Ufc(n) v c (n) g(n) -(J2) i'c,i(n) «c,l(n) 9(n) -i L A(n) -Q2) »c,2(n) « c ,2(n) (f2Vh(n) 1 u c (n) Figure A. 2: Illustration of Adding One More Channel by Splitting a Channel into Two. Proposition A. 0.4 The basis functions for a binary tree structured filter bank can be expressed as: k = 0 fc-i V 0 (e>") = G(e> w ), fc-i V k (e? u ) = J] H(e>* u )G(e> 2k "), U k {e?") = J] H(e^ 2 ' w )G(e' 2kw ), I < k < M — 2 1=0 1=0 A/-1 M-l Vk-i(^) = I] iTfe*' 2 ' 1 "), £/ A /-i(e>' w ) = I] H(e j2 ' u 'l k = M1. /=o /=o u^ere M > 2. For orthogonal wavelet basis satisfying condition (2.11), we have U k (en = V k *(en, or, u k (n) = v* k (-n). Proof: By direct application of the Proposition A. 0.3. Proposition A. 0.5 Basis functions generated by a general binary tree with filters hi(n) and gi(n) satisfying (2.9) satisfy the following biorthogonality: Em=-oo u c(™ n c l)v k (n k n m) =6(ck)S(l n) (A.5) Proof: By induction. For filter bank of 2 channels, u 0 — gain), u x = h 0 (n), v 0 = g 0 {n) and vi = h 0 (n). In this case, the orthonomality of the basis functions is the given property. Now, assume (A.5) is true for up to M channels. To construct &M+1 channel filter bank, we split an arbitrary channel c into two as illustrated in Figure A. 2.

PAGE 105

93 Since transform coefficient w c (n) before splitting is w e (n) = Zm=-oc x ( m ) v c(n c n m), the new coefficient w c i(n) and w C 2(n) are Wd(n) = YZ-oo^SWn-l) = E^-oc *(m) [E^-oc Vc(nJ ~ m)g c (2n I) = Y^=rx ,x(m)v cl {2n c n m), where v cl (n) = E/=_oo V M ~ n c l)g c (l). Similarly, vain) = E/=_oo v c(n n c l)h c (l), «cl(n) = E;=_oo u c( n ~ n J)9c(l), u
PAGE 106

94 Similarly, we can prove £m=-oo U cl(™ n c ii)v C 2(n c in — m) = 8{n — i), n cl i)v c2 {n cX n m) = 0, n c ii)v c i(n c in — m) — 0 We have thus proved that basis functions of the new filter bank constructed this way are also orthonomal. Notice that orthonomality hold under quite general conditions: 1) Filters g c , h c , g c and h c may be different with those used in other channels, as long as they satisfy condition (2.9). 2) Splitting can be done arbitrarily.

PAGE 107

APPENDIX B NUMERICAL COMPUTATION OF BATTLE-LEMARIE WAVELET Battle-Lemarie wavelets are built from polynomial spline of order 2p+ 1 [42]. Although they are not compactly supported, they do possess two important properties: symmetric and orthogonal. The symmetric property is especially desirable for some image processing applications such as edge detection, character recognition [39], and texture segmentation [62], but is mutual exclusive with the compact support property [13]. In this appendix, we presented an algorithm to compute the scaling function, wavelet, and associated quadrature mirror filters of Battle-Lemarie wavelets. B.l Functions in Finite Terms The Fourier transform of the scaling function (f>, wavelet xp, and the associated quadrature mirror filter h may be written as [42]: = / (B.l) XT/ I <\ e S 4p+4 (2 +tt) m 9 x *P W = ~2^2\ V 1 YV ( BJ > uj 2 p+ 2 \ £ 4 p +4 (u;)X;4p + 4(2) H p (u) = where p is a positive integer and £ 4p+4 (q;) 24p+4£ 4p+4 (2u;) (B.3) + 00 ^ ^n{u) = Yl 7 1 oi, V 95

PAGE 108

96 Therefore, the computation of the function E n (u) is the key. Fortunately, we have a closed form for n — 2 [42], 1 1_ ,(u; + 2A;7r) 2 ~4sin 2 (!)' = m and a closed form expression of the function £ n (u;) based on £ 2 (u;): (_l) n An2 Thus, all we needed is an algorithm to compute derivatives of the function T, 2 (u), which we shall derive in the following: Proposition B.l.l For any integer n>\, the n th order derivative of the function E 2 (w) has the form of d n v f . gn(cosf) m , = SF*(f)' (B 4) where #„(cos |) is a polynomial of order n and satisfied the following recursive relation: <7„+i(s) = ~2 x ) -^9n(x) * x (B.5) Proof: By mathematical induction. For n = 1, £S 2 (u;) = -0.25|$g^. Thus, 0i(cosw/2) = -0.25 cos(u;/2). Assume for any n > 1, (B.4) is true, and # n (cos |) is a polynomial of order n. Then, for n + 1 , d" +1 E 2 (a;) = _o[_ / g n (cosf) \ du^ 1 ~ do; Vsin n+2 (f ) J £ (gn(cos f )) * sin" +2 -g„(cos % ) * (n + 2) * sin" +1 (f ) * cos(|) * i ;„2n+4 sin (I) ff (^(cos I)) * sin |-^*g n (cosf).cos| dcos . sin< n+1 > +2 (f) -|^^( cos I)*! 1 cos2 f) ^ * ^( cos I) * cos f sin («+l)+2(|) g n+1 (cos|) sin (n+D+2(|)'

PAGE 109

97 where g n+ i(x) = — \ d 9 "^ (1 x 2 ) ^^-g n {x) * x, and is clearly of order (n 41) Proposition B.1.2 The coefficients {a n< k} of the polynomial g n (x) satisfies the following recursive relation: ao,o = 0.25 Oo,i = 0 a n,Q — —Q-§a>n-\,\ a n ,k = 0.5 [(fc n 2)a n _ 1> *_i (fc + ljan-i.fc+i] , ln-l,n-lX (B-8) Comparing (B.8) with (B.7), we obtain (B.6) Proposition B.1.3 For even order n of the polynomial g n (x), only terms with an even index are nonzero. For odd order n of the polynomial g n (x), only odd index

PAGE 110

98 terms are nonzero. (This property halves the number of calculations needed for the coefficients.) Proof: By induction. For n = 1, a li0 = — 0.5ao,i = 0, a^i = — a^o = —0.25. For n = 2, a 2 , 0 = -0.5a u = 0.125, a 2 ,i = -1.5ai. 0 = 0, a 2 , 2 = -ai,i = 0.25. Assume for n 1, the assertion is valid. For even n, let n = 2m, 02m,277i — ~~ o 2m _i i2m _i For 2m is even, 2m 1 is odd. And for odd /, / — 1 is even. Use the induction hypothesis, all the odd index terms are zero. The similar proof applied to odd order (n) polynomial. Based on the above properties, we can now write T, n (u) as G2m,0 — —0.5 0,2m-\,\ 02m,i = 0.5 [(I 2m 2)a 2m _i ) i-i 02m,2m+l = —1-5 a 2m _i )2m _ 2 (/ + l)a 2m -i,;+i] £ n (u>) = (-I)" gn-2(cOSf) (n-1)! sin" | (B.9) By combining (B.9) and (B.3), we obtain and g 4p +2(sin |) \| 94 P +2{cosuj)' From Eqs. (B.9), (B.l) and (B.2), we obtain, 'sin(u;/2)r + ' (4p + 3)! (u/2) J ^2 4 P+V+2(cosf)' and therefore, *p(0) (4p + 3)! %(0) = 0. \| 2^^ 4p+2 (l)

PAGE 111

99 Having derived analytical expressions, we shall be able to calculate these three functions of any order at any frequency point. B.2 Numerical Computation Using FFT The three functions that we investigated in the last section are continuous functions in the frequency domain. To compute their waveforms in the time domain, we used the FFT (Fast Fourier Transform) approach. H p (uj) is a 27r-periodic function. It is Discrete Fourier Transform of quadrature mirror filter h(n) which is a discrete function in time domain. Recall that sampling in unit circle of frequency domain results in periodic extension of h(n) in time domain. However, if the sampling period is small enough, aliasing in time domain is negligible and h(n) obtained will be accurate. The sampling values of the H p (u) can be obtained: 2p+2 I \ 0(w) and are Fourier transforms of scaling function (t) and wavelet ip(t), respectively. Notice that functions (t) and t/)(t) are continuous functions in time domain. By numerical method, we can only obtain sampling values in time domain. Recall that DFT (Discrete Fourier Transform) of sampling values (j)(nT) and ip(nT) are related to the Fourier transform of continuous functions
PAGE 112

100 Table B.l: Coefficients of g 24 (x) (a) and Truncated (a) Impulse Response of hi(n) (b). (b) k a 2A,k n hi(n) 0 3679416778537.75 0 0.54173576 2 475582801692177.0 1 0.30682964 4 8011347151626016.5 2 -0.03549798 6 40907559988277171.5625 3 -0.07780792 8 81471913154443867.5 4 0.02268462 10 69736261315895408.25 5 0.02974682 12 26157606919852821.0 6 -0.01214549 14 4123375115615239.5 7 -0.01271542 16 243456149817686.25 8 0.00614143 18 4245937488920.0 9 0.00579932 20 13213718716.5 10 -0.00307863 22 2097148.875 11 -0.00274529 24 0.25 12 0.00154624 Note: hp(-n) = h p (n) To use FFT to compute values £)(A:) and respectively. On the issue of implementation, we notice that coefficients of the polynomial g n (x) is growing very fast. As an example, the coefficients of #24(2) is shown in Table B.l. In our numerical calculations, 64-bit precision was used. Notice that g n (x) appears in both numerator and denominator of the formulae of H p (u). This suggests another way of preventing overflow in calculating H p (io) by using k(n)g n (x), where k(n) is an appropriate multiplier and should be embedded into the iterative process of (B.6).

PAGE 113

101 1 \2 14 1.6 1.6 2 2.2 2.4 2.6 26 Figure B.l: Transition Band of Filters H p {uj) with p = 1,3,9. As the order p of the filter increases, the filter H p (u>) is closer to the ideal halfband filter. Figure B.l showed transition band of filters with p 1, 3, 9. Two MATLAB m-files for calculating QMF of the B-L wavelets are listed in the following. function [H,G] = Lfilter(p,N) % This function build a Quadrature Mirror Filter pair H, G % H: low-pass filter; G: high-pass filter % p: 2p + 1 is the order of Lemarie wavelet % N : length of the filters L = 4*p + 2; C = gcoef(L); % length of C is L+l H = zeros(l,N); G = zeros(l, N); H{1) = sqrt{2); % at w=0, H= v / 2 H{N/2 + 1) = 0.0; % at w = tt ,H=0 dw = 2* pi/N; for % = 1 : N/2 1 omega = i * dw; u = 0; v = 0: for k = 1 : 2 : L + 1 v — v + C(A:) * cos(omega/2)A(k — l); u = u + C(k) * cos(omega) A(k — 1); end: + 1) = s
PAGE 114

102 function [c] = gcoef(n) c = zeros(l, n+1); t = zeros(l, n + 1); c(l) = 0.25; c(2) = 0; if( n > 8 ) k = n/4; else A: = n; end; for i=l:l:n for j = 0 : 1 : i if j == 0 *(1) = -0.5*c(2); elseif( j>0 & (j <= i-2) ) t(j + 1) = 0-5 *((j i 2)*c(j) {j + l)*c(j + 2)): elseif j == (i — 1) t(j + l) = -1.5*c(*-l); else t(j + 1) = -c(»); end; end; if( i > A; ) % apply multiplier c(l : t + 1) = t(l : t + l)/(t fc); else c(l : t + 1) = f(l : i + 1); end: end:

PAGE 115

APPENDIX C NUMERICAL COMPUTATION OF THE UNCERTAINTY FACTOR Eq. (3.7) defines the uncertainty factor of a lowpass filter in the continuous frequency domain. In order to compute uncertainty factors of filters in a filter bank, two issues need to be resolved. First, the definition for a lowpass filter has to be extended to bandpass and highpass filters. Second, the formulas have to be modified for computation in discrete frequency domain. The general definition of the variance of the frequency distribution, including bandpass filters, adopted by Liu and Akansu [40] is: The drawback of applying this definition to bandpass filters is that for real filters \F(e ju> )\ = \F(e~^)\ and thus u; = 0 and is not the central frequency of the passband. Therefore, this is not a fair comparison between lowpass and bandpass filters. Moreover, the variance will also depend on the location of the passband due to the factor (u — u;) 2 = a; 2 , and thus is not fair even among bandpass filters. To overcome this problem, we notice that the idea behind the variance a w is the energy distribution within a complete passband. For bandpass filters, a single passband in the positive frequency side should be used. We thus adopted the following definition: where C \F(ei«)\Muj (C.l) 103

PAGE 116

104 Bo Figure C.l: Channel Bandwidths of Overcomplete Wavelet Representations. C u\F{&)\*du> u, = _ JuJi_ Q \F{eJ»)\*du ' (C.2) where Wl = < for lowpass filters, for bandpass filters, for highpass filters. Notice that this definition is good only for filters whose passband is completely con-7T, 0. 0. 7T, 27T, tained in the range [uji,uj h ], although sidebands are allowed within this range. Also notice that the bandwidth of the lowpass band is twice the width of bandpass bands by the construction of (3.3) as illustrated in Figure C.l. In discrete frequency domain, Eqs. (C.l) and (C.2) should be modified as: a = Et"(fe-A; c ) 2 |F(fc)| 2 /27TX 2 Ekl \F(k)\ 2 \N) ' where N is the number of sampling points in the [0, 27r], and k. ' -N/2, 0, 0. N/2, for lowpass filters, N/2, for bandpass filters, N, for highpass filters. Notice that \F(k)\ is N-periodic such that \F(k)\ = \F(k + N)\, and therefore \F(-k)\ = \F(N-k)\. Similarly, due to the periodic nature of the sequence f(n) in the time domain, the variance o n should be modified as E^o l («-n) 2 |/(n + d)| 2 mm 0 Z^\f(n + d)\* '

PAGE 117

105 where f(n + d) is the original f(n) circular shifted by d. A MATLAB function uf actor () for computing the uncertainty factor is included in the following. Also included is a MATLAB function lwuf() for building analyzing filters of an overcomplete wavelet packet using a prototype Lemarie filter and computing uncertainty factors of them, where the function Lf titer () is given in Appendix B. function [u]=ufactor(H) % H is the Discrete Fourier Transform of a discrete filter h(n) N = length{H); W = H.*conj(H); vmax = max(W); if W(l) > 0.9b*vmax % low-pass filter F = \W{N/2 -1:N), Wi\ : N/2)]; elseif W(N/2) > 0.9b*vmax % high-pass filter F = W; else % band-pass filter F = W{\ : N/2); end; L = length{F); fc=[0: L-l\* F'/sum(F); sf = (([0 : L-l) fc). A 2) * F' /sum(F)*(2*pi/N) A2; % transform to time domain h = real(ifft(H)); st = lelO; for d = 2 : N1 % circular shift to find the minimum x = [h{d: N),h(l : d-1)]; \Y = x. A 2; tc=[0 : N-l}* W'/sum(W); m = (([0 : N-l] tc). A 2) * W'/sum{W); if( m < st ) st = m; end: end: u = st * sf; function [] = lwuf(shell,p,nlevel) % shell : =1, autocorrelation shell; =0, regular % p : 2p + 1 is the order of Lemarie wavelet % nlevel: number of levels N = 2 A nlevel; SIZE = 256; omega = 2*pi*[0 : SIZE-l]/ SIZE; [H.G] = Lf titer {p, SIZE); H = H/sqrt{2); G = G/sqrt{2); if shell == 1 H = H.* conj{H);

PAGE 118

G = G. * conj{G); end: W = ones(N, SIZE); HH = ones(l, SIZE); GG = ones (1, SIZE); d=l; for 72 = 1 '. hIcvgI HH = H(rem([0 : {SIZE-l)]*d, SIZE) + 1); GG = G{rem([0 : (SIZE-l)]*d,SIZE) + l); for j = 2An : -2 : 1 Irf = ceil{j/2); if rem(pt, 2) = 0 Wj,:) = W(pt,:).*GG; W(j-1,:) = W(pt,:). * Gls6 W(j,:) = W(pt,:).*HH; W(j-l,:)=W(pt,:).*GG; end: end; of = d*2; end: clg; mi = 1.1 * max(max(abs(W (:, :)))); for n = 1 : AT f = ab S (W(n,:)); if n <= TV/2 p\ot(om.ega, f); else plot (ome^a, /,' :'); end; uf = u/a(ior(VT(n, :)); str = sprint /('%.2/', «/); text(((n1)+ 0.1) *pi/iV, 0.95 *mrr, sir); if n==l axis([0 pi 0 mi]); hold; end; end; hold off; of = sprintf('p%dl%ds%d.eps',p, nlevel, shell); eval(['print -deps ' of]);

PAGE 119

APPENDIX D BOUNDARY TREATMENTS OF FINITE-LENGTH SEQUENCES D.l Periodic Extensions For convolutions involving finite-length discrete sequences, we face the boundary problem. One solution is to construct an infinite-length signal from the finitelength sequence. For a finite-length sequence s(n) of length N, two constructions are commonly used: 1) N-Periodic extension. An infinite-length signal s(n) of period N is constructed as: s(n)-{ S{U) ,0
PAGE 120

108 and denote t m (s(n)) (n). Equivalently, we may write: 00 s^m(n) = E s(k)5(n — km). (D.l) k= — oo We then define symmetric/antisymmetric sequences as: Definition D.2.1 For a discrete sequence s(n), if there exist integers c and m such that Sfm(c — n) = s^m(c + n) (s^m(c — n) = —s^m(c + n)), the sequence is said to be symmetric (antisymmetric) at c/m, and C = c/m is said to be the symmetry center. For symmetry sequences, we may define a symmetry index S: S = 1 for symmetric sequences, and S= -1 for antisymmetric sequences. Symmetry sequences possess some special properties. We will prove that symmetry sequences is closed under convolution. We first prove a proposition. Proposition D.2.2 The expansion operation is distributive over convolution: r(s*f)(n) =r (s(n)hr (/(*)). Proof: According to (s * f)(n) = T.T=-oo s {k)f(n ~ k ) and (D1 )* we have the left side: 00 00 r(s*f)(n) = E E s(k)f(l-k)S(n-lm) l= — oo fc=— oo oo oo = E s ( k ) E f(l-k)S{n-lm) fr = -oc l=— oo 00 oo E «(*) E mS{n-(i + k)m) k= — oo i=—oo and the right side: 00 00 r(s(n))*r(f(n)) = E E s(l)5(k Im) mS(n-k-im) fe=— oo /=— oo i=— oo 00 00 oo = E s ( l ) E /(O E *(* lm ) S (n k im) /= — oo i= — oo fc=— oo oo oo = E s (0 E f{i)^(n — lm — im) m l= — 00 t= — 00 We now prove the closure property on convolution.

PAGE 121

109 Theorem D.2.3 Convolution of two symmetry sequences s(n) and f(n) results in a symmetry sequence y(n) with symmetry index S y = S x Sj and symmetry center Cy C X + Cf . Proof: Based on the given condition, we have: x-^m (c x — ?i) — S x x^m {c x ~\~ nj f V n (cf n) = Sffrm (cf + n) Applying Proposition D.2.2, we have: 00 Vt m i c x + c/ — n) = ^2 X|m (k) ffm (c x + Cf — n — k) k=—oo 00 = ]L x t m ( c * ~~ 0/t m (c/ n + 0 i=— 00 00 = ]T 5 x ar-fm (c x + l)Sffrm (c f + n I) l=—oo OO = S x S f ^2 x t m(k)f^ m (c x + Cf + n k) k= — oo = S x S f y^m(c x + Cf + n) Since translation of index n will not actually alter a discrete sequence, without loss of generality, symmetry sequences may be classified according to their symmetry index S and symmetry center C as below: • Type-I symmetric. s(n) = s(—n). The symmetry center C = 0. • Type-II symmetric. s(n) = s(—n 1). The symmetry center C = -1/2 (c = — \,m = 2). The 2N-periodic mirror extended sequences introduced previously belong to this class. • Type-Ill symmetric. s(n) = s(-n + 1). The symmetry center C = 1/2 (c = l, m = 2).

PAGE 122

110 • Type-I antisymmetric. s(n) = -s(-n). The symmetry center C = 0. Notice that s(0) =0 is the constraint. • Type-II antisymmetric. s(n) = -s(-n 1). The symmetry center C = -1/2 (c= -l,m = 2). • Type-Ill antisymmetric. s(n) = -s(-n+ 1). The symmetry center C = 1/2 (c= l,m= 2). The closure property of symmetric sequences may reduce the computation burden significantly. For symmetric filters and mirror extension, we only need to carry out convolution on at most N + 1 sample points instead of 2N.

PAGE 123

REFERENCES [1] Akram Aldroubi and Michael Unser, editors. Wavelet in Medicine and Biology. CRC Press, Boca Raton, Florida, 1996. [2] H. C. Andrews and B. R. Hunt. Digital Image Restoration. Prentice Hall, Englewood Cliffs, New Jersey, 1989. [3] M. R. Banham, N. P. Galatsanos, H. L. Gonzalez, and A. K. Katsaggelos. Multichannel restoration of single channel images using a wavelet-based subband decomposition. IEEE Transactions on Image Processing, 3:821-833, 1994. [4] M. G. Bello. A combined Markov random field and wave-packet transform-based approach for image segmentation. IEEE Transactions on Image Processing, 3(6):834-846, 1994. [5] Z. Berman and J. S. Baras. Properties of the multiscale maxima and zerocrossings representations. IEEE Transactions on Signal Processing, 41(12):32163231, 1993. [6] A. C. Bovik. Analysis of multichannel narrowband filters for image texture segmentation. IEEE Transaction on Signal Processing, 39:2025-2043, 1991. [7] P. Brodatz. Textures-a Photographic Album for Artists and Designers. Dover Publications, New York, 1966. [8] T. Chang and C.-C J. Kuo. Texture analysis and classification with treestructured wavelet transform. IEEE Transaction on Image Processing, 2(4):429441, 1993. [9] J. Chen and A. Kundu. Rotation and gray scale transform invariant texture identification using wavelet decomposition and hidden Markov model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(2):208-214, 1993. [10] R. R. Coifman and Y. Meyer. Orthonormal wavelet packet bases, preprint, Department of Mathematics, Yale University, 1990. [11] I. Daubechies. Orthogonal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, XLP909-996, 1988. [12] I. Daubechies. The wavelet transform, time-frequency localization and signal analysis. IEEE Transactions on Information Theory, 36(5) .961-1005, 1990. [13] I. Daubechies. Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, 1992. Ill

PAGE 124

112 [14] D. L. Donoho. Nonlinear solution of linear inverse problems by waveletvaguelette decomposition. Technical report, Department of Statistics, Stanford University, 1992. [15] D. L. Donoho. Nonlinear wavelet methods for recovery of signals, densities, and spectra from indirect and noisy data. In Proceedings of Symposia Applied Mathematics, volume 00, pages 173-205, 1993. [16] D. L. Donoho. Nonlinear solution of linear inverse problems by waveletvaguelette decomposition. Applied and Computational Harmonic Analysis, 2:101 126, 1995. [17] D. L. Donoho and I. M. Johnstone. Ideal spatial adaptation via wavelet shrinkage. Technical report, Department of Statistics, Stanford University, 1992. [18] R. 0. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley & Sons, New York, 1973. [19] D. Dunn, W. E. Higgins, and J. Wakeley. Texture segmentation using 2-D Gabor elementarv functions. IEEE Transaction on Pattern Analysis Machine Intelligence, 16(2):130-149, 1994. [20] J. Fan and A. Laine. Multiscale contrast enhancement and denoising in digital radiographs. In Akram Aldroubi and Michael Unser, editors, Wavelet in Medicine and Biology, chapter 7, pages 163-189. CRC Press, 1996. [21] D. Gabor. Theory of communication. Journal of the IEE, 93:429-457, 1946. [22] R. C. Gonzalez and P. Wintz. Digital Image Processing. AddisonWesley Publishing Company, Reading, Massachusetts, second edition, 1987. [23] R. M. Haralick and L. G. Shapiro. Image segmentation techniques. Comput. Vision Graphics Image Processing, 29:100-132, 1985. [24] J. Y. Hsiao and A. A. Sawchuk. Supervised textured image segmentation using feature smoothing and probabilistic relaxation techniques. IEEE Transaction on Pattern Analysis Machine Intelligence, 11:1279-1292, 1989. [25] R. A. Hummel and S. W. Zucker. On the foundation of relaxation labeling processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(3):267-287, 1983. [26] L. W. Couch II. Digital and Analog Communication Systems. Macmillan Publishing Company, New York, third edition, 1990. [27] M. Kardan J. M. H. D. Buf and M. Span. Texture feature performance for image segmentation. Pattern Recognition, 23:291-309, 1990. [28] A. K. Jain. Fundamentals of Digital Image Processing. Prentice Hall, Englewood Cliffs, New Jersey, 1989. [29] A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, New Jersey, 1988. [30] A. K. Jain and F. Farrokhnia. Unsupervised texture segmentation using Gabor filters. Pattern Recognition, 24(12):1167 1186, 1991.

PAGE 125

113 [31] B. Jawerth, M. L. Hilton, and T. L. Huntsberger. Local enhancement of compressed images. Journal of Mathematical Imaging and Vision, 3:39-49, 1993. [32] B. Jawerth and W. Sweldens. An overview of wavelet based multiresolution analysis. SIAM Review, 36(3):377-412, 1994. [33] A. K. Katsaggelos, editor. Digital Image Restoration. SpringerVerlag, New York, 1991. [34] A. Kundu and J. Chen. Texture classification using QMF bank-based subband decomposition. CVGIP: Graphical Models and Image Processing, 54(5):369-384, 1992. [35] A. Laine and J. Fan. An adaptive approach for texture segmentation by multichannel wavelet frames. In SPIE Proceedings on Mathematical Imaging: Wavelet Applications in Signal and Image Processing, volume 2034, pages 288-299, 1993. [36] A. Laine and J. Fan. Texture classification by wavelet packet signatures. IEEE Transaction on Pattern Analysis Machine Intelligence, 15(11):1186-1191, 1993. [37] A. Laine and J. Fan. Frame representations for texture segmentation. IEEE Transaction on Image Processing, 5(5):771-780, 1996. [38] A. Laine, S. Schuler, J. Fan, and W. Huda. Mammography feature enhancement by multiscale analysis. IEEE Transaction on Medical Imaging, 13(4):725-740, 1994. [39] A. Laine, S. Schuler, and V. Girish. Orthonormal wavelet representations for recognizing complex annotations. Machine Vision and Applications, 6:110-123, 1993. [40] Y. Liu and A. N. Akansu. An evaluation of time-frequency localization in transforms and filter banks. In IEEE International Conference on Acoustics, Speech and Signal Processing, pages 261-263, 1993. [41] S. Mallat. Multifrequency channel decompositions of images and wavelet^ methods. IEEE Transaction on Acoustics, Speech and Signal Processing, 37:2091 2110, 1989. [42] S. Mallat. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11:674 693, 1989. [43] S. Mallat and W. L. Hwang. Singularity detection and processing with wavelets. IEEE Transactions on Information Theory, 38(2):617-643, 1992. [44] S. Mallat and S. Zhong. Characterization of signals from multiscale edges. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(7) :710732, 1992. [45] U. Manber. Introduction to Algorithms A Creative Approach. AddisionWesley Publishing Company, Reading, Massachusetts, 1989. [46] P. Maragos, J. F. Kaiser, and T. F. Quatieri. On amplitude and frequency demodulation using energy operators. IEEE Transactions on Signal Processing. 41(4):1532-1550, 1993.

PAGE 126

114 [47] D. Marr. Vision. W.H. Freeman and Company, New York, 1982. [48] Pierre Moulin. Wavelet thresholding techniques for power spectrum estimation. IEEE Transactions on Signal Processing, 42(11):3126-3136, 1994. [49] A. V. Oppenheim and R. W. Schafer. Discrete-time Signal Processing. Prentice Hall, Englewood Cliffs, New Jersey, 1989. [50] B. Macq P. Desarte and D. T. Slock. Signal-adapted multiresolution transform for image coding. IEEE Transactions on Information Theory. 38(2) :897 904, 1992. [51] A. Papoulis. Signal Analysis. McGraw-Hill Book Company, New York, 1977. [52] J.-C. Pesquet, H. Krim, and H. Carfantan. Time-invariant orthonormal wavelet representations. IEEE Transaction on Signal Processing, 44:1964-1970, 1996. [53] L. R. Rabiner and R. W. Schafer. Digital Processing of Speech Signals. Prentice Hall, Englewood Cliffs, New Jersey, 1978. [54] T. R. Reed and H. Wechesler. Segmentation of textured images and Gestalt organization using spatial/spatial-frequency representations. IEEE Transaction on Pattern Analysis Machine Intelligence, 12:1-12, 1990. [55] K. H. Rosen. Discrete Mathematics and Its Applications. Random House, New York, 1988. [56] S. A. Ruzinsky and E. T. Olsen. L\ and minimization via a variant of Karmarkar's algorithm. IEEE Transaction on Acoustics, Speech, and Signal Processing, 37:245-253, 1989. [57] N. Saito. Local Feature Extraction and Its Applications Using a Library of Bases. PhD thesis, Yale University, 1994. [58] E. P. Simoncelli, W. T. Freeman, E. H. Adelson, and D. J. Heeger. Shiftable multiscale transforms. IEEE Transaction on Information Theory, 38(2):587-607, 1992. [59] G. Strang. Wavelets and dilation equations: a brief introduction. SIAM Review, 31(4):614-627, 1989. [60] A. N. Tikhonov and V. Y. Arsenin. Solution of Ill-posed Problems. V. H. Winston & Sons, Washington, D. C, 1977. [61] M. Tuceryan and A. K. Jain. Texture analysis. In C. H. Chen, L. F. Pau, and P. S. P. Wang, editors, The Handbook of Pattern Recognition and Computer Vision, pages 235-276. World Scientific Publishing Company, Singapore, 1993. [62] M. Unser. Texture classification and segmentation using wavelet frames. IEEE Transactions on Image Processing, 4(11):1549 1560, 1995. [63] P. P. Vaidyanathan. Multirate Systems and Filter Banks. Prentice-Hall, Englewood Cliffs, New Jersey, 1993. [64] R. Wilson and M. Spann. Finite prolate spheroidal sequences and their applications II: image feature description and segmentation. IEEE Transaction on Pattern Analysis Machine Intelligence, 10:193-203, 1988.

PAGE 127

BIOGRAPHICAL SKETCH Jian Fan was born in Fuzhou City, Fujian Province, the People's Republic of China, on April 15, 1954. In 1978, he began his studies at the Xiamen University, Xiamen, PRC, where he received BS degree in radio physics (82) and MS degree in marine physics (85). From 1985 to 1989, he served in the faculty of the Oceanography Department of the Xiamen University. He arrived in the United States on January 13, 1989 to pursue graduate studies. He continued his studies at the University of Florida, Gainesville, where he received MS degrees in electrical engineering (90) and in computer and information sciences (92). He jointed the Hewlett-Packard Company as a software design engineer in 1995. 115

PAGE 128

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Andrew F. Laine, Chairman Associate Professor of Computer and Information Science and Engineering I certify that I have read this study and that in my opiniorflt/dDnforms to acceptable standards of scholarly presentation and is fully adequate, \jl scope and quality, as a dissertation for the degree of Doctor of Philosophy. Gerh^G Putter Pr/ff^ssor of Computer and lformation Science and Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Sartaj SaTmi Professor of Computer and Information Science and Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Dave Wilson Professor of Mathematics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Mm G. Harris Professor of Electrical and Computer Engineering

PAGE 129

This dissertation was submitted to the Graduate Faculty of the College of Engineering and to the Graduate School and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. August 1997 Winfred M. Phillips Dean, College of Engineering Karen A. Holbrook Dean, Graduate School