Efficient computational methods for early vision problems

MISSING IMAGE

Material Information

Title:
Efficient computational methods for early vision problems
Physical Description:
xiii, 188 leaves : ill. ; 29 cm.
Language:
English
Creator:
Lai, Shang-Hong, 1964-
Publication Date:

Subjects

Subjects / Keywords:
Electrical and Computer Engineering thesis, Ph. D
Dissertations, Academic -- Electrical and Computer Engineering -- UF
Genre:
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1995.
Bibliography:
Includes bibliographical references (leaves 181-187).
Additional Physical Form:
Also available online.
Statement of Responsibility:
by Shang-Hong Lai.
General Note:
Typescript.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 022049735
oclc - 33876876
System ID:
AA00025708:00001

Full Text













EFFICIENT COMPUTATIONAL METHODS FOR EARLY VISION PROBLEMS


By

SHANG-HONG LAI















A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY


UNIVERSITY OF FLORIDA


1995

































To my wife Yih-Chou
and my son Patrick













ACKNOWLEDGMENTS


I am deeply grateful to my advisor and committee chairman, Dr. Baba C. Ve-

muri, whose guidance, encouragement and support made the completion of this work

possible. I thank him for his academic as well as nonacademic advice and assistance

during my study at the University of Florida. Many ideas in this dissertation were

produced through the discussions with him. I would also like to thank Dr. Staudham-

mer, Dr. Doty, Dr. Harris, Dr. Sigmon and Dr. Rao for serving in my supervisory

committee and advising me on various aspects of this dissertation. I sincerely thank

the National Science Foundation for supporting me financially to finish this work

under the grant number ECS-9210648. Special thanks go to my parents, who never

hesitate to deliver their love, encouragement and support to enrich my life. Finally,

I would like to dedicate this dissertation to my beloved wife, Yih-Chou, who shared

the happiness and struggle I experienced during the work of this dissertation, and to

my son, Patrick.













TABLE OF CONTENTS




ACKNOWLEDGMENTS ............................. iii

LIST OF TABLES ................................. vii

LIST OF FIGURES ................................ viii

ABSTRACT .................................... xii

CHAPTERS

1 INTRODUCTION ............................... 1

1.1 Formulations of Early Vision Problems ................... 2
1.2 Computational Algorithms ........................ 3
1.3 Main Contributions of this Thesis ....................... 6
1.4 Thesis Overview .............................. 8

2 PROBLEM FORMULATIONS ........................ 11

2.1 Deterministic Framework ......................... 11
2.2 Probabilistic Framework ......................... 14
2.3 Variational Formulations for Some Early Vision Problems ....... .18
2.3.1 Surface Reconstruction ...................... 19
2.3.2 Shape from Shading ....................... 20
2.3.3 Optical Flow Computation ...................... 24

3 GENERALIZED CAPACITANCE MATRIX THEOREMS ......... 28

3.1 Introduction .. ... .. ... .. ... .. .. ... .. . . . 28
3.2 Generalized Capacitance Matrix Theorems ................. 33
3.3 Discussion . . . .. . . . . . . . . . . . . 48

4 GENERALIZED CAPACITANCE MATRIX ALGORITHM ........ 50

4.1 Structure of the Algorithm ........................ 50
4.2 Choice of the Splitting .......................... 52
4.2.1 Dense Data Surface Reconstruction . . . . . . ... .. 55








4.2.2 Sparse Data Surface Reconstruction . . . . . . ... .. 60
4.3 Numerical Solution ..... ........................... .. 61
4.3.1 ADI Method for the Lyapunov Equation . . . . ... ..62
4.3.2 Modified BCG Method for the Capacitance Matrix Equation 72
4.4 Experimental Results . . . . . . . . . . . . ... .. 74
4.4.1 Surface Reconstruction . . . . . . . . . ... .. 75
4.4.2 Shape from Shading . . . . . . . . . . ... .. 77

5 ADAPTIVE PRECONDITIONING TECHNIQUE . . . . . .... .. 81

5.1 Introduction . . . . . . . . . . . . . . ... .. 81
5.2 Adaptive Preconditioning in a Wavelet Basis . . . . . .... .. 85
5.2.1 Preconditioning Technique . . . . . . . . ... .. 86
5.2.2 Regularization Filter . . . . . . . . . . ... .. 88
5.2.3 W avelet Basis . . . . . . . . . . . . ... .. 90
5.2.4 Adaptive Preconditioning via the Modulation in a Wavelet Basis 93
5.3 Adaptive Preconditioned Conjugate Gradient Algorithm ....... 96
5.3.1 Surface Reconstruction . . . . . . . . . ... .. 97
5.3.2 Shape from Shading . . . . . . . . . . ... .. 100
5.3.3 Optical Flow Computation . . . . . . . . ... .. 102
5.4 Experimental Results . . . . . . . . . . . . ... .. 102
5.4.1 Surface Reconstruction . . . . . . . . . ... .. 103
5.4.2 Shape from Shading . . . . . . . . . . ... .. 117
5.4.3 Optical Flow Computation . . . . . . . . ... .. 123
5.5 Summary and Discussion . . . . . . . . . . . ... .. 126

6 ROBUST OPTICAL FLOW COMPUTATION . . . . . . ... .. 128

6.1 Introduction . . . . . . . . . . . . . . ... .. 128
6.2 Modified gradient-based method . . . . . . . . . ... .. 132
6.3 SSD-based Regularization Method . . . . . . . . ... .. 137
6.3.1 Nonlinear Conjugate Gradient Algorithm . . . . ... ..139
6.3.2 Modified Search Direction Scheme . . . . . . ... .. 141
6.4 Experimental Results . . . . . . . . . . . . ... .. 142
6.4.1 Modified Gradient-based Method . . . . . . ... .. 143
6.4.2 SSD-based Regularization Method . . . . . . ... .. 146
6.5 Summary and Discussion . . . . . . . . . . . ... .. 150

7 RESEARCH DIRECTIONS . . . . . . . . . . . ... .. 152

7.1 Introduction . . . . . . . . . . . . . . ... .. 152
7.2 Previous Energy Minimization Methods . . . . . . . ... .. 154
7.2.1 Deterministic Optimization Methods . . . . . .... ..155
7.2.2 Stochastic Optimization Methods . . . . . . ... .. 160
7.3 Proposed Hybrid Search Algorithm . . . . . . . . ... .. 161








7.3.1 Elements of Genetic Algorithms . . . . . . . ... .. 164
7.3.2 Informed Genetic Algorithm . . . . . . . . ... .. 168
7.3.3 Preliminary Experimental Results . . . . . . ... .. 171
7.3.4 Discussion . . . . . . . . . . . . . ... .. 176

8 CONCLUSIONS . . . . . . . . . . . . . . ... .. 178


REFERENCES . . . . . . . . . . . . . . . . ... .. 181

BIOGRAPHICAL SKETCH . . . . . . . . . . . . ... .. 188














LIST OF TABLES



6.1 Summary of Square 2 results . . . . . . . . . .... .. 145

6.2 Summary of Diverging Tree results . . . . . . . . ... .. 146

6.3 Summary of Sinusoid 1 results . . . . . . . . . ... .. 148

6.4 Summary of Translating Tree results . . . . . . . .... .. 150














LIST OF FIGURES


4.1 The flow chart of the generalized capacitance matrix algorithm. . 53

4.2 (a) Data constraint molecule and (b) membrane computational molecules
periodically imposed on the domain Q . . . . . . . ... .. 56

4.3 Surface reconstruction example; (a) original sparse data, (b) recon-
structed surface using membrane smoothness, (c) reconstructed sur-
face using thin-plate smoothness, obtained using the generalized ca-
pacitance matrix algorithm . . . . . . . . . . .... .. 76

4.4 Surface reconstruction with prespecified discontinuities along a hori-
zontal line; (a) original sparse data set with depth discontinuities along
the indicated horizontal line; reconstructed surfaces using a membrane
spline after (b) 1 iteration and (c) 10 iterations of ADI method; (d)
reconstructed surface using a thin-plate spline after 16 iterations of
ADI m ethod . . . . . . . . . . . . . . ... ..77

4.5 Surface reconstruction with prespecified discontinuities along a diag-
onal line; (a) the original sparse data set with depth discontinuities
along the indicated diagonal line, (b) reconstructed surface using a
membrane spline after 10 iterations of ADI method . . . ... .. 78

4.6 Shape from shading example, (a) Shaded sphere image (b) irregular
occluding boundary imbedded in a rectangle (c) recovered surface ori-
entation, (d) reconstructed surface shape . . . . . . ... .. 80

5.1 The implementation of the wavelet transform via the QMF struc-
ture from one level to the next, (a) decomposition (b) reconstruction,
adopted from Mallat[1989] . . . . . . . . . . ... .. 92

5.2 Division of the frequency domain in a wavelet basis . . . ... .. 92

5.3 The sparse data sets used for the surface reconstruction, (a) synthetic
data (b) synthetic symmetric data (c) real laser range data . . 104








5.4 (a), (d), (g) & (j) are the computed solutions using the APCG al-
gorithm, (b), (e), (h) & (k) are the computed solutions using HCG
algorithm, (c), (f), (i) & (1) are the computed solutions using conju-
gate gradient algorithm, after 1, 5, 10 and 20 iterations, respectively. 107

5.5 The computed solutions after (a) 10 iterations of APCG (b) 25 itera-
tions of HCG (c) 30 iterations of CG using the same time . ... .. 108

5.6 The convergence for different preconditioners in the preconditioned
conjugate gradient algorithm in the surface reconstruction problem
with the synthetic sparse data set and (a) with controlled-continuity
stabilizer (p = 1, r = 0.1) (b) with thin-plate stabilizer . . ... .. 109

5.7 The convergence for different preconditioners in the preconditioned
conjugate gradient algorithm for the surface reconstruction problem
with discontinuities specified in the controlled-continuity stabilizer. 109

5.8 The computed solutions for the surface reconstruction with disconti-
nuities using the APCG algorithm after (a) 10, (b) 20, (c) 40 and (d)
60 iterations, respectively . . . . . . . . . . .... .. 110

5.9 The convergence of Pentland's preconditioner in conjunction with the
conjugate gradient algorithm is worse than that of conjugate gradi-
ent algorithm without preconditioning for the surface reconstruction
problem with the synthetic sparse data set using a membrane stabilizer. 111

5.10 Row profile of the wavelet transformed membrane matrix K, of size
64 x 64 along different rows, (a) row 3 (b) row 10 (c) row 30 (d) row 50.
This figure establishes the falsity of the diagonal dominance behavior
of K. claimed in [19] . . . . . . . . . . . . ... .. 112

5.11 The convergence for different preconditioners in the preconditioned
conjugate gradient algorithm for the surface reconstruction problem
with sparse symmetric data set using a controlled-continuity stabilizer. 113

5.12 The computed solutions using the adaptive preconditioned conjugate
gradient algorithm after (a) 10 iterations, (b) 40 iterations, (c) 100
iterations, in the third surface reconstruction experiment . . ... .. 114

5.13 (a) The convergence for the APCG, HCG and CG algorithms for the
surface reconstruction problem with the real laser range data set using
a controlled-continuity stabilizer, (b) The computed solution after 10
APCG iterations . . . . . . . . . . . . . ... .. 116

5.14 Shaded images of bump with different light source directions . . .117








5.15 The convergence of the APCG, HCG, and CG algorithms for the
shape-from-shading problem . . . . . . . . . . .... .. 118

5.16 Shaded images of bump after 100 iterations of APCG algorithm with-
out using any boundary condition . . . . . . . . ... .. 118

5.17 Shaded images of bump after (a) & (b) 10 iterations, (c) & (d) 20
iterations, (e) & (f) 40 iterations of the APCG algorithm with the
occluding boundary condition . . . . . . . . . ... .. 120

5.18 The computed (a) gradient and (b) height after 40 iterations of the
APCG algorithm . . . . . . . . . . . . . .... .. 121

5.19 The convergence curves of the (a) energy, (b) p q error, (c) z error
for the CG, HCG and APCG algorithm . . . . . . ... .. 122

5.20 One frame from the (a) square image sequence and (b) rotating Rubic
cube image sequence . . . . . . . . . . . . ... .. 124

5.21 Computed flow of the (a) translating square and (b) rotating Rubic
cube . . . . . . . . . . . . . . . . . . .. 125

5.22 The convergence curves of the CG, HCG and APCG algorithms for
(a) translating square (b) rotating Rubic image sequence . . ... .. 126

6.1 The discrepancy between the gradient directions of the function oI0 and
I, at the i-th location is small and thus the use of the gradient direction
as the search direction for the i-th component is robust; while the the
j-th component of the gradient direction is unreliable for a local search. 142

6.2 Frames from the (a) Square 2 and (b) Diverging Tree image sequences. 144

6.3 Computed optical flow of the (a) Square 2 and (b) Diverging Tree
image sequences after 40 iterations of the incomplete Cholesky pre-
conditioned CG algorithm . . . . . . . . . . ... .. 145

6.4 (a) One frame from the Rotating Rubic Cube sequence (b) Computed
optical flow after 40 iterations of the incomplete Cholesky precondi-
tioned CG algorithm . . . . . . . . . . . . ... .. 147

6.5 Frames from the (a) Sinusoid 1 and (b) Translating Tree image sequences. 147

6.6 Computed optical flow of the (a) Sinusoid 1 and (b) Translating Tree
image sequences, using the preconditioned nonlinear CG algorithm. 149

6.7 (a) One frame from the Hamburg Taxi sequence (b) Computed optical
flow after 200 iterations of the preconditioned nonlinear CG algorithm. 150








7.1 The crossover operation example . . . . . . . . ... .. 166
7.2 The sparse data set used for the surface reconstruction with disconti-
nuity detection, (a) original surface (b) sparsely sampled data set. . 172
7.3 (a) Clique for the line process (x: line process, 0: nodal variable for
surface), (b) clique potentials for different line process configurations. 173

7.4 Surface reconstruction example: (a) the reconstructed surface (b) the
recovered discontinuity map ( solid line ) obtained after 3 generations
of the informed GA and imposed on the true discontinuity map (
dashed line ) . . . . . . . . . . . . . . ... .. 174
7.5 Surface reconstruction with noise example: (a) the noisy sparse data
set, (b) the reconstructed surface (c) the recovered discontinuity map
( solid line ) obtained after 5 generations of the informed GA and
imposed on the true discontinuity map ( dashed line ) . . ... .. 175
















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


EFFICIENT COMPUTATIONAL METHODS FOR EARLY VISION PROBLEMS

By

SHANG-HONG LAI

August 1995


Chairman: Dr. Baba C. Vemuri
Major Department: Electrical and Computer Engineering


Early vision has been one of the most fundamental research topic in computer

vision. The task of early vision is to recover physical properties of three-dimensional

objects (such as shape) from two-dimensional sensed data. Example tasks consist

of surface reconstruction, optical flow computation, shape from shading, shape from

stereo, edge detection, lightness problem, segmentation problem,... etc. Early vision

problems may be formulated either in a deterministic or a probabilistic framework.

A popular technique in the deterministic framework is based on the regularization

theory, and the probabilistic formulation exploits the Markov random field (MRF)

model to characterize the function being estimated. Both formulations lead to min-

imization of equivalent energy functions. When discontinuity detection is involved








in the formulation, the energy functions to be minimized become nonconvex. In this

thesis, we propose efficient computational algorithms for minimizing these energy

functions with and without discontinuity detection.

This thesis contains the following four main contributions:


1. A new theoretical and algorithmic framework based on the capacitance matrix

method for solving elliptic partial differential equations arising in early vision

and engineering problems.


2. A novel adaptive preconditioning algorithm using a wavelet transform for solv-

ing ill-conditioned large sparse linear systems arising in regularized solutions of

ill-posed problems in early vision and other domains.


3. Two new, robust and efficient algorithms for accurate optical flow computation

namely, a modified gradient-based formulation and an SSD-based regularization

formulation.


4. A new hybrid (stochastic + deterministic) search algorithm involving an in-

formed genetic algorithm (GA) and either one of the two aforementioned fast

numerical algorithms, for solving nonconvex optimization problems in early

vision.


The efficiency of the proposed algorithms are demonstrated through experi-

ments on several early vision problems, including surface reconstruction, shape from

shading, and optical flow computation. In addition to computer vision applications,








the theory and algorithms developed in this thesis can also be applied to other ar-

eas such as cartography/terrain mapping, reverse engineering, MPEG image sequence

compression, material science applications (velocity estimation of dry granular flows),

engineering applications and mathematical physics.














CHAPTER 1
INTRODUCTION


Early vision has been one of the most fundamental research topic in computer

vision. The primary goal of early vision is to recover the geometric or physical

properties of three-dimensional objects from two-dimensional sensed data. Typical

properties of objects are the shape ( depth or orientation ), texture, color, reflectance

and motion parameters. Example tasks of early vision include surface reconstruction,

optical flow computation, shape from shading, stereo matching, edge detection and

lightness problem [28, 41, 40, 38, 63, 6, 72]. The results of early vision can be used

for higher level tasks such as bin picking, object recognition, passive navigation and

motion tracking.

Early vision problems can be regarded as inverse problems. During the imaging

process, some information is lost since it projects three-dimensional objects to two-

dimensional images. Therefore, it is usually not possible to solve the inverse problems

from data information only. In fact, most early vision problems are ill-posed problems

[6]. A problem is said to be well-posed if its solution (a) exists, (b) is unique, and

(c) depends continuously on the input data [62]. We can transform an ill-posed

problem into a well-posed one by imposing additional constraints (such as smoothness

constraints) on the solution.








1.1 Formulations of Early Vision Problems

Early vision problems may be formulated either in a deterministic or a prob-

abilistic framework. A popular technique in the deterministic framework is based

on regularization theory [82, 80] and leads to the minimization of energy function-

als. The probabilistic formulation exploits the Markov random field (MRF) model to

characterize the function being estimated. We can relate the smoothness constraints

in the regularization theory to the prior distribution of the MRF model. Both the

regularization formulation and the MRF formulation of early vision problems lead

to the minimization of equivalent energy functions. We will give the deterministic

formulations of some early vision problems in the next chapter.

Discontinuities contain very crucial information in computer vision. There are

different treatments of discontinuities in early vision research. Early work on vi-

sual reconstruction problems reported in literature ignores discontinuities altogether

[40, 28, 62], which yields undesirable smoothing over discontinuity locations. How-

ever, more recently problem formulations allow for incorporation of prespecified dis-

continuities [81, 75]. For both cases, the aforementioned energy function is convex

and quadratic for most early vision problems. Minimization of a convex quadratic

function can be achieved by solving the associated linear system. In this thesis, we

present several efficient and novel computational algorithms to minimize the energy

functions for early vision problems with prespecified discontinuities.

There have been many methods proposed to detect discontinuities in the under-

lying function from the observed data. An elegant method to treat discontinuities








is via the introduction of a set of binary variables called the line process. Each line

process variable takes a value of 1 in the presence of discontinuities and zero other-

wise. Line process has been included in both the deterministic [53, 8] and the MRF

[22] formulations. The resulting energy function involves real variables (for surface

or image) coupled with binary variables (for discontinuities) and is therefore a non-

convex function. Both formulations lead to problems involving the minimization of

nonconvex functions.

1.2 Computational Algorithms

For the prespecified discontinuities case, numerous iterative methods have been

employed to minimize the energy functions namely, Gauss-Seidel [14], successive

overrelaxation (SOR), conjugate gradient (CG) [26], hierarchical conjugate gradi-

ent (HCG) [75], and multigrid methods [79], the slowest being Gauss-Seidel, taking

O(N2) time to converge ( N(= n2) is the number of discretization ) [26] while, the

fastest being multi-grid methods taking 0(N) time to converge for some regular class

of problems [29]. Although the multi-grid techniques have been applied successfully

to a general class of problems, the proofs of convergence are restricted to a special

class of problems [29, 9, 91]. For these results ( posed as solution for PDE problems),

discontinuity in the solution is not considered. Therefore, inclusion of discontinuity

into the problem is bound to deteriorate the convergence speed of multigrid methods.

In addition, the construction of multigrid needs to be modified for nonrectangular

regions. The modification becomes quite complicated for implementation.








Direct solutions to the associate linear systems from minimization of the energy

functions are based on a theorem called the capacitance matrix theorem which was

stated and proved in Buzbee et al. [10]. However, this capacitance matrix theorem

is restricted to some early vision problems and on the design on numerical algo-

rithms [45]. The direct solutions using the capacitance matrix theorem employ the

Fourier-Toeplitz method in combination with LU decomposition [72], the Cholesky

decomposition [10] or the conjugate gradient [65] technique. The Fourier-Toeplitz

method requires 0(N log N) time and the LU decomposition, Cholesky factoriza-

tion or the conjugate gradient require 0(n3) = O(Nv/'N), thus making the overall

complexity O(N v/N).

In this thesis, we present two efficient algorithms for the minimization of the en-

ergy functions in early vision. One is a generalized capacitance matrix algorithm and

the other is an adaptive preconditioning technique in conjunction with a conjugate

gradient or nonlinear conjugate gradient algorithm. The development of the gener-

alized capacitance matrix algorithm is based on the generalized capacitance matrix

theorems that we state and prove in this thesis. In our algorithm, we first transform

the linear system into a Lyapunov matrix equation or a cascade of two Lyapunov

matrix equations with an appropriate right-hand side (RHS). This RHS is obtained

by solving an associated linear system with a capacitance matrix. The Lyapunov ma-

trix equations are solved using the alternating direction implicit method (ADI) while

the solution to the capacitance matrix linear system is obtained using a modified








bi-conjugate gradient technique. We prove that the ADI method takes a constant

number of iterations with each iteration taking 0(N) time.

Our adaptive preconditioning technique is used in conjunction with a conjugate

gradient or nonlinear conjugate gradient algorithm to minimize the energy function.

This adaptive preconditioner is constructed in a wavelet basis and adapted to the

spectral characteristics of the imposed smoothness constraint as well as the data

constraint in the energy function. The construction of the preconditioner makes

use of the division of frequency domain property in a wavelet basis. Previously,

there have been some preconditioners proposed in computer vision literature [75, 61,

92]. However, they are either not adapted to the problem [75, 92] or constructed

inappropriately [61]. We empirically show the superiority of our preconditioner over

other preconditioners on various early vision problems.

To solve the early vision problems with discontinuity detection, both regular-

ization and MRF formulations lead to the minimization of a coupled (binary-real)

nonconvex energy function. Numerous algorithms have been proposed to solve this

coupled nonconvex minimization problem. They can be categorized into the deter-

ministic methods [8] [44] [21] and the stochastic methods [22] [51]. Most of the deter-

ministic methods can only deal with simple constraints on the line process [8, 21, 31];

in addition, they can not find the global minimum solution. The stochastic optimiza-

tion methods are primarily the simulated annealing (SA) type algorithms. Simulated

annealing is a general global optimization algorithm, but its convergence rate is very

slow, thus making it impractical for many applications.








In this thesis, we will briefly discuss existing methods for this coupled noncon-

vex minimization problem and propose a novel hybrid search algorithm as a possible

solution toward solving this problem. This hybrid search algorithm which is a combi-

nation of a stochastic and a deterministic search techniques. For the stochastic search,

we propose an informed genetic algorithm (GA) whereas for the deterministic search,

we employ either one of the two deterministic algorithms mentioned above. Some

promising preliminary experimental results of applying this algorithm on sparse data

surface reconstruction along with discontinuity detection problems are presented.

1.3 Main Contributions of this Thesis

This thesis contains the following four main contributions:

1. A new theoretical and algorithmic framework based on the capacitance matrix

method is proposed for solving elliptic partial differential equations arising in

early vision and engineering problems. The algorithm is based on the develop-

ment of the generalized capacitance matrix theorems which we state and prove

in this thesis. We make use of the ADI method as a fast elliptic solver in this

algorithm and prove that it takes a constant number of iterations to converge

within a given error tolerance with each iteration taking 0(N) time.


2. A novel adaptive preconditioning algorithm using a wavelet transform is devel-

oped for solving ill-conditioned large systems of equations arising in regularized

solutions of ill-posed problems in early vision and other domains. Our pre-

conditioner is constructed by approximating the spectral characteristics of the

imposed smoothness as well as data constraints. This spectral approximation is








achieved by using the division of frequency domain property in a wavelet basis.

By using this construction, our preconditioner is adapted to to the smoothness

constraint as well as data constraint.


3. Two robust algorithms are proposed for computing optical flow. The first is a

modified gradient-based regularization method, and the other is an SSD-based

regularization method. In our modified gradient-based method, we selectively

combine the image flow constraint and the contour-based flow constraint into

the data constraint in a regularization framework to amend the errors in the

image flow constraint caused by brightness discontinuities. Our SSD-based

regularization method uses the SSD (Sum of Squared Differences) measure as

the data constraint in a regularization framework. Experimental results for

these two algorithms are given to demonstrate their superior performance.


4. A novel hybrid (stochastic + deterministic) search algorithm is proposed to

solve the nonconvex optimization problems for early vision involving disconti-

nuity detection. This hybrid search algorithm consists of an informed genetic

algorithm (GA) and either one of the two aforementioned numerical algorithms.

The informed GA is used for the line process only in the outer loop of our algo-

rithm, while the deterministic algorithm is employed to solve the subproblems

of minimizing the energy function given a fixed line process configuration.

The efficiency of the proposed algorithms is demonstrated through experiments

on several early vision problems, including surface reconstruction, shape from shad-

ing, and optical flow computation. In addition to computer vision applications, the








theory and algorithms developed in this thesis can also be applied to many other

areas, such as cartography/terrain mapping, reverse engineering, MPEG image se-

quence compression, material science applications (velocity estimation of dry granular

flows), engineering applications and mathematical physics.

1.4 Thesis Overview

The rest of this thesis is organized as follows.

In chapter 2, the deterministic and probabilistic frameworks of the early vision

problems are briefly reviewed and the formulations of some early vision problems

namely, surface reconstruction, shape from shading and optical flow computation,

are given. The deterministic framework is primarily based on the regularization

formulation. For the probabilistic framework, we discuss the maximum a posteriori

(MAP) estimation in a coupled Markov Random Field (MRF) model.

In chapter 3, the generalized capacitance matrix theorems are stated with com-

plete proof. These theorems are generalizations of using Sherman-Morrison-Woodbury

formula [26] to solve linear systems and the capacitance matrix theorem proposed by

Buzbee et al. [10]. The structure of the our generalized capacitance matrix algorithm

can be obtained directly from these generalized theorems.

Chapter 4 gives the complete generalized capacitance matrix algorithm which

involves using the ADI method for solving the Lyapunov matrix equations and using

the modified bi-conjugate gradient (BCG) algorithm for solving the associate linear

system with a capacitance matrix. We consider the membrane smoothness, thin-plate

smoothness or a linear combination of the two in our algorithm. We demonstrate the








efficiency of this algorithm by applying it to surface reconstruction and shape from

shading problems.

In chapter 5, the new adaptive preconditioning technique is presented as an

efficient solution to general early vision problems. Our adaptive preconditioner is

constructed in a wavelet basis based on the approximation to the spectral character-

istics of the smoothness and data constraints. We empirically show the superiority of

this adaptive preconditioner over other preconditioners previously proposed in vision

literature through experiments on surface reconstruction, shape from shading and

optical flow computation.

Two new, robust and efficient algorithms for optical flow estimation are pre-

sented in chapter 6. The first is a modified gradient-based algorithm which combines

the image flow constraint and the contour-based flow constraint into the data con-

straint in a regularization framework to amend the errors in the image flow constraint

caused by brightness discontinuities. We employ the adaptive preconditioned conju-

gate gradient algorithm, presented in chapter 5, to solve the linear system for this new

formulation. The second is an SSD-based regularization method that uses the SSD

measure as the data constraint in a regularization framework. The preconditioned

nonlinear conjugate gradient with a modified search direction scheme is developed to

minimize the resulting energy function. Experimental results for these two algorithms

are given to demonstrate their performance.









In chapter 7, we list several directions for conducting future research based on

the research reported in this thesis. For the early vision with discontinuity detec-

tion problem, we discuss numerous existing techniques and propose a new hybrid

search algorithm as a possible approach toward solving this problem. By includ-

ing line process into the optimization of the energy function, the early vision along

with discontinuity detection problem is formulated as the minimization of a coupled

(binary-real) nonconvex energy function. Our hybrid search algorithm consists of a

stochastic and a deterministic search techniques. For the stochastic search, we pro-

pose an informed genetic algorithm (GA) as a global minimizer for the binary line

process only, whereas for the deterministic search, we employ either one of the two

deterministic algorithms mentioned above to solve the minimization problem for each

fixed line process configuration inside the GA. Promising preliminary experimental

results of using this algorithm on the sparse data surface reconstruction problems are

given.


Finally, we summarize the results of this thesis in chapter 8.














CHAPTER 2
PROBLEM FORMULATIONS


There have been two classes of formulations proposed for the early vision prob-

lems; one is the deterministic formulation, and the other is the probabilistic one. The

deterministic approach is based on the regularization theory proposed by Tikhonov

[82]. The regularization method introduces a stabilizing operator to well-pose the

original problem and leads to a variational principle. The most popular probabilistic

formulation to early vision problems involves the use of an MRF (Markov Random

Field) model. Although there is some correspondence between the aforementioned

stabilizing operator and the MRF model, the MRF model is known to be more ver-

satile to model surfaces or images than the deterministic model [51] [74].

2.1 Deterministic Framework

Variational formulations for various early vision problems have been reported

in [62, 6]. These formulations make use of regularization theory. In a regularization

framework, generic smoothness assumptions are imposed on the solution space prior

to attempting any functional minimization. The smoothness constraints are well

characterized by a class of generalized multi-dimensional spline functionals [80]. The

formulations involve minimization of an energy functional E, which is the sum of the

energy contribution from the smoothness constraint (E8) and the data constraint (Ed)








i.e.,

Find u such that (u) = infv E -H (v) (2.1)


where, R defines the linear admissible space of smooth functions defined on R2 and

the functional

(v) = TD(v) + AS(v) (2.2)


Where, A is called the regularization parameter that controls the contribution of the

smoothness term, S(v) : R is a functional on H that is a measure of smoothness

of an admissible function v(x, y), and 79(v) : H 3 R is a functional that measures

the discrepancy between the observation model and the observed data [80].

To solve the variational problem, a finite difference method [37] or a finite el-

ement method [78] can be used to discretize the problem domain. The infinite-

dimensional minimization problem in equation 2.1 can be reduced to a finite-dimensional

minimization problem as follows:


minE(u) with E(u)= D(u,d)+ AS(u) (2.3)



where u E R?2 is the discretized surface, d E n"2 Xl is the data vector, and the

discretization mesh is n x n. In general, the observation model can be nonlinear,

e.g. the shape from shading problem with a nonlinear reflectance map function.

Let's denote this model by Kdu = d where the operator Kd is a matrix when the

model is linear, and choose the stabilizing ( or smoothing ) operator as the first-order








derivative along x and y, then



D(u,d) = II Ku d 112 (2.4)

S(u) = luWKu (2.5)
2



For the choice of first-order differential stabilizer, i.e. the membrane smoothness, the

operator K, is a Laplacian matrix.

The regularization method imposes the smoothness constraint over the entire

domain. It also smoothes out the discontinuities which is very undesirable. Incorpo-

rating the discontinuity detection into the early vision problem, the energy function

to be minimized can be modified as [8]



E(u,h,v) = I Kdu d 12 + ((u+i,, uu,j)2(1 hi,j) + (u+i u,j)2(1 vi,))
i,j
+ Ey V(h, v) (2.6)
ij


where hij and vi,j are the horizontal and vertical line processes, respectively, which

are binary variables with a value 1 denoting a vertical or horizontal discontinuity

at that location and a 0 otherwise. The line energy V(h,v) is a term that may

be used to penalize the inclusion of discontinuities or to impose the smoothness of

the line process. Most of the deterministic formulations take very simple functions

for V(h, v) merely based on some heuristics; for example, Blake and Zisserman [8]

take V(h, v) = hi,j + vij, which has no interaction between individual line processes

variables.








The energy in equation 2.6 is a nonconvex function of real variables coupled

with boolean variables. It is difficult to solve a nonconvex minimization problem

since traditional gradient decent methods are easily trapped in local minima.

The deterministic formulation leads to satisfactory results for some early vision

problems, but this formulation is not general enough to include arbitrary constraints

for the line process. We will see in the next section that the probabilistic formu-

lation using the MRF model is more flexible in this context than the deterministic

formulation.

2.2 Probabilistic Framework

MRF (Markov Random Field) is a generalization of Markov process in 1D to

2D. To define an MRF [22], we have to define a neighborhood system first. Let S be

a finite set of n x n sites, and G = {G1,j, (i,j) E S} be a neighborhood system for S

such that


(1) (i,j) V Gij, for all (i,j) E S.


(2) (i,j) E Gk,l, if and only if (k, 1) E G ,j, for all (i,j),(k ,l) E S.


Let F = { Fij, (i,j) E S} be a random field, which is a family of random variables

indexed by the two-dimensional site (i,j) E S. Any possible sample realization

f = (f1i,i,... fn,n) is called a configuration of the field. Let f be the set of all

possible configurations, i.e. the sample space, and P be a probability measure on Q.

Then F is an MRF with respect to G if


(1) P(F = f) > 0, for all f E Q.








(2) P(Fi,j = fij I Fk, = fk,, (k, 1) # (i,j)) = P(Fij = fi, Fk,i = fkA,, (k, 1) E

Gjj).


It is not clear how to find a probability distribution function for an MRF directly

from the above definition. Fortunately, the Hammersley-Clifford theorem [22] states

that F is an MRF on S with respect to the neighborhood system G if and only if

the probability distribution is a Gibbs distribution with respect to G. Therefore, we

can construct an MRF by designing a Gibbs distribution function as the distribution

function of the MRF.

To give a definition of a Gibbs distribution, we have to define a "clique" first.

A subset C C S is a clique if every pair of distinct sites in C are neighbors. Let C

denote the set of all cliques.

A Gibbs distribution relative to {S, G} is a probability measure ir on f such

that

r(w) = exp-U()/T (2.7)


where Z is the normalizing constant ( also called partition function), T is a parameter,

and the energy function U(w) is of the form



Uw) = E Vc() (2.8)
CEC


The function Vc(w) is called the clique potential. Its value is determined by wij with

(i,j) E C.








For the sake of exposition, let's consider the surface reconstruction problem

here, but the discussion to follow should apply to most early vision problems with

minor modifications. A coupled MRF model may be constructed for the surface

reconstruction problem by defining an MRF, F, for surface and a boolean MRF, L,

for line process. The prior probability can be written as

1

P(F = f, L = 1) = exp-U(fl)/T (2.9)
Zi
U(f,l) = >Vc(f,1) (2.10)
c


The parameter T is assumed to be 1 for simplicity. The prior energy U(f, 1) is

similar to the last two terms of E(u, h, v) in equation 2.6, which are the smoothness

constraints on the surface and line process.

Let the observed data be denoted by d, then the problem to estimate f and 1

from d can be obtained by the MAP (maximum a posteriori) estimation. The MAP

estimate is obtained by using Bayes' rule and maximizing the posterior probability

P(F = f, L = 1 D = d). Assume the observation model is given by D = H(F) + N,

where N is zero mean white Gaussian random field with variance 0 for each variable

in n. After applying the Bayes rule, we can see that the MAP estimation is equivalent

to maximizing P(D = d F = f, L = 1)P(F = f, L = 1), and


P(D d F f, L 1) = (2a)-"/2 exp-d-Hf)2 (2.11)
P(D = d F = f,L 1) = (2 a2)-n2/2exp- J2oI (2.11)








Combining equations 2.9 and 2.11, we can see that the MAP estimate can be obtained

by minimizing the following energy function



U(f, I d) = 1 I d- H(f) 112 +U(f,l) (2.12)

1- || d H(f) 12 + Vc(f Il) + Vc(1) (2.13)
a c c


Investigating equations 2.6 and 2.13, we can see that these two energy functions are

very similar. If we choose the parameters and the clique potential to make these

two energy functions identical, then, both the deterministic formulation and the

MRF formulation give rise to minimizing the same energy function. Therefore, the

correspondence between the two formulations is obvious.

The advantages of the probabilistic model over a deterministic model are two-

fold:


1. The MRF model can include very general constraints on the line process, while

the deterministic model is restricted in the types of constraints considered.

The MRF model regards the imposed constraints as a prior probability, which is

determined by the clique potential. In fact, we can estimate the clique potential

functions for different object models instead of ad hoc or heuristic choices [54].


2. A probabilistic model can provide second covariancee) or higher order statistics

of the estimate in addition to the MAP estimate. The higher order statis-

tical information is very useful in the context of dynamic vision [74], sensor








fusion, and integration of multiple modules [51]. This can not be achieved by

a deterministic model.


Therefore, it can be concluded that the MRF model is more versatile than the

deterministic models. The efficient computational algorithms proposed in this thesis

can be applied to solve the minimization problems for both the deterministic and

probabilistic formulations.

2.3 Variational Formulations for Some Early Vision Problems

From the previous two sections, we can see the link between the regulariza-

tion and MRF formulations, which lead to the minimization of equivalent energy

functions. In this section, we give the deterministic formulation of some early vi-

sion problems namely, surface reconstruction, shape from shading and optical flow

computation. The corresponding probabilistic formulation can be directly obtained

from the equivalence between these two formulations [51]. For ease of exposition, the

location of surface and orientation discontinuities (if any) are assumed to be known

in the below discussion. However, we will discuss the issue of discontinuity detection

in chapter 7.

Variational formulations for various early vision problems have been reported

in [79] and references therein. These formulations make use of the popular theory

of regularization. In a regularization framework, generic smoothness assumptions

are imposed on the solution space prior to attempting any functional minimization.

The smoothness constraints are well characterized by a class of generalized multi-

dimensional spline functionals [80]. The formulations involve minimization of an








energy functional , which is the sum of the energy contribution from the smoothness

constraint (S) and the data constraint ('P) (see [80, 75] for details of this formulation).

2.3.1 Surface Reconstruction

For the surface reconstruction problem, we are faced with recovering surface

shape from either sparse or dense range data. For a first order stabilizer, the smooth-

ness constraint is given by


S(v)= J (v + v2) dxdy, (2.14)


where v(x, y) is the admissible function and vx, vy its partial derivatives assumed to

be small. For the case of a second order stabilizer, we have the following smoothness

constraint

S(v) = J 2(x + 2v + V2) dxdy, (2.15)


The first and second order stabilizer can be combined to form the controlled-continuity

stabilizers [80]. To the smoothness constraint, we add the data constraints in the

form of penalty terms. The following penalty term which measures the discrepancy

between the surface and data weighted by the uncertainty in the data may be used,


P(v) = ci (v(xi, y,)-di,)2 (2.16)








Where, di are the depth data points specified in the domain 1 and ci are the un-

certainty associated with the data. The goal is to find a u that minimizes the total

potential energy (v) = S(v) + P(v).

The numerical solution to the above problem is computed by discretizing the

functionals S(v) and P(v) using finite element techniques [75]. The resulting energy

function is a quadratic in x (a discretization of v) given by, E(x) = 1x'Kx xTb + c.

Where, K is a very large and sparse N x N matrix [79], with N being the number

of discretization nodes. The minimum x* of this energy function is found by solving

the large sparse linear system Kx = b. We present two very fast solutions to this

problem in this thesis. One is a generalized capacitance matrix algorithm and the

other is a novel adaptive preconditioning technique, which will be given in subsequent

chapters

2.3.2 Shape from Shading

Numerous solution methods using variational principle formulations of the shape

from shading (SFS) problem have been proposed. For a comprehensive set of papers

on this topic, we refer the reader to the book edited by Horn and Brooks [39] and the

work in [79, 72, 38, 76]. In this problem, it is required to recover the shape of surfaces

from image irradiance which depends on surface geometry and reflectance, scene

illuminance and imaging geometry. The image irradiance can be expressed directly as

a function of the surface orientation if illuminance, reflectance and imaging geometry

are assumed to be constant. Two deterministic formulations for SFS problem are

discussed in the following.








PDE formulation

The shape from shading problem is commonly posed as a nonlinear, first or-

der partial differential equation in two unknowns, called the image irradiance equa-

tion, E(x,y) = R(p,q) [39], where E(x,y) is the image irradiance at a point (x,y),

p = v., q = vy are first partial of the surface function v(x, y), and R(p, q) is the re-

lation between surface orientation (p, q) and image irradiance E(x, y). To overcome

the ambiguity caused along occluding contours in the gradient space parameteriza-

tion (p, q), a reparameterization of surface orientation in terms of a stereographic

mapping: f = 2ap, g = 2aq with a = 1/(1 + -/1 + p2 + q2) is used. With this repa-

rameterization, the overall energy function (sum of stabilizer and data energies) to

be minimized is expressed as


(fg) = A (f. +f) + (g + g)dxdy+ J [E(xy)-R(fg)2dxdy. (2.17)


The Euler Lagrange equations which express the necessary condition for a minimum

are given by the system of coupled PDEs


Af = A[R(f,g)- E(x,y)]Rf (2.18)
(2.18)
Ag = A[R(f,g)-E(x,y)]R,


The above equations may be discretized and solved using the algorithm of Sim-

chony et al. [72], which enforces integrability condition on p and q. The method

involves solving three different discretized Poisson equations namely, Kxj = bi,








Kx2 = b2, and K1x3 = b3, where K is the stiffness matrix obtained by discretizing

the Laplacian operator (on the LHS of the equation 2.18) with Dirichlet boundary

conditions and K1 is the stiffness matrix obtained by discretizing the Poisson equation

for the height from orientation problem i.e.,


AZ = p. + qY (2.19)



with Neumann boundary conditions. The vectors b, and b2 are the appropriate

RHSs obtained by discrete computation of the RHS of equation 2.18 at the current

estimated value of (f,g). b3 is the discrete form of the RHS of the above height from

orientation equation with the estimated (p, q) from equation 2.18. The structure of

the stiffness matrices K can be analyzed using the computational molecules [81]. In

this dissertation, we apply our generalized capacitance matrix algorithm to solve the

above discretized Poisson equations.

Energy minimization formulation

The other formulation for the SFS problem is to minimize the total energy func-

tional containing the data constraint, the integrability constraint and the smoothness

constraint. The minimization is taken with respect to the surface orientation (p, q)

and the surface height Z altogether. The total energy functional (Z, p, q) is the sum

of the smoothness constraint and the penalty terms in this problem, i.e.,


(Z,p,q) = pdxdy+ tp)2+(Z-q)2dxdy
// (p .- -4(q "4q)xy 24 /Z p2--Z )dd








+ JJE(xy) -R(pq)]2dxdy.


The first term in the above equation is a weak smoothness constraint and the coeffi-

cient A controls its contribution toward the final solution. The second term enforces

the integrability constraint, ensuring that the gradients (p, q) of the recovered surface

Z match those of a valid surface. Zx, Zy are the first partial of the surface Z(x, y).

The above energy can be discretized directly by using finite element techniques

however, the same set of discrete domain equations can be arrived at via the use of

finite difference approximations for this application [76]. The discrete energy is given

by

E(Z-p)2 4 -)2+ (pZ2+p,+q+ 220)
E= (E(xy)-R(p,q))2+-! P+(Z- 2 % (2.20)


where the summations are over all the pixels in the image. In vector notation, the

above equation can be written as


E(u) = TKu -bu+C (2.21)


Where, u = [zTpTqT] is a concatenation of the vector representation of the fields

Z, p and q. The minimum of the above quadratic is obtained by solving Ku = b with

Ku containing linear u terms and the constant and the nonlinear terms form the b

vector. This is an (N x N) sparse nonlinear system which is usually solved using

iterative methods. These iterative methods to solve the nonlinear system normally








have the following procedures in each iteration: (1) compute the b vector using

the current u, (2) update the solution u by treating b as a constant. In the past,

multigrid techniques have been employed in solving this nonlinear system [79] wherein

a hierarchy of problems at different resolutions needs to be explicitly constructed.

Szeliski [76] presented a fast shape from shading algorithm which employs hierarchical

basis for preconditioning the conjugate gradient algorithm used to solve the above

nonlinear system. Unlike the multigrid methods, this technique does not require

the explicit construction of a multiresolution pyramid of problems. In this thesis we

apply our adaptive preconditioning technique that leads to a faster convergence than

the one in Szeliski [76]. Our method makes use of the spectral characteristics of the

data constraint, the imposed smoothness constraint and the integrability constraint.

This spectral function is then used to modulate the frequency characteristics of the

chosen wavelet basis leading to the construction of the preconditioner.

2.3.3 Optical Flow Computation

A third early vision problem that we consider in this dissertation is the compu-

tation of optical flow from an image sequence. Many techniques for computing optical

flow have been proposed in literature. These can be classified into gradient-based,

correlation-based, energy-based, and phase-based methods [3]. Recently, Barron et

al., [3] have presented a comprehensive survey of various optical flow computation

techniques along with a quantitative comparison of them. We note that the survey did

not do any comparison of the computation time required by the various algorithms.








Two new and robust algorithms namely, a modified gradient-based algorithm

and an SSD-based regularization algorithm, for accurate optical flow computation

are presented in this dissertation. In the modified gradient-based algorithm, we

propose to selectively combine the image flow constraint and the contour-based flow

constraint into the data constraint in a regularization framework to amend the errors

in the image flow constraint caused by the brightness discontinuities. The image flow

constraint is disabled in the neighborhood of discontinuities, while the contour-based

flow constraint is active at discontinuity locations. The SSD-based regularization

algorithm uses the SSD measure as the data constraint in a regularization framework,

which leads to a nonlinear optimization problem. These two different regularization

formulations are described below. These two robust algorithms will be discussed in

details in chapter 6. In this section, we briefly describe our modified gradient-based

formulation.

Our modified gradient-based formulation modifies the Horn and Schunk for-

mulation [40] to incorporate a separate condition along discontinuities in the image

intensity [35]. This modification drastically enhances the performance of our method

over other gradient-based methods because the optical flow constraint equation is in-

valid along intensity discontinuities and causes large errors in the computed flow. We

get drastically better and more robust estimates of optical flow than any gradient-

based optical flow technique reported in Barron et al. [3].

Our variational formulation of the optical flow problem involves minimizing a

combination of the optical flow constraint equation with a global smoothness term








and an additional constraint term along edges leading to


I a((7VEu)+Et)2+A(\\ Vu 11' + I1 Vv II )dx+ I3((V(V2G*E)*u)+(V2G*E)t)2dx

(2.22)

Where u(x,t) = (u(x,t),v(x,t)) is the velocity field to be estimated, E is the image

brightness function, and V2G is the Laplacian of Gaussian operator [35]. The first

term is the gradient constraint which is active away from discontinuities, the second

term is the smoothness constraint on the velocity field with A controlling the contri-

bution of the smoothness term and the third term is the flow constraint along edges

which is disabled away from the edges. A discrete version of the above energy can

be written as



+(E + EyU + Et)2 + A (2 + u 2) + (U + 2) (2.23)


Where, U and F form the discretized flow vector, U,, uy, v. and :, represent the dis-

cretized partial of the velocity field, E., Ey and Et correspond to the discretized

partial in the x, y and t directions respectively collectively obtained from the

first and the third terms of the equation 2.22. Note that E., Ey, and Et are the

discretized partial of the brightness function E when the location of differentiation

is away from discontinuities, and they represent the discretized partial of V2G E

when the differentials are taken at the discontinuity locations. The minimization of

this discretized energy leads to solving a linear system of equations Ku = b where

K contains the discretized differential (smoothness) operator along with E., Ey and









the b contains the terms corresponding to the optical flow constraint equation. The

gradient constraint in the above energy function implies the weighting (E + --- on
graien cnstain i th aoveenegyfuntio iplis he eigtig (.,+ E.) on


(U, vU) at each discretized spatial location. This weighting is not reasonable since the

gradient constraints at locations of high gradient are not necessarily reliable. In fact,

they tend to be unreliable because the brightness gradients close to discontinuities are

normally higher than those away from discontinuities. Therefore, we eliminate this

weighting for each gradient constraint via normalization. To have a uniform contri-
-20__ -2 -
bution at each pixel from the flow constraint, we normalize the entries E., Ey, E.,EY
-2 -2
in the matrix K with E, + Ey.

To solve the above quadratic and convex energy minimization problem, we em-

ploy our adaptive preconditioned conjugate gradient algorithm presented in this

dissertation. This novel adaptive preconditioning technique is applicable to any

gradient-based optical flow computation method that uses a smoothness constraint

involving the 2D Laplacian operator. The efficiency as well as accuracy of our adap-

tive preconditioning technique and new optical flow algorithms will be demonstrated

through experimental results on synthetic and real image sequences in chapters 5 and

6.














CHAPTER 3
GENERALIZED CAPACITANCE MATRIX THEOREMS


3.1 Introduction

There are numerous applications of the Sherman-Morrison-Woodbury formula in

various fields [30] [85]. These applications can be classified into two categories. One is

to compute the inverse of a matrix that has low-rank difference w.r.t. another matrix

whose inverse is available or can be efficiently computed; the other is to solve a linear

system whose matrix differs from a well-structured matrix by a low-rank matrix and

very efficient methods exist to solve the linear system with the well-structured matrix.

The latter case is also called the capacitance matrix algorithm. This algorithm has

been widely used for solving linear systems arising from the discretization of the

boundary value problems, primarily for the elliptic partial differential equations. In

this chapter, we state and prove some very general theorems which are then applied

in developing a new capacitance matrix algorithm in the next chapter.

When A E Rx", U,V E Rnxp, and both the matrices A and (I + VTA-U)

are nonsingular, the Sherman-Morrison-Woodbury formula [26] gives an expression

for the inverse of (A + UVT) as



(A + UVT)-1 = A-1 A-1U(I + VTA-LU)-IVTA-1. (3.1)








The above assumptions imply that the matrix (A + UVT) is also nonsingular. The

matrix (I + VTA-'U) is called the capacitance matrix and is denoted by C. Note

that a similar formula can be obtained for the inverse of (A + UGVT), where the

matrix G E RPP is nonsingular [30].

Let B = A + UVT, with A, B, and C being nonsingular, then the linear

system Bx = b can be solved using the traditional capacitance matrix algorithm,

which involves the following steps:

1. Solve AR = b for x.


2. Compute W = A-1U by solving AW = U column by column.


3. Form the capacitance matrix C = I + VTW.


4. Solve C/3 = VTR for /3.


5. Compute the solution x = R W/3.

This capacitance matrix algorithm is derived directly from the Sherman-Morrison-

Woodbury formula, and therefore, it can be applied only when A, B, and C are all

nonsingular, making it unsuitable for general linear system problems. In addition, the

choice of a well-structured matrix A which is close to the original matrix is limited

to be nonsingular.

Buzbee et al. [10] proposed a fast algorithm to solve the Poisson equation (

a second-order elliptic PDE ) on irregular regions. Their algorithm used a domain

imbedding technique to imbed the irregular region in a rectangular region and utilized

a fast Poisson solver ( on the rectangular region ) which is based on the capacitance








matrix algorithm. The discrete Poisson operator on the rectangular domain is the

Laplacian matrix, which is singular with rank n 1. Therefore, they presented a

theorem to generalize the traditional capacitance matrix algorithm to the case when

rank(A) = n 1 and B is nonsingular. This generalization is based on projecting the

right-hand-sides (RHSs) of the linear systems appearing in the first and second steps

onto the range of A through a unit vector. In [58], a generalized capacitance matrix

algorithm was presented for rank-deficient matrices A and B. This generalization

is accomplished by establishing the relationship between the solutions of the linear

systems with matrices A and B using their singular value decompositions. The

resulting algorithm involves the projection of the RHSs of the linear systems with

matrix A onto its range through the eigenvectors in the null space of AT. In addition,

two auxiliary matrices A and B, which are the rank-augmented versions of A and B,

are constructed in this algorithm to apply the Sherman-Morrison-Woodbury formula

directly. Hence, this capacitance matrix algorithm requires the computation of all

the eigenvectors in the null spaces of A, AT, B and Bw.

The capacitance matrix algorithm has also been employed as a discrete anal-

ogy to the potential theory for partial differential equations by Proskurowski and

Wildlund [65, 66], and others. However, the framework of their capacitance matrix

algorithms is limited to solving second-order elliptic boundary value problems.

The other possibility to generalize the capacitance matrix algorithm for singular

A and B is to directly apply a generalized inverse form for (A + UVT) to solving

the linear system. Some works on the extension of the Sherman-Morrison-Woodbury








formula in equation 3.1 have been reported in [11], [34], [67], and references therein.

However, these generalized inverse forms of (A + UVT) have been derived under

very restrictive assumptions. In [11], a generalized inverse form was given for the

case when U and V are both vectors, which means B is a rank-one modification of

A. Although the generalized inverse of a rank-p modification of A can be obtained by

using the rank-one modification generalized inverse formula iteratively, the resulting

algorithm is very inefficient in terms of storage and computation when the size of

the matrix A is very large. Henderson and Searle [34] derived several generalized

inverse forms for (A + UVT) under the assumptions that range(UVT) C range(A)

and both the matrices A and UVT are symmetric. Recently, Riedel [67] presented a

generalized inverse form for (A + UVT) with a very restrictive assumption, which is

implied by the condition rank(B A) < dim(null(A)). This condition implies that

Riedel's result at best can give a generalized inverse for a rank-q modification of A,

where q < dim(null(A)). This condition becomes very restrictive and impractical

when the dimension of the null space of A is small. Although the aforementioned

generalized inverse formulas can be used to design the corresponding capacitance

matrix algorithms, the associated assumptions restrict them from being applied for

general matrices A and B.

In this chapter, we present several theorems which we call generalized capaci-

tance matrix theorems that lead to a generalized capacitance matrix algorithm which

can be applied to solve very general linear systems. In the main theorem, there are

no restrictions on the rank of the matrices A or B and it can be applied to any linear








system as long as it has a solution. In addition, with this generalized capacitance

matrix theorem, we have ample freedom to choose a well-structured matrix A for de-

signing more efficient capacitance matrix algorithms. Unlike in [58], no construction

of auxiliary rank-augmented matrices is required since we do not use the Sherman-

Morrison-Woodbury formula in our generalization, instead our resulting algorithm

uses only the eigenvectors in null(AT)\null(BT) to project the RHSs of the linear

systems with matrix A onto range(A). To achieve the projection of the RHS onto

range(A), we subtract the RHS along the vectors uj, 1 < j < mA m, instead of

subtracting along the eigenvectors in the null space of A as in [58]. By using our

projection technique, the sparsity of the RHS can be preserved, while the sparsity of

the RHS is usually destroyed after subtracting along the eigenvectors as in [58]. We

take advantage of this sparsity property to obtain fast numerical algorithms in our

work. Unlike in [58], the capacitance matrix and the capacitance matrix equation

are explicitly given in our generalized capacitance theorems and the algorithm. By

explicitly specifying the capacitance matrix and the capacitance matrix equation, it

is possible to modify the RHSs of these equations in accordance with the singularities

of A and B. With the structure of the capacitance matrix equation given explicitly,

it is easy to design an efficient numerical algorithm to solve this equation. When

comparing the generalized inverse forms of (A + UVT) proposed in [34] with our

generalized capacitance matrix theorems, we observe that the generalized inverse for-

mula of Henderson and Searle [34] is a degenerate result of our theorem 2, which has

a more relaxed assumption, i.e. null(A) C null(B) and range(B) C range(A), than








the assumption, range(B) C range(A) for symmetric A & B, made in [34]. When

applied to solve linear systems using the capacitance matrix approach with the matri-

ces A and B satisfying the aforementioned assumption, both the generalized inverse

formula in [34] and our theorem 2 lead to the same algorithm.

We will state and prove the generalized capacitance matrix theorems for different

types of relationships between the matrices A and B subsequently. In the next

chapter, the generalized capacitance matrix algorithm will be constructed based on

the theorems presented in this chapter.

3.2 Generalized Capacitance Matrix Theorems

We categorize the generalized capacitance matrix theorems according to the

type of relationship between the matrices A and B. The first theorem is for the case

when null(B) C null(A) and range(A) C range(B), while the second theorem is

for the assumption null(A) C null(B) and range(B) C range(A). There are no

assumptions on A and B in the third theorem. The generalized capacitance matrix

theorem for the first case is stated as follows.


Theorem 1 Given B = A + UGVT, where A,B E Rnxn, G E Rpxp is nonsingular,

U = [ui,..., up] E Rnxp, V = [v1,...,Vp] E nxp, with both the sets {ul,...,up}

and {vi,... ,vp} containing linearly independent vectors. Assume null(B) C null(A)

and range(A) C range(B). Let rank(B) = n mB and rank(A) = n mA,

0 < mB <_ mA < n, then there exist two sets of orthogonal bases, {qi,... ,qmA}

and {q'1,... ,q'mA}, for null(AT) and null(A) respectively, such that null(BT) =








span{ qmA-mB+, ... ,qa} and null(B) = span{ q'mA-mB+,.. q,q'mA} Let qTiu. =

aj6(ij), ai j 0, for 1 < i,j < mA mB.

Let R be a solution to


MA-MB qTb
A,=b- E fU (3.2)
-=1 q3Tu j,


where b E range(B). For 1 < i < p, let i7i be a solution to


mA-mB qTut
= qjTu1
Ar/i= Ui- q----uj (3.3)



and choose 1 ... r7aMA- B to be linearly independent and the linear combinations of

q'l,...,q'mA-mB Define the capacitance matrix C as

mA-mB qTU
C=G-1+VT77- E G-e Tu' (3.4)
j=1 q j~u


where 7 = [rl,.... ,7,], and ej is the j-th standard basis vector in RP. Then, there

exists a unique solution 3 to the capacitance matrix equation


m lA-mB qTb
C,3= VTx E G-ej, (3.5)



and x ,q3 is a solution to the linear system Bx = b.


We will use the following proposition to prove this theorem.








Proposition 1 Under the assumptions in Theorem 1, we have vfq'j = uTq, = 0 for

1 < i < p and mA- mB + 1 < j < mA- In addition, VT[q'i ...q'mA-,mI and

UT[qi ... qmA-mB ] are of full rank.


Proof: From the assumption in Theorem 1, Bq'j = Aq'j = 0, mA mB + 1 < j <

mA. Since Bq', = (A + UGVT)q'j, we have UGVTqj = 0 for mA mB + 1 < j <

mA. From the assumption that U E nR"p, p < n, has full rank and G is nonsingular,

it is obvious that VTq'j = 0. Therefore, we have vTq'j = 0 for 1 < i < p and

mA -mB + 1 < j < mA. Similarly, we can show that uTqj = 0, 1 < i < p and

mA-mB+l
We use proof by contradiction to show that the matrix VT[q'i ... q'mA-mB] is of

full rank. Let's assume this matrix is rank deficient, i.e., there exists a nonzero vector

aT = (ai,..., amA-mB) such that VT[q'... q'A-,n]a = 0. Let r = [q' ... q'MA-M]a,

then r E null(A) and r is orthogonal to null(B). Post-multiplying both sides of the

equation B = A + UGVT by r yields Br = Ar + UGVTr. Since Ar = 0 and

VTr = 0, we obtain Br = 0. But, from the above, r is a nonzero vector orthogonal to

null(B), i.e. Br : 0, leading to a contradiction. Consequently, the assumption that

VT[q'i ... q'mA-MBj is rank deficient is not true, implying that VT[q'i ... q'mA-m]

is of full rank. Similarly, we can prove that UT[qi ... qmA-mB] is of full rank. *

We are now equipped to prove Theorem 1. The proof is in four parts, and

we will show that (1) there exist two sets of orthogonal bases, {q1,...,qmA} and

{q',... ,q'mA }, for null(AT) and null(A) respectively, such that null(BT) = span{

qmA-mB+1, *"',qmA} and null(B) = span{ q'mA-MB+1'*.*' q'MA}, (2) there exist








solutions to equations 3.2 and 3.3, (3) the capacitance matrix C is nonsingular,

implying that equation 3.5 has a unique solution, and (4) B(x 17,3) = b.

Proof for Theorem 1:

(1) Since rank(A) = n-mA and rank(B) = n-mB, it is obvious from their singular

value decompositions that there exist mA orthogonal eigenvectors, 4b",... ,mA' for

A and mB orthogonal eigenvectors, f1,... 4,,i, for B such that null(A) = span{

01,"''", mA} and null(B) = span{ol,.. ., }. From the assumption null(B)

C null(A), we can write each 0j, i = 1,..., m, in terms of a linear combination of

the basis vectors 01,''... mA Thus, we have




: =T : (3.6)




where T E R'mBX A is of full rank. By using the Householder or Givens transforma-

tion [26], we can obtain the factorization of T as follows:


T=[o T]Q, (3.7)








where 0 is an mB x (mA ms) zero matrix, T E RBXmB is nonsingular, and

Q e RmAXm is orthogonal. Define the vectors q'1,..., q'mA by the following relation


q'i
=Q (3.8)

q'MA 4OMA


We can see that {q'I,. q'MA} is a set of orthogonal bases for null(A) and its

subset {q'MA-m,+1 ,..., q'mA} forms an orthogonal basis for null(B). Thus, the

existence of the basis {q'j,..., q'mA } which satisfies the aforementioned requirements

is proved. Similarly, we can prove that there exists an orthogonal basis {q, . qmA}

for null(AT) and null(BT) = span{qmA-m,+,. ... qmA } is satisfied.

(2) From Proposition 1, we have uTqj = 0 for 1 < i < p and mA mB +1 < j <

mA. The assumption b E range(B) implies bTqj = 0 for mA mB + 1 < j < mA.

Combining these facts and the orthogonality of the eigenvectors qj, j = 1,..., ma, we

can see that the RHSs of equations 3.2 and 3.3 are orthogonal to qc for j = 1,..., mA,

thus, the RHSs are orthogonal to null(AT). Consequently, the RHSs of equations

3.2 and 3.3 are both in the range of A, which means there always exist solutions to

equations 3.2 and 3.3.

(3) To prove that C is nonsingular, we show that C/3 = 0 => (3 = 0. Let's

suppose C/3 = 0, then, from equation 3.4, we have

7 MA-mB qTU/3
vTE- Z G-1e -u G-/3. (3.9)
j=1 qlu3








By definition, Bri/3 = Ar/3 + UVTf3. Substituting equations 3.3 and 3.9 into this

equation, we get Br/3 = 0. Thus, r7/3 E null(B). Since null(B) C null(A), we

have r7/3 E null(A), i.e. Ait/3 = 0. Substituting At7 by equation 3.3 and denoting

/3 = (/I,..., 3p), we obtain


p p mA-mB qTu
=1 =l ?l qju,


(3.10)


After rearranging, equation 3.10 becomes


mA-mB q J Um-_ 3 P
qjul j=mA-mB+l


luj = 0.


(3.11)


Since Ul,..., up are linearly independent, equation 3.11 imposes the following condi-

tions on j3,


q~TU/3
Piu = ,J = l,...,mA-mB,

#j = 0, j=mA-mB+l,...,p.


(3.12)

(3.13)


Substituting these conditions into the RHS of equation 3.9, we obtain


MA-B

I3 ^-mE


=0.


(3.14)


VT[,l,..", -1 ,A-MB]








For i = 1,...,mA mB, the RHS in equation 3.3 equals 0, thus, r77 E null(A).

From the assumption that h,. '' mA_7mB are chosen to be linear combinations of

ql,... ,q'mA-mB and linearly independent, we can write [t7 ... mAr,-MBI] in terms

of [q'l ... q'mA-mB] as follows:



[7 ... TmA-mBI = [q'I *... q'mA-mBIT, (3.15)



where T E 3(mA-m-B)(mAmB-) is nonsingular. By substituting equation 3.15 into

equation 3.14, we get




VT[q'I ... qjT : = 0. (3.16)

SPmA -m-Ei


From Proposition 1, the matrix VT[q'li ... q'MA-mBI is of full rank and is of size

p x (mA mB) with p > mA mB, which implies




T : =0. (3.17)

f PmA-mB


Since T is nonsingular, this makes fli = 0, i = 1,..., mA mB. Combining this with

equation 3.13, we obtain 03 = 0. Thus concluding that C/3 = 0 implies /3 = 0,

therefore, C is nonsingular.








(4) Now, we will show that B(R r]0) = b. Substituting A + UGVT for B

and expanding, we have


B( 1f}) = AR Ay/3 + UGVTk UGVTf3. (3.18)



Rewriting VT7)3 using equations 3.4 and 3.5 and some algebraic manipulation, we

get

mA-mB qTb mA-mB qTU
VT^ '-VTjE- Z -- G-ej -G-1j3 + E e,-j-/3. (3.19)
j=l qj uj .j=l qj"


Substituting for AR, At and VT1/3 from equations 3.2, 3.3 and 3.19 respectively,

equation 3.18 leads to B(x rq/3) = b. *

In theorem 1, we assume that the matrices U and V contain linearly independent

columns and G is nonsingular, which is equivalent to saying that U, V and G are all

of full rank, in addition, the condition qTuj = aii(i,j) must be satisfied for 1 < i,j <

mA-mB. In fact, we can prove that given any two matrices A, BE Rn"n, there always

exist full-ranked matrices U, V E Rn'P and G E RPXP such that UGVT = B-A and

p = rank(B A). This can be verified by taking the singular value decomposition of

the matrix B A as UVZVT, where both U and V are orthogonal matrices and X is

a diagonal matrix with rank p = rank(B A). We can reduce the size of the matrix

X by retaining the p nonzero diagonal entries and taking out the remaining rows

and columns of zeros. This reduced matrix G E RP'P is nonsingular. In addition,

we take out the columns in V and V corresponding to the zero singular values in 7,








then the reduced matrices U, V E 3Zn'p are obtained. It is not difficult to see that

UGVT = B A and the matrices U, V and G are of full rank.

As for the condition qTuj = ai6(i,j), 1 < ij < mA mB, we can find such

a U by transformation. If {U, G, V} satisfies UGVT = B A and each is of full

rank, then the alternative set {UQ, Q-1G, V} also satisfies these requirements when

the matrix Q 3Rpxp is nonsingular. From Proposition 1, UT[qi ... qmA-mBI has

full rank. We can find a transformation matrix Q to make the upper (mA mB) x

(mA mB) block of the transformed matrix QTUT[qi ... qmlA-m ] diagonal. Then,

the condition qfuj = ai6(i,j), 1 < i,j < 5MA mB, is satisfied.

There are infinite number of choices for the matrices U, V and G to satisfy the

aforementioned requirements. The above choice from the singular value decomposi-

tion of B A is just a particular one that leads to a diagonal G matrix. This makes

the computation of G- in equation 3.4 trivial. The choice of A, U, V and G is very

crucial for the efficiency of the numerical algorithms based on the capacitance matrix

algorithm to solve a linear system.

Now let's turn to the other case when null(A) C null(B) and range(B) C

range(A). Note that these two constraints are equivalent to each other when both

B and A are symmetric matrices as in [34].


Theorem 2 Given B = A + UGVT, where A,B E ?"nn, G E RpXp is nonsingular,

U = [ui,...,up] E R"Xp, V = [v1,...,Vp] E "nXp, with both the sets {ul,...,Up}

and {vi,... ,VpJ containing linearly independent vectors. Assume null(A) C null(B)








and range(B) C range(A). Let R be a solution to


(3.20)


where b E range(B). For 1 < i < p, let rh be a solution to


Ati = ui,


(3.21)


Define the capacitance matrix C as G + VT'q, where q = [q?,..., qp]. Then there

exist a solution (3 to the capacitance matrix equation


C63 = VTx,


(3.22)


and R rij3 is a solution to the linear system Bx = b.


We will use the following proposition to prove the above theorem.


Proposition 2 In Theorem 2, let {qi,... ,qmA} and {q',.., q,'mA} be the sets of or-

thogonal eigenvectors in null(AT) and null(A) respectively, where mA = n-rank(A).
Then, qfuj = q' Tvj =0, for i= 1,...,mA and j = 1,...,p.


Proof: From the assumption that null(A) C null(B), we can see that the eigen-

vectors q'i, i = 1,... ,mA, are also in null(B). Since Bq', = Aq'i + UGVTq'i, we

obtain


UGVTq'i = 0,


(3.23)








for 1 < i p, and of full rank, consequently

GVTq', = 0. From the assumption that G is nonsingular, we have VTq, = 0.

Therefore, q'Tvj = 0 for i = 1,... ,mA and j = 1,... ,p.

Similarly, we obtain the equation VGTUTqi = 0 from the condition BTq, =

ATqj = 0 which is derived from the assumption range(B) C range(A). Using

the similar reasoning for this equation leads to the conclusion that qiTu = 0, i =

1,...,mAandj = l,...,p. m

We are now poised to prove Theorem 2.

Proof for Theorem 2 : The proof is presented in three parts namely, (1) we

will show that there exist solutions for equations 3.20 and 3.21, (2) we will establish

that there exist a solution for equation 3.22, and (3) show that B(x 1/3) = b.

(1) To show that there exist a solution for equations 3.20 and 3.21, we prove that

the RHSs of both equations are in the range of A. It is obvious that b E range(A)

for equation 3.20, since b E range(B) and range(B) C range(A). For equation 3.21,

from Proposition 2 we can see that ui, 1 < i < p, is orthogonal to null(AT), which

means ui E range(A) for i = 1,... ,p.

(2) As for equation 3.22, we need to show that VTR is in the range of C. We

use proof by contradiction in the following. Assume that VTR is not in the range of

C, then there exists a nonzero vector 7 E null(CT), i.e. CTy = 0, such that


7T(VyT) 5 0.


(3.24)








Using the generalized inverse 1 A+ of the matrix A, we can write R as follows.


mA
=A+b + aiq'i,
i=1


(3.25)


where mA = n rank(A), q'l,..., q',A are the orthogonal eigenvectors in null(A),

and a, E R, Vi. Then,


MA
VT5 = VTA+b + aiVTq'i.


(3.26)


From Proposition 2, we have VTq'i = 0, for i = 1,..., mA. Substituting into equation

3.26, we have VTR = VTA+b. Substituting for VTR in equation 3.24 yields


(A+TVy)Tb 5 0.


(3.27)


We can show that A+TVY E null(BT) as follows.


BT(A+TVy) = (A + UGVT)TA+TV7


(3.28)

(3.29)


= (ATA+Tv + VGTUTA+TV)


Since ATA+T I EI= qIqT, using a result of Proposition 2 namely, q'TV = 0

for i = 1,..., mA, we have


ATA+TV = V q'qV = V.
S=1 V = V.
A A 1


(3.30)


'Let the singular value decomposition of A be A = UEVT, where U and V are orthogonal
matrices and X = diag(O,...,O, m A+I,..., An). Then its generalized inverse A+ = VE+UT,
where 7+ = diag(O,..., i '... I) [26].








Substituting equation 3.30 into equation 3.29 yields


BT(A+TVy) = VGT((G-1 + VTA+U)T^y (3.31)



Again, using Proposition 2, we can show that C = G-1 + VTA+U. Consequently,

we obtain BT(A+TVy) = VGTCT7 = 0 since CTy -= 0 by assumption. Thus,

A+Tv -Y E null(BT). Combining this with the equation 3.27 leads to the conclusion

that b is not in range(B). This is a contradiction to the assumption b E range(B).

Therefore, the assumption that VTx is not in the range of C is untrue. Hence, we

can conclude that VTi E C. This proves that there exists a solution to equation

3.22.

(3) The proof for B(x rj3) = b is very similar to part (3) of the proof for

Theorem 1 and is omitted here. *

If we restrict the above constraints on the relation between B and A to null(B) =

null(A) and range(B) = range(A), then the capacitance matrix C is nonsingular,

i.e., the capacitance matrix equation has a unique solution. This result is stated in

the following corollary.


Corollary 1 In Theorem 2, if null(A) = null(B) and range(A) = range(B), then

the capacitance matrix C is nonsingular.


Proof: To prove that C is nonsingular, we show that C/3 = 0 =* 3 = 0. Suppose

C/3 = 0, then, from the definition of C, we have VTq/3 = -G-13, which can be

substituted into the equation Bi/t3 = Ai7f3 + UVTi/.3, along with the equation for








Ait from (3.21), to get Brj/3 = 0, i.e., r/3ft E null(B). Since null(A) = null(B), we

have Ar7/3 = 0. Using equation 3.21 for Atq, we get U/3 = 0. From the assumption on

the matrix U namely, u1,. . up are linearly independent, we conclude that /3 = 0.

Therefore, C is nonsingular. *

In the above theorems, we impose the constraints on the relationship between

B and A either via null(B) C null(A) and range(A) C range(B) or via null(A) C

null(B) and range(B) C range(A). Now we consider the general case, i.e., no con-

straint on the relationship between B and A is imposed. The following theorem is

given for this general case and can be used for any matrices B and A in ?Inn as long

as there exists a solution for the associated capacitance matrix equation.


Theorem 3 Given B = A + UGVT, where B,A E RnXn, G E RPxp is nonsingular,

U = [ul,...,up] E fnxp, V = [vi,..., ,Vp] E Rnxp, with both the sets {ul,...,up}

and {vi,...,vp} containing linearly independent vectors. Let rank(A) = n mA,

dim(null(B) n null(A)) = m', and dim(null(BT) n null(AT)) = m, 0 < m',m <

mA < n, then there exist two sets of orthogonal basis, {qi, . qmA } and {q'l ..,M ,

for null(AT) and null(A) respectively, such that null(BT)nflnull(AT) = span{qMqA-m+i,

..,qmA} and null(B) n null(A) = span{q'mA-M,+l,... ,q'A}. Let rank(B) =

n MB, 0 < m',m < mB < n, then there exist mB- m' vectors r'm+l,...,r'mB

such that the set {q' mA-m'+1, ... q 'mA' r/m,'+1,. , r',B M} is an orthogonal basis for

null(B). Let qfuj = aij(i,j), ai, 0, for 1 < i,j < MA m.








Let R be a solution to


MIA-M" oTb
AR=b- E q= (3.32)
A b- I qjTu--Uj,
j=l ?"


where b E range(B). For 1 < i < p, let ih be a solution to


mA-m qTui.
Ai= ui- E -. uj, (3.33)
j=l qj uj


and choose .. .lrnA-m to be linearly independent and linear combinations of q',...,

qmAm Define the capacitance matrix C as

mA-M 1 ; qUT
C=G-1+VT7- : G-lej, T (3.34)
j=1 qj uj


where r7 = [771,...,1p], and ej is the j-th standard basis vector in RP. Assume the

two sets of vectors {Ar'm,'+i,..., Ar'm,,} and {ul,..., up} are linearly independent,

then there exists a unique solution 3 to the capacitance matrix equation


mA--m QTb
CO = VR- G-1e, (3.35)
C vT -- qjTuj-. ,
j=i l


and R r13 is a solution to the linear system Bx = b.


The proof of this theorem is omitted here since, it is similar to the proof of

Theorem 1. This proof will require the use of the fact that qTuj = 0, for mA-m+l <

i < mA and 1 < j < p, and the matrix VT[q'I ... q'mA-m] is of full rank.








In general, Theorems 1 and 2 can be regarded as the special cases for Theo-

rem 3 with the exception of the additional assumption on the linear independence of

the two sets of vectors {Ar'm,'+i,..., Ar',,,} and {ul,..., up) made in Theorem 3 to

guarantee the uniqueness of the solution in the capacitance matrix equation. This as-

sumption is automatically satisfied for Theorem 1 since the set {Ar'm'+i,..., Ar',m }

is empty and the vectors ul,..., up are linearly independent. However, there is no

such assumption in Theorem 2, which gives the existence of a solution to the capac-

itance matrix equation instead of the uniqueness of the solution given in Theorem

3.

3.3 Discussion

In this chapter, we presented the generalized capacitance matrix theorems for

solving linear systems. Our generalized capacitance matrix theorems can be used for

very general matrices A and B, therefore the resulting algorithm can be applied to a

very broad class of problems and we have ample freedom to choose the matrix A ap-

propriately for designing efficient numerical algorithms. The traditional capacitance

matrix algorithm [10, 65, 66] has been primarily applied to solve the second-order

elliptic boundary value problems, whereas our generalized capacitance matrix algo-

rithm can be used to solve any-order elliptic boundary value problems. Also, we can

handle cases such as the shape from shading ( SFS ) problem on an irregular domain

with Neumann boundary conditions which specifically yields a singular K matrix.

This case can not be solved using the capacitance matrix technique of Buzbee et al.,

[10] as mentioned in [72] because, the capacitance matrix theorem of [10] requires a





49



nonsingular K. Our algorithm gives a solution to this problem (recovers a surface)

within a scale factor in height.

Previous works [11, 34, 67] on deriving the generalized inverse forms of (A +

UVT) were presented under restrictive assumptions. Although our generalized capac-

itance matrix theorems are developed for solving general linear systems, it is possible

to derive the generalized inverse form of (A + UGVT) for more general cases based

on these theorems.














CHAPTER 4
GENERALIZED CAPACITANCE MATRIX ALGORITHM


4.1 Structure of the Algorithm

From the generalized capacitance matrix theorems presented in chapter 2, we

can generalize the standard capacitance matrix algorithm for general matrices A and

B as follows.


1. Solve equation 3.32 for it.


2. Solve equation 3.33 for q1, for i = 1,... ,p.


3. Form the capacitance matrix C from equation 3.34.


4. Solve the capacitance matrix equation (i.e. eq. 3.35) for f3.


5. The solution x = R Tq,.


This generalized capacitance matrix algorithm can always be applied to the cases

when null(B) C null(A) and range(A) C range(B) or when null(A) C null(B)

and range(B) C range(A), as stated in Theorem 1 and 2 respectively. However,

to apply this algorithm to more general cases, we need to make sure that a solution

exists to the capacitance matrix equation in step 4. Theorem 3 gives a sufficient

condition for the uniqueness of the solution to the capacitance matrix equation in

the general case.








Note that the step 2 of the algorithm requires the solutions 77,..., r7p in equation

3.33 with different RHSs. The computational cost to solve this equation p times is

very high when p is large. However, the vectors 7l, ... vp can be easily obtained

when the matrix A is chosen to be circulant Toeplitz and the vectors ui have a very

sparse set of non-zero entries. The circulant Toeplitz property of the matrix A makes

it shift-invariant, which means AS(u) = S(Au) for all u E R" where S is a shift

operator [59]. A linear shift-invariant operator can be completely characterized by its

impulse response, which is the solution of the corresponding linear system Au = e

where the RHS e is a unit vector. Thus, the solution rqi can be obtained via the

convolution of the impulse response for this shift-invariant linear system with the

RHS in equation 3.33, which is sparse since the vectors ul,...,Up themselves are

sparse. The convolution of the impulse response with a sparse vector can be easily

obtained by a linear combination of the shifted impulse response vectors. When the

vectors vi, i = 1,... ,p, are also sparse, the computation of the matrix VT'i for

forming the capacitance matrix C can be further simplified by obtaining the (i,j)-th

entry of VTq from the impulse response corresponding to the relative locations of

the nonzero components of vi and those of uI,..., umA-,m and uj.

The direct computation of rq/3 in the final step of the generalized capacitance

matrix algorithm requires O(pn) operations, which are very costly when p, the rank

of difference between B and A, is large. Instead, we solve the following linear system

mA-M qTUO
Ay=U/3- E 7u. Ui. (4.1)
j= q j








to get the solution y which has the same projection on the range of A as ir)3. As for

the projection of r/3 on the null space of A, it can be easily computed from the inner

product of 0 and ai = (a,,1,... ai,p), where aij = qT%, for 1 < i < mA m. The

coefficients ai can be chosen arbitrarily as long as the assumption that rn ... 7mA_-m

are linearly independent and linear combinations of q',., q'mA-m is satisfied. Thus,

r/3 can be obtained from the following equation.

mA-m
r3 = y+ (a/3 )q (4.2)
i=1 q*qi j "


The flowchart of the generalized capacitance matrix algorithm is given in figure

4.1.

4.2 Choice of the Splitting

From the variational formulations for the early vision problems, we can see it

is necessary to solve linear systems of the form Kx = b, where the matrix K E

RNXN(N = n2) is the sum of the matrices Kd and AK, obtained from the data con-

straint and smoothness constraint respectively. To apply the generalized capacitance

matrix algorithm to solve the linear systems, we have to choose a splitting scheme

for K at first. We discuss the nice choice of splitting K in this section.

Assume we split the matrix K into two components Ko and UVT. Note that the

matrices K and K0 can be replaced by the matrices B and A, respectively, used in

chapter 3. The notation K and K0 (popularly used in computational vision literature)

are used instead of B and A (widely used in the field of capacitance matrix methods)

































Figure 4.1. The flow chart of the generalized capacitance matrix algorithm.

for the rest of this chapter. Using the generalized capacitance matrix algorithm to

solve the linear system Kx = b with the above splitting for K, we need to solve several

linear systems of the form K0x = b as shown in equations 3.2 and 3.3 respectively.

Especially, in equation 3.3, we have to solve it p times, where p is the rank of K K0.

Therefore, it is very crucial to choose a well structured K0 so as to efficiently compute

the solution in the capacitance matrix method.

In general, there are three criteria that dictate the choice of K0. The first is

the difference between K0 and K must be small in the sense that rank(K Ko) is

small. Secondly, Ko must possess some nice structure so as to have a fast numerical

solution to the associated linear system. Lastly, K0 should be translation invariant;








this property has the advantage of saving the computational cost in solving equation

3.3 for each ui, 0 < i < p 1, since there are only one or two nonzero components in

each ui. Therefore, we need to compute the solution h0 to a linear system consisting

of a matrix K0 and an RHS vector which is the projection of the unit vector onto

the range of K0. Because, ho is translation invariant, we can obtain the solution of

rq in equation 3.3 without having to solve any of the linear systems associated with

each ir.

For ease of exposition, we consider the splitting scheme for the surface recon-

struction problem in the following. This splitting schemes can be extended to other

early vision problems.

The stabilizer discussed in the previous section imposes smoothness constraints

on the surface reconstruction problem leading to a Co surface i.e., the membrane

spline, a C1 surface namely, the thin plate spline, or a combination thereof called

the thin-plate-membrane spline. As mentioned earlier, the numerical solution to the

surface reconstruction problem involves solving a linear system Kx = b. The matrix

K is large and sparse but not tightly banded. This is primarily due to the fact

that a vector has been used to represent a two-dimensional array which leads to

representing neighboring interactions in two-dimension by interactions between very

distant elements in one-dimension. The matrix K is a sum of two matrices K8 and

Kd, with K, containing the membrane or thin-plate molecules [81] defined on the

interior and boundary of Q. The Kd matrix is a diagonal matrix containing non-zero

weights.








By making use of the structure of the matrix K, we can split K into two com-

ponents K0 and UVT, with K0 E RNxN and U,V E 3Np, such that K0 is close to

K, i.e., the rank of UVT is small, and there exists a fast algorithm to solve a linear

system with the matrix K0. Then, the solution to the linear system Kx = b can

be obtained by using the generalized capacitance matrix algorithm which involves

solving the well-structured linear systems with the matrix K0 and the capacitance

matrix linear system.

The choice of Ko depends on the data compatibility matrix Kd and the smooth-

ness constraint matrix K,. We have different choices of K0 for sparse (scattered) and

dense data cases. They are given in the following sections.

4.2.1 Dense Data Surface Reconstruction

We will first describe our choice of the matrix K0 and the equivalent Lyapunov

matrix equation for the dense data surface reconstruction with a membrane spline.

Then we will present the matrix K0 for the thin-plate-membrane spline surface re-

construction based on the construction scheme for the membrane problem. The thin-

plate spline is a special case of thin-plate-membrane spline and will not be discussed

separately. In addition, the selection of the matrices U and V will be discussed for

both cases.

Membrane spline for the dense data case

The data compatibility matrix Kd for the dense data case with uniform weight-

ing is an identity matrix. Focusing on the surface reconstruction problem in a rectan-

gular domain, the matrix K0 is chosen to contain both the membrane computational







molecules and the data constraint molecules [81] all over the domain which are

shown in figure 4.2, Using this choice of K0, we can transform the original linear

system Kx = b to an equivalent Lyapunov matrix equation.

+ 1





(a) (b)


Figure 4.2. (a) Data constraint molecule and (b) membrane computational molecules
periodically imposed on the domain Q.

With the above choice, the matrix K0 can be written as follows.


Ko = AK m +K AKemK, + I, (4.3)


where

D -I -I
-I *". "'.

Kmem = ". ". (4.4)
*** -I

-I -I D









4 -1 -1

-1 4 -1

D = ". e rn. (4.5)

-1 4 -1

-1 -1 4


The matrix Kmem, can be regarded as the result obtained from the 5-point ap-

proximation to Laplacian operator on a doubly periodic imbedded rectangular do-

main. From a computational molecule [81] point of view, only the membrane molecule

is used at every node of the doubly periodic rectangular domain. Therefore the molec-

ular inhibitions due to depth discontinuities (if any) and imposed doubly periodic

smoothness are included in the matrix UVT. Given the matrices K and Ko, we can

show that the choice of the matrices U and V such that UVT = K K0 where both

matrices U and V are of full column rank is not unique. Our selection method for

the matrices U (= [ui,..., up]) and V (= [vl,..., vp]) is as follows. Let each column

vector ui, 1 < i < p, contain only two nonzero entries +1 and -1 located at the two

neighboring sites separated by the i-th discontinuity. Then select the column vector

vi to be -Au_, 1 < i < p. Thus the matrices U and V are of full column rank p

where p is the number of discontinuities. Our selection scheme has the advantage

that the column vectors of U and V are very sparse (only two nonzero entries in each

column), a structure that is very conductive to the design of an efficient numerical

algorithm. We will discuss more of this in section 4.3.









We can see that this particular choice of K0 makes it circulant Toeplitz and thus

translation invariant. In addition, it can be decomposed as the sum of two tensor

products of matrices A and I, i.e.,



Ko =AI+I A, (4.6)



with
+ 2A -A -A

-A +2A -A

A= .. .. .. Enxn,

-A + 2A -A

-A -A + 2A

and is the Kronecker product. By using this special structure of the matrix K0, we

can rewrite any linear system of the form K0z = f as the following Lyapunov matrix

equation

AZ + ZA = F, (4.7)


where Z and F are n x n matrices corresponding to their concatenated n2 x 1 vectors

z and f respectively. It should be noted that n2 = N, and that the matrix A in

equation 4.7 is circulant Toeplitz and symmetric positive-definite. We can use the

ADI (Alternating Direction Implicit) method to solve this Lyapunov matrix equation

in O(N) operations. We will prove this 0(N) computational complexity of the ADI

method in section 4.3.








Thin-plate-membrane spline for the dense data case

The construction of the matrix K0 for the thin-plate-membrane spline prob-

lem is based on the above construction for the membrane spline. Analogous to the

matrix Kmem which is a well-structured matrix and close to the smoothness con-

straint matrix K, constructed for the membrane problem, a matrix Ktp is used for

the smoothness constraint matrix discretized from the thin-plate smoothness energy.

The differentiation of the membrane smoothness energy function corresponds to a

Laplacian operator A, while the corresponding operator for the thin-plate smooth-

ness energy is a biharmonic operator A2. Thus, we can extend the selection scheme

for the membrane problem to the thin-plate problem by choose the matrix Ktp as


Ktp=KeM. (4.8)



The smoothness matrix for the thin-plate-membrane spline is a convex combination

of the membrane and thin-plate smoothness matrices. Let's define the matrix Ktpm

as a convex combination of Kmem and Ktp, i.e.,



Ktp, = aKtp + (1 o)Kmem, (4.9)



where 0 < a < 1. Similar to that of the membrane spline, the matrix Ko is chosen

to be AKtpm + I, which can be written as


Ko = AaKem + A(1 a)Kmm +I








= (aKmem + I)(bK.em + I), (4.10)



where


A(l a) + A2(1 )2 -4Aa A(l a) A2(1 -)2-4A (4.11)
a, = 2 2(411



When A > (1)2, a and b are always real and positive. Then the matrix K0 can be

factored into two real matrices (aK.em + I) and (bKmem + I), which are of the same

form as the matrix K0 for the membrane dense data problem. Thus, the solution to

a linear system with the matrix Ko can be obtained by solving two Lyapunov matrix

equations, which can be solved using the ADI method in O(N) time. However, when

the above condition is violated, a and b as well as the factoring matrices in equation

4.10 become complex. It is more complicated to use ADI method to solve the complex

Lyapunov matrix equations and analyze the computational complexity. Therefore,

we limit our discussion to the case when the aforementioned condition is satisfied for

the dense data surface reconstruction problem.

Using the above choice for the matrix K0, we can see that the matrices U and

V encode the inhibition of membrane and thin-plate computational molecules due to

the presence of depth and orientation discontinuities (if any) and the imposed doubly

periodic smoothness over the imbedded rectangular domain.

4.2.2 Sparse Data Surface Reconstruction

The splitting of matrix K for the sparse data surface reconstruction problem is

discussed in this section. The only difference between the splitting schemes for the








sparse data problem and the dense data problem is the treatment of the data com-

patibility matrix Kd. For the dense data case discussed above, the matrix Kd is an

identity matrix and is arranged into K0, which can be rewritten as a Kronecker prod-

uct of two well-structured matrices. For the sparse data case, the nonzero elements of

the matrix Kd are sparsely distributed along the diagonal, thus every nonzero entry

of the sparse diagonal matrix Kd is decomposed into the outer product of ul and v

and included into U and V as follows:



U = U1 ... Uk U1 ...u

V = Avi ... Av,, V1 ... V'k,



where the vectors ul,..., u V, vi,...,vk account for the molecule inhibition due

to depth and orientation discontinuities (if any) and the imposed doubly periodic

smoothness over the embedded rectangular domain. With the above choice of U and

V, each column of U and V is still very sparse, containing only one or two nonzero

entries. Our numerical algorithm will take advantage of this sparsity property.

Similar to the dense data case, the matrix K0 can be decomposed as the sum of

two Kronecker products of AA and I. Therefore, a linear system of the form Koz = f

can be written as the Lyapunov matrix equation AZ + ZA = F.

4.3 Numerical Solution

In this section, we discuss the convergence of ADI method applied to the Lya-

punov matrix equation and show that the computational cost for solving this equation








is 0(N) by proving that the ADI method can converge in a constant number of it-

erations independent of N. In addition, we present a modified biconjugate gradient

method that is used in solving the capacitance linear system.

4.3.1 ADI Method for the Lvapunov Equation

The ADI method for solving a Lyapunov matrix equation is described in [48].

For any Lyapunov matrix equation of the form AX + XA = B, the ADI method

involves the following steps in each iteration j = 1,2,..., J,



(A+pjI)X,_- = B-Xj_,(A-pjI) (4.12)

Xj(A + pjI) = B (A pjI)Xj (4.13)



For our problem, note that the matrix A is circulant Toeplitz and symmetric positive

semidefinite as discussed in section 4.2. The special structure of matrix A helps in

advancing the ADI iterations very efficiently in each step of the technique. On the

left-hand side (LHS) of the equations 4.12 and 4.13 respectively, the matrix A + pjIl

is very close to being tridiagonal except for the nonzero entries at the corners namely,

the entries (1,n) and (n,1). These nonzero corner entries are caused due to the doubly

periodic boundary conditions discussed in the last section. In addition, A+pjI is SPD

for any positive parameter pj. Therefore, we can compute the LU decomposition of

this matrix and then use the forward and back substitution to compute the updated

solution. Due to the the special structure of the matrix A +pjlI, the solution update

requires only 0(N) time per iteration.








We now examine the convergence of the ADI method for the Lyapunov matrix

equation in our algorithm. Let the initial Xo = 0 (zero matrix), and AXt = Xt -X*,

where X* is the true solution of the Lyapunov matrix equation, AO0, Ai,..., An-i, and

qo, q, ..., qn- be the eigenvalues and the eigenvectors of A respectively. For this

symmetric matrix A, we have

A = QDQT, (4.14)


where Q = [qo q, ... qn-,], D = diag[Ao, Ai,.., An-1], and Ak = 4sin2( ')

for k = 0,..., n 1. We express the eigenvalue Ak in this form so as to have nonde-

creasing eigenvalues with respect to the indices. The range of eigenvalues is between

0 and 4, and the zero eigenvalue (when k = 0) is simple. The orthogonal matrix

Q contains the discrete sine or cosine function in each column with increasing fre-

quency corresponding to increasing eigenvalue, i.e., the index increases. For example,

the eigenvector corresponding to the simple zero eigenvalue is a flat function whose

frequency is 0.

By taking the tensor product of the eigenvectors, qk 0 qj, k, 1 = 0,..., n 1, we

can generate an orthogonal basis for R" and express the true solution X* in this

basis as follows:
n-i n--
X*= akL(qkq). (4.15)
k=O 1=0

Subtracting each of the equations 4.12 and 4.13 from the Lyapunov equation AX + XA

= B, and expressing X in the above tensor product basis, we have the following result








for the error after t iterations,


AXt n-i'- r tk--- A)(,-p) aki(qk l. (4.16)
A ~=-Z P (1+) ],+p
k=0 1=0 = Ak+ q)(.j


The function inside the bracket is the error reduction factor in the basis qk 0 qj.

Notice that the Lyapunov matrix equation has an infinite number of solutions due to

the singularity of the matrix A. The error reduction in each step is always less than

1 for eigenvectors with positive eigenvalues and positive parameters pj while it is

always 1 for the basis qo 0 qo (which is evident from equation 4.16 after substituting

k = 1 = 0 and Ao = 0). A solution to the Lyapunov matrix equation can be obtained

without reducing this factor at the component k = I = 0.

The classical ADI minimax parameter problem is to find the ADI parameters pj

to minimize the maximum of the error reduction function excluding the zero eigen-

value, i.e., to minimize the function


t (Ak Pj) A, PJ
t = max I (-=--)(-P) (4.17)
=t m ax f -'k'k+ Pj t+ Pji
4V(k,)\(,O) j= p Al + p


Applying the result from the classical ADI minimax analysis [48], leads to an 0(log N)

iterations for convergence [85]. This in turn yields an overall computational complex-

ity of 0(N log N) for the ADI method since, 0(N) time is required per iteration in

the ADI method. An O(N log N) computational complexity is unacceptable for our

purposes (recall that we seek an 0(N) solution). Thus, in order to design an 0(N)

algorithm, it is essential to use the additional knowledge about the problem e.g.,








spectral characteristics of the differential operator in the PDE. The classical ADI

convergence analysis does not make use of any such knowledge. In this thesis, we re-

formulate the ADI convergence analysis by incorporating the spectral characteristics

of the Laplacian operator. This is facilitated by making the following assumption.


Assumption 1 Let the projection of B, the RHS of the Lyapunov matrix equation, on

the basis qk 0 q1 be bkl. Assume there exist a constant a E 3+ and a Wm, > 0 such

that ,V(k,l)\(O,O), Ibkl < (k for some m > -0.5, and EkZE( .6k+2) > aw
-- (k2 +12)m,,,2


For our Lyapunov matrix equation AX + XA = B, the relationship between the

solution X and the data B in the eigen basis of the matrix K0 is aki = jk+J'

Therefore, the assumption IbklI < (w2) IakM < J (Ak+k)"2 for some m >

-0.5. Using the relation Ix < sin x < x for x E [0,7r/2], we have clk2 < Ak < c2k2

with c = 4 and C2 = 42. Thus, requiring that the solution satisfy the constraint
\akil < -(k f2 for some m > -0.5.
-(k2+J2)M+I
This assumption can be given the following spectral interpretation. The smaller

values k or 1 correspond to lower frequencies along vertical or horizontal direction, and

vice versa. Consequently, the assumption means that the overall frequency content

of the input data or the right hand side can be bounded by an increasing function

of |11w12 = (k2 + 12)0.5 of degree less than 1. This can be easily satisfied when the

overall frequency distribution of the input data is decreasing or uniform. This is true

in reality since, the low frequency content in signals is normally much higher than

the high frequency content.








We first define two matrices X' and AX't for use in the convergence analysis

of the ADI method. Let X' and AX't be the projections of the true solution X*

and the error in computed solution (in iteration t) AXt on the range of the matrix

K0 ( = AI + IA), respectively. Note that, X' and X* are almost the same

except that X' doesn't have projection at the basis qo qo. In the ADI literature

[48], the convergence criterion is normally defined as follows: The ADI method for

the Lyapunov matrix equation is said to have converged when the relative Frobenius

norm AXj is reduced to within a specified error tolerance c lying between 0 and

1, i.e., IIAX'tl|f <: CI|X'||F. This means we don't need to reduce the error component

in the basis corresponding to the zero eigenvalue or in the null space of K0. The

problem is to find the number of iterations required in the ADI method and the

associated ADI parameter set to achieve IIAX'|IF < :EllX'IIF.

We make use of this convergence criterion and assumption 1 to reformulate the

convergence theory of the ADI method as follows. The Frobenius norm IIAX'tl|F can

be written as,
JJAXtJJ2a~lt (Ak -p Pj(l pj)2
IIAX'II' = Ea4 ( ')2( )2 (4.18)
k 1 j=l Ak + pj Al "+p

where the double summation takes every k and 1 between 0 and n 1 except k =

1 = 0 (simultaneously). As a consequence of assumption 1 the ADI method for our

Lyapunov matrix equation can converge in a constant number of iterations. We state

and prove this convergence result in the following theorem.


Theorem 4 Given assumption 1 and an error tolerance e between 0 and 1, there exists

a constant integer t such that Vn, IIAX'tlIjF EIIX'IIF-.




67


Proof: From the discussion on assumption 1, we have the following inequality for

IIX'IIF.


(4.19)


||X||2 2 > bkI )2 > 0aW2
' k IyAk+ A, 16 kI V 16 'n


Applying assumption 1 to equation 4.18, we have the following inequality


2 f.Ak-p., A-(Pj )2
IIAX 'lII2 < W A Wmk-l('k + PJ 2 ___q- _
(kl +)m(A, )-+)) A.+P. A,+p_)
21 1
-- n- ((k)2 + ()2)2m(Ak + A)2-q()R(pp2,..


where q is a positive number between 0 and 2m + 1 and


R(PP2 ..1 tP = Pp)2(A p)2.
(A + A). Ak + pj A, + pj


(4.20)


Using the fact that Ak = 4sin2(1Y71) > 4(k)2 for k > 0 and substituting into the

above inequality, we have


2 {
j4 2q/t jm2 < m
--42-qn4m-2k


1 ( p 2,...,Pt)
((k)2 + ()2)2m+2-q)


Then we have,


IIAX',II < V. V.1 1 1
'- 42-qn4 2 "-- L((k)2 + (-L)2)2m+2-q T2
k I v-n7 ^n'


} max R(pi,p2,...,pt).
V(k,l)\(O,O)
(4.22)


Note that the double summation on the RHS of equation 4.22 takes every integer k

and I between 0 and n 1 except k = 1 = 0 simultaneously. Let the set of all the


(4.21)








pair (k, 1) in this double summation be denoted by A. By using the split in the set

A, we have

1 1 n-1 n-i
EE--f(k,l)1 = -{f(O, 1)+f(1, O)+f(1, 1)-+- f(k, O)+ / f(O, l)+ E f(k,l)},
k I k=2 1=2 (k,l)EB
(4.23)
where f(k,l) = 1!2+ and B = (k,l)\(l,l) k,l= 1,.. ,n- 1. By using
((-k)2+(-L)22m+2--q "

the fact that f(k, 1) is a strictly decreasing function of k and 1, we can bound the

terms with summation in equation 4.23 by the corresponding terms in an integration

form as follows.


f(k,1)- < {f(0,1) + f(1,0) +f(l,1)1+ f )dx
k I n
4-n f(0, y) dy + L 2 + y2)2m+ dxdy, (4.24)


where the integration domain S = {(x,y)I(x,y) E [0,1] x [0,1]\[0, ) x [0, )}. By

transforming the Cartesian coordinate to the polar coordinate, the term with double

integral in equation 4.24 can be bounded by the following integral,


j r2 1 rdrdO r ) 4m+2-2q 2m+l-)
Sr4m+4-2q 4(2m+ 1 n 2
~ 7r n4m+2-2q (4.25)
< 4(2m + 1 q)








Substituting equation 4.25 into equation 4.24 and calculating the integrals in equation

4.24, we obtain the following inequality


E E 1 1 k n(( )2 +r

where f=2 + 2+2-q + 4m+3-2q + 74(21-q) is a constant.

Using equations 4.19, 4.22 and 4.26, the convergence criterion IIAXtI|F < IEjX'IIF

can be met with by the following derived condition


max R(pp2,. ..P Pt) < -' n2q = Mn29, (4.27)
V(k,I)\(0,0) 0


where M is the constant defined as M := 24 Furthermore, we can bound the

function maxk,l\(o,o) R(pi,p2,. .. ,pt) by a function of Ak as follows:


max R(pl,...,pt) = max ( 1" Ak- -P )2)At-I p)2)
V(k,I)\(o,O) V(k,\o)\ (Ak + A). (= Ak + pj j=1 +pj
< max 1t Ak-p2


The maximum taken from the values of function R at the nonzero eigenvalues,can

be bounded by using the maximum of the same function in an interval containing all

the nonzero eigenvalues. Then the convergence can be reached by using the following

modified requirement


max 1 J(- )2 < Mn2q. (4.28)
A=4sin2()







For any positive pj, the function -I 1 (j )2 is always less than the right-hand

side when x > -r-. Therefore, in equation 4.28, the interval (4sin2(*),4) can be
Mqn2
replaced by S = (4sin2(-),min(4, -'-)). The convergence requirement in equation
Mn n2
4.28 can be made less stringent by using the inequality,
1 -[x-P,)2t1

max ,(- )2 < ( (x P)2)(Max), (4.29)
rx X ES es X ++ i pj xES Xq


leading to the following modified requirement which when satisfied automatically

makes the inequality 4.28 true.

t X-p3 2 1 2 ()) =MI
max I ( -) < (Mn2-)( _) = (Mn2)(4 sin2)) = (4.30)
xES j=x + Pj maxxEs n


This bound M' will approach a constant when n -+ oo. Thus, the problem becomes

one of finding the number of iterations t and the parameters pj such that the re-

quirement in 4.30 is satisfied. This requirement is of the same form as the classical

ADI minimax problem which was solved by Jordan as quoted in [87]. We can use the

result in [87] to determine the number of iterations required for convergence. The

number of iterations needed to meet this requirement is


J = [ 7M2 k (4.31)

k' = (4.32)
v, + =v( (3 1
v= _(V+ ), (4.33)
2 V









where, [z] denotes the smallest integer larger than z and v is the expansion factor of

an interval (a, b) defined as . The expansion factor v of the interval of interest (S)

is fixed when n approaches infinity, i.e.



limv= 1 (4.34)
nlroo 47r2Ml/q.


In the classical ADI minimax formulation, v is the spectral radius of the eigenvalues

being considered in A, which is shown be of the order of n2 [85]. Since we use

a different formulation for convergence and impose the spectral assumption on the
rin(4,. =-)
data, the resulting v becomes m n When n -+ oo, the ratio v approaches a
*n'

constant as given in equation 4.34 and M' also approaches a constant as mentioned

above. Therefore, we conclude that lirn-.oo J is a constant, i.e. there is a constant t

such that IIAX'tIIF < eClX'DF is satisfied for all n. m

In [87], an optimal set of ADI parameters for the general ADI iterations was

given. We may use these parameters to solve the minimax problem in equation 4.30.

Although this set of ADI parameters is not the optimal set to meet our convergence

requirement, it provides us with a parameter set which can achieve the convergence

criterion in a constant number of iterations. Therefore, we use the result in [87] to

select a suboptimal ADI parameter set. Solving for the optimal ADI parameters in

our problem requires complicated analysis using transformation theory [87] and is

currently under investigation.








4.3.2 Modified BCG Method for the Capacitance Matrix Equation

In this section we present an algorithm to solve the capacitance matrix equa-

tion defined earlier. The conjugate-gradient(CG) method is a very powerful iterative

scheme to solve the symmetric positive definite linear system. A natural generaliza-

tion of the CG-type method to solve a general nonsymmetric linear system is called

the biconjugate gradient(BCG) method [46]. The linear system with the capacitance

matrix is a nonsymmetric and indefinite system. The size of the capacitance ma-

trix is determined by the rank of the matrix UVT, which contains the difference

between K and K0. With the Ko proposed in section 4.2, the size of the associated

capacitance matrix is usually 0(n). The BCG method requires the computation of a

matrix-vector product in each iteration. Since the capacitance matrix C is dense, this

matrix-vector product takes 0(N) operations. In this section, we present a modified

biconjugate gradient method to solve this linear system by applying the wavelet trans-

form technique to approximate the multiplication and reduce the computational cost

to 0(n). Since the BCG converges in O(n) iterations, the computational complexity

for the modified BCG method is 0(n2) = 0(N).

The BCG algorithm [46] for the nonsymmetric linear system CQ3 = f is as

follows.

0. Choose 30, and set q0 = ro = f C30; Choose io # 0, and set 4o = ro;

1. Compute = -T D T N D
1. Compute ak rkrk-l,ak = qk_1Cqk-1, and ak = ak/ak,


2. Update 13k = /3k-1 + akqk-1;








3. Set rk = rk- akCqk-1, and rk = 4-1 akCTc4k-1;
N TN N.
4. Compute pN = ik.Trk, and Pk = pk/ak,

5. Set qk = rk + pkqk-1, and qk = rk + Pkcqk-1;

6. If rk P 0 or rik 0, stop; else go to 1.

Like the CG, the operations and storage requirement per step in the BCG are con-

stant. The convergence behavior is similar to the CG method except that the BCG

algorithm is susceptible to breakdowns and numerical instabilities. To be more spe-

cific, division by 0 (or a number very close to 0) may occur in computing ak and

pk. These two different breakdowns may be avoided by using the look-ahead Lanczos

algorithm or the quasi-minimal residual(QMR) algorithm [19].

In the above algorithm, two matrix-vector multiplications Cqk-1 and Cqk-I are

required in each iteration. Since the matrix C is dense, the direct multiplication will

take 0(N) operations, which is not acceptable to our pursuit of an O(N) algorithm

for the Poisson equation. Thus, we propose to carry out the multiplication in a

wavelet basis to reduce the computational cost [1, 7]. With appropriate wavelet basis

[7], the decomposed matrix can be approximated to high accuracy by a sparse matrix.

After the approximation, the multiplication of this sparse matrix and the decomposed

vector can be reduced to 0(n) or O(nlog(n)) operations [7]. Consequently, the

modified BCG method is computationally efficient.

The wavelet transform of a matrix D and vector q can be written as follows:



D = DpT, =







where 0 is the wavelet transform matrix with each row corresponding to a wavelet

basis vector. If we use the orthogonal wavelet transform, then Dq = 0r(fTj).

Therefore, we compute matrix-vector multiplication in the wavelet domain and then

transform back to the original domain by using the wavelet reconstruction (inverse

wavelet transform) to obtain the product Dq.

It is very crucial to know the structure of the capacitance matrix to achieve an

accurate and efficient approximation of the matrix. In equation 3.4, the capacitance
-TT
matrix is composed of three matrices, i.e., the identity matrix I, E'-=o1 ej+lqTU

and VTr. The first two matrices are extremely sparse, but the third is a dense

matrix. When computing the matrix-vector product of the capacitance matrix and

a vector, we directly compute the separate products for the first two sparse matrices

while the aforementioned wavelet approximation scheme is used only for the matrix

VT1.

4.4 Experimental Results

We have successfully applied our generalized capacitance matrix algorithm to

a host of early vision problems which use the membrane and/or thin-plate spline

to constrain the solution [85, 45]. These early vision problems include the surface

reconstruction, shape from shading, optical flow computation, lightness problem,

etc.[85, 45]. These problems when formulated as variational principles and dis-

cretized, lead to solving one or more large sparse linear systems. In this section,

we present the results of applying our algorithm to the surface reconstruction and

shape from shading problems.








4.4.1 Surface Reconstruction

We implemented our generalized capacitance matrix algorithm on the sparse

data surface reconstruction with membrane and thin-plate splines. We synthesized a

sparse range/elevation data set on a 64 x 64 grid. The input data set is very sparse

and contains 15 data points randomly scattered in the plane as shown in figure 4.3(a).

We apply our algorithm to this data to obtain a surface approximation.

In the first experiment, we reconstruct the surface to approximate this sparse

data set by using membrane and thin-plate smoothness constraints. Figure 4.3(b)

shows the reconstructed surface using membrane smoothness after 10 ADI iterations.

We observe that only 10 ADI iterations are needed to attain an error tolerance of

105. The reconstructed surface using thin-plate smoothness after 16 ADI iterations

is shown in figure 4.3(c). We can see the reconstructed surface using thin-plate

smoothness is much smoother than that using membrane smoothness.

The second experiment is the surface reconstruction on the same data set with

prespecified discontinuities along a horizontal line, between the coordinates (1,32)

and (30,32). as shown in figure 4.4(a). We depict the computed surface with mem-

brane smoothness obtained using our generalized capacitance matrix algorithm with

1 and 10 ADI iterations in figure 4.4(b) & (c) respectively. Figure 4.4(d) shows the

result of the same experiment with thin-plate smoothness after 16 ADI iterations

in our generalized capacitance matrix algorithm. A similar experiment with depth

discontinuities along the diagonal line segment extending from the coordinate (1,63)














Ulip
(a)




(b) (c)
Figure 4.3. Surface reconstruction example; (a) original sparse data, (b) recon-
structed surface using membrane smoothness, (c) reconstructed surface using thin-
plate smoothness, obtained using the generalized capacitance matrix algorithm.








to (63,1) was performed. The data along with the recovered surface using membrane

smoothness are shown in figure 4.5(a) & (b) respectively.













(a) (b)










(c) (d)


Figure 4.4. Surface reconstruction with prespecified discontinuities along a horizon-
tal line; (a) original sparse data set with depth discontinuities along the indicated
horizontal line; reconstructed surfaces using a membrane spline after (b) 1 iteration
and (c) 10 iterations of ADI method; (d) reconstructed surface using a thin-plate
spline after 16 iterations of ADI method.


4.4.2 Shape from Shading

In this problem, we tested our algorithm on a synthetically generated image of

a Lambertian sphere illuminated by a distant point source and viewed from the same

location as the point source. Figure 4.6(a) depicts the 64 x 64 original shaded image





















(a) (b)


Figure 4.5. Surface reconstruction with prespecified discontinuities along a diagonal
line; (a) the original sparse data set with depth discontinuities along the indicated
diagonal line, (b) reconstructed surface using a membrane spline after 10 iterations
of ADI method.

with an 8 bit resolution in gray per pixel. The sphere is assumed to have Lambertian

reflectance properties with a constant albedo. Both depth and orientation data are

specified on an occluding contour depicted in figure 4.6(b). Note that we have to

solve three systems of equations namely Kxj = bl, Kx2 = b2 and K1x3 = b3 (see

section 2.3). The first two systems are for recovering the surface normals and the last

equation for recovering the depth from the surface orientation map. We start the ADI

iteration for the linear systems with the initial values of p = q = z = 0. The recovered

surface orientation is shown in figure 4.6(c) and the final surface shape obtained using

the orientation information is depicted in figure 4.6(d). For each computed (p, q) on

the RHS of equation 2.18, the ADI error tolerance was set to 10-. The same tolerance

was used for the ADI iterations in the depth from orientation problem.





79


Although, we have not compared the computational time of existing solution

methods for the discretized Poisson equation with the computational performance

of our algorithm, we believe that our method will outperform most other existing

methods. Future research efforts will focus on empirically demonstrating this claim.





80






















(a) (b)













(c) (d)


Figure 4.6. Shape from shading example, (a) Shaded sphere image (b) irregular
occluding boundary imbedded in a rectangle (c) recovered surface orientation, (d)
reconstructed surface shape.















CHAPTER 5
ADAPTIVE PRECONDITIONING TECHNIQUE


5.1 Introduction

The generalized capacitance matrix algorithm presented in chapter 4 is very

efficient to solve the linear systems arising from the early vision problems. However,

this technique can prove to be inefficient for dense data problems with nonuniform

weighting on the data constraints, such as the optical flow problem, since the size

of the associated ( dense ) capacitance matrix is too large. Although this problem

may be circumvented by incorporating the capacitance matrix technique as part of

an iterative scheme [72], however it was pointed out in [12] that this semi-direct

numerical scheme only converges when the regularization parameter A is very large.

In this chapter, we present a novel adaptive preconditioning technique which when

used in conjunction with a conjugate gradient algorithm can solve the linear systems

very efficient. This adaptive preconditioning technique can be universally applied to

solve any early vision problem.

Numerical iterative methods have been popular for solving the linear systems in

early vision problems. Several multiresolution preconditioning techniques [75, 61, 92]

have been proposed to accelerate the convergence speed of the iterative methods in








computer vision literature. In this chapter, we will introduce a physically-based adap-

tive preconditioning technique which will outperform in computational efficiency

- all previously proposed preconditioning methods in vision literature namely, the

hierarchical basis preconditioned conjugate gradient of Szeliski [75] and the change

of basis methods of Yaou et al. [92] and Pentland [61]. In addition, it does not have

any restriction on the choice of the regularization parameter A for its convergence.

In [75, 76], Szeliski presented a preconditioning technique based on the use of

hierarchical basis functions. This technique is based on the multi-level splitting of the

finite element spaces presented in [93]. The hierarchical basis functions naturally lead

to a pyramidal representation of the unknown function being solved for. Empirical

convergence results were shown for preconditioned conjugate gradient in comparison

with the standard conjugate gradient method. A key piece of information that was

overlooked in Szeliski's work [75, 76] was that, in designing a preconditioner for the

conjugate gradient algorithm, no information about the problem being solved was

used i.e., once a basis set was chosen, the same preconditioner was used regardless

of the imposed smoothness or data constraints. Thus, the true power of a precondi-

tioning transform was not exploited to its fullest.

More recently, Pentland [61] presented a technique for surface interpolation us-

ing a change of basis to orthonormal wavelets as a preconditioning transform. He

uses the results of Beylkin et al., [7] which states that the application of (N x N)

matrices corresponding to any pseudo-differential and Calderon-Zygmund opera-

tors to arbitrary vectors requires either 0(N) or O(N log N) operations based on








whether nonstandard or standard wavelet transform is used. The result of Beylkin

et al., primarily shows a reduction in the bandwidth of the matrices corresponding

to pseudo-differential operators in a wavelet basis and gives a technique whereby an

0(N) coefficients can be used to approximate the (N x N) operator, for a given

tolerance. This compression takes 0(N) time if the structure of the singularities of

the matrix are known a priori.

There are three issues to be noted with regards to Pentland's reported results.

Firstly, after a change of basis to orthonormal wavelets, the off-diagonal terms were

discarded in [61], under the claim that the transformed stiffness matrix was diagonally

dominant. This diagonal dominance behavior was depicted via the profiling of a row

of the transformed stiffness matrix at a high resolution. Upon close examination,

we found that the diagonal dominance property does not hold in general at lower

resolutions (see figure 5.10) and thus, discarding the off-diagonal terms is unjustified

and leads to a solution which may be far from the true solution. Secondly, Pentland's

preconditioner requires the computation of the diagonal entries in a wavelet trans-

formed stiffness matrix. Although the stiffness matrix is banded, the computation of

these diagonal entries takes O(N log N) operations, which is too expensive for large

N. Thirdly, it is not known if the matrix (K + S) in his linear system (K + S)U = D

being solved for the interpolation problem would satisfy the conditions set forth in

Beylkin et al., [7] in order for their bandwidth reduction scheme to be applicable in

this case. Further, we found that his algorithm fails to converge in reasonable number








of iterations to the correct solution for very sparse data that is used in the examples

of this thesis.

Yaou and Chang [92] report a fast surface interpolation algorithm using the

multiresolution wavelet transform. This work is almost exactly similar to that of

Szeliski with the exception that Yaou and Chang use a wavelet basis instead of the

hierarchical basis and a Jacobi iteration [26] instead of the conjugate gradients used in

Szeliski [75]. As in Szeliski, the full potential of a preconditioning technique was not

exploited in Yaou and Chang i.e., the design of the preconditioner did not make use of

any information about the problem being solved. The stiffness matrix in the surface

interpolation problem was first transformed to a bi-orthonormal wavelet basis and

then the transformed linear system was solved using a Jacobi iteration. Note that,

in a preconditioning technique applied to the the positive definite stiffness matrix K,

one tries to approximate K as well as possible with another positive definite matrix P

known as the preconditioner, such that P-'K is a good approximation to the Identity

matrix. Such an approximation P was never constructed in Yaou and Chang [92].

Further, upon implementing their algorithm, it was found that their published results

about rate of convergence were only valid for data from periodic function since their

implementation took advantage of periodic wavelet transforms.

In this chapter, we will present a very effective way to construct a physically-

based adaptive preconditioner in a wavelet basis for several early vision problems

namely, the surface reconstruction (SR), shape from shading (SFS) and computation

of optical flow (OF). This physically-based preconditioner is adapted to the membrane








spline or the thin plate spline or a convex combination of the two. Experimental

results depicting the performance of our algorithm in comparison to existing methods

discussed above are presented for synthetic and real data.

The rest of this chapter is organized as follows. In section 5.2, an adaptive

preconditioner constructed in a wavelet basis is presented, followed by the adaptive

preconditioned conjugate gradient algorithm in section 5.3. Algorithm implementa-

tion on synthetic and real data are presented in section 5.4. In section 5.5, we end

with a discussion.

5.2 Adaptive Preconditioning in a Wavelet Basis

Iterative methods are preferable to direct methods for solving large and sparse

linear systems in early vision problems, since direct methods require the formation

of intermediate matrices which are not necessarily sparse. Although the sparsity

can be preserved when using direct methods on 1D problems with the smoothness

constraint, it will be destroyed in the case of 2D problems. This makes the number

of operations and the amount of storage too large for 2D and higher dimensional

problems.

The efficiency of applying an iterative method to solve a linear system depends

on its rate of convergence, which is a function of the condition number of the coef-

ficient matrix in a linear system. Unfortunately, many early vision problems after

regularization lead to large ill-conditioned linear systems. Therefore, most itera-

tive methods converge very slowly to the true solution and are thus computationally








impractical. One way to speed up the convergence of an iterative method is via pre-

conditioning. In this section, we present a new adaptive preconditioning technique

wherein the preconditioner is constructed in a wavelet basis. Since the proposed

preconditioner is designed to adapt to the spectral characteristics of the smoothness

constraint in the regularization solution (in the frequency domain) of different early

vision problems, it is bound to perform better than the existing "fixed" precondition-

ers [75, 42, 92].

5.2.1 Preconditioning Technique

For a linear system Kx = b with positive-definite matrix K, we assume there

exists a positive-definite matrix P, the preconditioner for the matrix K, such that the

condition number of P-1K is greatly reduced. Since P is a positive-definite matrix,

it can be shown that there exists a positive-definite matrix C such that C2 = P [26].

Now, we can define the vectors x and b via the transformation matrix C as follows



x=Cx, 6 = C-b. (5.1)



Thus, the linear system Kx = b can be solved by solving the transformed linear

system

Kx = b, (5.2)


where K = C1KC-1. Note that the matrix K in the transformed linear system

is similar to the matrix P-1K, i.e., the condition numbers of K and P-1K are

equivalent. The condition number of the transformed matrix is much smaller than