Group Title: Department of Computer and Information Science and Engineering Technical Reports
Title: Physically-based adaptive preconditioning for early vision
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00095328/00001
 Material Information
Title: Physically-based adaptive preconditioning for early vision
Series Title: Department of Computer and Information Science and Engineering Technical Report ; 95-010
Physical Description: Book
Language: English
Creator: Lai, S. H.
Vemuri, Baba C.
Publisher: Department of Computer and Information Science, University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: March, 1995
Copyright Date: 1995
 Record Information
Bibliographic ID: UF00095328
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.

Downloads

This item has the following downloads:

1995172 ( PDF )


Full Text




















Physically-based Adaptive
Preconditioning for Early Vision1


S. H. Lai and B. C. Vemuri


UF-CIS Technical Report TR-95-010
Computer and Information Sciences Department
University of Florida
Bldg. CSE, Room 301
Gainesville, Florida 32611-6120
March 1995

















1This work was supported in part by the NSF grant ECS-9210648.
Manuscript submitted to the IEEE Trans. on Pattern Analysis and Machine Intelligence









Abstract


Several problems in early vision have been formulated in the past in a regularization frame-
work. These problems when discretized lead to large sparse linear systems. In this paper, we
present a novel physically-based adaptive preconditioning technique which can be used in con-
junction with a conjugate gradient algorithm to drastically improve the speed of convergence for
solving the aforementioned linear systems. A preconditioner based on the membrane spline or
the thin plate spline or a convex combination of the two is termed as a physically-based precon-
ditioner for obvious reasons. The adaptation of the preconditioner to an early vision problem
is achieved via the explicit use of the spectral characteristics of the regularization filter in con-
junction with the data. This spectral function is used to modulate the frequency characteristics
of a chosen wavelet basis leading to the construction of our preconditioner. The precondition-
ing technique is demonstrated for the surface reconstruction, shape from shading and optical
flow computation problems. We experimentally establish the superiority of our preconditioning
method over previously presented preconditioning techniques for the surface reconstruction and
shape from shading problems. Performance of the preconditioning technique is demonstrated
via experimental results on real and synthetic data.









1 Introduction


In the past decade, several problems in early vision have been formulated in a regularization frame-

work [9, 27, 26, 30, 3, 4, 12, 13]. These formulations result in partial differential equations which when

discretized lead to large sparse linear systems. Numerical iterative methods such as the Gauss-Seidal,

Jacobi and the conjugate gradient technique [8] have been popular until the inception of multi-grid

methods [24] and multiresolution methods [22, 19, 32]. More recently, the capacitance matrix tech-

nique [5] has been generalized to solve the linear system arising from the early vision problems very

efficiently t-"' 17]. However, this technique can prove to be inefficient for dense data problems with

nonuniform weighting on the data constraints, such as the optical flow problem, since the size of the

associated ( dense ) capacitance matrix is too large. Although this problem may be circumvented

by incorporating the capacitance matrix technique as part of an iterative scheme [21], however it

was pointed out in [6] that this semi-direct numerical scheme only converges when the regularization

parameter A is very large. In this paper, we will introduce a /'l. i .*, ,,ll.i-based adaptive preconditioning

technique which when used in conjunction with the conjugate gradient algorithm, will outperform -in

computational eff,' ., ,- i all 1,j' i.'... ili proposed preconditioning methods in literature ,,,ni 1, l. the

hierarchical basis preconditioned conjugate gradient of Szeliski [22] and the change of basis methods

of Yaou et al. [32] and Pentland [19]. In addition, it does not have any restriction on the choice of

the regularization parameter A for its convergence.

In [22, 23], Szeliski presented a preconditioning technique based on the use of hierarchical basis

functions. This technique is based on the multi-level splitting of the finite element spaces presented

in [33]. The hierarchical basis functions naturally lead to a pyramidal representation of the unknown

function being solved for. Empirical convergence results were shown for preconditioned conjugate

gradient in comparison with the standard conjugate gradient method. A key piece of information that

was overlooked in Szeliski's work [22, 23] was that, in designing a preconditioner for the conjugate

gradient algorithm, no information about the problem being solved was used i.e., once a basis set was

chosen, the same preconditioner was used regardless of the imposed smoothness or data constraints.









Thus, the true power of a preconditioning transform was not exploited to its fullest.

More recently, Pentland [19] presented a technique for surface interpolation using a change of basis

to orthonormal wavelets as a preconditioning transform. He uses the results of Beylkin et al., [2]

which states that the application of (N x N) matrices -corresponding to any pseudo-differential and

Calderon-Zygmund operators -to arbitrary vectors requires either O(N) or O(N log N) operations

based on whether nonstandard or standard wavelet transform is used. The result of Beylkin et al.,

primarily shows a reduction in the bandwidth of the matrices corresponding to pseudo-differential

operators in a wavelet basis and gives a technique whereby an O(N) coefficients can be used to

approximate the (N x N) operator, for a given tolerance. This compression takes O(N) time if the

structure of the singularities of the matrix are known a priori.

There are three issues to be noted with regards to Pentland's reported results. Firstly, after a change

of basis to orthonormal wavelets, the off-diagonal terms were discarded in [19], under the claim that

the transformed stiffness matrix was diagonally dominant. This diagonal dominance behavior was

depicted via the profiling of a row of the transformed stiffness matrix at a high resolution. Upon

close examination, we found that the diagonal dominance property does not hold in general at lower

resolutions (see figure 9) and thus, discarding the off-diagonal terms is unjustified and leads to a

solution which may be far from the true solution. Secondly, Pentland's preconditioner requires the

computation of the diagonal entries in a wavelet transformed stiffness matrix. Although the stiffness

matrix is banded, the computation of these diagonal entries takes O(N log N) operations, which

is too expensive for large N. Thirdly, it is not known if the matrix (K + S) in his linear system

(K + S)U = D being solved for the interpolation problem would satisfy the conditions set forth in

Beylkin et al., [2] in order for their bandwidth reduction scheme to be applicable in this case. Further,

we found that his algorithm fails to converge in reasonable number of iterations to the correct solution

for very sparse data that is used in the examples of this paper.

Yaou and Chang [32] report a fast surface interpolation algorithm using the multiresolution wavelet

transform. This work is almost exactly similar to that of Szeliski with the exception that Yaou and









Chang use a wavelet basis instead of the hierarchical basis and a Jacobi iteration [8] instead of

the conjugate gradients used in Szeliski [22]. As in Szeliski, the full potential of a preconditioning

technique was not exploited in Yaou and Chang i.e., the design of the preconditioner did not make use

of any information about the problem being solved. The stiffness matrix in the surface interpolation

problem was first transformed to a bi-orthonormal wavelet basis and then the transformed linear

system was solved using a Jacobi iteration. Note that, in a preconditioning technique applied to the

the positive definite stiffness matrix K, one tries to approximate K as well as possible with another

positive definite matrix P known as the preconditioner, such that P-1K is a good approximation

to the Identity matrix. Such an approximation P was never constructed in Yaou and Chang [32].

Further, upon implementing their algorithm, it was found that their published results about rate of

convergence were only valid for data from periodic function since their implementation took advantage

of periodic wavelet transforms.

In this paper, we will present a very effective way to construct a physically-based adaptive precon-

ditioner in a wavelet basis for several early vision problems namely, the surface reconstruction (SR),

shape from shading (SFS) and computation of optical flow (OF). This physically-based precondi-

tioner is adapted to the membrane spline or the thin plate spline or a convex combination of the two.

Experimental results depicting the performance of our algorithm in comparison to existing methods

discussed above are presented for synthetic and real data.

The rest of the paper is organized as follows. In section 2 we will very briefly present the variational

formulations for the three early vision problems namely, the SR, SFS and the computation of OF,

the last of which is our new and original formulation. Preconditioner construction in wavelet basis

is presented in section 3 followed by the preconditioned conjugate gradient algorithm in section 4.

Algorithm implementation on synthetic and real data are presented in section 5. In section 6, we end

with a discussion and conclusion.








2 Variational Formulations


Variational formulations for various early vision problems have been reported in [24] and references

therein. These formulations make use of the popular theory of regularization. In a regularization

framework, generic smoothness assumptions are imposed on the solution space prior to attempting

any functional minimization. The smoothness constraints are well characterized by a class of gener-

alized multi-dimensional spline functionals [25]. The formulations involve minimization of an energy

functional 8, which is the sum of the energy contribution from the smoothness constraint (S) and

the data constraint (7) (see [25, 22] for details of this formulation).

For the surface reconstruction problem, we are faced with recovering surface shape from either

sparse or dense range data. The location of surface and orientation discontinuities (if any) are

assumed to be known in the below discussion. For a first order stabilizer, the smoothness constraint

is given by

S(v)= (v ++v ) dxdy, (1)

where v(x, y) is the admissible function and v,, vy its partial derivatives assumed to be small. For

the case of a second order stabilizer, we have the following smoothness constraint

S(v) = (v + 2vxy + v) dxdy, (2)

The first and second order stabilizer can be combined to form the controlled-continuity stabilizers

[25]. To the smoothness constraint, we add the data constraints in the form of penalty terms. The

following penalty term which measures the discrepancy between the surface and data weighted by

the uncertainty in the data may be used,
1
P(v) = ci (v(xi, y) d)2 (3)

Where, di are the depth data points specified in the domain Q and c, are the uncertainty associated

with the data. The goal is to find a u that minimizes the total potential energy S(v) = S(v) + P(v).

The numerical solution to the above problem is computed by discretizing the functionals S(v)

and P(v) using finite element techniques [22]. The resulting energy function is a quadratic in x








(a discretization of v) given by, E(x) = lXTKx XTb + c. Where, K is a very large and sparse

N x N matrix [24], with N being the number of discretization nodes. The minimum x* of this energy

function is found by solving the large sparse linear system Kx = b. We present a very fast solution

to this problem, based on a novel preconditioning technique, in a subsequent section.

Numerous solution methods using variational principle formulations of the SFS problem have been

proposed. For a comprehensive set of papers on this topic, we refer the reader to the book edited by

Horn and Brooks [12] and the work in [24, 21, 11, 23]. In this problem, it is required to recover the

shape of surfaces from image irradiance which depends on surface geometry and reflectance, scene

illuminance and imaging geometry. The sum of the smoothness constraint and the penalty terms in

this problem are,


S(Z, p, q) = (p + P) + (q + q ,1,,J + I (Z p)2 + (Z q)2 dxdy

+ [E(x,) R(p, q)]2dxdy.

Where, E(x, y) is the image irradiance at a point (x, y) and R(p, q) is the relation between surface

orientation (p,q) and image irradiance E(x, y). The first term in the above equation is a weak

smoothness constraint and the coefficient A controls its contribution toward the final solution. The

second term enforces the ',,i .I,,.i7i1,'l constraint, ensuring that the gradients (p, q) of the recovered

surface Z match those of a valid surface. Z,, Zy are the first partial of the surface Z(x, y).

The above energy can be discretized directly by using finite element techniques however, the same

set of discrete domain equations can be arrived at via the use of finite difference approximations for

this application [23]. The discrete energy is given by


1 2 A E2 2
E= (E(x,y) R(p,))2 + (Z p)2 + ( q)2+ (p + +q + qf ) (4)

where the summations are over all the pixels in the image. In vector notation, the above equation

can be written as

E(u) = U TKu bu +C (5)
2









Where, u = [zTpTqT] is a concatenation of the vector representation of the fields Z,p and q. The

minimum of the above quadratic is obtained by solving Ku = b with Ku containing linear u terms

and the constant and the nonlinear terms form the b vector. This is an (N x N) sparse nonlinear

system which is usually solved using iterative methods. These iterative methods to solve the nonlinear

system normally have the following procedures in each iteration: (1) compute the b vector using the

current u, (2) update the solution u by treating b as a constant. In the past, multigrid techniques

have been employed in solving this nonlinear system [24] wherein a hierarchy of problems at different

resolutions needs to be explicitly constructed. In [23], Szeliski presented a fast shape from shading

algorithm which employs hierarchical basis for preconditioning the conjugate gradient algorithm used

to solve the above nonlinear system. Unlike the multigrid methods, this technique does not require

the explicit construction of a multiresolution pyramid of problems. In this paper we present a novel

preconditioning method that leads to a faster convergence than the one in [23] and uses a wavelet

basis. Our method makes use of the spectral characteristics of the data and the imposed smoothness

constraint. This spectral function is then used to modulate the frequency characteristics of the chosen

wavelet basis leading to the construction of the preconditioner.

A third early vision problem that we consider in this paper is the computation of optical flow from

an image sequence. Recently, Barron et al., [1] have presented a comprehensive survey of various

optical flow computation techniques along with a quantitative comparison of them. We note that

the survey did not do any comparison of the computation time required by the various algorithms. In

this paper, we will develop a novel preconditioning technique applicable to any gradient based optical

flow computation technique that uses a smoothness constraint involving the 2D Laplacian operator.

We develop a new formulation which modifies the Horn and Schunk formulation [13] to incorporate

a separate condition along discontinuities in the image intensity [10]. This modification d*,, /. ,,lii

enhances the performance of our method over other gradient-based methods because the optical flow

constraint equation is invalid along intensity discontinuities and causes large errors in the computed

flow. We get d *,, ,/ ,ll i better and more robust estimates of optical flow than any gradient based

optical flow technique reported in Barron et al., [1]. Using wavelet basis, we develop a preconditioner









for this problem which assists our algorithm in converging very fast.

Our variational formulation of the optical flow problem involves minimizing a combination of the

optical flow constraint equation with a global smoothness term and an additional constraint term

along edges leading to


Sa((VE u) + E)2 + A( + VVu v l)dx + ((V(V2G E) u) + (V2 E),)2dx (6)

Where u(x,t) = (u(x, t), v(x, t)) is the velocity field to be estimated, E is the image brightness

function, and V2G is the Laplacian of Gaussian operator [10]. The first term is the gradient constraint

which is active away from discontinuities, the second term is the smoothness constraint on the velocity

field with A controlling the contribution of the smoothness term and the third term is the flow

constraint along edges which is disabled away from the edges. A discrete version of the above energy

can be written as

E(Eu + Ev + E,)2 + (2 + 2 ) + (2 ) + 2) (7)

Where, u and v form the discretized flow vector, ux, uy, v, and vy represent the discretized partial

of the velocity field, Ex, Ey and Et correspond to the discretized partial in the x, y and t directions

respectively -collectively obtained from the first and the third terms of the equation 6. Note

that E,, Ey and ET are the discretized partial of the brightness function E when the location of

differentiation is away from discontinuities, and they represent the discretized partial of V2G*E when

the differentials are taken at the discontinuity locations. The minimization of this discretized energy

leads to solving a linear system of equations Ku = b where K contains the discretized differential

(smoothness) operator along with Ex, Ey and the b contains the terms corresponding to the optical

flow constraint equation. The gradient constraint in the above energy function implies the weighting
--2
(E, + EY) on (u, v) at each discretized spatial location. This weighting is not reasonable since the
gradient constraints at locations of high gradient are not necessarily reliable. In fact, they tend to

be unreliable because the brightness gradients close to discontinuities are normally higher than those

away from discontinuities. Therefore, we eliminate this weighting for each gradient constraint via

normalization. To have a uniform contribution at each pixel from the flow constraint, we normalize








-2 -2 2 2
the entries E4, EE, in the matrix K with E,+ E.



3 Adaptive Preconditioning in a Wavelet Basis


Iterative methods are preferable to direct methods for solving large and sparse linear systems in

early vision problems, since direct methods require the formation of intermediate matrices which

are not necessarily sparse. Although the sparsity can be preserved when using direct methods on

1D problems with the smoothness constraint, it will be destroyed in the case of 2D problems. This

makes the number of operations and the amount of storage too large for 2D and higher dimensional

problems.

The efficiency of applying an iterative method to solve a linear system depends on its rate of

convergence, which is a function of the condition number of the coefficient matrix in a linear system.

Unfortunately, many early vision problems after regularization lead to large ill-conditioned linear

systems. Therefore, most iterative methods converge very slowly to the true solution and are thus

computationally impractical. One way to speed up the convergence of an iterative method is via

preconditioning. In this section, we present a new adaptive preconditioning technique wherein the

preconditioner is constructed in a wavelet basis. Since the proposed preconditioner is designed to

adapt to the spectral characteristics of the smoothness constraint in the regularization solution (in

the frequency domain) of different early vision problems, it is bound to perform better than the

existing "fixed" preconditioners.


3.1 Preconditioning Technique


For a linear system Kx = b with positive-definite matrix K, we assume there exists a positive-definite

matrix P, the preconditioner for the matrix K, such that the condition number of P-1K is greatly

reduced. Since P is a positive-definite matrix, it can be shown that there exists a positive-definite

matrix C such that C2 = P [8]. Now, we can define the vectors x and b via the transformation









matrix C as follows


x = Cx, b = C-lb. (8)

Thus, the linear system Kx = b can be solved by solving the transformed linear system

Kx = b, (9)

where K = C-1KC-1. Note that the matrix K in the transformed linear system is similar to the

matrix P-1K, i.e., the condition numbers of K and P-1K are equivalent. The condition number of the

transformed matrix is much smaller than that of the original matrix according to the aforementioned

criterion of choosing a preconditioner P. Consequently, the convergence of the iterative methods can

be accelerated by a transformation of the original linear system to a better conditioned linear system.

Although the conditioning of a linear system can be improved after an appropriate transformation,

it is not practical to form the matrix K to solve equation 9 since, the transformation of a matrix

requires a large number of operations and the sparsity structure of the matrix K could be destroyed

after the transformation. Fortunately, it is not necessary to compute this transformation of the

matrix K to achieve the preconditioning effect in a conjugate gradient algorithm. Instead, the same

preconditioning can be achieved via solving an auxiliary linear system Pz = r, where r is the residual

vector, in each conjugate gradient iteration. (The preconditioned conjugate gradient algorithm is

given in section 4.) Therefore, the second consideration for a good preconditioner is to choose the

matrix P such that the linear slli i,, Pz = r can be I
The above two criteria for designing a good preconditioner in conjugate gradient are equally im-

portant. Reducing the condition number can accelerate the convergence of the iterative methods

employed. Choosing the preconditioner P such that Pz = r is easily solvable is very crucial for

efficiently updating the solution in each iteration of the preconditioned conjugate gradient (PCG)

algorithm. Our preconditioner is designed in a wavelet basis by approximating the spectral char-

acteristics of the matrix K. This approximation is achieved via the frequency domain analysis of

the regularization operators and using the frequency band partitioning behavior of a wavelet basis.

Therefore, our adaptive preconditioner is akin to a spectrally equivalent operator for the original








matrix K. Furthermore, the solution of a linear system using our adaptive preconditioner can be

obtained by taking the wavelet transform, scaling the coefficients, and taking the inverse wavelet

transform. By using a very compactly supported wavelet, the solution update can be achieved very

efficiently.


3.2 Regularization Filter


The filtering interpretation of regularization operators has been discussed in vision literature [25, 27].

The inverse of the control-continuity stabilizer (Laplace + Bi-harmonic operators) may be interpreted

as a low-pass filter hence, the solution to an early vision problem in a regularization framework can

be obtained by applying this low-pass filter to the data. We will use this filtering behavior of the

regularization operator to develop a preconditioner which approximates this filter in a wavelet basis.

In section 2, we have seen the variational formulations of various early vision problems that lead

to minimizing an energy functional. This minimization can be achieved by solving the associated

Euler-Lagrange equation which for the surface reconstruction problem in the continuous data case

(with a controlled-continuity stabilizer) can be written as

p(x){(l r(x))A2v(x) T(x)Av(x)} + c(x)v(x) = c(x)d(x), (10)

where A is the Laplace operator, A2 is the bi-harmonic operator, c(x) is the weighting function

associated with the data d(x), the functions p(x) and r(x) control the rigidity and tension of the

solution surface respectively. Now let's consider the case when p(x) and r(x) are constant functions,

i.e. p(x) = p and r(x) = r. In addition, let's also assume the function c(x) to be a constant function,

which means the Gaussian noise associated with each data measurement has the same variance.

Taking the Fourier transform of equation 10 we get


V ( w ) =--------2- D ( w ) ( 1 1 )
( p(l r)||1 w 114 2+ pr\ I II +c(
Where w = (l, .. wd) is the d-dimensional frequency domain variable while, V(w) and D(w) are the

Fourier transforms of v(x) and d(x) respectively. Equation 11 can be given a filtering interpretation








namely, V(w) is obtained by low-pass filtering the data D(w). This low-pass filter is characterized by

the first term on the right hand side in equation 11 called the transfer function. Our preconditioner

is constructed in a wavelet basis to approximate this transfer function.

Equation 11 expresses a linear shift-invariant low-pass filtering of the data in frequency domain,

derived from a regularization framework. This filtering interpretation can be generalized to the

regularly sampled discrete data case in a straightforward manner. However, this linear shift-invariant

filter will become a nonlinear and spatially varying filter in the general cases when the functions

p(x), r(x) and c(x) are not constant. These cases include discontinuity-preserving regularization
and sparse-data interpolation. Although the corresponding nonlinear and spatially varying filter is

difficult to characterize fully in the frequency domain, it is in general a low-pass filter and we can

approximate its spectral characteristics in the frequency domain. In the next section we show how

to achieve this approximation in a wavelet basis.


3.3 Wavelet Basis


Wavelet transform has been used widely in the signal and image processing communities since it

provides a powerful tool for multiresolution analysis. The construction of a wavelet basis is achieved

by defining a scaling function O(x), such that the integer translates of the dilated scaling func-

tion 2--j (2- x) form an orthonormal basis of L2(R), where j e Z. This orthonormal basis

(= { \_ (2- x k)}kez) forms a subspace Vj C L2(R), and the subspaces Vj satisfy the contain-

ment hierarchy

Vj C V_ Vj E Z. (12)

Now we can introduce the difference space Wj between the two consecutive subspaces Vj and Vj_i

such that Vj f Wj = Vj_1, where a denotes the direct sum operator. The key idea of the wavelet

basis is the difference space Wj is spanned by an orthonormal basis which are obtained via integer

translations of the associated wavelet function ((x), i.e. Wj = { 2/2-j0(2- x k)}kez. Therefore, a









multiresolution orthogonal decomposition of the space Vo can be achieved via


Vj Wj Wj_ ... Wi = Vo, (13)


where J is a positive integer. From this multiresolution orthogonal space decomposition, we can

construct a multiresolution orthonormal wavelet basis, since each of the subspaces on the left hand

side of equation 13 are orthogonal to each other and there exists an orthonormal basis for each

subspace as discussed above.

The wavelet decomposition and reconstruction can be efficiently achieved via a QMF (Quadrature

Mirror Filters) implementation as shown in figure 1 (see [18]). In this figure, Ajf is the discrete

approximation of the function f in the subspace Vj, i.e. Ajf contains the values of the inner products

of the function f and the orthonormal basis functions in Vj, and Djf encodes the projection of the

function f on the subspace Wj. The filters H ( in decomposition) and H ( in reconstruction) are

low-pass filters, while the filters G ( in decomposition) and G ( in reconstruction) are high-pass

filters. They are usually designed to be the FIR filters, which can be implemented via convolution

operations. With an appropriate choice of a wavelet basis, we can find the associated FIR filters

with very short length/span and the QMF filter implementation for the wavelet transform can be

accomplished very efficiently.

The QMF structure decomposes the space from one level to the next coarser level with each stage

involving a low-pass and high-pass filtering. From a frequency domain analysis, we can see that a

wavelet decomposition divides the frequency domain into two separate bands in each stage. Therefore,

each subspace Wj has its own frequency band, which means that all the wavelet basis functions in any

particular level are fully contained in the frequency band associated with that level. This behavior of

dividing the frequency domain [20, 29] exhibited by a wavelet decomposition is illustrated in figure

2. Notice that the closer the QMF filters are to the ideal low-pass and high-pass filters, the better

it is for achieving a good frequency-domain division, i.e., the overlap between the frequency bands

associated with each level is minimized with close approximations of the ideal low and high-pass

filters.














-Ajf H -- 2 -Ajf-


S2 : Keep one sample out of two

X : Convolve with filter X


--Ajf 2 H + A f--

Djf- f2 G

2 : Put one zero between each sample

S: Convolve with filter X

S2 : multiplication by 2


Figure 1: The implementation of the wavelet transform via the QMF structure from one level to the
next, (a) decomposition (b) reconstruction, adopted from Mallat[1989].









f
tf 2fo 4fo 8

Figure 2: Division of the frequency domain in a wavelet basis.

In [18], Mallat constructed the 2D wavelet basis by using the tensor products of the 1D wavelet

and scaling functions. The QMF implementation of the wavelet transform was also extended to the

two-dimensional case in [18]. The division of frequency domain behavior for the 2D wavelet basis is

simply an extension of the 1D case.

After introducing the division of frequency domain behavior for a wavelet basis, we are ready to

construct our preconditioner in a wavelet basis and adapt its spectrum to that of the regularization

filter discussed in section 3.2.









3.4 Adaptive Preconditioning via the Modulation in a Wavelet Basis


In section 3.1, we discussed two criteria for selecting a good preconditioner. In this section, we

construct an adaptive preconditioner for early vision problems by following these two criteria.

Using techniques of frequency domain analysis, a low-pass filtering interpretation can be given to

the regularization formulation of early vision problems as was shown in section 3.2. The spectral

characteristics of the regularization filter in the frequency domain can be approximated using the

frequency domain subdivision behavior of the wavelet transform which can be implemented very

efficiently using a quadrature mirror based implementation.

To apply a preconditioner P to a matrix K, the first criterion is to drastically reduce the condition

number of P-1K; the second criterion for a good preconditioner is that the auxiliary linear system

Pz = r should be efficiently solvable. An obvious choice to reduce the condition number of P-1K is

to choose P = K, which reduces the condition number to 1. Although only one iteration suffices to

solve the problem, the complexity of solving the auxiliary linear system Pz = r is the same as that of

solving the original linear system. Therefore, nothing is gained by using P = K. A reasonable choice

for a good preconditioner is to find a matrix P that is spectrally close to K; in addition, the solution

to the auxiliary linear system Pz = r should be easily solvable. Our preconditioner is constructed

based on this very idea.

The matrix K is the stiffness matrix obtained from a discretization of the regularization formulation.

Therefore, its inverse has the same spectral characteristics as that of the regularization filter given in

equation 11. Now we approximate the spectral characteristics of the regularization filter in a wavelet

basis by using its frequency domain partitioning property. This approximation is accomplished by

modulating each wavelet basis function by the spectral value of the regularization filter taken at the

center of the corresponding f, 'p1, '. i, domain subregion. To be more specific, when Mallat's tensor

product strategy [18]is used for the 2D wavelet transform, each level (resolution) contains three sets

of wavelet basis functions, i.e. LH (0 0 ), HL (y0 p), and HH ( i 0), and each set corresponds









to its own subregion in the 2D frequency domain. The modulation for the wavelet basis functions in

each set is based on the spectral function of the regularization filter in its corresponding subregion.

For ease of implementation, we can just take the spectral function value at the center of the subregion

to be the modulation factor.

The inverse of the stiffness matrix is basically a low-pass filter, whose spectral function changes

rapidly in the lower frequency region and varies smoothly in the higher frequency region. An ap-

proximation to the spectral function of the low-pass filter achieved in a wavelet basis is very fine in

the lower frequency regions and coarse in the higher frequency regions. This is because the division

of the frequency domain becomes finer in the lower-frequency band with increasing levels (see figure

2). Therefore, we can achieve a better approximation when more levels are used. In this paper, the

number of levels used in the preconditioner is determined by the support of the wavelet basis used

and the problem size. T/,i before, for a given problem size and wavelet basis, the number of levels is

fixed.

Our preconditioner P can be written as


P = WMWT, (14)


where the matrix W is a 2D orthonormal wavelet transformation matrix and the matrix M is the

diagonal modulation matrix, whose diagonal values are obtained from the spectral function of the

regularization filter. From equation 14, we can see that the solution to the auxiliary linear system

Pz = r in the preconditioned conjugate gradient can be computed very efficiently when the wavelet

basis are chosen appropriately, i.e. the QMF filters are of short length/span. As long as the QMF

filters are of finite length, the solution can be obtained in O(N) operations. The solution to this

auxiliary linear system is obtained in the three steps namely, (1) take the wavelet transform of the

vector r, (2) modulate the transformed vector using the matrix M, and (3) take the inverse wavelet

transform.









4 Adaptive Preconditioned Conjugate Gradient Algorithm


As discussed in section 2, the discretization of the variational formulation leads to solving the system

of equations Kx = b. The stiffness matrix K is symmetric positive definite and therefore, the con-

jugate gradient algorithm can be used to solve the problem guaranteeing a unique solution. When a

preconditioner P is used to accelerate the convergence, we have what is known as the preconditioned

conjugate gradient algorithm (see [8]) given as follows.


0. Choose an initial xo, and compute ro = b Kxo; set k = 0.

1. Solve Pzk = rk; k = k + 1.

2. If k = 1, set pi = zo; else compute /3 = r _z -2 and 3k = _// f;

update Pk = zk- + 3kPk-1.

3N T DN D.
3. Compute ac = rk z_-1, ok = p Kpk, and ak = ca /af;

4. Update r = rk 1 akpk.

5. Update Xk = Xk-1 + ckpk.

6. If rk 0, stop; else go to step 1.


The only difference between the preconditioned conjugate gradient algorithm and the standard

conjugate gradient algorithm is the preconditioning process in step 1. Since the preconditioning

involves solving the auxiliary linear system PZk = rk in each iteration, it is absolutely necessary to

have a fast solution to this preconditioning equation when designing a good preconditioner.

The strategy for constructing our adaptive preconditioner for the early vision problems with regular-

ization formulation was discussed in section 3. Our preconditioner is based on the idea of modulation

in a wavelet basis to approximate the spectral characteristics of the matrix K. Accordingly, our

adaptive wavelet preconditioner P took the form WMWT with the diagonal modulation matrix M









encoding the spectral approximation in the wavelet basis. Now, the primary problem in this pre-

conditioner construction is how to construct the modulation matrix M for general problems. In the

following paragraphs, we show how to achieve the construction of an adaptive preconditioner for the

surface reconstruction, shape from shading, and optical flow computation problems.


4.1 Surface Reconstruction Problem


For the surface reconstruction, the stiffness matrix K(= Kd + Ks) contains the matrix Kd obtained

from the data constraint and the matrix Ks corresponding to the smoothness constraint. When

the controlled-continuity stabilizer is used, the matrix Ks is the linear combination of the discrete

Laplacian matrix ( from membrane) and the discrete biharmonic matrix ( from thin plate). Since the

spectral distributions for both matrices are known a priori, the spectral distribution for the matrix Ks

can be obtained from their linear combination. Now, what remains to be determined is the spectral

distribution for the matrix Kd. For the dense data case with uniform weights for each data constraint,

the matrix Kd is a scaled identity matrix, whose spectral distribution is a constant function. This

case corresponds to the discrete version of the linear shift-invariant low-pass filtering interpretation

for the regularization problem. For other cases such as sparse data surface reconstruction or dense

data with nonuniform weights, we somehow need to estimate the spectral distribution of Kd in the

frequency domain. We will elaborate further on this spectral estimation in the following paragraphs.

Since the spectral approximation of the matrix K is achieved in the wavelet basis, we can approx-

imate the spectral distribution of the matrix Kd via the wavelet transform. One way is to do the

wavelet transform of the matrix Kd and then compute the energy contained in each diagonal block

which corresponds to a particular division of the frequency band, and the computed energy is used

to represent the spectral value for this frequency band. To use this spectral estimation, we need to

develop a fast computational algorithm since the wavelet transform of an N x N matrix requires

O(N2) operations. Another ir.,i to estimate the spectral distribution of Kd is to take the wavelet

transform of its diagonal entries o,,li The N diagonal entries of the N x N diagonal matrix Kd









are grouped into an n x n matrix. Tli, r we compute the energy from the wavelet transform of this

matrix and associate it with the spectral value of each f, '/ ,. r. i. band. This method takes advantage

of the diagonal structure of the matrix Kd and can be accomplished in O(N) operations. We use this

method to estimate the spectrum of the matrix Kd in our experiment.

Using the above discussed spectral approximation to the matrices Ks and Kd separately in a

wavelet basis, we can obtain the diagonal approximation matrices Ms and Md, respectively. Then,

the modulation matrix M is taken to be the sum of Ms and Md. Since the matrix Md is obtained

from the spectral approximation and the matrix Ms has very small diagonal entries at the very coarse

resolution levels, the spectral estimation error in Md at these levels may cause significant error in

the spectral approximation to the matrix K. This problem may exist only in the case of sparse data

surface reconstruction, since the spectral estimation for Kd in the dense data case is quite accurate.

To amend the problem caused by the spectral estimation error, we introduce a weighting function

a(i), where i is the number of iterations, to appropriately weight the importance of Ms and Md

during the CG iterations, i.e.,

M(i) = M, + r(i)Md. (15)

The matrix M(i) is the modulation matrix in the i-th iteration. From the filtering correspondence

with the regularization formulation, we can see that the overall spectral characteristics for the matrix

K-1 is a low-pass filter. Note that the inverse of Ks is also a low-pass filter, while the inverse of

Kd is not. Therefore, the matrix Ms ( or Ks ) is more dominant than the matrix Md ( or Kd ) for

solving the linear system Kx = b. Based on this observation, we set the weighting function o(i) to

be very small initially and then gradually increase the values of the weighting function. Using this

schedule for the weighting function, the lower frequency components of the solution are recovered

at the beginning of the preconditioned conjugate gradient iterations, and then the higher frequency

components. This global-to-local shape recovery phenomenon can be predicted by the choice of the

weighting function a(i). In fact, the weighting function can be adapted during the iterations instead

of following a preset schedule. The adaptation concept is similar to the adaptive step scheme in the









scale space tracking [31].


The preconditioner construction for the shape-from-shading and the optical flow computation prob-

lems is very similar to the one outlined here for the surface reconstruction problem. We only state

their differences in the following paragraphs.


4.2 Shape from Shading Problem


The variational formulation of the SFS problem leads to solving a set of nonlinear partial differential

equations. After discretization, we need to solve the system Kx = b. If the linear approximation

of the reflectance function technique [11] is used to improve the convergence, the stiffness matrix

K e R3Nx3N has the following form.

pK, pI & D pD 0 I
K = [I 0 DT AK, + (Rp + p)I RpRqI (16)
pDT T I RpRI AK, + (Rq + p)I

where Ks E RNxN is the Laplacian matrix, I eG NxN is the identity matrix, Rp and Rq are the

first-order partial derivatives of the reflectance function with respect to p and q, respectively, taken

at the reference gradient (po, qo), and the matrix D sRnx" (N = n2) has nonzero entries only along

the main diagonal and upper diagonal with is along the main diagonal and -ls along the upper

diagonal, i.e.,
1 -1 0 ... ... 0
0 1 -1 0 ... 0
D = . (17)
0 ... ... ... 1 1
0 ... ... ... 0 0
Our preconditioner P has similar block structure as K and can be expressed as


W 0 0 M, 0 0 W T 0 0
P= W 0 0 M2 0 0 W 0 (18)
0 0 W 0 0 M3 0 0 WT
The modulation matrices M1, M2 and M3 are all diagonal and they are used to modulate the

wavelet basis functions corresponding to z, p, and q vectors, respectively. The modulation matrices









are constructed to approximate the spectral distributions of the corresponding diagonal blocks in the

matrix K, i.e. pKs, AK, + (R, + p)I and AK, + (R, + p)I. The construction procedure for each

block is the same as discussed in the surface reconstruction problem.


4.3 Optical Flow Computation Problem


Similar to the SFS problem, the stiffness matrix K for the optical flow estimation has the following

2 x 2 block structure.
K_ K s + E y QE
E,, K, + E,
K= Exy K s +Eiyy

where E,,, E,,, and E,, are all diagonal matrices with the normalized entries ~_ 2, and
Ex+Er Ex+Er
E
2. The construction of the block preconditioner is similar to that in the SFS problem, therefore
Ex+Ey
it is omitted here.



5 Experimental Results


In this section, we present implementation results of applying our adaptive preconditioner to the sur-

face reconstruction, shape-from-shading, and optical flow computation problems. Since our precon-

ditioner adapts to the problem, it is bound to perform better than the existing fixed preconditioners

described in literature. We will justify this claim through the experimental results.

Note that there is no restriction on the wavelet basis being used in the construction of our precon-

ditioner. However, the efficiency of our preconditioner is related to the localization of the wavelet

being used. The time localization makes the computation of the wavelet transform efficient; the

localization in the frequency domain yields a more accurate spectral approximation and therefore

faster convergence of the preconditioned conjugate gradient can be achieved. In our experiment, we

use the cubic B-spline wavelet given in [18] with the QMF filters truncated to a length of 11. This

truncation is used to reduce the operations involved in QMF implementation of the wavelet transform

and an insignificant error is introduced in the frequency domain.









In constructing our preconditioner, the weighting function o(i) is needed to control the contributions

of Ks and Kd to the modulation matrix. We -I.-. -1 that the weighting function be chosen as an

increasing function. We used an increasing function of the form (1 e-)d, where c and d are

constants, for the weighting function. In our experiment, c was usually set to 10 (for problems not

severely ill-conditioned) or 100 (for severely ill-conditioned problems), and accordingly d was set to

be 1 or 2.


5.1 Surface Reconstruction











(a) (b)


Figure 3: The sparse data sets used for the surface reconstruction, (a) synthetic data (b) real laser
range data


We present results of several experiments with sparse data, and compare the convergence rate of our

adaptive preconditioned conjugate gradient (APCG) algorithm with the hierarchical basis conjugate

gradient (HCG) technique [22], the conjugate gradient (CG) algorithm [8], and the bi-orthogonal

wavelet basis transfer scheme with Jaffard's diagonal preconditioning [15]. In [32], the biorthogonal

wavelet basis transfer is used in the preconditioning process. We found that this wavelet basis transfer

must be combined with appropriate scaling to improve the convergence, which was not mentioned in

[32]. In [15], Jaffard proposed the 2j, where j is the level, diagonal scaling in the wavelet basis as

a preconditioning technique. Jaffard's 2i diagonal scaling can be changed to 2"j scaling where m

is a positive integer. We combined this variant of Jaffard's diagonal scaling with the bi-orthogonal

wavelet basis transfer and implemented it for conducting the convergence comparisons. We use all of









the aforementioned preconditioners in conjunction with a conjugate gradient algorithm and test their

performance on one synthesized sparse data set (see figure 3) in the surface reconstruction experiments

and one real range data set from a laser scanned ball. Subsequent experiments depict the convergence

comparison of the aforementioned preconditioners applied to the SFS, and OF computation problems.

In the first experiment, we recover the underlying surface from the sparse data set in figure 3(a)

using the controlled-continuity stabilizer with the parameters p and r being 1 and 0.1, respectively.

The size of the problem is 64 x 64. Applying our adaptive preconditioner to this problem, we can

see that the iterations converge to the true solution very fast. The computed solutions after 1, 5,

10 and 20 steps of conjugate gradient, HCG, and our APCG algorithms are illustrated in figure 4.

With only five iterations of our APCG algorithm, the global shape of the surface is recovered. After

ten iterations of our algorithm, the solution is very close to the true solution. Comparing these

intermediate solutions in figure 4, we can see the solution after 20 iterations of HCG is still bumpy

and the computed solution after 20 CG iterations is far from the true solution. In figure 5(a), we

depict the convergence rate of the APCG algorithm in comparison to all the aforementioned iterative

algorithms. Figure 5(b) depicts the experiments with a thin plate stabilizer. The convergence curves

for our APCG algorithm, HCG, CG, and Yaou & Chang's bi-orthogonal basis transfer with Jaffard's

diagonal preconditioning are shown. As can be seen, our APCG algorithm has the best convergence

performance in both experiments.

We include the discontinuities in the above controlled-continuity stabilizer to test the performance

of our APCG algorithm. The discontinuities are located along a pre-specified diagonal line. Figure

6 illustrates the convergence rates of our APCG, HCG( with 4 levels) and CG algorithms. The

APCG algorithm still has the best convergence performance in this experiment. Since the inclusion

of discontinuities makes the conditioning of the stiffness matrix worse and our preconditioner was not

redesigned to account for the discontinuities in the stabilizer, the convergence rate in this experiment

is slower than that of APCG without discontinuities. The computed solutions after 10, 20, 40 and

60 iterations of our APCG algorithm are depicted in figure 7.













APCG


(1)


Figure 4: (a), (d), (g) & (j) are the computed solutions using the APCG algorithm, (b), (e), (h) &
(k) are the computed solutions using HCG algorithm, (c), (f), (i) & (1) are the computed solutions
using conjugate gradient algorithm ,after 1, 5, 10 and 20 iterations, respectively.


HCG


CG


RSr 1..



















L Hierachical Basis Conjugate Gradient m=l Bi-orthogonal wavelet transform with power of 2 scaling
m Biorthogonal wavelet transform with power of 2 scaling
10
CG

-- - m=0 5 10 .
CG - m=1

W" 10, -
10' L=3 HCG(L-4)
10 1

L=4 A APCG
.ACAPCG

0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Number of iterations Number of iterations



(a) (b)




Figure 5: The convergence for different preconditioners in the preconditioned conjugate gradient

algorithm in the surface reconstruction problem with the synthetic sparse data set and (a) with

controlled-continuity stabilizer (p = 1, r = 0.1) (b) with thin-plate stabilizer.


Number of iterations


Figure 6: The convergence for different preconditioners in the preconditioned conjugate gradient

algorithm for the surface reconstruction problem with discontinuities specified in the controlled-

continuity stabilizer.




















































Figure 7: The computed solutions for the surface reconstruction with discontinuities using the APCG
algorithm after (a) 10, (b) 20, (c) 40 and (d) 60 iterations, respectively.









10 -





I \ \ CG

10 Pentland's preconditioning

APCG

0 5 10 15 20 25 30 35 40 45 50
Number of iterations



Figure 8: The convergence of Pentland's preconditioner in conjunction with the conjugate gradient
algorithm is worse than that of conjugate gradient algorithm without preconditioning for the surface

reconstruction problem with the synthetic sparse data set using a membrane stabilizer.












o 1

S 02






(a) (b)






















We attempted to incorporate the preconditioner developed in Pentland's [19] into the comparison

however, we found that the convergence of conjugate gradient with this preconditioner is even slower

than that of the CG algorithm (without preconditioning) for the very sparse data case. Note that the

preconditioner was used in a gradient decent iteration in [19]. In our experiments, the preconditioners

are used in conjunction with the conjugate gradient algorithm, which is known to be much more

efficient than a gradient decent algorithm [8]. We use the sub-sampled data set in figure 3(a) with

a membrane stabilizer in a 32 x 32 problem for testing this preconditioner. The convergence curves

for Pentland's preconditioner with CG, CG and our APCG algorithms are shown in figure 8. We

can see that Pentland's preconditioning is worse than the conjugate gradient iterations without

preconditioning. This is because the diagonal dominance behavior of Ks after wavelet transform

claimed in [19] doesn't hold in general. Pentland's preconditioner was constructed by approximating










the wavelet transform of the stiffness matrix by its diagonal entries based on the diagonal dominance

assumption of Ks. The falsity of the diagonal dominance claim is evident in figure 9 where we profile

arbitrarily selected rows of the wavelet transformed membrane matrix Ks of size (64 x 64) for the

1D problem. In [19], the preconditioner was chosen as the diagonal of the stiffness matrix K in the

wavelet basis. Note that for the sparse data case, the matrix Ks dominates Kd. Therefore, bad

approximations to the wavelet transform of Ks will deteriorate the preconditioning effect and slow

down the convergence speed.





CG


102




HCG(m=5)
10
o '-_ U v- ; -...--. ---
HCG(m=6)
APCG
0 2 4 6 8 10 12 14 16 18 20
Number of Iterations


(a) (b)



Figure 10: (a) The convergence for the APCG, HCG and CG algorithms for the surface reconstruction
problem with the real laser range data set using a controlled-continuity stabilizer, (b) The computed
solution after 10 APCG iterations.


The other data set in our surface reconstruction experiments is the real laser range data from a ball.

Figure 3(b) shows the 148 randomly sampled data points from the original range data set generated

by the MSU (Michigan State University) Pattern Recognition and Image Processing Lab's Technical

Arts 100X scanner. The size of the problem domain is 152 x 193. The controlled-continuity stabilizer

with p = 1 and r = 0.1 was used in this experiment. The discontinuities are specified along the

boundary of the object. The discontinuities divide the problem domain into two non-overlapped

regions, where the inside region corresponds to the object surface and the outside region corresponds

to uniform background. Our adaptive preconditioning is used in the inside region only by enforcing









the solution values in the outside region to be zero after the preconditioning. The convergence curves

for our APCG, HCG (with different levels), and CG algorithms are shown in figure 10(a). Our

adaptive preconditioned CG algorithm still has the best convergence performance in this experiment

with real laser range data. Very accurate solution is obtained after 10 iterations of the APCG

algorithm, as shown in figure 10(b).


5.2 Shape from Shading


For the shape-from-shading experiment, we use the synthetic bump images with size 64 x 64 used

in [23, 11], as shown in figure 11. These two images are generated from the Lambertian reflectance

model with the light source directions (p,, q,, 1) being (0.34, 0.3467, 1) and (-0.34, 0.3467, 1) for the

left and right images, respectively. In this experiment, only the left image is taken as the input to

the SFS algorithm to recover the shape, and the right image is used to monitor the closeness of the

computed solution to the true solution.


Figure 11: Shaded images of bump with different light source directions.


The construction of our preconditioner for the shape-from-shading problem is described in the

previous section. The parameters A and p are set to be 0.5 in this experiment. Without including































Figure 12: The convergence
problem.


of the APCG, HCG, and CG algorithms for the shape-from-shading


Figure 13: Shaded images of bump after 100 iterations of APCG algorithm without using any bound
ary condition.


HC LL-2)
i ,H CG 38
---HCG(L-4)

..........................
- ---- - -- - -









any boundary condition in the energy function to be minimized, we compare the convergence of

our APCG algorithm with the HCG and CG algorithms. The convergence curves for the energy

function and the computed (p, q) error are shown in figure 12. Note that the (p, q) error is defined as

Z |pi -p |I + |qi q* |, where (pi, qi) and (pm, q ) are the computed and true (p, q) at the i-th location,

respectively. We can see the APCG algorithm has the best convergence behavior in figure 12(a) and

(b). The computed solution after 100 iterations of the APCG algorithm is fed back to the reflectance

model to generate the left and right bump images as shown in figure 13. Comparing these computed

images to those in figure 11, we can see that this computed solution is far from the true solution

in the upper left region, although its corresponding energy is close to 0. This is because the energy

function for the SFS problem is nonconvex and gradient-based methods (e.g. CG-type methods)

tend to get trapped in local minima. To obtain the true solution using the gradient-based method,

we need to have an appropriate initial guess that is closer to the true solution. But it is difficult to

find such initial guess that is guaranteed to converge to the true solution in the CG-type algorithms.

In fact, we always use 0 as the initial guess in our experiments.

To resolve the above problem for minimizing the SFS energy function by gradient-based methods,

we include the boundary conditions obtained from the occluding boundary [14] as an additional data

constraint in the energy function to be minimized. By incorporating the boundary conditions into

the SFS energy function, the CG-type algorithms can approach the true solution. In our experiment,

we use the active contour model [16] to locate the occluding boundary at first. By using the property

that the surface normal at a point on the occluding boundary is parallel to the normal of the silhouette

in the image plane [14], we impose the Direchelet boundary condition on (p, q) along the occluding

boundary. In addition, we use the occluding boundary to incorporate the discontinuities in (p, q) or

z representing orientation discontinuities or depth discontinuities respectively. In this experiment,

we incorporate the orientation discontinuities along the occluding contour. Figure 14 shows the left

(north west) and right (north east) images generated from the computed solutions for our APCG

algorithm after 10, 20 and 40 iterations. We can see that the computed solutions approach to the

true solution after incorporating the boundary conditions. The computed gradient and height after






















(a) (b)













(c) (d)













(e) (f)


Figure 14: Shaded images of bump after (a) & (b) 10 iterations, (c) & (d) 20 iterations, (e) & (f) 40
iterations of the APCG algorithm with the occluding boundary condition.
33









40 iterations are shown in figure 15.


(a) (b)


Figure 15: The computed (a) gradient and (b) height after 40 iterations of the APCG algorithm.


Figure 16: The convergence curves of the (a) energy, (b) p
and APCG algorithm.


q error, (c) z error for the CG, HCG


The convergence of our APCG algorithm was compared to the HCG and CG algorithms for the

SFS problem with the Direchelet boundary condition. The convergence curves for the energy, p q

error and z error reduction are shown in figure 16. Our APCG outperforms the other methods in

terms of convergence speed for all of the three different measures.









5.3 Optical Flow Computation


In section 2, we presented a new formulation of the optical flow computation problem. Our regularization-

based formulation combines the region-based [13] and contour-based [10] gradient constraints. The

gradient constraint for the brightness function E is active all over the image plane except in the the

neighborhood of the discontinuity locations, while the gradient constraint for the brightness function

after the Laplacian of Gaussian operation, i.e. V2G E, is active at the discontinuity locations. In

our implementation, we locate the zero-crossings of V2G E and label them as discontinuities [10].

Accurate estimation of E,, Ey and Et is necessary to obtain a reliable flow constraint. When E is

the brightness function, we use the Gaussian spatial-temporal pre-smoothing prior to applying the

two-point central difference to approximate the partial derivatives. When E is V2G E, a two-point

central difference is applied to E to approximate its partial derivatives. As discussed in section 2,

a normalization of the gradient constraints is used to make the contribution of each (active) flow

constraint uniform.

We present two experiments for the optical flow computation. The first experiment consists of a

synthetic translating square image sequence; and the second is a real image sequence consisting

of a rotating Rubic cube. One frame for each image sequence is shown in figure 17 Each frame for

the translating square and rotating Rubic cube image sequence is of size 100 x 100 and 256 x 240

respectively.

The construction of our adaptive preconditioner for the optical flow problem is described in section

4. This preconditioner is embedded in the conjugate gradient to accelerate the convergence speed

for solving the linear system in the optical flow computation. We use the CG, HCG ( with various

number of levels), and our APCG algorithms to solve the problem. Figure 19(a) and (b) shows the

convergence rates of the three algorithms for the translating square and the rotating Rubic examples,

respectively. We can see that our APCG algorithm has the best performance in convergence. Note

that the HCG algorithm also converges quite fast because the linear system for the optical flow

problem is not as ill-conditioned as the one in the surface reconstruction and shape-from-shading




























Figure 17: One frame from the (a) square image sequence and (b) rotating Rubic cube image sequence.


problems. Therefore, the difference between various preconditioning methods in not as pronounced

for the optical flow problem as in the other two (SR & SFS) problems. Although the HCG algorithm

with an appropriate number of levels L, exhibits fast convergence, it is difficult to determine the best

L for a problem in advance. As for the APCG algorithm, the number of levels is always chosen to

be the maximum possible which allows the linear convolution in the QMF implementation with the

mirror reflection at the border to be carried out with ease.

In our examples, the regularization parameter A was set to a value of 10 for the translating square

example and 1 for the rotating Rubic cube experiment. For the former, the computed optical flows

after 10 iterations of the APCG algorithm is shown in figure 18(a), and the average angular

error is 0.21 with the standard deviation 0.07. Compared to the results for the other gradient

based methods reported in Barron et al. [1], our new formulation gives the best solution with

101i' flow density. In fact, our adaptive preconditioner can also be used with the other gradient based

regularization formulations of the optical flow problem. For the rotating Rubic cube experiment, the

computed optical flow after 15 iterations of our APCG algorithm is shown in figure 18(b).





























































Figure 18: Computed flow of the (a) translating square and (b) rotating Rubic cube.


15
Number of Iterations


5 10 15 20
Number of Iterations


25 30


Figure 19: The convergence curves of the CG, HCG and APCG algorithms for (a) translating square

(b) rotating Rubic image sequence.










37


* * CG
- -HCG(L=3)
- HCG(L=4)
HCG(L=5)
S APCG








-- - - - -.


_ 1 1 1 1 ( (
_ 1 1 1 ( (
~ ~ _ 1 ( ~
) I I I _ _ ~ _ t I I I I I I
_ _ _ _ _ _ _ I t _ I I I I I _


........._ i..









6 Discussion and Conclusion


In this paper, we presented a novel physically-based adaptive preconditioning technique which was

used in conjunction with a conjugate gradient algorithm to drastically improve the speed of conver-

gence for solving various early vision problems. A preconditioner based on the membrane spline or the

thin plate spline or a convex combination of the two was termed as a physically-based preconditioner

for obvious reasons. The adaptation of the preconditioner to an early vision problem was achieved via

the explicit use of the spectral characteristics of the regularization filter in conjunction with the data.

This spectral function was used to modulate the frequency characteristics of a chosen wavelet basis

leading to the construction of our preconditioner. The preconditioning technique was demonstrated

for the surface reconstruction, shape from shading and optical flow computation problems. Through

experiments, we demonstrated the superiority of our preconditioning method over existing methods.

Our adaptive preconditioning technique can be used in conjunction with any wavelet basis. The

choice of the wavelet basis is crucial to achieve efficient preconditioning. The ideal wavelet needs

to be well localized in spatial (or time) and frequency domain. Good spatial localization makes the

QMF implementation very efficient, i.e., the wavelet transform can be accomplished very efficiently.

Good localization in frequency domain of the wavelet is necessary to achieve fast convergence since

the the accuracy of spectral approximation to the regularization filter depends on the localization

in frequency domain of the wavelet. But the localization in spatial and frequency domain cannot

be arbitrarily small, because the product of their bandwidths is lower bounded by a constant 1

which is known as the uncertainty principle or Heisenberg inequality. This implies that there is a

tradeoff between the spatial and frequency localization. Gaussian functions are optimal in the sense

that they meet the bound with equality [7]. If we use the wavelet with the optimal property, the

preconditioning should be more efficient.

In the construction of our adaptive preconditioner, we approximate the associated regularization

filter by a linear shift-invariant filter in a wavelet basis, since the modulation matrix in our precon-

ditioner gives uniform weighting to the wavelet basis functions corresponding to the same frequency









band. As discussed in section 3, the regularization filter is spatially varying in general, especially for

the regularization problems with discontinuities or sparse data constraints. However, it is possible

to construct a spatially varying filter in a wavelet basis to achieve a better approximation to the

regularization filter. This will be the focus of our future research.

The adaptive preconditioner constructed in a wavelet basis can be applied to any regularization

formulation in early vision. In fact, we can use similar construction of the adaptive preconditioner

for the linear system discretized from any of the boundary value problems. The extension is straight-

forward by replacing the spectral characteristics of the membrane or thin-plate stabilizer by that of

the associated differential operator. There is no restriction on the order of the partial differential

equation. From the success of our adaptive preconditioner in the early vision problems, we believe the

adaptive preconditioner will lead to drastic acceleration for the numerical solution of the boundary

value problems.



References


[1] J. L. Barron, D. J. Fleet, and S. S. Beauchemin. "Performance of optical flow techniques". Intl.

Journ. of Comput. Vision, 12((1)):43-77, 1994.

[2] G. Beylkin, R.R. Coifman, and V. Rokhlin. "Fast wavelet transforms and numerical algorithm

I. Comm. on Pure and Applied Math., 44:141 183, 1991.

[3] A. Blake and A. Zisserman. Visual Reconstruction. The MIT Press, Cambridge, MA, 1987.


[4] R.M. Bolle and B.C. Vemuri. "On three-dimensional surface reconstruction methods". IEEE

Trans. Pattern Anal. Machine Intell., 13(1):1-13, 1991.

[5] B. L. Buzbee, F. W. Dorr, J. A. George, and G. H. Golub. "The direct solution of the discrete

Poison equation on irregular regions, ". SIAM Journal of Numerical A,,,rlii.' 8(4):722-736,

December 1971.









[6] A.K. Chhabra and T.A. Grogan. "On Poisson solvers and semi- direct methods for computing

area based optical flow". IEEE Transactions on Pattern A,,,lii;.' and Machine Intelligence,

16(11), Nov. 1994.

[7] D. Gabor. "Theory of communication". Journal of the IEE, 93:429-457, 1946.

[8] G.H. Golub and C.F. Van Loan. Matrix Computations. The Johns Hopkins University Press,

2nd edition, 1989.

[9] W. E. L. Grimson. "an implementation of a computational theory of visual surface interpola-

tion". Computer Vision, Graphics, Image Processing, 22:39-69, '1- ;

[10] E. C. Hildreth. "Computations underlying the measurement of visual motion". Artificial Intel-

ligence, 23:309-354, 1984.

[11] B. K. P. Horn. "Height from shading". Intl. Journal of Computer Vision, 5((1)):174-208, 1990.

[12] B. K. P. Horn and M. J. Brooks, editors. sI'.,, from s .../.,lj The MIT Press, Cambridge, MA,

1989.

[13] B. K. P. Horn and B. G. Schunk. "Determining optical flow". Artificial Intelligence, 17:11- 203,

1981.

[14] K. Ikeuchi and B. K. P. Horn. "Numerical shape from shading and occluding boundaries".

Artificial Intelligence, 17:141-184, 1981.

[15] S. Jaffard. "Wavelet methods for fast resolution of elliptic problems ". SIAM Journal of Nu-

merical A,,,1,.,. 29(4):965 '*I., August 1992.

[16] M. Kass, A. Witkin, and D. Terzopoulos. "Snakes: active contour mod 1- Intl. Journal of

Computer Vision, 1:321-331, 1987.









[17] S. H. Lai and B. C. Vemuri. "An O(N) iterative solution to the Poisson equations in low-level

vision problems,". In IEEE conference on Computer Vision & Pattern Recognition, Seattle,

Washington, June 1994.

[18] S. G. Mallat. "A theory of multiresolution signal decomposition". IEEE Trans. Pattern Anal.

Machine Intell., PAMI-11(7):647-1.i ; 1989.

[19] A. Pentland. "Interpolation using wavelet ..--". IEEE Transactions on Pattern A,,,,lii-.' and

Machine Intelligence, PAMI-16(4):410-414, 1994.

[20] 0. Rioul and M. Vitterli. "Wavelets and signal processing". IEEE Signal Processing Magazine,

8:14-38, 1991.

[21] T. Simchony, R. Chellappa, and M. Shao. "Direct analytical methods for solving Poisson equa-

tions in computer vision probl. i.- IEEE Transactions on Pattern A,,,lii.', and Machine

Intelligence, 12(5), May 1990.

[22] R. Szeliski. "fast surface interpolation using hierarchical basis fun, I ;.-". IEEE Trans. Pattern

Anal. Machine Intell., 12(6):513-528, 1990.

[23] R. Szeliski. "Fast shape from shading". CVGIP: Image Understanding, 53(2):129-153, 1991.

[24] D. Terzopoulos. "Image analysis using multigrid relaxation methods". IEEE Trans. Pattern

Anal. Machine Intell., 8:129-139, 1',1.

[25] D. Terzopoulos. "Regularization of inverse visual problems involving discontinuil -. IEEE

Trans. Pattern Anal. Machine Intell., 8(4):413-424, 1'l-i.

[26] D. Terzopoulos. "The computation of visible-surface repr -. Il.l.r;,-, IEEE Trans. Pattern

Anal. Machine Intell., 10(4):417-438, 1'i-

[27] V. Torre and T. Poggio. "on edge detection". IEEE Trans. on Pattern A,1,Jnl-.' and Machine

Intelligence, PAMI-8:147-163, 1'1'1.









-] B. C. Vemuri and S. H. Lai. "A fast solution to the surface reconstruction problem,". In SPIE

conference on Mathematical Imaging: Geometric Methods in Computer Vision, pages 27-37, San

Diego, CA, July 1993. SPIE.

[29] B. C. Vemuri and A. Radisavljevic. "Multiresolution stochastic hybrid shape models with fractal

priors". ACM Trans. on Graphics, 13(2):177-207, April 1994.

[30] B.C. Vemuri, A. Mitiche, and J.K. Aggarwal. "Curvature-based representation of objects from

range data". Image and Vision Comput., 4(2):107-114, 1'i-i.

[31] G. Whitten. "Scale space tracking of deformable sheet models for computational vision". IEEE

Trans. Pattern Anal. Machine Intell., PAMI-15(7):697-706, July 1993.


[32] M. H. Yaou and W. T. Chang. "Fast surface interpolation using multiresolution wavelet trans-

form". IEEE Trans. Pattern Anal. and Machine Intell., PAMI-16(7):673-688, 1994.

[33] H. Yeserentant. "On multi-level splitting of finite element p..... ". Numeriche Mathematik,

49:379-412, i'l-i.




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs