Robust and Efficient
Computation of Optical Flow1
S. H. Lai and B. C. Vemuri
UFCIS Technical Report TR95012
Computer and Information Sciences Department
University of Florida
Bldg. CSE, Room 301
Gainesville, Florida 326116120
email:hong@ cis.ufl.edu
vemuri@cis.ufl.edu April 1995
Key Phrases: Motion Analysis, Nonrigid and Rigid Motion Estimation, Object
Tracking, Optical Flow Computation, Lowlevel Vision, Regularization.
1This work was supported in part by the NSF grant ECS9210648.
Manuscript submitted to the Intl. Journal of Computer Vision
List of Figures
1 The discrepancy between the gradient directions of the function 10
and I1 at the ith location is small and thus the use of the gradient
direction as the search direction for the ith component is robust;
while the the jth component of the gradient direction is unreliable
for a local search. . . . . . . . . . . . . . 16
2 Frames from the (a) Square 2 and (b) Diverging Tree image sequences. 19
3 Computed optical flow of the (a) Square 2 and (b) Diverging Tree
image sequences after 40 iterations of the incomplete Cholesky pre
conditioned CG algorithm ...................... . 20
4 (a) One frame from the Rotating Rubic cube sequence (b) Com
puted optical flow after 40 iterations of the incomplete Cholesky
preconditioned CG algorithm ...................... 22
5 Frames from the (a) Sinusoid 1 and (b) Translating Tree image
sequences . . . . . . . . . . . . . .. . .22
6 Computed optical flow of the (a) Sinusoid 1 and (b) Translating Tree
image sequences, using the preconditioned nonlinear CG algorithm. 24
7 (a) One frame from the Hamburg Taxi sequence (b) Computed op
tical flow after 200 iterations of the preconditioned nonlinear CG
algorithm . . . . . . . . . . . . . . . . 25
Abstract
In this paper, we present two very efficient and accurate algorithms for computing optical
flow. The first is a modified gradientbased regularization method, and the other is an SSD
based regularization method. To amend the errors in the image flow constraint caused by the
brightness discontinuities, we propose to selectively combine the image flow constraint and the
contourbased flow constraint into the data constraint in a regularization framework. The im
age flow constraint is disabled in the neighborhood of discontinuities, while the contourbased
flow constraint is active at discontinuity locations. To solve the linear system resulting from
the regularization formulation, the incomplete Cholesky preconditioned conjugate gradient al
gorithm is employed, leading to an efficient algorithm. Our SSDbased regularization method
uses the SSD measure as the data constraint in a regularization framework. The precondi
tioned nonlinear conjugate gradient with a modified search direction scheme is developed to
minimize the resulting energy function. Experimental results for these two algorithms are given
to demonstrate their performance.
1 Introduction
Optical flow computation is a fundamental problem in the motion analysis of image
sequences. It provides very important information for estimating 3D velocity
fields, analyzing rigid and nonrigid motion, segmenting the image into regions
based on their motion, or recovering 3D structure of objects in the image. Many
techniques for computation of optical flow have been proposed in literature. These
can be classified into gradientbased, correlationbased, energybased, and phase
based methods (Barron et al. 1994).
The gradientbased approach is dependent on the image flow constraint equa
tion, which is derived from the brightness constancy assumption as well as the
firstorder Taylor series approximation (Horn and Schunk 1981). Using the image
flow constraint equation alone is not enough to compute the optical flow since each
equation involves two different variables. Horn and Schunk (1981) introduced the
membrane smoothness measure to constraint the flow field. Since the smoothness
constraint is invalid across the motion boundary, Nagel and Enkelmann (1986) pro
posed the "oriented smoothness" measure to suppress the smoothness constraint in
the direction perpendicular to boundaries. However, the smoothness across motion
boundaries problem can be resolved by using the regularization with discontinuities
(Terzopoulos 1986).
A major problem with the gradientbased approach is the unreliability of the im
age flow constraint equation in the areas whose local brightness function is highly
nonlinear. These areas occur in scenes containing highly textured regions, motion
discontinuities, or depth discontinuities. The firstorder Taylor series approxima
tion used in the derivation of the image flow constraint equation leads to inac
curacies when the higher order terms are significant. The higher order terms are
neglected in the derivation of the image flow constraint equation by assuming the
time step between consecutive frames is arbitrarily small, i.e. approaches 0, which
is far from being practical. Usually, the time step between consecutive images
is fixed and is not arbitrarily small. Therefore, the gradient constraint equation
is reliable only when the higher order derivatives of the brightness function are
insignificant, i.e., the local brightness function is well approximated by a linear
function. Most of the gradientbased methods do not account for the reliability
of the gradient constraint equation. However, Heitz and Bouthemy (1993) used
the hypotheses testing technique to determine whether the image flow constraint
equation is valid at each site in the image plane.
The above mentioned gradient constraint is very unreliable in the neighborhood
of discontinuities of the brightness function. However, the contourbased methods
can give robust estimates of the optical flow along the zerocrossing contours, which
are the discontinuity locations of the image brightness function. The contourbased
methods use a different flow constraint along the zerocrossing contours. This flow
constraint along contours is obtained by replacing the image brightness function I
by the convolution of I with the second derivative of Gaussian smoothing function.
In this paper, we propose to use the image flow constraint in the areas away from
the brightness discontinuities and use the contourbased flow constraint at the
discontinuity locations. Thus, in our gradientbased regularization method, the
unreliable image flow constraints near discontinuity locations are replaced by the
robust contourbased flow constraints at the zerocrossing locations. This yields
an accurate estimate of the optical flow field vectors.
The correlationbased approach locally finds the displacement vector (u, v) be
tween two images I1 and I1 at the location (x, y) by minimizing the sum of squared
difference (SSD) function
k k
SSD(x; = E E w (, j)[ (x+ + +j+ ) (x+ + j)] (1)
j=ki=k
where w(i,j) is the weighting function. In this SSD function, the summation
is performed in a window of size (2k + 1) x (2k + 1) centered at (x,y). Most
correlationbased methods do an extensive search of the displacement vector (u, v)
in a finite integerpair set and find the pair with the smallest SSD value to be
the displacement at that location. Since the search region of the displacement
vector is discretized for an extensive search, the accuracy of the computed optical
flow is limited by this discretization. In addition, the correlationbased approach
is incapable of producing reliable optical flow estimates in a homogeneous region.
Anandan (1989) treated the estimates provided by the matching process as the data
constraints for optical flow with appropriate confidence measure, and incorporated
the smoothness constraint on optical flow to achieve better results. Recently,
Szeliski and Coughlan (1994) used the twodimensional spline model to represent
the flow field and minimized the following SSD function
E(u) = E i [I(i + ui ,j + vi) Io(ij)]2 (2)
j= i=1
where the vector u is the concatenation of the flow components uij and vij. The
LevenbergMarquardt algorithm was then employed to solve this nonconvex opti
mization problem. They reported very accurate results using this method. The 2D
spline models for optical flow field assume the flow field to be wellapproximated
by the 2D spline basis functions in the preset patches. However, it is difficult to
incorporate discontinuities into these 2D spline patches. In this paper, we use
the standard finite difference discretization on the flow field and take the SSD
as the data constraint energy in a regularization framework. A preconditioned
nonlinear conjugate gradient algorithm is employed to minimize the total energy
function. The experimental results on the standard synthetic image sequence us
ing our algorithm are better than or comparable to the best existing results reported
in literature. Unlike some other algorithms that only generate sparse optical flow
estimates, our algorithms give dense optical flow estimates. In addition, motion
discontinuities can be incorporated in our (regularization) framework.
Recently, Barron et al. (1994) have presented a comprehensive survey of various
optical flow computation techniques along with a quantitative comparison of them.
From their experimental results, they argue that the phasebased method by Fleet
and Jepson (1990) gives the most accurate result. However, most of the reported
results are only for sparse optical flow measurements obtained after thresholding
unreliable components, present in the neighborhood of phase singularities.
The remainder of this paper is organized as follows. In the next section, we
present our modified gradientbased method that combines the regionbased gradi
ent constraint in the regions away from brightness discontinuities and the contour
based gradient constraint at these discontinuity locations. In section 3, our SSD
based regularization technique for optical flow computation from two images of a
time sequence is proposed. The experimental results for both algorithms on syn
thetic and real image sequences are presented in section 4. Finally, we conclude
our paper in section 5.
2 Modified Gradientbased Method
At the foundation of the standard gradientbased approach is the image flow con
straint equation,
I_(x, y, t)u(x, y, t) + Iy(x, y, t)v(x, y, t) + It(x, y, t) = 0 (3)
where I(x, y, t) is the image brightness function at location (x, y) and time t, I, ly
and It are the partial derivatives of I with respect to x, y and t respectively, and
(u, v) is the optical flow field. The image flow constraint equation has been used in
different ways to obtain the optical flow estimates as reported in the literature on
gradientbased methods (Horn and Schunk 1981, Lucas and Kanade 1981, Nagel
1983, Uras et al. 1988). In a regularization framework, the image flow constraint
equation is regarded as the data constraint and additional smoothness constraint
is imposed on the optical flow.
The image constraint equation is derived by taking the Taylor expansion of the
image constancy equation d(,y,t) = 0 up to the firstorder terms. The higherorder
terms are neglected under the assumption that the time step between consecutive
frames is arbitrarily small, which is most often violated in practice. In reality,
the time step is usually fixed. Therefore, the lack of higherorder terms becomes
the main source of errors in the data constraint. The reliability of the image flow
constraint equation depends on the magnitudes of the higher order derivatives of
image brightness function. If the image brightness function in the neighborhood of
one point is well approximated by a linear function, i.e. its higherorder derivatives
are small, then the image flow constraint is very reliable at this point. On the con
trary, the image flow constraint is very unreliable at the locations with significant
higherorder derivatives, which are mainly the neighborhood of brightness discon
tinuities. Since the use of an unreliable flow constraint will degrade the accuracy
of the optical flow estimation, we propose to disable this unreliable image flow con
straint in the neighborhood of discontinuities and enable the robust contourbased
flow constraint at the discontinuity locations in a regularization framework.
2.1 Incorporating the Contourbased Flow Constraint
The contourbased methods (Hildreth 1984, Duncan and Chou 1992) compute the
optical flow along the zerocrossing contours by using the following flow constraint
equation
S,(x, y, t)u(x, y, t) + Sy(x, y, t)v(x, y, t) + St(x, y, t) = 0 (4)
where S(x, y, t) is the convolution of I(x, y, t) with the second derivative of the
Gaussian function, SX, Sy, and St are the partial derivatives of S with respect to
x, y and t respectively. In Hildreth (1984), the function S was chosen to be the
convolution of the Laplacian of Gaussian (LOG) filter with the image brightness
n V2 V2 = 32
function, i.e. VG I, where 2 = x2 + 0,2 G(x, y) is the 2D spatial Gaussian
function and the convolution is performed in 2D.
Now, we incorporate the above two flow constraint equations into the regular
ization framework. The image flow constraint in equation 3 is disabled in the
neighborhood of brightness discontinuities, while the contourbased flow constraint
is active along the zerocrossing. The variational formulation of the optical flow
problem using this selective scheme for the data constraint leads to minimizing the
following functional
a((VI0u)+I)t+()2 Vu 1+1 V+ )dx+ (V.Su+S )x (5)
Szerocrossing(5)
where u(x, t) = (u(x, t), v(x, t)) is the optical flow field to be estimated, x = (x, y)
is a 2D vector, Q2 is the 2D image domain, a, A and / are the weighting functions
associated to the image flow constraint, smoothness constraint and contourbased
flow constraint respectively. In this paper, the weighting functions A and 3 are
chosen to be constants, and a(x) has value 0 when x is in the neighborhood of
discontinuities and takes a constant value elsewhere. A discrete version of the
above energy can be written as
E(E,,iui + Eiv + Et,i)2 + A(u, + U2i + V2 + yi) (6)
i
Where, the index i denotes the ith location, the indices x, y and t denote the
partial derivatives along the corresponding directions, u and v are the discretized
flow vector, and the function E is defined as follows
i li when i G D\N
3Si when i Z
The set D contains all the discretized locations in the image domain, N is the
set of locations in the neighborhood of discontinuities, and the set Z contains the
discretized locations along the zerocrossing contours.
2.2 Rejecting Unreliable Data Constraints
The image flow data constraint and the contourbased flow constraint are obtained
by using the firstorder Taylor series approximation of the functions I and S re
spectively, thus each constraint is unreliable in the regions where the underlying
function ( I or S ) is not well approximated by a linear function locally, which
means the function contains significant nonlinear terms. In this paper, we define a
function 6(i) to be a measure of reliability for the data constraint at ith location
as follows
6 (i) Exil + IEyyil + Ettil + I Eyil + Ext,il + Eytil (7)
SE,il + Ey,iJ + Et,i
The partial derivatives of the function E in equation 7 can be estimated by stan
dard finite difference methods. The reliability measure 6(i) is the ratio between
the sums of the absolute values of the secondorder and firstorder partial deriva
tives of the function E at the ith location. Therefore, the corresponding data
constraint is reliable when the measure 6 is small and unreliable when 6 is large.
We can set a threshold to reject the data constraints with a high 6 function value.
This threshold was set to 1.0 in our implementation.
2.3 Normalizing the Data Constraint
The data constraint in the above energy function implies the weighting I12, + 12
or S2,i + S2,i on (ui, vi) at site i. This weighting is not reasonable since the flow
constraints at high gradient locations are not necessarily more reliable than those
at low gradient locations. In fact, they tend to be less reliable because the bright
ness gradients close to discontinuities are normally higher than those away from
discontinuities. Therefore, we eliminate this weighting on each data constraint
via normalization. To have a uniform contribution at each pixel from the flow
constraint, we divide the flow constraint at the ith location by the normalization
factor (I i + ,i) or (Sx + S'2,i). Let's denote the normalized flow constraint by
E,,iui + Ey,ivi + Et,i = 0, then the energy function to be minimized becomes
f(u) = (E,ii + Ey,i + Et, + Et,) A(u ,i + U2, + v,i + ,i) (8)
i
where the vector u is the concatenation of all the components ui and vi.
2.4 Incomplete Cholesky Preconditioned Conjugate Gradient
The minimization of the above energy leads to solving a linear system of equations
Ku = b where the stiffness matrix K E R22 2x2 is symmetric positivedefinite and
it has the following 2 x 2 block structure.
K= Ks + Exx K E (9)
Exy K, + E, '
where K, E R"2 X2 is the discrete 2D Laplacian matrix from the membrane
smoothness constraint, Exx, EXy, and Eyy are all n2 x n2 diagonal matrices with
2 2
entries E ,i, E;,iEy,i and Ey,, respectively.
To solve this linear system for optical flow estimation, we use the preconditioned
conjugate gradient algorithm (Golub and Van Loan 1989) with an incomplete
Cholesky preconditioner P (Meijerink and van der Vorst 1977, Elman 1986), given
as follows.
1. Initialize uo; compute ro = b Kuo; k = 0.
2. Solve PZk = rk; k = k + 1.
N
3. If k = 1, pi = z0; else compute /3D = r2zk k = and update
Pk = Zk1 + /kPk1.
4. Compute a = rkT ZkI, af = pfKpk, and ak = / ;
5. Update rk = rk1 ckpk, uk = Uk1 + ckpk.
6. If rk ~ 0, stop; else go to step 2.
The preconditioner P is chosen to be an incomplete Cholesky factorization of the
matrix K. A good preconditioner can drastically accelerate the convergence rate
of the conjugate gradient algorithm. There are two criteria for designing a good
preconditioner P for the matrix K. At first, the preconditioner P has to be a nice
approximation to K so that the condition number of the preconditioned linear
system is drastically reduced. Secondly, there must exist a very fast numerical
method to solve the auxiliary linear system Pz = r required in the preconditioned
CG algorithm. Taking these two criteria into consideration, a good preconditioner
for the above stiffness matrix K of the optical flow problem can be obtained via
the incomplete Cholesky factorization of K.
The standard Cholesky factorization of the sparse matrix K "fills in" entries in
the band between nonzero offdiagonals, which means the sparsity structure will
be destroyed after the factorization. The idea of incomplete Cholesky factorization
is to find an approximate Cholesky factorization of the matrix K, i.e. K LLT,
such that the lower triangular matrix L has the similar sparsity structure. In
addition, the product LLT at the locations with nonzero entries in L or LT still
has the same values as those in K. Therefore, the preconditioner P = LLT is a
good approximation to K. Since the matrix K is sparse and wellstructured, the
matrix L is also sparse and wellstructured. Thus, the solution to the auxiliary
linear system Pz = r in the preconditioning step of the preconditioned conjugate
gradient algorithm can be obtained via forward and backward substitutions very
efficiently. Furthermore, we use the block version (Golub and Van Loan 1989) of
the incomplete Cholesky factorization to take advantage of the 2 x 2 block structure
of the matrix K.
The matrix K in equation 9 has diagonal blocks K, + E,, and K, + Eyy, each
of which is block tridiagonal. The offdiagonal elements of K are diagonal blocks,
each of which is a diagonal matrix. In addition, the offdiagonal elements of the
block tridiagonal matrices K. + E,, and K, + Eyy are diagonal. The factoring
matrix in the incomplete Cholesky factorization has the similar sparse structure
as that of K and takes the following form.
L L11L 1 (10)
L21 L22,
where the submatrices L11, L21 and L22 are of size n x n2. Our incomplete Cholesky
factorization is based on the construction of an incomplete block preconditioner in
Golub and Van Loan (1989). The submatrices L1, L22 and LT are chosen to be
block lower bidiagonal matrices with the diagonal blocks being lower bidiagonal
and the lower diagonal blocks being upper bidiagonal, i.e., the matrices L11, L22
and L21 have the following structures,
G
H Q) (
T .. Zj
H(.n1) G(n)
z3 z3
for (i,j) {(1, 1),(2,2)}, and
G(k)
Hij
H (k)
UHW
G11)T H1T
G21
H21
G21
(k)
i j, 1
e(k)
ij,2
Zjn i j 
ij, 1 ij,
(k)
Qij,2
6(k)
ij,n1
(k)
Qij,n
for k = 1,...,n.
The nonzero entries in the sparse matrix L are computed by
equating the entries of the product LLT to those in the matrix K at the locations
with nonzero entries in L. Thus, the nonzero entries in L11 are given by
 V K t1, ,1)2,
1/1 (2k) 2 (11) 2 (
P, 11i ,1 11 l 11, )
k = 1 1 < 1 < n.
2 k<n,
2 < k < n,
(13)
1 < / n,
S
2 < k < n,
1 < I < n,
1 <1 < n 1,
(11)
where
(12)
(k)
0,
(Kp,p / 1 1,l
( PP 11, 11,11 / 11,l
/3(k)
11,l
(k)
711,
(14)
(15)
L21=
SRnxn
I = 0, n, 1 < k < n,
k= 1, 1 < < n 1,
K Ice(k)
PIPn 11,1,
6(k) f 0, I =0,n, 1< k < n 1,
611(k) (k) (k) 1< (16)
S11,1 1 1,1/c ll+, 1 < < n 1 1 < k < n 1,
where p = (k 1)n + 1. Similarly, the nonzero entries in L21 can be written as
follows.
(k) ()17)
a21,1 = Ey,p/ 11,1, 1 < k,I < n, (17)
(k) _= (L (k) 1 < k < n, 1 < n 1, (18)
21,/ 21 1,ll, 11,.1.+
1(k) k) k) (k) 1 < k I 1 (19)
21,1 21 1+1
(k) (k) (k), + )(016(k),+6(k()) (k) (k)) (k+)
21,1 = (,1711, + 11,+ 21,1, 1 < k < n 1, 1 _< / 0)
The nonzero entries in the submatrix L22 have the similar forms to those in L11,
given by equations 13, 14, 15 and 16, except for the matrix K which is replaced
by K, + Ey La21L in the equations for a P(ai) 6 k) and 7 (
The incomplete Cholesky factorization of the matrix K given above can be com
puted in O(N) operations, where N(= n2) is the number of discretization points.
After the factorization, the preconditioner P is chosen as LLT, which is close to K
and has a nice structure. Thus, the preconditioned conjugate gradient algorithm
requires O(N) operations in each iteration.
3 SSDbased Regularization Method
The SSD function in equation 2 can be used as the data constraint energy in a
regularization framework. Incorporating the membrane smoothness and the SSD
constraints into the regularization formulation, we obtain the total energy function
f(u) = E(Ii(xi + Ui, yi + vi) Io(xi, yi)), + A(U,i + n2, + 2 + ) (21)
icD
where (xi, yi) is the ith discretized location in the image domain. Note that the
data constraint, Ii(xi + ui, yi + vi) = Io(xi, yi), is not linear in the flow vector
ui or vi. If the Taylor expansion of the data constraint is taken up to first or
der terms, we obtain the image flow constraint in equation 3. Consequently, the
gradientbased regularization approach can be perceived as an approximation to the
SSDbased regularization formulation. The energy function to be minimized in the
gradientbased regularization is quadratic and convex, while the energy function
in equation 21 is neither quadratic nor convex. For the quadratic and convex en
ergy function in the gradientbased regularization, the energy minimization can
be accomplished by solving an SPD linear system. The minimization of the non
quadratic and nonconvex energy function is more difficult and requires more com
putational effort. The SSDbased regularization can give more accurate estimation
than that from the gradientbased regularization, since the data constraint used in
the SSDbased regularization is more precise than that in the gradientbased reg
ularization. However, it definitely requires more computational work to minimize
the energy function.
The total energy function in equation 21 is a nonlinear function with a large
number of variables. We use the nonlinear conjugate gradient method (Gill et al.
1981) with a modified search direction to minimize this nonlinear energy function
in equation 21. The nonlinear conjugate gradient algorithm is very efficient in the
unconstrained optimization of largescale nonlinear functions (Gill et al. 1981). Al
though this algorithm can only find the local minima, we can obtain very accurate
solution when the initial solution is close to the true solution. For example, small
displacement between consecutive frames can be easily found when we start the
nonlinear CG from zero initial values. In fact, a very good initial solution can be
obtained from the optical flow estimated in the previous frames. In addition, the
scale space tracking scheme (Whitten 1993) can be employed to take care of large
displacement problems and to accelerate the convergence rate of the nonlinear CG
algorithm.
3.1 Nonlinear Conjugate Gradient Algorithm
The nonlinear conjugate gradient method for minimizing general nonlinear func
tions is outlined as follows (Gill et al. 1981):
1. do = ro = f'(uo).
2. Find ai that minimizes f(ui + aidi).
3. ui+1 = ui + aidi.
4. ri+1 = f'(ui+).
rT+ri+i f r3+ Tr i+iri) 0
5. /i+1 rT or i+ = maxI i+x{ ri' ,1.
6. di+1 = ri+1 + /3i+di.
7. If rk ~ 0, stop; else go to step 1.
Step 2 requires a line search method to minimize the function f(ui + aidi) at
ui along the direction di. The NewtonRaphson method and the Secant method
are two popular line search schemes to approximate the best ai. We employ the
NewtonRaphson method in the nonlinear CG. The Taylor series approximation
of the function f up to the secondorder terms is used to derive the approximate
ai, giving
f'l(ui)di (22)
dTf"(ui)di
In step 5, there are two ways for choosing 3. The first is FletcherReeves formula
and the other is PolakRibiere formula. We adopt the latter since it usually con
verges much more quickly, although the nonlinear CG needs to restart when 3 = 0.
Note that f'(u) and f"(u) in this algorithm denote the gradient vector and the
Hessian matrix of the function f. From equation 21, we can obtain
f'(u),= ,d, + u (23)
SKd, + AK, Kd,uv (24)
f"(u) [ = + K 1 (24)
Kd,uv Kd,vv + AKs
where the matrix K, is the Laplacian matrix obtained from the membrane smooth
ness constraint, the vectors f'd,u and f'd,v are the concatenations of the components
fld,u,i and f'd,v,i respectively, Kd,uu, Kd,uv and Kd,vv are all diagonal matrices with
the ith diagonal entries Kd,uu,i, Kd,uv,i and Kid,v,i respectively. The components
fld,u,i fd,vi, Kd, uui, Kduv,i and Kd,vv,i corresponding to the ith location are defined
as follows:
f, dIi(xi + ui, yi + vi)
Sd,u,i w (25)
f, 91(xi + ui, yi + vi)
f d,v,i w (26)
Kd,uu,i = () + W (27)
1 = 9I(xi + Ui, yi + vi) (xi + + V + i(, /+ + )i)
Kd,uv,i = A) + OO)y
dr dy dxdy
Ad v. (xi + ui, yi + vi) 2 211 ( + !i+ y + v29i)
Kd, i Oy2
where w = Ii(xi + ui, yi + vi) Io(xi, yi). In the above equations, the values of
the image function I1 and its partial derivatives at (xi + ui, yi + Vi) need to be
computed using any interpolation technique. In our implementation, we used the
interpolating spline under tension (Nielson and Franke 1984) to compute these
values.
3.2 Modified Search Direction Scheme
The nonlinear conjugate gradient algorithm is a local search method which involves
approximating the local landscape of the energy function by a quadratic function.
Figure 1: The discrepancy between the gradient directions of the function 1o and I, at the ith location
is small and thus the use of the gradient direction as the search direction for the ith component is
robust; while the the jth component of the gradient direction is unreliable for a local search.
The use of gradient to obtain the search direction is not appropriate at the locations
where the local landscape of the energy function has large higherorder derivatives,
i.e. the local landscape is very nonlinear. We propose a new search direction
scheme to modify the gradient direction in nonlinear conjugate gradient algorithm.
The idea of the modified search direction scheme is to check the smoothness of
the local energy function for every component of flow vector. From equation 21,
we can see that the smoothness energy is already quadratic, therefore, we need
only to consider the smoothness of the first term, i.e., the data constraint energy.
To measure the local smoothness of the data constraint energy for the flow vector
(ui, vi), we use the inner product of the normalized gradients of the two image
brightness functions as a smoothness measure o(ui, vi), i.e.
UVI(xi + Ui, yi + vi) VI(xi + Ui, yi + vi)
o7(UiVi) = (30)
IIVII(xi + i, yi + vi) VI(xi + i,y i + vi)ll
Figure 1 depicts the gradient directions of I1 and I1 at two locations. The gradient
directions at the ith location are close, therefore the use of it as the search direction
is robust for the ith component. The two gradient directions at the jth location
differ by a significant angle, which means the jth component of the gradient is
unreliable as a local search direction.
We can set a threshold on the smoothness measure in order to reject the unreli
t(x) (x+ U)
able components of the gradient direction as the basis of a local search direction.
In addition, this smoothness measure is also used as a confidence measure for
flow vectors. We use this confidence measure in a preconditioning process for the
nonlinear conjugate gradient algorithm, a topic of discussion in the next section.
3.3 Preconditioning Technique
For the nonlinear conjugate gradient algorithm, an ideal preconditioner is the
Hessian matrix of f. However, it is too expensive to solve the auxiliary linear
system associated with the Hessian preconditioner. In this paper, we use the
diagonal preconditioning technique which has the advantage that updating the
solution is very simple. Nevertheless, we can use the physicallybased adaptive
preconditioner of Lai and Vemuri (1995) or the incomplete Cholesky preconditioner
discussed in section 2 to achieve faster convergence. For the optical flow problem,
our diagonal preconditioning matrix P has the following structure,
P = P 0 (31)
0 P,
where P, and P, are both diagonal matrices. In our preconditioner, we choose
P, = P,. The ith diagonal entries of P, and P, are chosen to be
(OIll(i+ui,Yi+vi)2 + I(xi+ui,yi+v)2 +
PU,i = P,i = a (32)
(ui, vi) + 1
where E is a small positive number used to prevent dividing by zero in the precon
ditioning process.
4 Experimental Results
In this section, our modified gradientbased regularization algorithm and SSD
based regularization algorithm are tested on a variety of synthetic and real image
sequences. Comparisons of our results with the previously reported results in
Barron et al. (1994) and Szeliski and Coughlan (1994) are made to demonstrate
the performance of our proposed algorithms.
The angular measure of error (Barron et al. 1994) between the computed optical
flow and the true flow is adopted in the experiments as a basis for comparison with
the previously reported results. Unlike some other methods which only produce
sparse optical flow, both of our regularization methods presented in this paper
give dense optical flow estimates with 100% density. To achieve a fair comparison
with the reported results, only those techniques reporting 100% flow density in
literature are included in our comparative study.
4.1 Modified Gradientbased Method
In the implementation of our modified gradientbased regularization method, we
treat the zerocrossing points of S(x, y, t) as the discontinuity locations. The com
putation of the derivatives Ix, Iy and It is very errorsensitive. Therefore, we pre
smooth the image brightness function with the Gaussian filter prior to using the
central difference method to approximate the derivatives. As for Sx, Sy and St,
we directly use the central difference scheme to approximate them since the pro
cess of convolution with the LOG (Laplacian of Gaussian) filter for generating the
function S already involves the smoothing operation.
Two standard synthetic image sequences are tested by using our modified gradient
based regularization method to compute the optical flow. These two image se
quences are the images of a translating Square and a Diverging Tree reported in
Barron et al. (1994). Figure 2 shows one frame from each sequence. The image
sizes for the square and diverging tree are 100 x 100 and 150 x 150 respectively.
Figure 2: Frames from the (a) Square 2 and (b) Diverging Tree image sequences.
The motion in the Square 2 sequence is a constant translation. In our experiment,
the regularization parameter A is chosen to be 10, and the weighting parameters a
and 3 for the image flow constraint and contourbased constraint are set to 1. The
optical flow obtained after 40 iterations of the incomplete Cholesky preconditioned
conjugate gradient algorithm is shown in figure 3(a). The errors of our modified
gradientbased regularization method and the other gradientbased methods for
this translating square example are given in Table 1. Only the results from those
gradientbased methods that yield 100% density flow estimates reported in Barron
et al. (1994) are included in this table for comparison. It is obvious that our
modified gradientbased regularization method drastically outperforms the other
gradientbased methods in this example.
The other synthetic data is the more realistic Diverging Tree image sequence.
The underlying motion in this example is a bilinear expansion from the center of
the image. We use the regularization parameter A = 0.1 and a = P = 1 in this ex
ample. The error in the computed optical flow after 40 iterations of the incomplete
Technique Average Error Standard Deviation Density
Horn and Schunck (modified) 32.810 13.670 100%
Nagel 34.570 14.380 100%
Our method (Gradientbased) 0.210 0.060 100%
Table 1: Summary of Square 2 results
Figure 3: Computed optical flow of the (a) Square 2 and (b) Diverging Tree image sequences after
40 iterations of the incomplete Cholesky preconditioned CG algorithm.
111111111111111111111111%
% % % % % % %1111111111
% \. \. \. 1 % % % % %11111
% \. \. 11 11 \. % % % % \. \. \. \
% % % % \. 111 \.11 11 % % % % %1
% % % \ 111111111 11 1. %1
% % % 11111111 11 % % %111
% % % % 111111111 \. 11 % %
1111111 % % 11111 %11 11 11 .
% %% %% \. I'll 111 % %%% "\.\. \
11111111 1. % % 1111111 1.
11111 \.11 1. % %111111~
% % % 1. % % % % % % % % 1111 11
11111 % % % % % %1111111
11% "\.\. \. 11111111\.% %,
% % % 1. % \ 1111111 \ 11 \. %
\\\\\\\\\ \ T T I? T? 7//A/7/A
\\\\\\\'\ltI \ ? T T /7IhAZ///
S S ''' ~ t P P7Ph A ''9AA
y.'r./iV/ 1111 *415 \X \\TrNNN
t/t c//t// 1114 IL a
Vt//t/I~~~1 1 1 III Jilt L ~
/1/1/1/11 11441 4 L1
1//V/I/lI 1144144 L1
1.tV///1111l11(1441 SI1~1
*<* . < s s\ \\ ii i tr / / / / / //*/^a^
Cholesky preconditioned CG algorithm is presented in Table 2, along results from
other gradientbased methods in literature that yield 100% flow density (Barron
et al. 1994). Once again, our modified gradientbased regularization method pro
duces more accurate optical flow than the other methods. The computed optical
flow is shown in figure 3(b).
Technique Average Error Standard Deviation Density
Horn and Schunck (modified) 2.550 3.670 100%
Nagel 2.940 3.230 100%
Uras et al. 4.640 3.480 100%
Our method (Gradientbased) 1.600 1.150 100%
Table 2: Summary of Diverging Tree results
Our gradientbased algorithm was also tested on a real data set namely, the Ro
tating Rubic cube. Figure 4 shows one frame of the Rubic cubic sequence and
the computed optical flow after 40 iterations of the incomplete Cholesky precon
ditioned conjugate gradient algorithm.
4.2 SSDbased Regularization Method
The experimental results of using our SSDbased regularization method on two
synthetic image data and one real image sequence are presented here. The two
synthetic image sequences are the Sinusoid and the Translating Tree image se
quences. Our results on these two examples are compared to the best results with
100% density optical flow estimate reported in Barron et al. (1994) and Szeliski
and Coughlan (1994).
The Sinusoid 1 image sequence has a constant translation with a velocity of
(1.585, 0.863) pixels/frame. The image size for each frame of this image sequence
(a) (b)
Figure 4: (a) One frame from the Rotating Rubic cube sequence (b) Computed optical flow after 40
iterations of the incomplete Cholesky preconditioned CG algorithm.
Figure 5: Frames from the (a) Sinusoid 1 and (b) Translating Tree image sequences.
1 1 )
 ~~~r~ 1 1
 ~~~~4 1 1
1~~ ~~  ....... II
 ~~ I 
~5~~~~1~44
is 100 x 100. Each frame is a highly textured image whose brightness function is
fast changing almost everywhere. The regularization parameter A is set to be 300.
Very accurate estimate is obtained after 300 preconditioned nonlinear conjugate
gradient iterations. Figure 6(a) shows the computed optical flow. Our result is
the best among the three depicted in Table 3. Note that our method uses only two
consecutive images to estimate the optical flow. The results of Fleet and Jepson's
method for this example were obtained by using a 21frame image sequence (Barron
et al. 1994). To achieve a fair comparison, we depict the reported results of Szeliski
and Coughlan (1994) for the 2frame image sequence. With incorporation of more
frames of data, the results in Szeliski and Coughlan do however improve over those
depicted in the table.
Technique # of Frames Average Error Standard Deviation Density
Fleet and Jepson 21 0.030 0.010 100%
Szeliski and Coughlan 2 0.170 0.020 100%
Our method (SSDbased) 2 0.020 0.010 100%
Table 3: Summary of Sinusoid 1 results
For the translating tree example, each frame contains more complicated highly
textured regions, which result in multiple local minima in the associated energy
function to be minimized. To avoid being trapped in a local minimum during the
minimization of a nonconvex energy function, we employ the scalespace tracking
scheme (Whitten 1993). The scalespace tracking is achieved by applying our al
gorithm on the smoothed image pair with the degree of smoothing being gradually
reduced during the energy minimization. We start from a o = 1.1547 for Gaus
sian smoothing, reduce to a o = 0.5774 after 100 iterations of the preconditioned
nonlinear CG algorithm, and finally set a to 0 after 200 iterations. The computed
optical flow is shown in figure 6(b). The error in the computed optical flow is de
Figure 6: Computed optical flow of the (a) Sinusoid
using the preconditioned nonlinear CG algorithm.
1 and (b) Translating Tree image sequences,
picted in Table 4 along with the errors of best results with 100% density reported
in Barron et al. (1994) and Szeliski and Coughlan (1994). Our SSDbased regular
ization method gives the most accurate result from among these methods. Note
that the result of Szeliski and Coughlan in this table is adopted for the twoframe
case to be compared to the result of our twoframe algorithm, although more ac
curate estimation can be obtained via incorporating more frames into the energy
function as shown in Szeliski and Coughlan (1994).
Technique # of Frames Average Error Standard Deviation Density
Uras et al. 15 0.620 0.520 100%
Szeliski and Coughlan 2 0.350 0.340 100%
Our method (SSDbased) 2 0.300 0.190 100%
Table 4: Summary of Translating Tree results
Finally, our SSDbased regularization method is applied to a real image sequence
..................
N, NN IN N, 11 IN I I I I
~
~~)~~ ~ ~ ~ ~ ~
namely, the Hamburg Taxi sequence. One frame of this sequence and the computed
optical flow are shown in figure 7. The computed optical flow between two con
secutive images was obtained after 200 iterations of the preconditioned nonlinear
conjugate gradient algorithm.
(a) (b)
Figure 7: (a) One frame from the Hamburg Taxi sequence (b) Computed optical flow after 200
iterations of the preconditioned nonlinear CG algorithm.
5 Conclusion
In this paper, we presented two new algorithms for robust and efficient computation
of the optical flow from an image sequence. One is the modified gradientbased
regularization algorithm, and the other is the SSDbased regularization algorithm.
The modified gradientbased regularization method selectively combines the image
flow constraint and the contourbased flow constraint into the data constraint. The
image flow constraint is disabled in the neighborhood of discontinuities, while the
contourbased flow constraint is active at discontinuity locations. A more precise
data constraint is obtained by using this selective combination scheme than the
one used by other gradientbased methods in literature. Flow estimates obtained
using our method are more accurate than those obtained from techniques reported
in literature (Barron et al. 1994). This is evident from the experimental results.
Our SSDbased regularization method uses the SSD measure as the data con
straint in a regularization framework. The resulting energy function is neither
quadratic nor convex. The preconditioned nonlinear conjugate gradient with a
modified search direction is developed to minimize this energy function. Very ac
curate results were obtained by applying this algorithm to two consecutive images
of a sequence.
Comparing these two algorithms proposed in this paper, we conclude that the
modified gradientbased regularization algorithm is more computationally efficient
than the SSDbased regularization algorithm, while the latter gives more accurate
results than the former. Also, we have experimentally demonstrated the superior
ity of our algorithms for optical flow computation over those reported to date in
literature. Future work will focus on automatically detecting motion discontinu
ities in the framework presented here.
References
Anandan, P. 1989. A computational framework and an algorithm for the measure
ment of visual motion. Intern. J. Comput. Vision 2: 283310.
Barron, J. L., Fleet, D. J. and Beauchemin, S. S. 1994. Performance of optical flow
techniques. Intern. J. Comput. Vision 12(1): 4377.
Duncan, J. H. and Chou, T.C. 1992. On the detection of motion and the computa
tion of optical flow. IEEE Trans. Patt. Anal. Mach. Intell. PAMI14(3): 346
352.
Elman, H. C. 1986. A stability analysis of incomplete LU factorizations. Mathe
matics of Computation 47(175): 191217.
Fleet, D. J. and Jepson, A. D. 1990. Computation of component image velocity
from local phase information. Intern. J. Comput. Vision 5: 77104.
Gill, P. E., Murray, W. and Wright, M. H. 1981. Practical Optimization. Academic
Press, London; New York.
Golub, G. H. and Van Loan, C. F. 1989. Matrix Computations. 2nd edn, The Johns
Hopkins University Press.
Heitz, F. and Bouthemy, P. 1993. Multimodal estimation of discontinuous optical
flow using Markov random fields. IEEE Trans. on Patt. Analysis and Mach.
Intelligence PAMI15: 12171232.
Hildreth, E. C. 1984. Computations underlying the measurement of visual motion.
Artificial Intelligence 23: 309354.
Horn, B. K. P. and Schunk, B. G. 1981. Determining optical flow. Artificial Intel
ligence 17: 185203.
Lai, S. H. and Vemuri, B. C. 1995. Physicallybased adaptive preconditioners for
early vision. IEEE Workshop on PhysicsBased Modeling in Computer Vision,
Cambridge, MA. also submitted to IEEE Trans. Patt. Anal. Mach. Intell.
Lucas, B. and Kanade, T. 1981. An iterative image registration technique with an
application to stereo vision. Proc. DARPA Image Understanding Workshop,
pp. 121 130.
Meijerink, J. A. and van der Vorst, H. A. 1977. An iterative solution method
for linear systems of which the coefficient matrix is a symmetric Mmatrix.
Mathematics of Computation 31(137): 148162.
27
Nagel, H.H. and Enkelmann, W. 1986. An investigation of smoothness constraints
for the estimation of displacement vector fields from image sequences. IEEE
Trans. Patt. Anal. Mach. Intell. PAMI8(5): 565593.
Nagel, H.H. 1983. Constraints for the estimation of displacement vector fields
from image sequences. Proc. IJCAI83, Karlsruhe, Germany, pp. 945951.
Nielson, G. M. and Franke, R. 1984. A method for construction of surfaces under
tension. Rocky Mountain Journal of Mathematics 14(1): 203221.
Szeliski, R. and Coughlan, J. 1994. Hierarchical splinebased image registration.
Proc. IEEE Conf. Comput. Vis. Patt. Recog., Seattle, Washington, pp. 194
201.
Terzopoulos, D. 1986. Regularization of inverse visual problems involving discon
tinuities. IEEE Trans. Patt. Anal. Mach. Intell. PAMI8(4): 413424.
Uras, S., Girosi, F., Verri, A. and Torre, V. 1988. A computational approach to
motion perception. Biol. Cybern. 60: 7987.
Whitten, G. 1993. Scale space tracking of deformable sheet models for computa
tional vision. IEEE Trans. Patt. Anal. Mach. Intell. PAMI15(7): 697706.
