UFDC Home |   Help  |   RSS

 Group Title: Department of Computer and Information Science and Engineering Technical Reports Title: Modifying a sparse Cholesky factorization
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
 Material Information Title: Modifying a sparse Cholesky factorization Series Title: Department of Computer and Information Science and Engineering Technical Report ; 97-003 Physical Description: Book Language: English Creator: Davis, Timothy A.Hager, William W Affiliation: University of Florida -- Department of Computer and Information Science and EngineeringUniversity of Florida -- Department of Mathematics Publisher: Department of Computer and Information Science and Engineering, University of Florida Place of Publication: Gainesville, Fla. Publication Date: April 28, 1997 Copyright Date: 1997
 Record Information Bibliographic ID: UF00095401 Volume ID: VID00001 Source Institution: University of Florida Holding Location: University of Florida Rights Management: All rights reserved by the source institution and holding location.

1997239 ( PDF )

Full Text

Modifying a Sparse Cholesky

Factorization*

Timothy A. Davist
Department of Computer and Information Science and Engineering
University of Florida, Gainesville, FL 32611
and
William W. HI, it
Department of Mathematics
University of Florida, Gainesville, FL 32611

April 28, 1997

Technical Report TR-97-003, Department of Computer and Information Science
and Engineering, University of Florida.

Abstract
Given a sparse symmetric positive definite matrix AAT and an associated
sparse ('C!i .i -:. factorization LLT, we develop sparse techniques for obtaining
the new factorization associated with either adding a column to A or deleting
a column from A. Our techniques are based on an analysis and manipulation
of the underlying graph structure and on ideas of Gill, Golub, I ii .y, and
Saunders for modifying a dense ('I!n -I:. factorization. Our algorithm involves
a new sparse matrix concept, the multiplicity of an entry in L. 1I II multiplicity
is essentially a measure of the number of times an entry is modified during
symbolic factorization. We show that our methods extend to the general case
where an arbitrary sparse symmetric positive definite matrix is modified. Our
methods are optimal in the sense that they take time proportional to the
number of nonzero entries in L that change.

*This work was supported by National Science Foundation grants D -1S-9404431 and D !IS-
9504974.
i, l ,- '.. 1-. i L 1i1, http://www.cise.ufl.edu/~davis
S"!i ,' i i! I I [ 1i !! L [, http://www.math.ufl.edu/~hager

Key words. numerical linear algebra, direct methods, C111, -l.y factorization,
sparse matrices, mathematical software, matrix updates.

AMS(MOS) subject classifications 65F05, 65F50, 65-04.

1 Introduction

This paper presents a method for updating and downdating the sparse Cl -lI.y
factorization LLT of the matrix AAT, where A is m by n. '.1 .re precisely, we evaluate
the Cl11 .I -I.y factorization of AAT + aww'T where either a is +1 (corresponding to
an update) and w is arbitrary, or a is -1 (corresponding to a downdate) and w is a
column of A. Both AAT and AAT + wwT must be symmetric and positive definite.
Ti, techniques we develop for the matrix AAT can be extended to determine the
effects on the C11 .1 -I.y factors of a general symmetric positive definite matrix M of
any symmetric change of the form M + awwT that preserves positive definiteness.
Our methods are optimal in the sense that they take time proportional to the number
of nonzero entries in L that change.
Tli, 1 are many applications of the techniques presented in this paper. In
the Linear Program Dual Active Set Algorithm (LP DASA) [19], the A matrix
corresponds to the basic variables in the current basis of the linear program, and
in successive iterations, we bring variables in and out of the basis, leading to changes
of the form AAT + awwT. Other application areas where the techniques developed
in this paper are applicable include least-squares problems in statistics, the analysis
of electrical circuits, structural mechanics, sensitivity analysis in linear programming,
boundary condition changes in partial differential equations, domain decomposition
methods, and boundary element methods. For a discussion of these application areas
and others, see [18].
Section 2 introduces our notation. For an introduction to sparse matrix
techniques, see [4, 8]. In 3 we discuss the structure of the nonzero elements in
the C11 .1 -ly factorization of AAT and in 4, we discuss the structure associated
with the C11.1i -I y factors of AAT+ t wwT. Tli symbolic update and downdate
methods provide the framework for our sparse version of ':. i!, 1 i C5 of Gill, Golub,
.ii, iiy, and Saunders [15] for modifying a dense Cl11 -I.y factorization. We discuss
our sparse algorithm in 5. Section 6 presents the general algorithm for modifying
the sparse Cl11 .1 -I.y factorization for any sparse symmetric positive definite matrix.
Implementation details for the modification algorithm associated with AAT + awwT
are summarized in 7, while the results of a numerical experiment with a large
optimization problem from N:'. 1ilI [3] are presented in 8. Section 9 concludes with a
discussion of future work.

2 Notation

Throughout the paper, matrices are capital bold letters like A or L, while vectors are
lower case bold letters like x or v. Sets and multisets are in calligraphic style like A,
, or P. Scalars are either lower case Greek letters, or italic style like a, k, or m.
Given the location of the nonzero elements of AAT, we can perform a symbolic
factorization (this terminology is introduced by George and Liu in [8]) of the matrix
to predict the location of the nonzero elements of the CI1 .I -l.y factor L. In actuality,
some of these predicted nonzeros may be zero due to numerical cancellation during
the factorization process. Ti, statement "lij = 0" will mean that lij is symbolically
nonzero. Ti, diagonal of L is always nonzero since the matrices that we factor are
positive definite (see [24, p. 253]). Ti, nonzero pattern of column j of L is denoted
2j:
L = {i : 4 0},
while L denotes the collection of patterns:

C = {f1, 2 , }-

Similarly, Aj denotes the nonzero pattern of column j of A,

Aj = {i : aij 0},

while A is the collection of patterns:

A ={A1,A2,.. .,An}.

Tli, elimination tree can be defined in terms of a parent map 7 (see [20]), which
gives the row index associated with a given node j of the first nonzero element in
column j of L beneath the diagonal element:

7(j) = min Lj \ {j},

where "min X" denotes the smallest element of X:

min X = min i.
iEX

Our convention is that the min of the empty set is zero. Note that j < 7(j) except
in the case where the diagonal element in column j is the only nonzero element. Tli,
inverse r-1 of the parent map is the children multifunction. T11ii is, the children of
node k is the set defined by

7-(k) = {j : 7(j) = k}.

Thi ancestors of a node j, denoted P(j), is the set of successive parents:

P() = (), 7(7()),. } {- I)= fU), ), 2,(), }.

Here the powers of a map are defined in the usual way: 7 is the identity while 71
for i > 0 is the i-fold composition of 7 with itself. Thi sequence of nodes j, 7(j),
r(7r(j)), * forming P(k), is called the path from j to the associated tree root. Til,
collection of paths leading to a root form an elimination tree, and the set of all trees
is the elimination forest. Typically, there is a single tree whose root is rn, however,
if column j of AAT has only one nonzero element, the diagonal element, then j will
be the root of a separate tree.
Ti, number of elements (or size) of a set X is denoted IXI, while IAI or I| denote
the sum of the sizes of the sets they contain. Define the directed graph G(M) of an m
by m matrix M with nonzero pattern AM as the vertex and edge sets: G(M) = {V, E},
where V = {1,2, n} is the vertex set and E = {(i,j) i C Mj} is the directed
edge set.

3 Symbolic factorization

Any approach for generating the pattern set is called symbolic factorization [5, 6, 7,
8, 23]. Thi symbolic factorization of a matrix of the form AAT is given in Algorithm 1
(see [9, 20]).

Algorithm 1 (Symbolic factorization of AAT)
7(j) = 0 for each j
for j = 1 to m do

Li = D}Iu U L,\{c}]u U Ak
c (ETl(j) / \min Ak=j
7(j) = min j \ {j}
end for

Algorithm 1 basically says that the pattern of column j of C can be expressed as
the union of the patterns of each column of L whose parent is j and the patterns of the
columns of A whose first nonzero element is j. Thi elimination tree, connecting each
child to its parent, is easily formed during the symbolic factorization. Algorithm 1
can be done in O(|C\ + JAI) time1 [9, 20].
A -mptotic !il 1. -1 notation is defined in [2]. We write f(n) = O(-,f.,i') if there exist positive
constants c and no such that 0 < f(n) < .,.'.!i for all n > no.

Observe that the pattern of the parent of node j contains all entries in the pattern
of column j except j itself [22]. T11,ii is,

,\fA}==Ln{i:7nj) i}C (j).
Proceeding by induction, if k is an ancestor of j, then

{i : i j,k < i} c .k (1)

This leads to the following relation between Lj and the path P(j). TIl, first part of
this proposition, and its proof, is given in [22]. Our proof differs slightly from the one
in [22]. We include it here since the same proof technique is exploited later.

Proposition 3.1 For each j, we have j C 'P(j); furthermore, for each k and j E
P(k), Ij C P(k).

Proof. Obviously, j G P(j). Let i be any given element of Cj with i \$ j. Since
j < i, we see that the following relation holds for I = 0:

<0((j< < ) < < (j) < i. (2)

Now suppose that (2) holds for some integer I > 0, and let k denote ('(j). By (1)
and the fact that k < i, we have i G k, which implies that

i > 7(k) = 7 (71()) = 7+11).

Hence, either i = tr+~(j) or (2) holds with I replaced by I + 1. Since (2) is violated
for I sufficiently large, we conclude that there exists an I for which i = tr+1(j).
Consequently, i 'P(j). Since each element of Lj is contained in P(j), we have
j C 'P(j). If j G 'P(k) for some k, then j is an ancestor of k, and P(j) C P(k).
Since we have already shown that Lj C P(j), the proof is complete. O
As we will see, the symbolic factorization of AAT + wwT can be obtained by
updating the symbolic factorization of AAT using an algorithm that has the same
structure as that of Algorithm 1. Thl new pattern Lj is equal to the old pattern Lj
union entries that arise from new children and from the pattern of the new column
w. (We put a bar over a matrix or a set or a multiset to denote its value after the
update or downdate is complete.)
Downdating is not as easy. Once a set union has been computed, it cannot be
undone without knowledge of how entries entered the set. We can keep track of
this information by storing the elements of L as multisets rather than as sets. Til,
multiset associated with column j has the form

= {(i, m(i,j)) :i G L},

where the multiplicity m(i, j) is the number of children of j that contain row index i
in their pattern plus the number of columns of A whose smallest entry is j and that
contain row index i. Equivalently,

m(i,j) = I{k E 7-(j) : i C4k}l + I{k : min Ak = and i C Ak}\.

With this definition, we can undo a set union by subtracting multiplicities.
We now define some operations involving multisets. First, if X is a multiset
consisting of pairs (i, m(i)) where m(i) is the multiplicity associated with i, then X
is the set obtained by removing the multiplicities. In other words, the multiset X
and the associated base set X satisfy the relation:

X = {(i, rn(i)): i c X}.

We define the addition of a multiset X and a set y in the following way:

X + y = {(i, m'(i)) : i X or i C Y},

where
S1 if i X and i Y,
r'(i) = r(i) if i X and i i Y,
r(i) + 1 if i X and i Y.
Similarly, the subtraction of a set y from a multiset X is defined by

X 3 = {(i, m'(i)) : i C X and m'(i) > 0},

where
m() m(i) if i Y,
rn(i) 1 ifi Y.
Ti1 i multiset subtraction of y from X undoes a prior addition. Tli,~ is, for any
multiset X and any set y, we have

((X+ Y) ) = X

In contrast ((X U y) \ y) is equal to X if and only if X and Y are disjoint sets.
Algorithm 2 below performs a symbolic factorization of AAT, with each set union
operation replaced by a multiset addition. This algorithm is identical to Algorithm 1
except for the bookkeeping associated with multiplicities.

Algorithm 2 (Symbolic factorization of AAT, using multisets)
7(j) = 0 for each j
forj = 1 to do

L= {(j, 1)}
for each c C 7-1(j) do
/L2 = L. + (L, \ {c})
end for
for each k where min Ak = j do

end for
7(j) = min L \ {j}
end for

We conclude this section with a result concerning the relation between the patterns
of AAT and the patterns of AAT + wwT.

Proposition 3.2 Let C and D be the patterns associated with the symmetric positive
definite matrices C and D respectively. Neglecting numerical cancellation, Cj C Dj
for each j implies that (c)j C (LD)j for each j, where Cc and CD are the patterns
associated with the Cli,, I. -.j factors of C and D respectively.

Proof. In [8, 21] it is shown that an edge (i,j) is contained in the graph of the
C111, I -.y factor of a symmetric positive definite matrix C if and only if there is a path
from i to j in the graph of C with each intermediate vertex of the path between 1 and
min {i,j}. If Cj C Dj for each j, then the paths associated with the graph of C are
a subset of the paths associated with the graph of D. It follows that (c)j C (CD)j
for each j. r
Ignoring numerical cancellation, the edges in the graph of AAT are a subset of the
edges in the graph of AAT +wwT. By Proposition 3.2, we conclude that the edges in
the graphs of the associated Cl1 I -I y factors satisfy the same inclusion. As a result,
if the columns of A and the vectors w used in the update AA + wwT are all chosen
from the columns of some fixed matrix B, and if a fill-reducing permutation P can be
found for which the C11.1 I-ly factors of PBB PT are sparse ([1] for example), then
by Proposition 3.2, the Cl1, .1 -.y factors of PAATPT and of P(AAT + wwT)pT will
be at least as sparse as those of PBBTPT.

4 Modifying the symbolic factors

Let A be the modified version of A. Again, we put a bar over a matrix or a set or a
multiset to denote its value after the update or downdate is complete. In an update
A is obtained from A by appending the column w on the right, while in a downdate,
A is obtained from A by deleting the column w from A. Hence, we have

AAT = AAT + OwwT,

where a is either +1 and w is the last column of A (update) or a is -1 and w is a
column of A (downdate). Since A and A differ by at most a single column, it follows
from Proposition 3.2 that Lj C j for each j during an update, while Lj C Lj during
a downdate. '. i reover, the multisets associated with the C11 .1 -Iy factor of either the
updated or downdated matrix have the structure described in the following theorem:

Theorem 4.1 Let k be the index associated with the first nonzero component of w.
For an update, P(k) C P(k) and - = C2 for all i E P(k)', the complement of P(k).
T1,.., is, 4i = C for all i except when i is k or one of the new ancestors of k. For
a downdate, P(k) C P(k) and = for all i P(k)Y. T., is, i = for all i
except when i is k or one of the old ancestors of k.

Proof. To begin, let us consider an update. We will show that each element of P(k)
is a member of P(k) as well. Clearly, k lies in both P(k) and P(k). Proceeding by
induction, suppose that 7r (k), 7l(k), 72 (k), j(k) c P(k), and define I= 7 (k).
We need to show that 7r() = r(jr(k)) = 7r+l(k) c P(k) to complete the induction.
Since I G P(k), we have I= wh(k) for some h. If w(l) = r(l), then
Th+1 (k) = T(7h(k)) = T(1) = 7"(1) = 7 (7iJ(k)) = 7iJ+1(k).
Since 7J+l(k) = h+l(k) c P(k), the induction step is complete, and 7J+l(k) c P(k).
If w(l) 7 r(1), then by Proposition 3.2, T(1) < r(l), and the following relation
holds for p = 1:
S(1) < y2( < .. < pP(1) < 7 (1). (3)
Now suppose that (3) holds for some integer p > 1, and let q denote 7P(1). By
Proposition 3.2, 7r() c 1 C Ci, and combining this with (3), q = 7w(1) < 7r() G 1.
It follows from (1) that r(1) G q for q = 7P(1). By the definition of the parent,

7(q) = 7(TP(I)) = TP1(1) < 7(1) G q.
Hence, either wr+1(1) = 7 (1) or (3) holds with p replaced by p + 1. Since (3) is
violated for p sufficiently large, we conclude that there exists an integer p such that
TP(1) = r(1), from which it follows that
7rj1(k) = 7(J(k)) = 7(I) = P(1) = p(Th((k)) = ph(k) c P (k).

Since 7J+l(k) c P(k), the induction step is complete and 'P(k) C P(k).
Suppose that I c P(k)c. It is now important to recall that k is the index of the
first nonzero component of w, the vector appearing in the update. Observe that I
cannot equal k since I G P(k)Y and k E P(k). Th, proof that = C is by induction
on the depth d defined by

d(1) = max{i : 7r'(j) = I for some j}.

If d(1) = 0, then I has no children, and the child loop of Algorithm 2 will be skipped
when either 4 or C are evaluated. And since I \$ k, the pattern associated with w
cannot be added into Hence, when d(l) = 0, the identity I = L is trivial. Now,
assuming that for some p > 0, we have i = L whenever I G P(k)e and d(l) < p,
let us suppose that d(l) = p + 1. If i c 7-1(1), then d(i) < p. Hence, , = L for
i G 7-1(1) by the induction assumption. And since I \$ k, the pattern of w is not
added to C,. Consequently, when Algorithm 2 is executed, we have < = , which
completes the induction step.
Now consider the downdate part of the theorem. Rearranging the downdate
relation AA = AA wwT' we have

AAT = AA + wwT.

Hence, in a downdate, we can think of A as the updated version of A. Consequently,
the second part of the theorem follows directly from the first part. ]

4.1 Symbolic update algorithm
We now present an algorithm for evaluating the new pattern C associated with an
update. Based on Ti1r riem 4.1, the only sets Cj that change are those associated
with P(k) where k is the index of the first nonzero component of w. Referring to
Algorithm 2, we can set j = k, j = w(k), j = w2(k), and so on, marching up the path
from j, and evaluate all the changes induced by the additional column in A. In order
to do the bookkeeping, there are at most four cases to consider:
Case 1. j = k: At the start of the new path, we need to add the pattern for w to CL.

Case 2. c G P(k), j G P(k), j > k, and c c wr-(j) Fn r-1(j): In this case, c is a child
of j in both the new and the old elimination tree. Since the pattern L, may
differ from L,, we need to add the difference to j. Since j has a unique child
on the path P(k), there is at most one node c that satisfies these conditions.
Also note that if
c Gr nl(j) r-(J) n P(k),
then by T1 i'rem 4.1, LC = C, and hence, this node does not lead to an

Case 3. j G P(k), j > k, and c c w-l(j) \ 7-1(j): In this case, c is a child of j in the
new elimination tree, but not in the old tree, and the entire set c should be
added to since it was not included in C. By T!i -rem 4.1, w(p) = r(p) for
all p G P(k). Since c e wr1(j), but c w r1l(j), it follows that r(c) = j r(c),
and hence, c P(k)c, or equivalently, c c P(k). Again, since each node on the

path '(k) from k has only one child on the path, there is at most one node c
satisfying these conditions, and it lies on the path P(k).

Case 4. j c P(k), j > k, and c c 7-r1(j) \ -'l(j): In this case, c is a child of j
in the old elimination tree, but not in the new tree, and the set L, should be
subtracted from since it was previously added to L. Since r(p) = 7(p) for
each p G P(k)', the fact that r(c) = j 7(c) implies that c G P(k). In the
algorithm that follows, we refer to nodes c that satisfy these conditions as lost
children.

In each of the cases above, every node c that led to adjustments in the pattern was
located on the path P(k). In the detailed algorithm that appears below, we simply
march up the path from k to the root making the adjustments enumerated above.

Algorithm 3 (Symbolic update, add new column w)
Case 1: first node in the path
k = min {i : .' 7 0}
S= L + {i: P 0}
r(k) = min \ {k}
c=k
j = w(c)
while j 0 do
if j = r(c) then
Case 2: c is an old child of j, possibly changed
L = L + (L \ L,)

else

Case 3: c is a new child of j and
place c in lost-child-ueue of (c)
place c in lost-child-queue of 7(c)

end if
Case 4: consider each lost child of j
for each c in lost-child-queue of j do
= (, \ {c})
end for
m(j) = min L \ {j}
c=j
j = 7f(c)
end while
= . and r(j) = 7(j) for all j P(k)Y
end Algorithm 3

a lost child of 7r(c)

ThI time taken by this algorithm is given by the following lemma.

Lemma 4.2 T7 time to execute Algorithm 3 is bounded above by a constant times
the number of entries associated with patterns for nodes on the new path P(k). T,..i
is, the time is

Proof. In Algorithm 3, we simply march up the path P(k) making adjustments to
as we proceed. At each node j, we take time proportional to I j, plus the time
taken to process the children of j. Each node is visited as a child c at most twice,
since it falls into one or two of the four cases enumerated above (a node c can be a
new child of one node, and a lost child of another). If this work (proportional to I\,
or I,c) is accounted to step c instead of j, the time to make the adjustment to the
pattern is bounded above by a constant times either ICjl or ICjl. Since ICj < ICjl
by ThI -rem 4.1, the proof is complete. O
In practice, we can reduce the execution time for Algorithm 3 by skipping over
any nodes for which = T11., is, if the current node j has no lost children, and
if its child c falls under case 2 with = -c, then we have C = and the update
can be skipped. Til, execution time for this modified algorithm, which is the one we
implement, is

P(k) jEP(k)AL j

4.2 Symbolic downdate algorithm
Let us consider the removal of a column w from A and let k be the index of the first
nonzero entry in w. Thl symbolic downdate algorithm is analogous to the symbolic
update algorithm, but the roles of P(k) and P(k) are interchanged in accordance
lost child queues, we have new child queues, instead of walking up the path P(k), we
walk up the path P(k) D P(k).

Algorithm 4 (Symbolic downdate, remove column w)
Case 1: first node in the path
k = min {i : -. 0}
=L {i: 0}
7r(k) = min k \ {k}

c=k
j = 7(c)
while j \$ 0 do
if j = 7(c) then
Case 2: c is an old child of j, possibly changed

else
Case 3: c is a lost child of j and a new child of w(c)
L = ; (c \ {c})
place c in new-child-queue of r(c)
end if
Case 4: consider each new child of j
for each c in new-child-queue of j do
12j = L + (L \{c})
end for
w(j) = min j \ {j}
c=j
j = (c)
end while
i = ^L and T(j) = 7(j) for all j E P(k)
end Algorithm 4
Similar to Algorithm 3, the execution time obeys the following estimate:
Lemma 4.3 T',, time to execute Algorithm 4 is bounded above by a constant times
the number of entries associated with patterns for nodes on the old path P(k). T1,.i
is, the time is

O ( E ) L|l)
EP(k) j
Again, we achieve in practice some speedup when we check whether or not C2}
changes for any given node j G P(k). Ti, time taken by this modified Algorithm 4
is

O (P(k) jP(k) I

5 The numerical factors

When we add or delete a column in A, we update or downdate the symbolic
factorization in order to determine the location in the C11, -l.y factor of either new

nonzero entries or nonzero entries that are now zero. Knowing the location of the
nonzero entries, we can update the numerical value of these entries. We first consider
the case when A and L are dense and draw on the ideas of [15]. Tli, n we show how
the method extends to the sparse case.

5.1 Dense matrices
Our algorithm to implement the numerical update and downdate is based on the
.1i, -!di C5 in [15] for dense matrices. To summarize their approach, we start by
writing
AA = AAT + TwwT LLT + (wwT = L(I + avvT)LT, (4)
where v = L-w. Observe that

I + (vvT = (I + 7vvT)2, (5)

where 7 = c/(1 + i1 + (c v| -). T11, quantity 1 + ( 1| -' is positive on account
of our assumption that both AAT and AA are positive definite. Tli,. is, since
AA = L(I + avvT)LT, the matrices AA and I + (TvvT are congruent, and by
Sylvester's law of inertia (see [24]), they have the same number of positive, negative,
and zero eigenvalues. Tli, eigenvalues of I + avvvT are one, with multiplicity m 1
(corresponding to the eigenvectors orthogonal to v) and 1 + (T v | (corresponding
to the eigenvector v). Since AA is positive definite, its eigenvalues are all positive,
which implies that 1 + (T -' > 0.
Combining (4) and (5), we have

AA = L(I + 7vvT)(I + 7VVT)LT.

A sequence of Givens rotations, the product being denoted G1, is chosen so that
v'G1 = v-| pe where el is the vector with every entry zero except for the first entry.
Tli, matrix (I + 7vvT)GI has a lower Hessenberg structure. A second sequence of
Givens rotations, the product being denoted G2, is chosen so that (I + rvvT)GiG2
is a lower triangular matrix, denoted G, of the following form:

_71 62
G = V371 V372 63

Vm'71 Vm'72 Vm73 ... m

where 6 and 7 are vectors computed in Algorithm 5 below. Ti, diagonal elements of
G are given by gjj = 6j, while the elements below the diagonal are gij = yj.

Algorithm 5 (Compute G, dense case)
v = L-1w
w = max {i : vi 0}
P= %1
if w = 1 then
s5 = V~ + up2
else
t Vw
find G1, to zero out all but first entry of v:
forj = w 1 down to 1 do

3j = Vj/(st)
sj = t/s
t= S
end for
q = 1/p + ap/(l + V + p2)
find G2, and combine to obtain G:
for j = 1 to w 1 do
6 = ^(p))2 + s2

c (vj)/8j
s8 -Sj/6j

77 = C[}j 8 T]
end for
6, = vUjT]
end if
6i = 1 for w < i < m
end Algorithm 5

In Algorithm 6 below, we evaluate the new C11i,.1 -I.y factor L = LG, without
forming G explicitly [15] by taking advantage of the special structure of G. Tli,
product is computed column by column, moving from right to left. In practice, L can
be overwritten with L.

Algorithm 6 (Compute L = LG, dense case)
tl...n = 0
for j =w down to 1 do
for i j tom do
lij = 6jlij tiyj

ti = ti Vj lij
end for
end for
end Algorithm 6

Observe that about 1/3 of the multiplies can be eliminated if L is stored as a
product LD, where D is a diagonal matrix. Thl components of 6 can be absorbed
into D, with 7 adjusted accordingly, and 8 in Algorithm 6 can be eliminated. VWe did
not exploit this simplification for the numerical results reported in 8.

5.2 The sparsity pattern of v
In the sparse case, v = L-1w is sparse, and its nonzero pattern is crucial. Since the
elements of G satisfy gij = vyj for each i > j, we conclude that only the columns
of L associated with the nonzero components of v enter into the computation of L.
Thi nonzero pattern of v can be found using the following lemma.

Lemma 5.1 TI/, nodes reachable from any given node k by path(s) in the directed
.,.i,,l' G(LT) coincide with the path P(k).

Proof. If P(k) has a single element, the lemma holds. Proceeding by induction,
suppose that the lemma holds for all k for which |P(k)| < j. Now, if P(k) has j + 1
elements, then by the induction hypothesis, the nodes reachable from r(k) by path(s)
in the directed graph G(LT) coincide with the path P(r(k)). Ti, nodes reachable
in one step from k consist of the elements of k4. By Proposition 3.1, each of the
elements of LC is contained in the path P(k). If i C k, i 7 k, then |P(i)| < j.
By the induction hypothesis, the nodes reachable from i coincide with P(i) C P(k).
T!i, nodes reachable from k consist of {k} union the nodes reachable from Lk. Since
k C P(k), it follows that the nodes reachable from k are contained in P(k). On
the other hand, for each p, the element of LT in row rP(k) and column P1 l(k) is
nonzero. Hence, all the elements of P(k) are reachable from k. Since the nodes in
P(k) coincide with the the nodes reachable from k by path(s) in the directed graph
G(LT), the induction step is complete. D

Theorem 5.2 During symbolic downdate AA = AAT wwT (where w is a column
of A), the nonzero pattern of v = L- w is equal to the path P(k) in the (old)
elimination tree of L where

k = min {i : ,' 0}. (6)

Proof. Let W = {i : .,' 0}. Thl..rem 5.1 of Gilbert [10, 11, 14] states that
the nonzero pattern of v is the set of nodes reachable from the nodes in W by paths

in the directed graph G(LT). By Algorithm 1, W C Ck. Hence, each element of W
is reachable from k by a path of length one, and the nodes reachable from W are a
subset of the nodes reachable from k. Conversely, since k E W, the nodes reachable
from k are a subset of the nodes reachable from W. Combining these inclusions, the
nodes reachable from k and from W are the same, and by Lemma 5.1, the nodes
reachable from k coincide with the path P(k). [

Corollary 5.3 During symbolic update AA = AAT + wwT, the nonzero pattern
of v = L- w is equal to the path P(k) in the (new) elimination tree of L where k is
defined in (6).

Proof. Since LLT = AA wwT, we can view L as the Cl i. -l.y factor for the
downdate AAT wwT. Hence, we can apply Thi -rem 5.2, in effect replacing P by
P. n

5.3 Sparse matrices
In the dense algorithm presented at the start of this section, we write

AA = L(I + TvvT)(I + TVT)LT (7)

where v = L-1w. To be specific, let us consider the case where (7) corresponds to
an update. T1ll nonzero elements of I + TVVT either lie along the diagonal or at
the intersection of rows i G P(k) and columns j G P(k), where k is defined in (6).
In essence, we can extract the submatrix of I + TVVT corresponding to these rows
and columns, we can apply Algorithm 5 to this dense submatrix, we can modify the
submatrix of L consisting of rows and columns associated with P(k), and then we
can reinsert this dense submatrix in the m by m sparse matrix L. At the algebraic
level, we can think of this process in the following way: If P is the permutation matrix
with the property that the columns associated with P(k) are moved to the first |P(k)|
columns by multiplication on the right, then we have

AAT = P(PTLP)(PT(I + TVVT)P)(PT(I + TVVT)P)(PTLTP)PT.

Thl matrix pT(I + TvvT)P is a 2 by 2 block diagonal with the identity in the lower
right corner and a dense upper left corner V,

(PTLP) (PT(I vT)P) = L11 L12 0

(L11V L12
0 L22 (8)

Til, elements of L21 correspond to the elements lij of L for which i 'P(k)" and
j E P(k). By Proposition 3.1, Cj C j C P(k) when j E P(k). Since P(k)" C 1j, it
follows that i c - when i G P(k)c. Hence, lij = 0 and L21 = 0. Note that in general
L12 \$ 0, which is why Algorithm 6 is column-oriented. When (8) is multiplied by the
Givens rotations computed by Algorithm 5, we obtain

(pTLP) ( L11V L12 GG2 0
0 L22 0 I
(LIuG L12
0 L22
(L11 L 12 (
0 L22 (9)

We apply P to the left and PT to the right in (9) to obtain L. "'. i1 r L12 nor L22
are modified.
When (7) corresponds to a downdate, the discussion is the same as for an
update except that P(k) replaces P(k), and the role of Lj and Lj are interchanged.
In summary, the sparse update or downdate of a C11.1, -.y factorization can be
accomplished by first evaluating v = L-lw whose nonzero entries are contained
in the path P(k) or P(k) where k is defined in (6). We apply the dense Algorithm 5
to the vector of (symbolically) nonzero entries of v. Tih ni we update the entries in L
in the rows and columns associated with indices in P(k) or P(k) using Algorithm 6.
As the following result indicates, Algorithm 6 does not have to be applied to all
of LI because of its specific structure.

Proposition 5.4 During symbolic downdate, LI is a lower triangular matrix with
a dense profile. T1,.' is, for each row i > 1, there is an integer p < i such that
[Lll]i = 0 for all 1 < j < p, and [Lll]ij 0 for all p < j < i.

Proof. Thi rows and columns columns of LI correspond to the nodes in the path
P(k). We assume that the nodes have been relabeled so that

P(k) {1,2, ... ,|P(k)|}.

It follows that [Lill]i,i- 0 for all i > 1. If [Llln]p 0 for some p, then i G p, and
by (1), if j G P(p) and j < i, then i G Cj. Consequently, [Ll]ij \$- 0 for all p < j < i.
Letting p be the smallest such index p, the proof is complete. O

Corollary 5.5 During symbolic update Ll is a lower triangular matrix with a dense
profile. Tl.1', is, for each row i > 1, there is an integer pi < i such that [Lll]j = 0
for all 1 < j < pi, and [ll]ij \$ 0 for all pi < j < i.

Proof. This follows immediately from Proposition 5.4, replacing P(k) with P(k).

As a consequence of Proposition 5.4, Corollary 5.5, and Proposition 3.2, we can
skip over any index i in Algorithm 6 for which j < pi. When Algorithm 6 is applied
to the sparse L, as opposed to the dense submatrix, the indices j and i take values
in P(k) and Cj respectively for a downdate, while they take values in P(k) and j,
respectively, for an update.

6 Arbitrary symbolic and numerical factors

TIi, methods we have developed for computing the modification to the C11.1 -l.y
factors of AAT corresponding to the addition or deletion of columns in A can be
used to determine the effect on the CI11 .I -Iy factors of a general symmetric positive
definite matrix M of any symmetric change of the form M + o-wwT that preserves
positive definiteness. We briefly describe how Algorithms 1 through 6 are modified
for the general case.
Let AMj denote the nonzero pattern of the lower triangular part of M:

Mj= {i: mij 0 and i < j}.

Ti, symbolic factorization of M [5, 6, 7, 8, 23] is obtained by replacing the union of
Ak terms in Algorithm 1 with the set MAj. With this change, Cj of Algorithm 1 is
given by

Li= {j} U cf }- uM .

This leads to a change in Algorithm 2 for computing the multiplicities. Til,
multiplicity of an index i in Lj becomes

m(i,j) = I{k G r-1(j) : i G Ck}l + (1 if i G AMj, or 0 otherwise).

TI, for loop involving the Ak terms in Algorithm 2 is replaced by the single statement
} = L + M4j. '-. i, re precisely, we have:

for each k where min Ak = j do

end for

Entries are removed or added symbolically from AAT by the deletion or addition
of columns of A, and numerical cancellation is ignored. i:' i 1, cancellation of
entries in M should not be ignored, however, because this is the only way that entries
can be dropped from M. When numerical cancellation is taken into account, neither

of the inclusions AMj C AMj or AMj C AMj may hold. We resolve this problem by using
a symbolic modification scheme with two steps: a symbolic update phase in which
new nonzero entries in M + afwwT are taken into account, followed by a separate
symbolic downdate phase to handle entries that become numerically zero. Since each
modification step now involves an update phase followed by a downdate phase, we
attach (in this section) a overbar to quantities associated with the update and an
underbar to quantities associated with the downdate.
Let W be the nonzero pattern of w, namely W = {i : .' 0}. In the first
symbolic phase, entries from W are symbolically added to AMj for each j E W. Th1,
is, if i M4j, but i,j E W with i > j, then we add i to Mj:

Mj = Mj U {i W: i >j}.

In the second symbolic phase, entries from W are symbolically deleted for each j E W:

Mj = \{i W:i > j, m &j+c. =0}. (10)

In practice, we need to introduce a drop tolerance t and replace the equality

m ij + (T,.' Wj = 0

in (10) by the inequality Imnj + (T.' .'" I < t. For a general matrix, the analogue of
Tl, .rem 4.1 is the following:

Theorem 6.1 If a is the first index for which Ma 4 Ma, then P(a) C P(a)
and 4 = C for all i G P(a).'- If 3 is the first index for which M MA,, then
P(;) C P(;3) and = < for all i G (;3).

In evaluating the modification in the symbolic factorization associated with
M + awwT, we start at the first index a where Ma Ma and we march up
the path P(a) making changes to 1 obtaining 12. In the second phase, we start at
the first index where M 4 M and we march up the path P(/3) making changes to
Cj, obtaining 12. T11 analogue of Algorithm 3 in the general case only differs in the
starting index (now a) and in the addition of the sets M4j \ Mj in each pass through
the j-loop:

Algorithm 7a (Symbolic update phase, general matrix)
Case 1: first node in the path
a = min {i : Mi Mi}
r(a = min a \ Ma
7(4G) = min 2a \{}

c =a
j = (c)
while j 0 do
C + M, j \ M
if j = r(c) then
Case 2: c is an old child of j, possibly changed

else
Case 3: c is a new child of j and a lost child of r(c)
;=;+ (L \ {c})
place c in lost-child-queue of r(c)
end if
Case 4: consider each lost child of j
for each c in lost-child-queue of j do
L = L (L' \ { c})
end for
W(j) = min Lj \ {j}
c=j
j = I (c)
end while
- = c. and T(j) = 7(j) for all E P(a)9
end Algorithm 7a

Similarly, the analogue of Algorithm 4 in the general case only differs in the
starting index (now /3), in the subtraction of the sets and AMj \ Mj in each pass
through the j-loop.

Algorithm 7b (Symbolic downdate phase, general matrix)
Case 1: first node in the path
f = min {t : Mi Mi}

_([3) = min LC2 \ {3}
c= 3
j w= 7(c)
while j \$ 0 do
4 = c-M\M
if j = 7(c) then
Case 2: c is an old child of j, possibly changed
/21 -- (1 L(C\ IL)

else
Case 3: c is a lost child of j and a new child of r(c)

place c in new-child-queue of r(c)
end if
Case 4: consider each new child of j
for each c in new-child-queue of j do
S= 4 + (L \ {c})
end for
_(j) = min L \ {j}
c=j
j = (c)
end while
4 = and 7(j) = (j) for all j P( C)
end Algorithm 7b

Algorithms 5 and 6 are completely unchanged in the general case. Ti can be
applied after the completion of Algorithm 7b so that we know the location of new
nonzero entries in the CI11 1 -l.y factor. Ti ,- process the submatrix associated with
rows and columns in P(k) where k is the index of the first nonzero element of w.
When M has the form AAT and when M is gotten by either adding or deleting a
column in A, then assuming no numerical cancellations, Algorithm 7b can be skipped
when we add a column to A since Mj = Mj for each j. Similarly, when a column
is removed from A, Algorithm 7a can be skipped since AMj = AMj for each j. Hence,
when Algorithm 7a followed by Algorithm 7b are applied to a matrix of the form
AAT, only Algorithm 7a takes effect during a update while only Algorithm 7b takes
effect during a downdate. Tii-. the approach we have presented in this section for an
arbitrary symmetric positive definite matrix generalizes the earlier approach where
we focus on matrices of the form AA .

7 Implementation issues

In this section, we discuss implementation issues in the context of updates and
downdates to a matrix of the form AAT. A similar discussion applies to a general
symmetric positive definite matrix. We assume that the columns of the matrix A
in the product AAT are all chosen from among the columns of some fixed matrix
B. Ti update and downdate algorithms can be used to compute the modification
to a C11i., -I.y factor corresponding to additions or deletions to the columns of A.
Thil CI11h,. -y factorization of the initial AAT (before any columns are added or
deleted) is often preceded by a fill reducing permutation P of the rows and columns.

For example, we could compute a permutation to reduce the fill for BBT since the
C11 ., -b.y factors of AAT will be at least as sparse as those of BBT by Proposition 3.2,
regardless of how the columns of A are chosen from the columns of B. Based on the
number of nonzeros in each column of the C 111 -I-ly factors of BBT, we could allocate
a static storage structure that will always contain the C 11 1 -y factors of each AAT.
On the other hand, this could lead to wasted space if the number of nonzeros in the
Cl i.1 -b.y factors of AAT are far less than the number of nonzeros in the C11 1 I -.y
factors of BBT. Alternatively, we could store the C11 .1, -l.y factor of the current AAT
in a smaller space, and reallocate storage during the updates and downdates, based
on the changes in the nonzero patterns.
TIi, time for the initial C!,,,L -ly factorization of AAT is given by (see [8]) the
following expression:

0 El j 2)
j=1 /
which is O(m3) if L is dense. Each update and downdate proceeds by first finding
the new symbolic factors and the required path (P(k) for updating, and P(k) for
downdating), using Algorithm 3 or 4 (modified to skip in constant time any column
4 that does not change). Algorithms 5 and 6 are then applied to the columns in the
path. Tli, lower triangular solve of the system Lv = w in Algorithm 5 can be done
in time

^EP(k) j
during downdating, and

(jEP(k)
during updating [14]. Tli remainder of Algorithm 5 takes time 0(|P(k)|) during
downdating, and O(|P(k)|) during updating. Our sparse form of Algorithm 6
discussed in 5.3, which computes the product L = LG, takes time

(jEP(k) )
during downdating, and

(Ep(k) /
during updating. Since 2j C Cj during an update, it follows that the asymptotic
time for the entire downdate or update, both symbolic and numeric, is equal to the

asymptotic time of Algorithm 6. This can be much less than the O(m2) time taken
by Algorithms 5 and 6 in the dense case.

8 Experimental results

We have developed '.1 .111 codes to experiment with all the algorithms presented in
this paper, including the algorithms of 6 for a general symmetric, positive definite
matrix. In this section, we present the results of a numerical experiment with a large
sparse optimization problem from :'1 !!il, [3]. Ti, computer used for this experiment
was a '.idel 170 UltraSparc, equipped with 25(6,'.1B of memory, and with '..111, I11
Version 4.2c.

8.1 Experimental design
We selected an optimization problem from airline scheduling (DFL001). Its constraint
matrix B is 6071-by-12,230 with 35,632 nonzeros. Ti, matrix BBT has 37,923
nonzeros in its strictly lower triangular part. Its CI11, -l.y factor LB has 1.49 million
nonzeros (with a fill-minimizing permutation PB of the rows of B, described below)
and requires 1.12 billion floating-point operations and 115 seconds to compute (the
LP Dual Active Set Algorithm does not require this matrix, however, as noted earlier,
this is an upper bound on the number of nonzeros that can occur during the execution
of the LP DASA). This high level of fill-in in LB is the result of the highly irregular
nonzero pattern of B. Ti, basis matrix Ao corresponding to an optimal solution of
the linear programming problem has 5,446 columns.
We wrote a set of '.iAil.il scripts that implements our complete C11.1, -l.y
update/downdate algorithm, discussed in 7. We first found PB, using 101 trials
of '.i,, 1,,l 's column multiple minimum degree ordering algorithm (colmmd [12]), 100
of them with a different random permutation of the rows of B. We then took the
best permutation found. With this permutation (PB), the factor L of AoAT has 831
thousand nonzeros, and took 481 million floating-point operations and 51 seconds to
compute (using '.1, l,,lW's chol). Following the method used in LP DASA, we added
10-12 to the diagonal to ensure positive definiteness. We used the same permutation
PB for the entire experiment. Ti, initial symbolic factorization took 15 seconds
(Algorithm 2). It is this matrix and its factor that are required by the LP DASA.
We did not use '..,1tI1,1's sparse matrix data structure, since '.iI1,.1Ii removes
explicit zeros. C11i i I ing the nonzero pattern by a single entry can cause '.i,, i II to
make a new copy of the entire matrix. This would defeat the asymptotic performance
of our algorithms. Instead, the column-oriented data structure we use for , Q, and
L consists of three arrays of length ILBI, an array of length m that contains indices
to the first entry in each column, and an array of length m holding the number of

nonzeros in each column. Tli, columns are allocated so that each column can hold as
many nonzeros as the corresponding column of LB, without reallocation.
Starting with the optimal basis of 5,446 columns, we added one column at a time
until the basis included all 12,230 columns, and then removed them one at a time
to obtain the optimal basis again. Tli, total time and work required to modify the
factors was 76,598 seconds and 61.5 billion floating point operations. This divides
into

(a) 441 seconds for bookkeeping to keep track of the basis set,
(b) 6,051 seconds for the symbolic updates and downdates (Algorithms 3 and 4),
(c) 21,167 seconds to solve Lv = w (in Algorithm 5),
(d) 4,977 seconds for the remainder of Algorithm 5, and
(e) 43,962 seconds for Algorithm 6.
Algorithm 6 clearly dominates the modification algorithm. Tli, symbolic updates and
downdates took very little time, even though Algorithms 3 and 4 are not well suited
to implementation in i-.1, iII By comparison, using the CI,.1, -I.y factorization to
solve a linear system LLTx = b with a dense right-hand-side b (using our column-
oriented data structure for L) at each step took a total of 93,418 seconds and 67.7
billion floating point operations (each solve takes O(||) time). Note that this is
much higher than the time taken to solve Lv = w in Algorithm 5, because v and w
are sparse. Tli, time taken for the entire update/downdate computation would be
much smaller if our code was written in a compiled language. Solving one system
LLTx = b with a dense right hand side (using the factorization of the optimal basis)
takes 5.53 seconds using our column-oriented data structure, 1.26 seconds using a
.1, 11,,1 sparse matrix for L, and 0.215 seconds in Fortran 77.

8.2 Numerical accuracy

In order to measure the error in the computed C11.1, -l.y factorization, we evaluated
the difference A.A' LL i where LL is the computed Ci'.1 -l.y factorization.
For the airline scheduling matrix of 8, L has up to 1.49 million nonzeros and it is
impractical to compute the product LL after each update. To obtain a quick and
i- T
accurate estimate for E i, where E = AA LL we applied the strategy presented
in [16] (see [17, p. 139] for a symbolic statement of the algorithm) to estimate the
1-norm of the inverse of a matrix. Tliii is, we used a gradient accent approach to
compute a local maximum for the following problem:

max{I E\xl : 'x = 1}.

Since L is used multiple times in the following algorithm, we copied our data structure
for L into a '.1i11, 1I1 sparse matrix.

Algorithm 8 (Estimate 1-norm of an m by m matrix E)
xi = 1/m for 1 < i < m
p = 0 (p is the current estimate for E )
while E\ i > p do
p= Ex|l
for i = 1 to m do
S= 1 if (Ex), > 0
yi = -1 if (Ex)i < 0
end for
z ETy
j = arg max { zi : i = 1 to m}
if Ijl < zTx return
xi = 0 fori =1 to n
Xj = 1
end while
end Algorithm 8

To improve the accuracy of the 1-norm estimate, we used Algorithm 8 three times.
In the second and third trials, a different starting vector x was used as described in
[16]. Observe that Algorithm 8 only makes use of the product between the matrix
E and a vector. This feature is important in the context of sparse matrices since E
contains the term LL and it is impractical to compute the product LL but it is
practical to multiply LL by a vector. For the airline scheduling matrix of 8 the
values for El initially, at step 6,784, and at the end were 2.49 x 10-13, 1.01 x 10-10
and 1.54x 10-10 respectively. Ti! estimates obtained using Algorithm 8 were identical
at the same three steps. On the other hand, the times to compute E| i at the initial
step and at step 6,784 were 137.9 and 279.5 seconds, while the times for three trials of
Algorithm 8 were 8.7 and 14.7 seconds, respectively (excluding the time to construct
the :. i,, ll 11 sparse matrix for L). Our methods were quite accurate for this problem.
After 6,784 updates and 6,784 downdates, or 13,568 changes in A, the 1-norm of E
increased by only a factor 618! Figure 1 shows the estimated value of E i computed
every 10 steps using Algorithm 8. Tli, 1-norm of the matrix AAT increases from
458.0 initially to 1107.0 at iteration 6,784, then returns to 458.0 at iteration 13,568.
Hence, the product of the computed C11 .1 -l.y factors agrees with the product AAT
to about 15 significant digits initially, while the products agree to about 12 significant
digits after 13,568 modifications of A.

10000 12000 14000

Figure 1: Estimated 1-norm of error in Cli ., -I.y factorization

8.3 Alternative permutations

Our methods are optimal in the sense that they take time proportional to the number
of nonzero entries in L that change at each step. However, they are not optimal with
respect to fill-in, since we assume a single initial permutation, and no subsequent
permutations. A fill-reducing ordering of BBT might not be the best ordering to use
for all basis matrices. A simple pathological example is the m-by-n matrix B, where
n = m(m 1)/2, and the nonzero pattern of each of the column of B is a unique
pair of integers from the set {1, 2, - m}. In this case, every element of BBT is
nonzero, while the nonzero pattern of AAT is arbitrary. As the matrix A changes, it
might be advantageous to compute a fill-reducing ordering of AAT if the size of its
factors grow "too large." A refactorization with the new permutation would then be
required.
We found a fill-reducing permutation PA of the optimal basis AoA( (again, the
best of 101 trials of colmmd). This results in a factor L with 381 thousand nonzeros,
requiring only 169 million floating point operations to compute. This is significantly
less than the number of nonzeros (831 thousand) and floating point operations (481
million) associated with the fill reducing permutation for BB We also computed
an ordering of AAT at each step, using colmmd just once, and then computed the
number of nonzeros in the factor if we were to factorize A using this permutation
(Ps). Although it only takes about 1 second to compute the ordering [12] and symbolic

Nonzeros in L for three different permutations

Figure 2: Nonzeros in L using three different permutations

factorization [13], it is not practical to use 101 random trials at each step.
Figure 2 depicts the nonzero counts of L for these three different permutations,
at each of the 13,568 steps. Tli, fixed permutation PB results in the smooth curve
starting at 831 thousand and peaking at 1.49 million. Tli, fixed permutation PA
results in a number of nonzeros in L that starts at 381 thousand and rises quickly,
leaving the figure at step 1,206 and peaking at 7.4 million in the middle. It surpasses
PB at step 267. Using a permutation, Ps, computed at each step s, gives the
erratic line in the figure, starting at 390 thousand and peaking at 1.9 million in the
permutation PA, use it for 267 steps, and then refactorize with the permutation P,
computed at step 267. This results in a new factor with only 463 thousand nonzeros.
" 11, the center of the figure, however, the basis A includes most of the columns in
B, and in this case the PB permutation should be used.

9 Summary

We have presented a new method for updating and downdating the factorization LLT
of a sparse symmetric positive definite matrix AAT. Our experimental results show
that the method should be fast and accurate in practice. Extensions to an arbitrary
sparse symmetric positive definite matrix, M, have been discussed. We mention here

an additional extension to our work that would be useful.
We do not make use of the supernodal form of the factorization, nor do we use
the related compressed pattern of L [8]. Any method can be used for the numerical
factorization of the first basis matrix, of course, but the factor would then be copied
into a simple column-oriented data structure. Keeping the supernodal form has the
potential of reducing time taken by the symbolic factorization (Algorithm 2), the
symbolic update and downdate (Algorithms 3 and 4), and the numerical update and
downdate (dense matrix kernels could be used in Algorithms 5 and 6). However,
supernodes would merge during update, and split during downdate, complicating the
supernodal form of the factorization.

References

[1] P. A: 1..- i Y, T. A. DAVIS, AND I. S. DUFF, An approximate minimum degree
ordering algorithm, SIA':. J. '-.:lii Anal. Applic., 17 (1996), pp. 886-905.

[2] T. H. Coi: Ii-:.., C. E. LEISERSON, AND R. L. RIr 1 '. Introduction to
Algorithms, :. l' Press, Cambridge, :.i~--,I I!i-its, and :.1 Graw-Hill, Y:' "York,
1990.

[3] J. J. DONGARRA AND E. GRO'-'I Distribution of mathematical software via
electronic mail, Comm. AC':i 30 (1987), pp. 403-407.

[4] I. S. DUFF, A. M. Ei:I -:1 ,, AND J. K. REID, Direct Methods for Sparse
Matrices, London: Oxford Univ. Press, 1986.

[5] S. C. EISENSTAT, M. C. GURSKY, M. H. SCHULTZ, AND A. H. Sii :[:! ,N,
Yale sparse matrix package, I: T7,, symmetric codes, Internat. J. :"ii, i
.1 11. 1i-. Eng., 18 (1982), pp. 1145-1151.

[6] A. GEORGE AND J. W. H. LIU, T/., design of a user interface for a sparse
matrix package, A' ".1 Trans. :--.it Softw., 5 (1979), pp. 139-162.

[7] An optimal algorithm for symbolic factorization of symmetric matrices,
SIA '. Journal on Computing, 9 (1980), pp. 583-593.

[8] Computer Solution of Large Sparse Positive Definite Systems, Englewood
Cliff- I ., Jersey: Prentice-Hall, 1981.

[9] A. GEOI:[:L, J. W. H. LIU, AND E. NG, A data structure for sparse QR and
LUfactorizations, SIA'-. J. Sci. Statist. Comp., 9 (1988), pp. 100-121.

[10] J. R. CLii-.;li:i, Predicting structure in sparse matrix computations, Tech.
Report CS-86-750, Computer Science Dept., Cornell Univ., 1986.

29

[11] Predicting structure in sparse matrix computations, SIA:'. J. :.i1 11ii Anal.
Applic., 15 (1994), pp. 62-79.

[12] J. R. Gl[.!ill: Ir, C. MO,[l. :, AND R. S(illi: IiIl I:: Sparse matrices in MATLAB:
design and implementation, SIA:'.I J. '.i 1,11 i Anal. Appl., 13 (1992), pp. 333-356.

[13] J. R. G I[.!;lr:i, E. G. NG, AND B. W. PEYTON, An f. .1 algorithm
to compute row and column counts for sparse CI, .'I I factorization, SIA'.I J.
'.Lini \i Anal. Applic., 15 (1994), pp. 1075-1091.

[14] J. R. Cl .!; I 1: AND T. PI. i .I:.-, Sparse partial pivoting in time proportional
to arithmetic operations, SIA'.I J. Sci. Statist. Comput., 9 (1988), pp. 862-874.

[15] P. E. GILL, G. H. GOLUB, W. MURRAY, AND M. A. SAUNI,.I:-, Methods
for modifying matrix factorizations, '..111 !1, it.1 i, s of Computation, 28 (1974),
pp. 505-535.

[16] W. W. II ,I:i : Condition estimates, SIA:'.I J. Sci. Comput., 5 (1984), pp. 311
316.

[17] Applied Ai, ,ical Linear Algebra, Prentice-Hall, Englewood Cliff- NJ,
1988.

[18] Updating the inverse of a matrix, SIA:'. Review, 31 (1989), pp. 221-239.

[19] W. W. II ,I(:l :, C.-L. SHI, AND E. O. LUNDIN, Active set strategies in the LP
dual active set algorithm, tech. report, Dept. of '.ii,!l, iiit i, -' Univ. of Florida,
Gainesville, FL, Aug. 1996. Available at http://www.math.ufl.edu/-hager.

[20] J. W. H. Liu, T7,, role of elimination trees in sparse factorization, SIA:.I J.
'.,it iii Anal. Applic., 11 (1990), pp. 134-172.

[21] D. J. P i, I; R. E. TARJAN, AND G. S. LI..1.I.. I:, Algorithmic aspects of vertex
elimination on ./,l,'.I' SIA '.I J. Comput., 5 (1976), pp. 266-283.

[22] R. S('i:ll!;lI: A new implementation of sparse Gaussian elimination, AC'.I
Trans. :'.1, !II Softw., 8 (1982), pp. 256-276.

[23] A. H. Siili:! ., On the f. ,1! solution of sparse systems of linear and
nonlinear equations, Tech. Report 46, Yale Univ. Dept. of Computer Science,
Dec. 1975.

[24] G. STRANG, Linear Algebra and Its Applications, Academic Press, "- "York,
1980.