Penalized Maximum Likelihood Methods in Emission Tomography

Material Information

Penalized Maximum Likelihood Methods in Emission Tomography
ZAHNEN, JEFFREY A. ( Author, Primary )
Copyright Date:


Subjects / Keywords:
Image reconstruction ( jstor )
Mathematical constants ( jstor )
Mathematical vectors ( jstor )
Maximum likelihood estimations ( jstor )
Objective functions ( jstor )
Penalty function ( jstor )
Perceptron convergence procedure ( jstor )
Pets ( jstor )
Positron emission tomography ( jstor )
Tomography ( jstor )

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright Jeffrey A. Zahnen. Permission granted to University of Florida to digitize and display this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Embargo Date:
Resource Identifier:
659560804 ( OCLC )


This item has the following downloads:

Full Text







Copyright 2006


Jeffrey A. Zahnen

This dissertation is dedicated to my grandmother, Lena Zahnen, for ah--iv

believing in me.


For his patience and advice through the years, I offer my greatest thanks

to my advisor, Dr. Bernard Mair. In addition, I would like to thank Dr. David

Gilland, Dr. William Hager, Dr. James Keesling, Dr. Scott McCullough and

Dr. David Wilson for taking the time to serve on my committee and providing

sl.-.-. -1 i..o:, to improve the work. Finally, I would like to thank my parents for their

unwavering support of my studies.


ACKNOWLEDGMENTS ................... ...... iv

LIST OF TABLES ...................... ......... vii

LIST OF FIGURES ................... ......... viii

ABSTRACT ....................... ........... ix



1.1 Introduction To PET ......... ................. 1
1.2 Mathematical Model for PET ....... ............. 2
1.3 PET Probability Function ..... .................. 3
1.4 Reconstruction Techniques ........ .............. 4
1.4.1 Maximum Likelihood ...... ... ......... .. 4
1.4.2 Penalized Maximum Likelihood (PML) ...... ... ... 6
1.5 Minimizing Function ........ .................. 9

2 PML RECONSTRUCTION METHODS .................. 10

2.1 Green's One Step Late Algorithm ...... ......... 10
2.1.1 Lange's Improvement to Green's OSL ..... ... .... 11
2.1.2 Mair-Gilland Method ....... ......... .... 12
2.2 Accelerated Penalized Maximum Likelihood Method (APML) .12

3 GENERAL PML ALGORITHM .................. ..... .15

3.1 General Algorithm Statement .................. .. 15
3.1.1 General Algorithm Update .................. .. 15
3.1.2 Penalty function criteria ............. .. .. .16
3.2 Proof of Convergence .................. ....... .. 17
3.3 New PML Algorithm .................. ....... .. 21
3.3.1 Non-Uniform Step-Size Method ............... .. 22
3.3.2 Line Search Methods .................. ..... 24
3.3.3 Stopping Criteria .................. ... .. 26


4.1 Algorithm Desription .................. ....... .. 27

4.1.1 RM Algorithm ......
4.1.2 Image-Matching Penalty
4.1.3 Modified RM Iteration
4.2 Convergence Results ......

5 RECONSTRUCTIONS. .................. ........ 36

5.1 PML Reconstructions .................. ....... 36
5.2 Modified RM Reconstructions .................. .. 42
5.3 Conclusions .................. ............ 45


REFERENCES .............. ................... 53

BIOGRAPHICAL SKETCH .................. ......... 54


Table page

4-1 RM voxel neighborhood .................. ........ .. 30

5-1 PML reconstruction data .................. ........ .. 38

5-2 RM data a = 0.02, 0 = 0.005 .................. ... .. 44


Figure page

1-1 A 2-D PET tube registering an emission ............. 1

1-2 Tube regions ............... ............ .. 3

1-3 Sample neighborhood system with associated weights . . .. 7

1-4 Derivatives of the penalty terms ................ ..... 8

3-1 Armijo rule .................. ............... .. 25

5-1 Phantom chest source image .................. ..... .. 36

5-2 Comparing the decrease in the objective function . ..... 38

5-3 Comparing the root-mean-square (RMS) error . . 39

5-4 Phantom image reconstruction using bisection rule . . 40

5-5 Phantom image reconstruction using Armijo rule . 41

5-6 Line plot of heart region ............... ....... .. 41

5-7 Comparing reconstruction algorithms ............. .. .. 42

5-8 Modified RM function plots ................ .... .... 43

5-9 Quiver plot of Slice 20 ............... ........ .. 44

5-10 RM reconstructions ............... .......... .. 45

Abstract of Disseratation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



Jeffrey A. Zahnen

December 2006

C'!h In: Bernard Mair
Major Department: Mathematics

Positron emission tomography (or PET) has been utilized for years in medical

diagnosis. The PET reconstruction problem consists of recovering the image from

the data gathered by the detectors in the machine. Penalized maximum likelihood

techniques allow for the recovery of the image with a smoothing term to improve

the image. Previous methods using this technique forced the use of a weak penalty

term in order to gain convergence. Using a new method, this restriction is removed

while retaining convergence. In addition, the method is extended to the problem of

recovering multiple PET images along with a vector describing the motion between

the images.


1.1 Introduction To PET

Emission tomography has been used for years to study metabolic processes

in the body. In this procedure, a patient is injected with a short-lived radioactive

isotope. While the substance is being absorbed by the tissues in the body, the

patient is placed in a scanner. As the isotope decays, it begins to emit positrons.

These emitted positrons travel only a short distance before interacting with

a nearby electron. In positron emission tomography (PET), the subsequent

annihilation produces a pair of photons traveling in opposite directions on a line

oriented in a uniformly random direction. A PET scanner consists of individual

detector crystals arranged around the patient in a cylindrical manner. Each pair

of detectors forms a virtual detection tube. Figure 1-1 di- -1ph,- a tube formed by

detectors a and 3. A near-simultaneous arrival of two photons at opposite ends of




Figure 1-1: A 2-D PET tube registering an emission

the tube is then registered as a single detection. The tube shown in Figure 1-1 will

register the di-pl i '1, emission at point B but not the emission at point A.

In single photon emission computed tomography (SPECT), a single photon is

emitted. In this method (more widely applied than PET due to the relative costs

of the scanning machines), the detectors are located only on one side of the patient

and a single detection results in a count. This scan is performed for different angles

between the patient and detector to provide a full image.

In both procedures, the aim is to determine the number of emissions from

the various locations in the patient. The resulting intensity images are used to

make medical diagnoses. This work will consider the problem of PET image

reconstruction using the maximum likelihood (\I b) method with the addition

of a penalty term. Although developed for PET reconstruction, the method can

be extended for SPECT reconstruction. Unlike previous methods, the revised

algorithm presented uses a more general penalty term, allowing for a wider range

of settings. Proof of convergence and reconstruction results are obtained using this


1.2 Mathematical Model for PET

The 2-dimensional PET scanner consists of a circle of detectors surrounding

the region of interest, Q, which can be taken to be rectangular. There are I

virtual detector tubes, each one formed by two of the detectors composing the

scanner ring. f is partitioned into J smaller-sized rectangles, called voxels. An

emission in voxel j is detected by tube i if the emitted positrons are detected

simultaneously in the two detectors forming the tube. Shepp and Vardi [16]

introduced a mathematical model for PET in 1982. In the model, the random

emissions from voxel j are modeled by a Poisson process with mean fj. Thus the

detector count in tube i, Yi, will also be governed by a Poisson process with mean

Ai = pj fj


where pij is the geometric probability that an emission in voxel j is registered by

detector tube i. Using the properties of the Poisson distribution, the probability

that there are yi counts registered in detector i is given by

e-Cx i y

To recover the emission intensity, it is necessary to estimate f = [f, f2, ..., fJ] based

on a random sample from Y = [Y, Y2, ..., Y].

1.3 PET Probability Function

The probability function pij for the 2-dimensional PET scanner can be

defined using the geometry of the scanner. Given any tube i formed by two

detectors, it can be divided into six regions as shown in Figure 1-2. The voxels




F3 \R /AA

Figure 1-2: Tube regions

are assumed to be so small that the probability can be assumed approximately

constant on the entire voxel. If z is the center of voxel j, then pij is defined by the

probability that the path of a pair of photons emitted from the center of the voxel,

z, intersects both detectors defining the tube. This probability is proportional to

the angle-of-view of the tube from z. Thus, if CzD is defined to be the acute angle

formed by the lines passing through C and z and D and z, the following definition

results [12].
CzD if z e R1

7 -BzC if z R2

7TPi, AzB if z e R3 (1-2)

7 DzA if z e R4

0 if z E R22 or 44

1.4 Reconstruction Techniques

The reconstruction problem is to recover the emission intensity f given the

scanner data y, assumed to be Poisson-distributed. Several methods have been

used to accomplish this reconstruction. If the detectors are treated as points, the

model reduces to line integrals along detector lines. Filtered-back projection (FBP)

is a popular method utilized to recover the emission intensity, as it provides a quick

way to invert the Radon transform. However, it has been seen to produce images

with streaking artifacts [16]. In 1998, Carroll [3] decomposed the data, probability

matrix and intensity functions into basis functions, transforming the problem into

an infinite set of linear equations to solve. Our algorithm though is based on a

third method, Maximum likelihood estimation (! I).

1.4.1 Maximum Likelihood

In the ML method, a search is made for the emission intensity f that

maximizes the probability of obtaining the Poisson-distributed data. Each detector

count yi is assumed to be an independent sample from a Poisson distribution with

a mean corresponding to i = Pf i = j pijfj. Thus the probability that there are


yi counts in detector i is given by

P(Y,= Uyt=A)- (13)

The detector counts are independent random variables, so these probabilities are

multiplied together to yield the probability of obtaining the data from an emission

intensity map of f. An ML estimate is an intensity f that maximizes the following

--i Y

In addition, the intensity map, f is constrained to be non-negative ({f lf > 0}).

An equivalent problem is to apply the negative logarithm and perform a

minimization of the term

-(Pf, y log Pf, log yi!).

As yi! does not depend on the unknown intensity f, it can be safety dropped from

the term resulting in the ML objective function.

L(f) = Pf, y, logPf, (1 4)

The Expectation-Maximization (or EM) algorithm developed by Shepp and Vardi

[16] can then applied to find the minimizer of this term, resulting in the EMML

f7+ fni If Z iIf (1 5)
Z Pi,j
where f" refers to the nth step in the iteration.

Alternately, the EMML iteration can be expressed as a descent algorithm.

fn+1 fn fin6 f )/VL,.
JY i ] *3Qf '

Restated in matrix form yields the equation

f"+l f" aD(f")VL(f").

where a = 1 is the step-size and D is the diagonal matrix whose diagonal

entries consist of dj,j = fj/ ipj. While the EM algorithm has been shown

to monotonically decrease the objective function and be globally convergent, a

well-known deficiency is a tendency to produce noisy images if the iteration is run

too long [8].

1.4.2 Penalized Maximum Likelihood (PML)

To help reduce the noise, a smoothing term can be added to the likelihood

term resulting in the penalized maximum likelihood (PML) method. Equivalently,

this method can be considered as an application of MAP (maximum a-posteriori),

in that the recovered image is assumed to have a known local Gibbs probability

distribution. This method, used by Geman and McClure [7], is employ, 1l by first

choosing a penalty function Q and then forming the term

U(f) E '(f- fJ) (1-6)
i j

where w is a weighting factor that associates pixels. This can be used to link

neighboring pixels, forcing them to have similar intensities. For example, Ni,

can be defined on each voxel i to be the set of pixels in a neighborhood lattice

of that voxel. Thus the weight :.' i 0 if j N(i). If j N(i), the weight is

defined to be a nonzero constant. One possibility is to let Ni consist of the eight

neighboring voxels as shown in Figure 1-3 with associated weights. In addition,

a constant value, 7 is multiplied to the penalty term to control the influence of

the penalty in the reconstruction. As 7 0, the reconstructions approach those

of the EM algorithm. However, if 7 is chosen to be too large, the images will be


Figure 1-3: Sample neighborhood system with associated weights

There exist many choices for the penalty function 0(r). A frequently used

choice is a penalty of the form 0(r) = log cosh [8],where 6 is a previously chosen

constant that allows for some fine-tuning of the penalty. This constant determines

how the prior is scaled with respect to r, as the log cosh penalty will smooth all

differences above a certain cutoff value, determined by 6. As 6 -+ o, the method

will approach results of the EM Algorithm (as no smoothing will occur). This

penalty is useful as it approximates quadratic smoothing for small values of r, but

its derivative remains bounded.

In 1993, Lalush and Tsui [9] developed a new penalty function that allowed

for even greater control over the penalized range of intensities. Noting that current

PML methods did not require the penalty function to be calculated, they instead

built the function from properties of its derivative. Thus, they produced

S r [1 + 2( )2]-2 (17)
Or Irl 6 r

where a and 6 are constants that affect the spread and location respectively of

peak of the derivative. As a increases, the peak becomes more narrow and as 6

1/sqrt(2) 1 1/sqrt(2)

1 0 1

1/sqrt(2) 1 1/sqrt(2)


increases, the peak away from the origin. The plot of the derivative shows the

in i i iiiy of the smoothing occurring in the region under the peak of the derivative

[9]. Thus altering the values of a and 6 will affect the range of smoothing in the

image. By comparing this function to the derivative of the logcosh function,

'(r) = (1/6) tanh r

it can be seen that while the logcosh function does not have a peaked derivative

08 0016
07 0014
06 0012
05 001-
04 0008
03 0006
02 0004
01 0002
0 0
0 20 40 60 80 100 20 2 40 60 80 100
(a) Derivative of the Lalush-Tsui penalty (b) Derivative of the logcosh penalty

Figure 1-4: Derivatives of the penalty terms

(it increases to an .i-vmptotic limit), it can be considered as an infinitely-wide

peak. Thus, Green's logcosh penalty can be approximated as a Lalush-Tsui penalty

by taking a close to zero (producing a near infinitely-wide peak). Integrating the

Lalush-Tsui penalty term results in their penalty function:

6 T2 2a4
( = log[1 2a2(1- )+ a2(6 + r4 2r2(1 -2a2)].
2a 62 62

While Green's original OSL algorithm does not require the calculation of this

term, other algorithms do require the frequent calculation of the objective function.

In that case, the Lalush-Tsui penalty will have a larger computation time than

other penalties. In addition, the general Lalush-Tsui penalty does not have the

strict convexity condition needed for convergence to a unique minimizer.

1.5 Minimizing Function

Given a choice of penalty term U and constant 7, the PML objective function
thus becomes

E(f) = Pf, y logPf, + TU(f) (1 8)

In Maximum Likelihood, this objective function must be minimized over the set of

positive intensity values {f fj > 0}. This minimum must satisfy the Kuhn-Tucker

optimality conditions for the problem. In this situation, those two conditions are

that for all j
(f) > 0 (1-9)
fj- (f)= 0
It will be shown that the revised algorithm presented in C'i lpter 3 will meet these

conditions at convergence.


2.1 Green's One Step Late Algorithm

Applying the Maximum Likelihood method, a search is made to minimize the

penalized objective function:

E(f) (Pf,- ylogPfi) + TU(f)
Differentiating the objective function and replacing it in the 2nd Kuhn-Tucker

optimality condition (1-9) yields the condition that for all j:

hfy [ (pi,- ) + 7 a (f)] 0.

Regrouping terms in this sum produces

i 9U f ii Pfi

Now define

rj(f) (EPj + 7 (f))- a
i 'afj
With this definition, it can be seen that the second Kuhn-Tucker condition is

equivalent to

fj fjri(f) Y YiPi,j (2-1)
i Pfi
This form for the optimality condition is useful as it reformulates the equation into

a fixed point condition.

In 1990, Green developed the One-Step-Late (OSL) algorithm [8] to apply the

EM algorithm to the Penalized Maximum Likelihood problem. The One-Step-Late

name comes from the choice to evaluate the penalty at the current iterate rather

than the (unknown) update, as it allowed the EM algorithm to be applied to the

PML problem. In Green's OSL, the update to iterate f" is given by

f~+ = f-rj (ff) 7 ) iPij
i i

Using the fact that 1f(f) E pj y U (f), the update can be expressed in

the alternative form:
f"+l fn + v(fn) (2-2)

vj(f) --fjrhrj (f). (2-3)

Thus if Green's OSL algorithm converges, it will satisfy the same fixed point

condition as the second Kuhn-Tucker condition. Note that if 7 = 0, Green's OSL

algorithm reduces to the EMML Algorithm (1-5). However, Green's OSL algorithm

does not guarantee positivity of the iterates or convergence.

2.1.1 Lange's Improvement to Green's OSL

In 1990, Lange showed that altering Green's algorithm to include a line search

could enforce convergence and positivity [10]. In Lange's algorithm, rj(f") is

required to be be positive for all j. To meet this condition, the penalty parameter

7 be must be chosen small enough to keep rj(fn) > 0 for all possible iterates f"

and voxels j. In the method, Green's OSL update is altered to include a step-size

parameter s (taking s=1 reduces the method to the original OSL method). Thus

the update becomes
f"+l f" + s"v(f") (2-4)

where v is defined as before. The step-size s" is chosen to be a positive value that

gives both positivity of the new iterate (taking s" < 1 enforces this condition) and

a decrease in the objective function E. To decrease the objective function, define

P(s) = E(f + sv(f"))


If P'(0) < 0, the algorithm will be decreasing for small values of the step-size. With

this condition that rj(f") > 0, it follows that P'(O) =- j frj (f")(2)2 (f ) < 0

if fL (ff") / 0 for some j. Since P'(0) < 0, the objective function will be

decreasing for small values of the step-size s. A line-search is then performed for

an appropriate step-size that will give a positive update and a decrease in the

objective function. In Lange's algorithm, the step-size that gives the greatest

decrease in the objective function is denoted s and a search is performed to find a

value s < s within c of s. This condition requires an exact line search, eliminating

the possibility of faster inexact search algorithms.

2.1.2 Mair-Gilland Method

One way to remove the restriction of positivity of the r-term is to allow the

constant s to be a negative value. In this method (developed for use in the RM

Algorithm [14]), the step-size s" is allowed to be negative. The sign of s" depends

on the sign of P'(0). If P'(0) < 0, the objective function will be decreasing for

small positive values of the step-size. In this case, the step-size is chosen as in

Lange's method. A search is made for a value that ensures positivity of the iterate,

but is also within c of the value that gives the greatest decrease. If P'(0) > 0,

then E will be increasing for small positive values of s. In this case, s" is allowed

to be negative. The search is made in the negative direction, again for a value that

ensures positivity and is within c of the value that gives the greatest decrease in

the objective function. Although this method will decrease the objective function

and convergence, the limit may not satisfy the second Kuhn-Tucker optimality


2.2 Accelerated Penalized Maximum Likelihood Method (APML)

The accelerated penalized maximum likelihood method (APML) was developed

in 2004 by ('!ih ig et al [5]. In the APML method, surrogate functions to decrease

the objective function and provide convergence. In the method, a penalty function

0(r) and associated neighborhood system is chosen as described in Chapter 1. In
addition, the related function 7(r) = r(r)/r is required to be non-increasing. In the

method, the trial iterate is formed by

f'1 -Gj(f) + /Gj(f)2 8Ej(f)F()/4F ) (2 6)

with the defined functions given by

YiPi,j fj
Z j Pij f4,
F, 2/3 ,7(f-, f)
G p 2 i(f fk)(f + fk)
i kENj

As this was seen to converge slowly, an acceleration factor (the APML Method)

was added to the algorithm [4]. As in Lange's method, a step-size factor is added
to the algorithm. In APML, the accelerated update is given by

f"+i f" + sv(f) (2-7)

with v(f") = f" f". In Lange's method, a search method is employ, l1 to find a

suitable choice for the step-size. Here, surrogate functions are used to find the best
possible step-size needed to decrease the objective function. Surrogate functions
were first applied to PET reconstruction by DePierro [6]. In order to minimize a
function f over a set S using the method of surrogate functions, another function i

must be found satisfying

f(x") = (x")
f(x) > Q(x) for all x e S.
The difficulty in using this method lies in finding an appropriate function Q. In
APML, creating the surrogate function requires the calculation of four separate
functions related to the penalty function, possibly increasing the computation


time. Reconstructions gathered using the APML algorithm will be compared to

reconstructions using the revised algorithms presented in this paper.


3.1 General Algorithm Statement

In the maximum likelihood method, a search is made for an emission intensity

f that minimizes the objective function

E(f) = L(f) + 7U(f)

under the restriction that fj > 0 for all j. In this chapter, an iterative algorithm

to find this minimum will be presented. Under specific conditions, convergence to a

minimizer will be proven.

3.1.1 General Algorithm Update

The new algorithm is formed by taking Lange's modification to OSL, but

treating the step-size as a vector instead of a constant. To apply the algorithm,

initialize first by taking fo = Yi yi/J. Given a current iterate f", the next iterate,
f"+l can be constructed by the following update:

fn+l= f" + sn v(f") (3-1)


s" is the step-size vector

rj(f) (i pi,j + 7l-(f)) 1

vj(f) fjrj(f) (f)

(x 0 y)j = xjyj (the Hadamard product).

Note that v(f) and r(f) are the same defined functions as in the OSL algorithm.

Given a current iterate f", the step-size vector s" = [s,, ..., s"] is chosen to

produce the next step in the algorithm. This vector must satisfy the conditions:

0 < |sj| < K for some large constant K (3-2)
l < ( rj(fn) E(fJ)) 1 (33)

E(f + s'+ v(f)) < E(fn)if lv(fn)l / 0 (3-4)

(s' r(f'))j > 0. (3-5)

The first three conditions on the step-size vector enforce respectively boundedness

of the step-size, positivity of the iterate and decrease in the objective function.

The final condition requires that for each voxel j, the sign of sj matches the sign

of rj(f). Note that a unique s is not guaranteed. The revised algorithm presented

after the proof provides an example of one way to choose a suitable step-size

vector. Once a penalty function and method of constructing a step-size vector

(both matching the required criteria) have been chosen, the algorithm can proceed.

Continue iterating until a suitable stopping criteria has been reached. Convergence

of this algorithm will now be investigated.

3.1.2 Penalty function criteria

The general penalty term U(f) is of the form

U(f) E iy(f, fy) (3-6)
i j

where 0(r) is required to satisfy the following properties:

S(0) 0
limroo (r) -+ o

(-r) = (r) (0 is an even function)

c C1 (0 has a differentiable first derivative)

b is a strictly convex function (0"(r) > 0).

The first three conditions are basic conditions for a penalty function, while the

latter two conditions are required for convergence. Functions that match these

conditions include the logcosh and quadratic penalties. Note that the general

Lalush-Tsui penalty does not meet these conditions due its lack of convexity.

3.2 Proof of Convergence

Theorem 1. Let < be a p ,'.l'//l; function .-,/->;bi;/ the conditions stated in Section

(3.1.2). Let E be the PML objective function E(f) = L(f) + 7U(f). DA i,

fo = E yi/J for all j (the il1.,>rithm is initialized with a flat emission ';. iilu:).

Let the ilj,.rithm update be given by (3-1) with the step-size vector chosen such

that conditions (3-2 35) are -.,i/r.f ,. The il'ji .':thm will then converge to a

unique minimizer.

The proof (extended from Lange's PML proof [10]) will demonstrate that the

algorithm converges to a unique vector that satisfies both Kuhn-Tucker optimality

conditions, hence is the minimizer of E. It consists of two parts. In the first,

Zangwill's convergence theorem is applied to prove that the objective function

converges and that all limit points will fulfil the second Kuhn-Tucker condition. In

the second part of the proof, it will be shown that there is a single limit point and

this point will also fulfil the first Kuhn-Tucker condition.

Definition 1. A stat.':'r,:, r; point is 1. I;,,, l to be ,i ':', i' ,'Ilq function f where for

allj, f VE(f) 0.

By this definition, the 2nd Kuhn-Tucker condition will be met at a stationary

point. Zangwill's Convergence Theorem [11] is now applied.

Theorem 2. Z.i: ;,.jI :'s Convergence Theorem. Let T be a point-to-set function

/, /7;,., on a metric space X. Generate a sequence {x"} such that x"'+ E T(x"). Let

a solution set, S C X, be given il. ,:i with a continuous function E. Suppose that

1. the sequence {x"} is contained in a compact subset of X

2. if x i S, E(y) < E(x) for all y c T(x)

3. if x e S, E(y) < E(x) for all y e T(x)

4. the i'nij'l:',j T is closed at points outside of S.
Then the limit of i,,;, convergent subsequence lies in S and {E(x")} converges

monotor,::. allii to Es = E(y) for some y E S.

To apply Zangwill, let X = R", S be the set of stationary points and let E the

PML objective function. For any emission intensity f, define the set Sf to be the

set of all possible step-size vectors s fulfilling condition (3-2 3-5). Then define the

point-to-set function T by the equation:

T(f) {f+s v(f) : s Sf}. (3-7)

Note that if f e S, v(f) = 0, thus T(f) {f}. In the PML algorithm, the next

iterate is created by taking f"+l = f" + s 0 v(f") for a particular choice of s E Sfn,

hence f"+l e T(f').

Lemma 1. The point-set i,,. j'.,:I' T is closed at all nonstationary points 0.

The closure definition requires that if 0" 0 S, ', E T(0O), and, ,

then E T(O). Hence it must be shown that there exists a step-size vector s E So

such that =- 0 + s ( v(O) holds for all 0 S. By the definition of the update, for

each n there is a step-size vector s" E Son such that

0 + s" O v(08).

Since 2- is assumed to be continuous, rj(f) is continuous and hence vj(f) will also

be continuous. Let W = {j : vj(O) / 0}. Since 0 is not a stationary point, W

is nonempty. For each j E W, there exists a constant Mj > 0 such that for all

mj > My, vj(0") / 0. Therefore for all j e W and m > maxj(Mj),

m j 0
i Vj (0m)

which converges to
bj 0j

For j e W, define sj to be ^(. Therefore for all j W, j = + sjvj(0). If

j ( W, vj(0) = 0 and hence j = Oj. Thus for all j, j = Oj + sjvj(0) where sj = 1
for j W. By the continuity of E, v, 0 and b, s E So and hence b E T(O).

Lemma 2. The iterates belong to a bounded and compact set.

It will be shown that if IIf"l -- oc, E(f") -+ o as well. The log-likelihood

term L is a function of the form Y ci yilog(ci), where ci = Pfi and yi are both

positive terms. If IIf"ll oo, then for some i, f7 oo as well. For each voxel

j, there exists at least one tube i with pij / 0. Hence if the intensity at voxel j

becomes unbounded, c = Pfi -+ o forcing ci yilog(ci) -- o0. Thus if the iterates

become unbounded, the log-likelihood function will also become unbounded. Hence

if IIf"ll o0, E(f") = L(f) + 7U(f") o0. By construction, E(f") < E(fO)

for all n. Thus, it cannot be true that E(f") -- oo. Therefore, the iterates must

belong to a bounded set. By construction, E(f"+l) < E(f"). Thus E(f") < E(f0)

for all n. Therefore the iterates belong to the set {OIE(O) < E(fo)}. Due to the

the continuity of E, the set is closed. Since the set is closed and bounded, it is a

compact set.

Theorem 3. The limit of i,:1, convergent subsequence is a stationary point.

Zangwill's theorem is now applied. Let X = R" and S =the set of stationary

points. If f" e S, v(f') = 0 and hence T(f") = {f"}. If f" P S, by construction,

E(f" + s" ( v(f")) < E(f") for any step-size vector s" satisfying conditions

(3-2 3-5). By Lemma 1, the point-to-set mapping T is closed at non-stationary

points. Thus Zangwill's theorem applies and all limit points are stationary points.

In addition, E(f") Es = E(y) for some y E S. It must be shown that Es

is well-defined. If x E S, then x is a limit point of the sequence. Hence there is

a subsequence {ff } converging to x. Since E(f") -+ E(y), it must be true that

E(x) = E(y).

Lemma 3. If E(f) = Es for some n, then f" e S.

In the proof of Zangwill's conditions, E(f-+1) < E(f') if f i S and E(f-+1) <

E(f') if f E S. Thus if f" S, E(f'+1) < E(f") = Es. But E(f") converges

monotonically to Es, giving a contradiction. Thus f" e S.

Lemma 4. The set of limit points is finite.

The results of Zangwill's Theorem state that all limit points of the sequence

are stationary points. A stationary point has been defined to be one where for each

index j, f 2(f) = 0. With the requirement that U is a strictly convex (penalty)

term, E will be a strictly convex function [10]. Therefore, given any subset W of

[1, J], E will remain a strictly convex function on the plane {fj 0 j E W}. Hence
there is a unique point on this plane that is a minimizer for E. Denote this point

fw. Thus for any subset W, there is only one point fw with -(fw) = 0 for j ( W

and fj = 0 for j E W. Since there are a finite number of possible subsets W, the

set of stationary points is finite.

Lemma 5. The distance between successive iterates tends to zero.
I||f+l fnl| II||n Qs v(ff)||. Since Is'| < K for all n, j, it suffices to

prove that vj(f') goes to zero. The function v is continuous and is zero for any

stationary point. Since the set of limit points is finite, given e > 0, a single

6 > 0 can be found such that if If" yll < 6 for any y E S, then lv(f")l =

Ilv(f") v(y) | < Let Z be the set of all sequence points within 6 of a limit
point and the W be the remaining sequence points. If W is infinite, it contains a

convergent subsequence (the iterates are contained in a compact set). Thus there

are points in W arbitrarily close to a limit point, a contradiction. Hence W must

be finite. Let N be the largest sequence index in W. Thus if n > N, I f" y|| <6

for some y E S. Hence Ilv(f") I < c. Therefore v(f") 0.

Lemma 6. The set of limit points consists of a single station point.

By Lemma 5, the distance between iterates goes to zero. Ostrowski proved

that the set of limit points of such a sequence is both compact and connected [15].

Since this set is finite, it must consist of a single point. Denote this intensity by f.

Theorem 4. The iterates converge to the minimizer of E.

There is a single limit point for the sequence, hence the sequence will converge

to f. Assume this function is not the minimizer of E. Since the limit is a stationary

point, the second Kuhn-Tucker criterion fj (f) = 0 is satisfied, so the first

optimality condition must be violated. Therefore, an index j can be found such

that fj = 0 and f(f) < 0 and hence there is an N such that (f") < 0 for all

n > N. By construction, rj(fn) and sj(fn) have the same sign and fj > 0. Thus for

n > N,

fj+ fj + s vj(f) ff sj fj r(f)- (ff) > fl

and fj does not converge to zero, contrary to the assumption. Therefore, both

Kuhn-Tucker conditions hold at the convergence point and hence it is the

minimizer of E.

If the penalty function lacks convexity (as in the generalized Lalush-Tsui

penalty (1-7), the proof of Theorem 4 will not apply as it requires convexity. Thus,

as with the modified RM algorithm discussed in the following chapter, the only

result that can be proved is that all of the limit points generated from the sequence

will be minimizers.

3.3 New PML Algorithm

The general statement does not guarantee the existence of a method of

choosing the step-size vector with the needed conditions. However, there do

exist several v-wi of meeting the necessary conditions. In Lange's algorithm [10],

all entries in the step-size vector are the same positive value. In this method,

conditions (3-2 3-5) are satisfied by assuming that rj(f) is positive for all

emission intensities f and voxels j. With this restriction, the objective function
will be decreasing for small positive values of the step-size s. However, in order to

ensure the positivity of r(f), the strength of the penalty term 7 may need to be

decreased, which alters the objective function needed to be minimized. The revised
algorithm presented below has been designed to remove this strong condition on

the penalty term by using a different method to choose the step-size, while still

providing for the convergence of the iterates.
3.3.1 Non-Uniform Step-Size Method

In this algorithm, the step-size vector consists of two values, one positive

and one negative. To find these values at the nth step in the iteration, form the

two sets, D+(f") := {jlrj(fj) > 0} and D_(f") := {jrj(fj) < 0}. Since

Spi,j + 724 (fn) will be finite for each j, r(f") can never be zero. Therefore,
D+(f") and D_(f") will partition the set of voxels. These sets will determine which

value of the step-size will be applied to the appropriate voxel. Define the equations

g"(tl, t2 { vj(fn) if j D(fP) 8
gilt1, = t2 (3-8)
fj + t2 (fn) if j E D_(fF)

(t1, 2) = E(g"(t1, t2)). (3-9)

The current value of the objective function at step n is given by b(0, 0) = E(f").

Since the goal is to decrease the value of E, a search is made for a new update in

the direction of the negative gradient. This will give the direction of the steepest

descent of E. If a single value is used for our constant (as in Lange's method [10] or

the Mair-Gilland method [14]), it corresponds to using (t,t) as the search direction

in (3-8). This may not be the direction of the greatest decrease. In order to achieve

the greatest decrease in the objective function, the vector (tl, 2) is assumed to be

in the direction of the negative gradient. Therefore, define (t, 2) = -sVQ(0, 0) for

some s > 0 to be chosen later. Calculating the two partial derivatives yields:

S(0,0) E0 (f>)v(f") E frj (fn)( )2(fn)
at, aF aft9
j6D" j6D"

(0,0) (f")v (f") f rj(fn)( )2(fn).
jCD" jCD"

Therefore by defining

f E
(f) fjrj(f)( -)2(f) (3 10)
jED (f)
'(f) E jrj(f)(j)2(f)
jED (f)

it can be seen that (t1,t2) s(r(f"+), T-r(f)). Since (T, r_) provides a search
direction only, it can be normalized at this stage to produce a unit vector

(7-(f), 7_(f)) (7r and 7_ will now refer to the normalized vector). The new
iterate is thus formed by:

f+l = f" + Tr(f") v(f") (3 11)

where Tj(f) = +(f) if j E D+(f) and Tj(f) r(f) if j c D_(f). The problem has
thus been reduced from a 2-dimensional search for (ti, t2) to a 1-dimensional search
for s. Defining P(s) = E(f + sT(f") e v(f)), yields

P'(0) = VE(f" + s(ff) O v(f")) d(f" + sr(f") ) v(f")) I o
E -(T)2(fn)j(f1)frj(f')_< 0.

The final inequality comes from the fact that rj has the same sign as rj (and so
(s 7jrj > 0 for all j, proving the fourth criteria for the step-size vector). Thus for
small positive values of the step-size coefficient s, E will be decreasing. It remains
to be shown that the method of selecting the step-size coefficient s will result in a
step-size vector, st, fulfilling the conditions stated in (3-2 3-5).

3.3.2 Line Search Methods

In the non-uniform step-size method discussed above, the step-size vector

consists of sT. Now, a search must be made for an appropriate step-size coefficient

to ensure the decrease in the objective function and positivity of the iterate. First,

a bounding value, smax must be found to ensure positivity. Using the definition of

the update,

f+' = fj + sn v(f)vj(f)

sf- l(f/)fi(f)l---f(ff)

f/( s"(f')rj(f") (f )).

Therefore, by defining

s = minj(|rj(fn)Tj(fn)- (f )|)- (3 12)
then fI 0.If (3-t2)

it can be seen that if s" < s~,, then n+l > 0. If K < sax, let sa, = K. Now

a search is made for a step-size coefficient within the bounding value that decreases

the algorithm. If P(s) is defined to be E(fn+l(s)) = E(f+sr(f") v(f")), then the

algorithm will decrease the objective function if the constant s" satisfies P'(s") < 0.

Using the definition of P(s),

P'(s) = VE(f+(s)) d
OE El 0E
(f"+ (s))rj(ff) fr(ff") (ff).

In order to find a constant s' to produce a negative value, a search algorithm

is emploi-. 1 One possible choice is the bisection rule, which produces a result

within a specified precision. The algorithm proceeded with this choice, but with

slow computation speeds as it requires multiple calculations of P'(s). In order

to decrease the iteration time, the Armijo rule [1] was emploi-. 1 The Armijo

rule is an inexact search algorithm, here used to approximate where P'(s) = 0.

The value returned by the Armijo rule will be a constant s that will decrease the

objective function as required. To apply the Armijo rule, the search begins with

an initial guess (here taken to be the minimum of sma and K). The initial value

is decremented this by a constant factor 3 (taken to be 1/3) until the following

relationship holds (6 is a positive constant chosen to be close to 0):

P(0) + s6P'(0) > P(s).

The following figure illustrates one application of the Armijo rule and the value

that would be returned by the method. The use of the Armijo rule produced much

acceptable value start point x


Figure 3-1: Armijo rule

smaller computation times than the bisection method. The former requires the

calculation of the derivative only at the origin and thereafter calculates values of

P(s) to give a rough approximation to the solution. Even though it does not give

the best value for the step-size coefficient, it was seen that the number of iterations

needed to reach a stopping point did not significantly increase when the Armijo

Rule was applied. Hence, the savings in computation time caused it to be used in

the reconstructions.

Since the step-size coefficient s' satisfies P'(s") < 0, the objective function

decreases as needed. The update is then formed by

fn+1 fn + szn_(fr) o v(fr)

and the iteration proceeds. It remains to be shown that this choice of step-size

vector will satisfy the required conditions for convergence. By construction,

(sTj(f)) < K. By specifying that (sTj(f)) < smax, (3-3) is met. (sr) decreases the

objective function, satisfying (3-4). Finally, by construction, Tj(f) and rj(f) have

the same sign, proving (3-5). Since conditions (3-2 3-5) have been satisfied, the

iteration will converge as proven in Theorem 1.

3.3.3 Stopping Criteria

The algorithm should terminate when the Kuhn-Tucker optimality conditions

(1-9) are realized. The projected gradient function 1 can be used to accomplish

this. This function is defined by

pgd(f) = max(fj (f)),0) fJ o. (3-13)

Theorem 5. pgd(f) = 0 if and only if the Kuhn-Tucker oi'nl,h,/,:li, conditions are


Assume the Kuhn-Tucker conditions are not met. If the first condition is not

true, then (f) < 0, resulting in pgd(f) > 0. If the second condition is not met,

fj and j-(f) are nonzero for some j. If fj > | (f) pgd(f) > |-(f)| > 0.
If the other inequality holds, then pgd(f) > fj > 0. Thus if the Kuhn-Tucker

conditions do not hold, the projected gradient function is nonzero. Now assume the

Kuhn-Tucker conditions are met. For each j, (f) > 0 and at least one of fj and
E(f) are zero. If fj > 1-(f), then E(f) 0 and |max(fj E(f),0) fj

f) 0. If E(f) > f,, then fj 0 and max(fj (f),0) 0.
-(f aO) fjf fj O.
Given some choice of small c > 0, the algorithm can be terminated when

pgd(f) < e.

1 as I-'-. -1 I by Prof. Hager


4.1 Algorithm Desription

4.1.1 RM Algorithm

The Penalized Maximum Likelihood method discussed in Chapter 3 can be

extended for use in the reconstruction of time-d, 1 -1, images. In this problem,

a series of CT-generated images of an object are reconstructed along with any

motion of the object between the scans. The RM algorithm, as developed by

Gilland and Mair [13] [2] [14], can be used to reconstruct this series. However,

its convergence to a minimizer has not been proven. A modification of their work

allows for convergence of an altered objective function. As in their work, the

algorithm will be described initially in the continuous case, then discretized to

produce the iteration. In this reconstruction, a total of K time-lapsed scans (or

frames) are analyzed. In the continuous case, the frame number will be denoted

in the subscript, fk(z). In the discrete case, the frame number will be the first

subscript, fk,j referring to the jth voxel in frame k. Along with the unknown

emission intensities in each frame k, fk(w), the motion vector, mk(z) from frame k

to frame k + 1 is also unknown. This motion vector describes the movement of each

point z in frame k, as z moves to z + mk(z) in frame k + 1. In the continuous case,

the negative log-likelihood of the reconstruction is given by
L(f) [Pfk(i) -k(i)logPf(i)] (4 1)
k=0 i

where Pfk(i) is the continuous projection operator given by Pfk(i) pi(z)fk(z)dz

and pi(z) is the probability that an emission at w will be registered in detector i

(compare with the discrete projection operator given by Pfi = iPijfj). The
penalty term U now measures how well the motion term matches up points

between frames. Ideally, fk(z) = fk+l(z + mk(z)) but it is to be expected that this

will not occur with real data. Hence the image-matching term between frames k

and k + 1 is given by

Uk(fk, fk+, mk) fk() fk+(z + mk(z))]2dz (4-2)

and the penalty term becomes the sum of these individual terms.
U(f, m) Uk(fk, fk+1,m) (4 3)
k= 1

In addition, the objective function includes a third term Es = 1= Esk modeling

the elastic strain of the material.

Esk(mk) = 1/2 A(z)(uk,x +' + k, )2dz

+ (z) + ,z)dz

+1/2 (z)[(uk,y + Vk,)2 + (Uk,x + Wk,x)2 + (Vk,z+ "' ,)2 d

where A and p are material parameters,uk, 1, i, w are the components of mk and

Uk,x is the partial derivative of Uk with respect to x.
Thus after adding constants to balance the contribution of each term, the

continuous objective function becomes

E(f, m) a cL(f) + U(f, m) + 3Es(m). (4-4)

The reconstruction problem thus becomes a minimization of E(f, m) with the

restriction that the intensities in each frame remain positive.

At each iteration of the RM Algorithm (the outer loop), two processes are

run in succession the R-Step and the M-Step. In the R-Step (an inner loop), the

previous motion iterate is used to improve the reconstruction of the frames. Since

Es does not depend on f, the R-Step is applied only to aL(f) + U(f, m) with the

motion vector held constant. A PML reconstruction method can be applied (such

as the non-uniform step-size method described C'!i pter 3) to decrease this term

and produce the next iterate. In the M-Step (an inner loop), the motion vector is

updated using the frames generated from the R-Step. In this step, the conjugate

gradient algorithm is applied to decrease U(f, m) + 3Es(m) with f now held


4.1.2 Image-Matching Penalty

In order to run the algorithm, the image space is partitioned into J equally

sized voxels. However, after discretizing the image space, it cannot be assumed

that the center of a voxel is mapped to the center of another voxel by the motion

vector. Thus, it is necessary to distribute the motion over adjoining voxels.

To accomplish this, let (uk,i, Vk,i, Wk,i) be the components of the motion vector

mapping voxel i in frame k into frame k + 1. Now define di, dj and dk to be the

decimal portions of u, v and w respectively. If (ai, bi, ci) are the coordinates of

the center of voxel i, then the motion vector will map this voxel to the "voxel"

in frame k + 1 centered at (ai + Uk,i, bi + vk,i, Ci + Wk,i). The target voxel to be

approximated will lie in at most eight real voxels. Define dooo to be the voxel

located at ([ai + Uk,i], [bi + Vk,i], [ci + i,' ]). Then the target voxel will lie in a cube

of dimension 2-voxels with dooo in the lower left corner. The emission intensity in

the target voxel can then be found by a trilinear approximation using the voxels

forming the cube, with the weights proportional to the distance from center of

each voxel to the center of the target voxel. Thus, the discrete version of the

image-matching term (4-3) is given by:

Ukcfk, fk+l,mk) fk,i- Ck,i,jfk+l,j)2 (4-5)
i jENk,i

where Nk,i is the 8-voxel cube formed by mk,i and the weights, k,ij, are given by

Table 4-1. This form for the image-matching term is currently implemented in the

voxel voxel center weight
dooo ([ai + uk,i], [b + vk,i], [ci + ) (1 )( dj) dk)
dloo ([ai + Uk,i] + 1, [bi + Vk,i], [Ci + ) ( dj)( dk)
dolo ([ai + Uk,i], [bi + Vk,i] + 1, [ci + (1 di) d( dk)
dllo ([ai + Uk,i] + 1, [bi + vk,i] + 1, [ci + ,, ) didj( dk)
dool ([ai + Uk,i], [bi + Vk,i], [ci + wk,i] + 1) (1- d)(- dj)dk
dlol ([ai + Uk,i] + 1, [bi + vk,i], [Ci + Wk,i] + 1) d dj)dk
doll ([ai + Uk,i], [bi + k,i] + 1, [ci + wk,i] + 1) ( di)djdk
d111 ([ai + Uk,i] + 1, [bi + vk,i] + 1, [ci + wk,i] + 1) didjdk

Table 4-1: RM voxel neighborhood

objective function used in the RM Algorithm. However, it is difficult to perform

the conjugate gradient (I -Step) on this term. it has proven difficult to implement

the M-Step on the this term since the neighborhood system Nk,i depends on the

motion vector. To solve this problem, the M-Step has instead been performed on a

linearization of the image-matching term [14] given by

Qk(m) fk(w)- fk+(W)- (mk(w) m(w)). Vfk+(w + imk(w)dw

with the previous motion estimate given by m. However, it has not been proven

that a decrease in the linearized term Qk will result in a decrease in the original

image-matching term Uk as required. One possible way to solve this problem is

to perform the linearization at the objective function level rather than at each

inner loop. Hence, the original objective function is modified to replace Uk with

a linearized version Uk. With this change, the algorithm will now be referred

to as the modified RM algorithm. Since the objective functions are different, a

minimizer for the modified RM algorithm is not expected to be a minimizer for

the original RM algorithm. However, a decrease in the modified objective function

can be proven, allowing for convergence results. To describe the modified objective

function, define the term
U = YUk(fkfk+1,mk)
k= 1
Uk (fk()- fk+1(z)- mk(z)) Vf-k+l(Z))2dz (4-6)

and update the objective function to become

E(f, m) = aL(f) + U(f, m) + 3Es(m). (4-7)

In addition, linearizing the image-matching term gains convexity in the objective

function (the RM image-matching term is not convex in m). However, this

linearization is valid only for small displacements. If the motion between the

frames is large, the reconstruction may not be valid. Using this modified objective

function, the two inner loop sub-processes can now be described in detail.

4.1.3 Modified RM Iteration

The R-Step consists of minimizing aL(f) + U(f, m) over the set of positive

intensities f. As this in the same form as the objective function in ('!i Ilter 3, the

non-uniform step-size algorithm (3-1) can be utilized. Thus, the continuous R-Step

update for each frame k is given by

f+ = f + sv(f") (4-8)

f s frk(fn)DkE(f, m)

with rk(f)(z) (= EiPi(z)+DkU(f,m))-'. As derived in the Appendix of [14], the
derivatives DkE (taken with respect to fk) are given by

DkE(f, m)

DkU(f, m)
DkUk(fk, fk+I, mk)
Dk k- I(fk-1, fk, mk-1)

a (pi~z -k (ipi ()z+ Dki(fZ, M)
N y ((z) Pf(i)" ) + DnU(f, m)
Dkk (fk, fk+, mk) + DkUk- (f-1, fk, k m1)
2(fk fk+l mk Vfk+l)
-2(fkl fk mki Vfk)
+2(Vfk- 1 V7 V(mk Vfk)) mk-_
+2(f-_1 f i m_ Vfk)(V mkl).

The last derivative can be seen (in the 2-frame case without loss of generality) by
using the formula

D2 U (fl, f2, ml) E(fI(z), f2(z) + th(z), mi(z))It=-o


to obtain

D2U (fl, f2, mi)

-2 (fl(z) f2(z) mi(z) Vf2(z))h(z)dz

-2 (fi(z) f2(z) mi(z) Vf2(z))(mi(z) Vh(w))dz.

D2U(fl f, m) = if equation (4-9) can be written as f tb(z)h(z)dz. As the first
term is already in this form, only the second needs to be simplified. This can be
done by applying the identity

Sgm Vh(z)

S(V gm)h(z)

S(Vg m)h(z) + (gV. m)h(z)

to yield the equality for the second term:

2 J((V(fi(z) f2(z) mi(z) Vf2(z)) mi)h(z)dz

+2 (fi(z) f2(z) mi(z) Vf2(z))(V mi)h(z)dz.

Distributing the V in the first integral and adding the results to the first term

yields the desired derivative.

After discretizing the voxels (and approximating derivatives with central

differences), (4-8) reduces to the discrete iteration:

fJ = ft + s",jk,j (4-10)

where fkjrk,(f) (f) and rk,(f) (- Ca k,i,j +DkU(f, m))-1. As before,

the step-size s" vector must be chosen so that (3-2 3-5) are satisfied. The method

described in C'! Ilpter 3 can be used to find such a vector.

Once the emission intensity has been updated by the R-Step, the M-Step

is run to update the motion vector. This step consists of minimizing U(f, m) +

3Es(m) over all possible motion vectors m. The conjugate gradient algorithm

(using the Polak-Ribiere variant) is then applied to the discretized voxels to update

the motion vector [14].

In practice, at each iteration of the RM algorithm (the outer loop), one

iteration each of the R-Step and M-Step occurs (the inner loop). Continue running

the outer loop under a stopping criteria has been reached, such as the projected

gradient (3-13) falling under a chosen critical value.

4.2 Convergence Results

Although the new objective function E aL + U + 3Es is convex in f, Lemma

4 does not apply in this situation (unless a single motion vector is fixed). Hence,

convergence to a single minimizer is not guaranteed.

Theorem 6. If {ff} is a sequence of iterates produced by (4-10), then each

limit point of the sequence is a minimizer for the objective function E and E(f")


As before, Zangwill's Convergence Theorem is applied. The definition of the
point-to-set function T is expanded to the following:

Ti(f,m) {(g, mcG)|g E T(f)} (4-11)

where T is the previous point-to-set function defined in (3-7) (with a range of

possible step-size vectors) and mce is the motion vector outputted from applying

the conjugate gradient algorithm to U(g, m) + 3Es(m). Let E be the linearized

objective function and define the set of stationary points S to be the set of all

points (f, m) with v(f) = 0. If f e S, then T(f) = {f} and the R-Step will not

change the value of L + yU. The M-Step (which does not affect the reconstructed

image f) has been defined to decrease yU + Es Thus E(Ti(f, m)) < E(f, m) if

f e S. If f S, then the R-Step will decrease L + yU as desired. Following this,
the M-Step will decrease the value of 7U + Es. Hence E(TI(f, m)) < E(f, m) if

f S. The closure condition is proved by noting that T1 can be split into its two

parts, thus allowing the critical closure condition on f to be proved as in Lemma 1.

Thus by Zangwill, all limit points of the sequence {(f", m"n)} lie in S and

E(f", m") will monotonically converge to E(y, m) for some (y, m) e S.

Now to prove all limit points are minimizers, let y be a limit point for the

algorithm. Consider the subsequence f" where for each j, II f" y|| < 1/j (the

definition of a limit point gives the existence of such points). Now as in Theorem

4, assume y does not meet the first K-T condition for a minimizer (By Zangwill,

y e S, hence the second K-T condition minimizer of E is met). Therefore, an index
k can be found such that yk = 0 and OJ(y) < 0 and hence there is an J such that
E(f j) < 0 for all j > J. By construction, rk(fnd) and Sk have the same sign and

fj > 0. Thus for j > J,

f' = f + "' (f) f fsf3kk
f fkk9


and fk, does not converge to yk = 0, contrary to the assumption. Therefore, both

K-T conditions hold at the limit point.

If the modified M-Step is terminated after iteration K (locking in a specific

value for the motion vector) and the modified R-Step is used exclusively afterwards,

the convergence proof for penalized maximum likelihood presented in Chapter 3

will suffice to give convergence for the modified RM algorithm.

5.1 PML Reconstructions
To test the non-uniform step-size algorithm presented in C'!h pter 3, a series of
reconstructions was performed on a phantom source image 1 A = [A, ..., Aj], of
dimension J 128x128. As di-',11l -I below, the phantom image is that of a typical
cardiac scan, with lungs, heart, spine and other tissues. Higher intensity values
correspond to regions of greater metabolic activity (in particular the heart and
lung regions). The detector data, Y = [y, ..., yi], was created by calculating the


Figure 5-1: Phantom chest source image

probability matrix P using (1-2). The chest emission was then projected into the
detector space using yi = Z pijAj. Finally, random Poisson noise was added to the

1 created by Dr.Anderson

data. The PML objective function is formed by the sum

E(f) L(f) + U(f).

The penalty term, U(f) was formed using the logcosh function with 6 taken to be


U(f) E log cosh f (5 1)
i jENi
The corresponding neighborhood system, N, is formed by the 8 closest voxels, with

the diagonal elements having a weight of and the 4 axial neighbors a weight

of 1, as seen in Figure 1-3. The strength of the penalty term, 7, ranged over a

set of values to provide a comparison. The reconstructions were formed by the

non-uniform step-size method (3-1). In separate trials, the search for the step-size

coefficient, s, was performed with the Armijo Rule and the bisection method. For

each value of 7, the algorithm was run until the projected gradient (3-13) fell

under 10-2. The resulting reconstructions were then compared to the source image

with a root-mean-square calculation

Zj(fj- Aj)2
128 128

to determine optimal values to use for the penalty strength.

As a comparison, the same phantom scanner data was used in the APML

algorithm (2-7). The reconstructions outputted from APML were visually identical

to the reconstructions gained using the non-uniform step-size algorithm. In all

trials, the objective function decreased rapidly, then leveled off. It was seen

that the image quality did not improve significantly after this leveling off of

the objective function. As can be seen in Figure 5-2, initially the objective

function decreases faster with the non-uniform step method than with APML,

but eventually both fall to the same level. The computation times for APML

were approximately the same as the computation time using the non-uniform

Non Unif


-3 5


Figure 5-2: Comparing the decrease in the objective function

step-size method with the Armijo Rule. The APML algorithm was able to reduce

the projected gradient factor in less iterations than with the non-uniform step-size


Table 5-1 gives both the RMS error and the iterations required for algorithm

termination, occurring when the projected gradient fell below 10-2. In Figure 5-3,

7 Method Iterations CPU time RMS error
0.015 Bisect 219 475 49.5
0.020 Bisect 227 492 46.5
0.025 Bisect 235 631 45.5
0.030 Bisect 183 530 45.4
0.015 Armijo 220 118 49.6
0.020 Armijo 228 125 46.4
0.025 Armijo 268 129 45.5
0.030 Armijo 229 100 45.4

Table 5-1: PML reconstruction data

the RMS error is plotted against the penalty paramter 7. Reconstruction using

the non-uniform step-size algorithm (applying either bisection or Armijo in the

line search produced an identical RMS error) produced a smaller RMS error than

reconstruction of the image than using the APML reconstruction method. Thus,

these methods were able to produce a reconstruction closer to the source phantom

image. Due to its inexact search, the Armijo rule reduced the projected gradient

49 Non Unit
0 48
45 5
451 3 4 5

Figure 5-3: Comparing the root-mean-square (RMS) error

criteria in a significantly shorter cputime than the bisection method (although the

iterations were approximately the same). The RMS plot in Figure 5-3 shows that

a penalty parameter taken from [0.03,0.04] resulted in the closest reconstruction to

the original phantom.

The reconstructions seen in Figure 5-4 were formed by applying the bisection

method. Although the iteration time was longer than using the Armijo method,

the bisection method gave a visually better image with higher values of the penalty

parameter, 7. The next set of reconstructions found in Figure 5-5 were formed

by applying the Armijo Rule. In this case, as 7 increased, the Armijo rule did not

perform as well in high-intensity regions. This can be observed in the heart region

of the later reconstructions. Figure 5-6 di- l1' a plot formed by a vertical line

thru the heart region. In this region, the exact bisection rule was able to produce a

smoother image in the high-intensity heart region than the non-exact Armijo rule.

Note that both methods produced similar intensities except in the heart region.

With further iterations, it was seen that the non-smooth intensity in the heart

region was reduced (using a smaller value for the projected gradient cutoff would

accomplish this). A tradeoff has thus been made between reconstruction speed (the

Armijo rule) and accuracy in the step-size choice (bisection). One possible solution

(a) 7 = 0.020

(b) 7 = 0.025

(c) 7 = 0.030 (d) 7 = 0.035

Figure 5-4: Phantom image reconstruction using bisection rule

(a) 7 0.020

(c) 7 = 0.030

(d) 7 = 0.035

Figure 5-5: Phantom image reconstruction using Armijo rule

\, Bisect

Figure 5-6: Line plot of heart region

(b) 7 = 0.025

is to combine these methods begin with the Armijo method to approximate the

solution, then use bisection for precision.

Originally, the new algorithm was designed to remove the restrictions on

the penalty function imposed by Lange's algorithm (2-4). Figure 5-7 shows two

reconstructions of the chest, the first using the non-uniform step-size method

and the second using Lange's algorithm (and the Armijo Rule). In this case,

a quadratic penalty was used in place of the logcosh function. With a penalty

parameter of 7 = 0.03, the r(f) function becomes negative (a situation not covered

in Lange's algorithm). As can be seen from the reconstructions, the non-uniform

step-size method is able to reconstruct the image correctly, but Lange's algorithm

(together with the Armijo Rule) produces artifacts.

(a) Non-uniform step-size method (b) Lange's method

Figure 5-7: Comparing reconstruction algorithms

5.2 Modified RM Reconstructions

The modified RM algorithm presented in C'! lpter 4 was tested on a 2-frame

NCAT phantom image of a heart. The algorithm reconstructs both images with the

assistance of the motion vector between the frames. The objective function used in

the modified RM algorithm is given by

E(f, m) = aL(f) + U(f, m) + 3Es(m)

with the constants taken to be a = 0.02 and 3 = 0.005 as in previous

reconstructions. The linearized penalty term U was used in the reconstruction

instead of the virtual voxel method used in the RM algorithm of Mair and

Gilland [14]. Although the use of the virtual voxels was seen to produce useful

reconstructions, the convergence of the RM objective function was unproven (but

was observed numerically).

In addition, the use of the linearized penalty produced a decreasing objective

function as seen in Figure 5-8. The second plot in Figure 5-8 shows the functions

involved in the M-Step, U and sEs. As can be seen, both U and 3Es are


2000 u

0 1800 10 E
!= C
3 1600 0
1400 -


1000 0
0 10 20 30 40 50 60 0 10 20 30 40 50 60
Iteration Iteration
(a) Modified RM objective function (b) U and 3Es

Figure 5-8: Modified RM function plots

increasing for each outer loop. Despite this, the M-Step is reducing their sum

for each M-Step as can be seen in the sample RM data found in Table 5-2.

Typically, the R-step increases the value of U while reducing aL. Then the M-Step

decreases U and increases 3Es to give a decrease in the objective function. The

reconstructions formed using the modified RM algorithm (with the linearized

penalty, U), were comparable to those outputted using the original RM algorithm

(using the approximated voxels penalty method (4-1)). Figure 5-9 di-pli-v'

a quiver plot of the reconstructed Slice 20 combined with the motion vector.

Figure 5-10 di-pl i-1 a comparison of the methods. The top reconstructions were

1 R-Step
1 M-Step
2 R-Step
2 M-Step
3 R-Step
3 M-Step
4 R-Step
4 M-Step
5 R-Step
5 M-Step

aL U
2866.4 1.326
2728.4 1.678
2569.1 2.341
2374.0 3.367
2116.4 5.638

3Es Objective
0.0203 2867.72
0.0228 2730.11
0.0750 2571.42
0.104 2377.42
0.277 2121.97

Table 5-2: RM data a

0.02, 3 = 0.005





40 -



* U

10 20 30 40 50 60

Figure 5-9: Quiver plot of Slice 20

performed with the modified RM algorithm. The lower plots were reconstructed

using the original RM algorithm, both using the parameters a 0.02 and 3 = 0.005.

Although the original RM method gives a better estimate of the motion penalty,

(a) Frame 1 Slice 20 Modified RM

(c) Frame 1 Slice 20 Original RM

Figure 5-10: RM reconstructions

(b) Frame 2 Slice 20 Modified RM

(d) Frame 2 Slice 20 Original RM

a decrease in the objective function (and hence convergence) remains unproven.

The use of the linearized penalty in the modified RM algorithm produces a similar

reconstruction, while in addition giving the needed decrease in the objective


5.3 Conclusions

Several methods exist to solve the problem of image recovery in Emission

Tomography. Green's OSL algorithm as extended by Lange provides for convergence

in a reduced set of penalty functions (which do not include the image-matching

penalty used in the RM algorithm). Although Lange's algorithm can be employ, 1

in larger settings, neither its convergence to a minimizer nor a decrease in the

objective function is guaranteed. If the step-size is allowed to become negative

[2], a decrease in the objective function and algorithm convergence is once

again guaranteed, but this minimizer may not satisfy the first Kuhn-Tucker

optimality condition. However, a further extrapolation to a step-size vector will

both guarantee convergence to a minimizer and allow for a larger set of penalty

functions. Unlike previous methods, the convergence proof of the generalized

method allows for the use of inexact faster line searches, such as the Armijo Rule.

The non-uniform step-size algorithm provides one way of constructing a step-size

vector that meets the conditions required for convergence. This generalized method

can be extended to other image recovery problems, such as multiple image-motion

recovery. In this case, the generalized algorithm is incorporated into the modified

RM algorithm for reconstruction. Future work in this area will concentrate on

refining the algorithm to increase speed without losing image quality.


This Fortran code can be utilized to run the non-uniform step method as

described in ('!Ch pter 3. Code to calculate the penalty term (and its derivative) is

required. The probability matrix pi,j is stored in a sparce matrix format consisting

of hrow, hcol and hv preserving the non-zero entries of p.

The iteration loop consists of the following code:

do j=l,nhdim

temp = hcol(j)

hsum(temp) = hsum(temp) + hv(j)


do j=l,nhdim

tempi = hrow(j)

temp2 = hcol(j)

AX(templ) = AX(templ) + hv(j)*f(temp2)


do j=l,nhdim

tempi = hrow(j)

temp2 = hcol(j)


AY(temp2) = AY(temp2) + hv(j)*Y(templ)/AX(templ)



call findderivpenalty(f,dU)

do j=l,NVOXX

dE(j) = hsum(j) AY(j) + gamma*dU(j)


tau(1) = 0

tau(2) = 0

do j=l,NVOXX

r(j) = sA(j) + gamma*dU(j)

fl(j) = f(j)*AY(j)/r(j)

v(j) = fl(j) f(j)


tempi = 0

do j=l,NVOXX

if(r(j).le.0)templ = 1


if (templ.eq. )then

do j=l,NVOXX

if(r(j) .ge.0.)then







tau(1) = 1


tau(3) = sqrt(tau(1)**2 + tau(2)**2)

tau(1) = tau(1) / tau(3)

tau(2) = tau(2) / tau(3)

do j=l,NVOXX

if (r(j).ge.0.)then

v(j) = v(j)*tau(1)


v(j) = v(j)*tau(2)




do j=l,NVOXX

if(r(j) .ge.0)then

smax = max(smax, tau(1)*dE(j)/r(j))


smax = max(smax, tau(2)*dE(j)/r(j))



if (smax.eq.0.)then

smax = 1


smax = 1/smax


sopt =find-s(f,smax,v,hcol,hrow, hv,Y,dE,gamma)

do j=l,NVOXX

f(j) = f(j) + sopt*v(j)


Utilizing the Armijo rule in the line-search can be accomplished by the code

(called by find-s in the main loop):

dEdotv = 0

do j=l,NVOXX

dEdotv = dEdotv + dE(j)*v(j)


tol = 0.1

k= 0

rho = 3

30 continue

alpha = smax / (rho**k)

do j=l,NVOXX

f1(j) = f(j) + alpha*v(j)


do j=l,nhdim

temple = hrow(j)

temp2 = hcol(j)

AX(templ) = AX(templ) + hv(j)*fl(temp2)


call findpenalty(fl,U)

E= 0

do j=l,ndet


E = E + AX(j) Y(j)*log(AX(j))




do j=l,NVOXX

E = E + gamma*U(j)


flag = EO + alpha tol dEdotv

k = k+1

if( 30

finds = alpha


[1] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty, 1993. Nonlinear r'ii'i"'.r .: ':r
theory and l..', .:thms. New York : J. Wiley & Sons, 2nd edition.

[2] Z. Cao, D. Gilland, B. Mair, and R. Jaszczak. Three-dimensional motion
estimation with image reconstruction for gated cadiac ECT. IEEE Trans.
Nuclear Science, pages 384-388.

[3] R. B. Carroll. An or'/,. '.,, ,';l series approach to positron emission '.i,,,.'/,,/',;
PhD thesis, University of Florida, Gainesville, FL, October 1998.

[4] J. C'I ,i:;, J.M.M. Anderson, and B.Mair. An accelerated penalized maximum
likelihood algorithm for positron emission tomography. IEEE Trans. Image
Processing, forthcoming, 2006.

[5] J. C'!I in,; J.M.M. Anderson, and J. Votaw. Regularized image reconstruction
algorithms for positron emission tomography. IEEE Trans. Med. Imag.,
23(9):1165-1175, 2004.

[6] A.R. DePierro. A modified expectation maximization algorithm for penalized
likelihood estimation in emission tomography. IEEE Trans. Med. Imag.,
14(1):132-137, 1995.

[7] S. Geman and D. McClure. B -i ,'i image analysis: an application to single
photon emission tomography. Proc. American Statistical S .. I:, pages 12-18,

[8] P. Green. On use of the EM algorithm for penalized likelihood estimation.
J.R. Statist. Soc, 52(3):443-452, 1990.

[9] D. Lalush and B. Tsui. A generalized Gibbs prior for maximum a posterioro
reconstruction in SPECT. Phys. Med. Biol., 38:729-741, 1993.

[10] K. Lange. Convergence of EM image reconstruction algorithms with Gibbs
smoothing. IEEE Trans. Med. Imag., 9(4):439-446, 1990.

[11] D. Luenberger, 1973. Introduction to linear and nonlinear "'"-','i'rn.':',
Reading : Addison-Wesley.

[12] B. Mair. A mathematical model incorporating the effects of detector width in
2D PET. Inverse Problems, 16:223 224, 2000.


[13] B. Mair, D. Gilland, and Z. Cao. Simultaneous motion estimation and image
reconstruction from gated data. Proc. 2002 IEEE Internat. Symp. Biomed.
Imaging, pages 661-664, 2002.

[14] B. Mair, D. Gilland, and J.Sun. Estimation of images and nonrigid
deformations in gated emission CT. IEEE Trans. Med. Imag.,
25(9):1130-1144, 2006.

[15] A. M. Ostrowski, 1973. Solutions of equations in euclidean and banach spaces.
New York : Academic Press, 3rd edition.

[16] L. A. Shepp and Y. Vardi. Maximum likelihood reconstruction for emission
tomography. IEEE Trans. Med. Imag., 1:113-122, 1982.


Jeff Zahnen was born on August 7, 1974, in Chicago, Illinois. He received

his Bachelor of Science degree in mathematics with a minor in history from the

University of Florida in 1996. Following this, he received his master's degree

in 1998. In 2000, he began to study medical imaging techniques under the

supervision of Dr. Bernard Mair. In 2006, he completed his Ph.D. work on

emission tomography methods. During this time, he also served as the coach of

the University of Florida College Bowl Team.

Full Text


Forhispatienceandadvicethroughtheyears,Ioermygreatestthankstomyadvisor,Dr.BernardMair.Inaddition,IwouldliketothankDr.DavidGilland,Dr.WilliamHager,Dr.JamesKeesling,Dr.ScottMcCulloughandDr.DavidWilsonfortakingthetimetoserveonmycommitteeandprovidingsuggestionstoimprovethework.Finally,Iwouldliketothankmyparentsfortheirunwaveringsupportofmystudies. iv


page ACKNOWLEDGMENTS ............................. iv LISTOFTABLES ................................. vii LISTOFFIGURES ................................ viii ABSTRACT .................................... ix CHAPTER 1MATHEMATICSOFPOSITRONEMISSIONTOMOGRAPHY(PET) 1 1.1IntroductionToPET .......................... 1 1.2MathematicalModelforPET ..................... 2 1.3PETProbabilityFunction ....................... 3 1.4ReconstructionTechniques ....................... 4 1.4.1MaximumLikelihood ...................... 4 1.4.2PenalizedMaximumLikelihood(PML) ............ 6 1.5MinimizingFunction .......................... 9 2PMLRECONSTRUCTIONMETHODS .................. 10 2.1Green'sOneStepLateAlgorithm ................... 10 2.1.1Lange'sImprovementtoGreen'sOSL ............. 11 2.1.2Mair-GillandMethod ...................... 12 2.2AcceleratedPenalizedMaximumLikelihoodMethod(APML) ... 12 3GENERALPMLALGORITHM ...................... 15 3.1GeneralAlgorithmStatement ..................... 15 3.1.1GeneralAlgorithmUpdate ................... 15 3.1.2Penaltyfunctioncriteria .................... 16 3.2ProofofConvergence .......................... 17 3.3NewPMLAlgorithm .......................... 21 3.3.1Non-UniformStep-SizeMethod ................ 22 3.3.2LineSearchMethods ...................... 24 3.3.3StoppingCriteria ........................ 26 4MULTIPLEIMAGE-MOTIONRECONSTRUCTION .......... 27 4.1AlgorithmDesription .......................... 27 v


.......................... 27 4.1.2Image-MatchingPenalty .................... 29 4.1.3ModiedRMIteration ..................... 31 4.2ConvergenceResults .......................... 33 5RECONSTRUCTIONS ............................ 36 5.1PMLReconstructions .......................... 36 5.2ModiedRMReconstructions ..................... 42 5.3Conclusions ............................... 45 APPENDIX:SAMPLEALGORITHMFORTRANCODE ........... 47 REFERENCES ................................... 53 BIOGRAPHICALSKETCH ............................ 54 vi


Table page 4{1RMvoxelneighborhood ........................... 30 5{1PMLreconstructiondata ........................... 38 5{2RMdata=0:02,=0:005 ........................ 44 vii


Figure page 1{1A2-DPETtuberegisteringanemission .................. 1 1{2Tuberegions ................................. 3 1{3Sampleneighborhoodsystemwithassociatedweights ........... 7 1{4Derivativesofthepenaltyterms ....................... 8 3{1Armijorule .................................. 25 5{1Phantomchestsourceimage ......................... 36 5{2Comparingthedecreaseintheobjectivefunction ............. 38 5{3Comparingtheroot-mean-square(RMS)error ............... 39 5{4Phantomimagereconstructionusingbisectionrule ............ 40 5{5PhantomimagereconstructionusingArmijorule ............. 41 5{6Lineplotofheartregion ........................... 41 5{7Comparingreconstructionalgorithms .................... 42 5{8ModiedRMfunctionplots ......................... 43 5{9QuiverplotofSlice20 ............................ 44 5{10RMreconstructions .............................. 45 viii


Positronemissiontomography(orPET)hasbeenutilizedforyearsinmedicaldiagnosis.ThePETreconstructionproblemconsistsofrecoveringtheimagefromthedatagatheredbythedetectorsinthemachine.Penalizedmaximumlikelihoodtechniquesallowfortherecoveryoftheimagewithasmoothingtermtoimprovetheimage.Previousmethodsusingthistechniqueforcedtheuseofaweakpenaltyterminordertogainconvergence.Usinganewmethod,thisrestrictionisremovedwhileretainingconvergence.Inaddition,themethodisextendedtotheproblemofrecoveringmultiplePETimagesalongwithavectordescribingthemotionbetweentheimages. ix


1{1 displaysatubeformedbydetectorsand.Anear-simultaneousarrivaloftwophotonsatoppositeendsof Figure1{1: A2-DPETtuberegisteringanemission 1


thetubeisthenregisteredasasingledetection.ThetubeshowninFigure 1{1 willregisterthedisplayedemissionatpointBbutnottheemissionatpointA. Insinglephotonemissioncomputedtomography(SPECT),asinglephotonisemitted.Inthismethod(morewidelyappliedthanPETduetotherelativecostsofthescanningmachines),thedetectorsarelocatedonlyononesideofthepatientandasingledetectionresultsinacount.Thisscanisperformedfordierentanglesbetweenthepatientanddetectortoprovideafullimage. Inbothprocedures,theaimistodeterminethenumberofemissionsfromthevariouslocationsinthepatient.Theresultingintensityimagesareusedtomakemedicaldiagnoses.ThisworkwillconsidertheproblemofPETimagereconstructionusingthemaximumlikelihood(ML)methodwiththeadditionofapenaltyterm.AlthoughdevelopedforPETreconstruction,themethodcanbeextendedforSPECTreconstruction.Unlikepreviousmethods,therevisedalgorithmpresentedusesamoregeneralpenaltyterm,allowingforawiderrangeofsettings.Proofofconvergenceandreconstructionresultsareobtainedusingthismethod. 16 ]introducedamathematicalmodelforPETin1982.Inthemodel,therandomemissionsfromvoxeljaremodeledbyaPoissonprocesswithmeanfj.Thusthe


detectorcountintubei,Yi,willalsobegovernedbyaPoissonprocesswithmean wherepi;jisthegeometricprobabilitythatanemissioninvoxeljisregisteredbydetectortubei.UsingthepropertiesofthePoissondistribution,theprobabilitythatthereareyicountsregisteredindetectoriisgivenbyeiyii 1{2 .Thevoxels Figure1{2: Tuberegions areassumedtobesosmallthattheprobabilitycanbeassumedapproximatelyconstantontheentirevoxel.Ifzisthecenterofvoxelj,thenpi;jisdenedbythe


probabilitythatthepathofapairofphotonsemittedfromthecenterofthevoxel,z,intersectsbothdetectorsdeningthetube.Thisprobabilityisproportionaltotheangle-of-viewofthetubefromz.Thus,ifC^zDisdenedtobetheacuteangleformedbythelinespassingthroughCandzandDandz,thefollowingdenitionresults[ 12 ]. 16 ].In1998,Carroll[ 3 ]decomposedthedata,probabilitymatrixandintensityfunctionsintobasisfunctions,transformingtheproblemintoaninnitesetoflinearequationstosolve.Ouralgorithmthoughisbasedonathirdmethod,Maximumlikelihoodestimation(ML).


Thedetectorcountsareindependentrandomvariables,sotheseprobabilitiesaremultipliedtogethertoyieldtheprobabilityofobtainingthedatafromanemissionintensitymapoff.AnMLestimateisanintensityfthatmaximizesthefollowingprobability.ieiyii Inaddition,theintensitymap,fisconstrainedtobenon-negative(ffjfj0g). AnequivalentproblemistoapplythenegativelogarithmandperformaminimizationofthetermXi(PfiyilogPfilogyi!): TheExpectation-Maximization(orEM)algorithmdevelopedbySheppandVardi[ 16 ]canthenappliedtondtheminimizerofthisterm,resultingintheEMMLalgorithm. wherefnreferstothenthstepintheiteration. Alternately,theEMMLiterationcanbeexpressedasadescentalgorithm.fn+1j=fnjfnj@L @fj(fn)=Xipi;j


Restatedinmatrixformyieldstheequationfn+1=fnD(fn)rL(fn): 8 ]. 7 ],isemployedbyrstchoosingapenaltyfunctionandthenformingtheterm where!isaweightingfactorthatassociatespixels.Thiscanbeusedtolinkneighboringpixels,forcingthemtohavesimilarintensities.Forexample,Ni,canbedenedoneachvoxelitobethesetofpixelsinaneighborhoodlatticeofthatvoxel.Thustheweight!i;j=0ifj=2N(i).Ifj2N(i),theweightisdenedtobeanonzeroconstant.OnepossibilityistoletNiconsistoftheeightneighboringvoxelsasshowninFigure 1{3 withassociatedweights.Inaddition,aconstantvalue,ismultipliedtothepenaltytermtocontroltheinuenceofthepenaltyinthereconstruction.As!0,thereconstructionsapproachthoseoftheEMalgorithm.However,ifischosentobetoolarge,theimageswillbeoversmoothed.


Figure1{3: Sampleneighborhoodsystemwithassociatedweights Thereexistmanychoicesforthepenaltyfunction(r).Afrequentlyusedchoiceisapenaltyoftheform(r)=logcoshr [ 8 ],whereisapreviouslychosenconstantthatallowsforsomene-tuningofthepenalty.Thisconstantdetermineshowthepriorisscaledwithrespecttor,asthelogcoshpenaltywillsmoothalldierencesaboveacertaincutovalue,determinedby.As!1,themethodwillapproachresultsoftheEMAlgorithm(asnosmoothingwilloccur).Thispenaltyisusefulasitapproximatesquadraticsmoothingforsmallvaluesofr,butitsderivativeremainsbounded. In1993,LalushandTsui[ 9 ]developedanewpenaltyfunctionthatallowedforevengreatercontroloverthepenalizedrangeofintensities.NotingthatcurrentPMLmethodsdidnotrequirethepenaltyfunctiontobecalculated,theyinsteadbuiltthefunctionfrompropertiesofitsderivative.Thus,theyproduced @r=r r)2]1=2(1{7) whereandareconstantsthataectthespreadandlocationrespectivelyofpeakofthederivative.Asincreases,thepeakbecomesmorenarrowandas


increases,thepeakawayfromtheorigin.Theplotofthederivativeshowsthemajorityofthesmoothingoccurringintheregionunderthepeakofthederivative[ 9 ].Thusalteringthevaluesofandwillaecttherangeofsmoothingintheimage.Bycomparingthisfunctiontothederivativeofthelogcoshfunction,0(r)=(1=)tanhr itcanbeseenthatwhilethelogcoshfunctiondoesnothaveapeakedderivative (b)Derivativeofthelogcoshpenalty Derivativesofthepenaltyterms (itincreasestoanasymptoticlimit),itcanbeconsideredasaninnitely-widepeak.Thus,Green'slogcoshpenaltycanbeapproximatedasaLalush-Tsuipenaltybytakingclosetozero(producinganearinnitely-widepeak).IntegratingtheLalush-Tsuipenaltytermresultsintheirpenaltyfunction:(r)= 2p


InMaximumLikelihood,thisobjectivefunctionmustbeminimizedoverthesetofpositiveintensityvaluesffjfj0g.ThisminimummustsatisfytheKuhn-Tuckeroptimalityconditionsfortheproblem.Inthissituation,thosetwoconditionsarethatforallj @fj(f)0(1{9)fj@E @fj(f)=0 ItwillbeshownthattherevisedalgorithmpresentedinChapter3willmeettheseconditionsatconvergence.


Dierentiatingtheobjectivefunctionandreplacingitinthe2ndKuhn-Tuckeroptimalitycondition( 1{9 )yieldstheconditionthatforallj:fj[Xi(pi;jyipi;j @fj(f)]=0: @fj(f)]=fjXiyipi;j @fj(f))1: Thisformfortheoptimalityconditionisusefulasitreformulatestheequationintoaxedpointcondition. In1990,GreendevelopedtheOne-Step-Late(OSL)algorithm[ 8 ]toapplytheEMalgorithmtothePenalizedMaximumLikelihoodproblem.TheOne-Step-Latenamecomesfromthechoicetoevaluatethepenaltyatthecurrentiteraterather 10


thanthe(unknown)update,asitallowedtheEMalgorithmtobeappliedtothePMLproblem.InGreen'sOSL,theupdatetoiteratefnisgivenbyfn+1j=fnjrj(fn)Xiyipi;j @fj(f)=Pipi;jyipi;j @fj(f),theupdatecanbeexpressedinthealternativeform: where @fj(f):(2{3) ThusifGreen'sOSLalgorithmconverges,itwillsatisfythesamexedpointconditionasthesecondKuhn-Tuckercondition.Notethatif=0,Green'sOSLalgorithmreducestotheEMMLAlgorithm( 1{5 ).However,Green'sOSLalgorithmdoesnotguaranteepositivityoftheiteratesorconvergence. 10 ].InLange'salgorithm,rj(fn)isrequiredtobebepositiveforallj.Tomeetthiscondition,thepenaltyparameterbemustbechosensmallenoughtokeeprj(fn)>0forallpossibleiteratesfnandvoxelsj.Inthemethod,Green'sOSLupdateisalteredtoincludeastep-sizeparameters(takings=1reducesthemethodtotheoriginalOSLmethod).Thustheupdatebecomes wherevisdenedasbefore.Thestep-sizesnischosentobeapositivevaluethatgivesbothpositivityofthenewiterate(takingsn<1enforcesthiscondition)andadecreaseintheobjectivefunctionE.Todecreasetheobjectivefunction,dene


IfP0(0)<0,thealgorithmwillbedecreasingforsmallvaluesofthestep-size.Withthisconditionthatrj(fn)>0,itfollowsthatP0(0)=Pjfnjrj(fn)(@E @fj)2(fn)<0iffnj@E @fj(fn)6=0forsomej.SinceP0(0)<0,theobjectivefunctionwillbedecreasingforsmallvaluesofthestep-sizes.Aline-searchisthenperformedforanappropriatestep-sizethatwillgiveapositiveupdateandadecreaseintheobjectivefunction.InLange'salgorithm,thestep-sizethatgivesthegreatestdecreaseintheobjectivefunctionisdenotedsandasearchisperformedtondavalues0,thenEwillbeincreasingforsmallpositivevaluesofs.Inthiscase,snisallowedtobenegative.Thesearchismadeinthenegativedirection,againforavaluethatensurespositivityandiswithinofthevaluethatgivesthegreatestdecreaseintheobjectivefunction.Althoughthismethodwilldecreasetheobjectivefunctionandconvergence,thelimitmaynotsatisfythesecondKuhn-Tuckeroptimalitycondition. 5 ].IntheAPMLmethod,surrogatefunctionstodecreasetheobjectivefunctionandprovideconvergence.Inthemethod,apenaltyfunction


~fn+1j=Gj(fn)+q withthedenedfunctionsgivenbyEj=yipi;jfnj Asthiswasseentoconvergeslowly,anaccelerationfactor(theAPMLMethod)wasaddedtothealgorithm[ 4 ].AsinLange'smethod,astep-sizefactorisaddedtothealgorithm.InAPML,theacceleratedupdateisgivenby withv(fn)=~fn+1fn.InLange'smethod,asearchmethodisemployedtondasuitablechoiceforthestep-size.Here,surrogatefunctionsareusedtondthebestpossiblestep-sizeneededtodecreasetheobjectivefunction.SurrogatefunctionswererstappliedtoPETreconstructionbyDePierro[ 6 ].InordertominimizeafunctionfoverasetSusingthemethodofsurrogatefunctions,anotherfunctionmustbefoundsatisfying Thedicultyinusingthismethodliesinndinganappropriatefunction.InAPML,creatingthesurrogatefunctionrequiresthecalculationoffourseparatefunctionsrelatedtothepenaltyfunction,possiblyincreasingthecomputation




undertherestrictionthatfj0forallj.Inthischapter,aniterativealgorithmtondthisminimumwillbepresented.Underspecicconditions,convergencetoaminimizerwillbeproven. where @fj(f))1 @fj(f) Notethatv(f)andr(f)arethesamedenedfunctionsasintheOSLalgorithm. 15


Givenacurrentiteratefn,thestep-sizevectorsn=[sn1;:::;snJ]ischosentoproducethenextstepinthealgorithm.Thisvectormustsatisfytheconditions: 00: Therstthreeconditionsonthestep-sizevectorenforcerespectivelyboundednessofthestep-size,positivityoftheiterateanddecreaseintheobjectivefunction.Thenalconditionrequiresthatforeachvoxelj,thesignofsjmatchesthesignofrj(f).Notethatauniquesisnotguaranteed.Therevisedalgorithmpresentedaftertheproofprovidesanexampleofonewaytochooseasuitablestep-sizevector.Onceapenaltyfunctionandmethodofconstructingastep-sizevector(bothmatchingtherequiredcriteria)havebeenchosen,thealgorithmcanproceed.Continueiteratinguntilasuitablestoppingcriteriahasbeenreached.Convergenceofthisalgorithmwillnowbeinvestigated. where(r)isrequiredtosatisfythefollowingproperties:


Therstthreeconditionsarebasicconditionsforapenaltyfunction,whilethelattertwoconditionsarerequiredforconvergence.Functionsthatmatchtheseconditionsincludethelogcoshandquadraticpenalties.NotethatthegeneralLalush-Tsuipenaltydoesnotmeettheseconditionsdueitslackofconvexity. Theorem1. 3.1.2 ).LetEbethePMLobjectivefunctionE(f)=L(f)+U(f).Denef0j=Piyi=Jforallj(thealgorithmisinitializedwithaatemissionintensity).Letthealgorithmupdatebegivenby( 3{1 )withthestep-sizevectorchosensuchthatconditions( 3{2 3{5 )aresatised.Thealgorithmwillthenconvergetoauniqueminimizer. 10 ])willdemonstratethatthealgorithmconvergestoauniquevectorthatsatisesbothKuhn-Tuckeroptimalityconditions,henceistheminimizerofE.Itconsistsoftwoparts.Intherst,Zangwill'sconvergencetheoremisappliedtoprovethattheobjectivefunctionconvergesandthatalllimitpointswillfullthesecondKuhn-Tuckercondition.Inthesecondpartoftheproof,itwillbeshownthatthereisasinglelimitpointandthispointwillalsofulltherstKuhn-Tuckercondition. 11 ]isnowapplied. 1. thesequencefxngiscontainedinacompactsubsetofX 2. ifx=2S,E(y)

ifx2S,E(y)E(x)forally2T(x) themappingTisclosedatpointsoutsideofS. ThenthelimitofanyconvergentsubsequenceliesinSandfE(xn)gconvergesmonotonicallytoES=E(y)forsomey2S. 3{2 3{5 ).Thendenethepoint-to-setfunctionTbytheequation: Notethatiff2S,v(f)=0,thusT(f)=ffg.InthePMLalgorithm,thenextiterateiscreatedbytakingfn+1=fn+sv(fn)foraparticularchoiceofs2Sfn,hencefn+12T(fn). @fjisassumedtobecontinuous,rj(f)iscontinuousandhencevj(f)willalsobecontinuous.LetW=fj:vj()6=0g.Sinceisnotastationarypoint,Wisnonempty.Foreachj2W,thereexistsaconstantMj>0suchthatforallmj>Mj,vj(m)6=0.Thereforeforallj2Wandm>maxj(Mj),smj=mjmj


whichconvergestojj 3{2 3{5 ).ByLemma 1 ,thepoint-to-setmappingTisclosedatnon-stationarypoints.ThusZangwill'stheoremappliesandalllimitpointsarestationarypoints.Inaddition,E(fn)!ES=E(y)forsomey2S.ItmustbeshownthatESiswell-dened.Ifx2S,thenxisalimitpointofthesequence.Hencethereis


asubsequenceffnjgconvergingtox.SinceE(fn)!E(y),itmustbetruethatE(x)=E(y). @fj(f)=0.WiththerequirementthatUisastrictlyconvex(penalty)term,Ewillbeastrictlyconvexfunction[ 10 ].Therefore,givenanysubsetWof[1;J],Ewillremainastrictlyconvexfunctionontheplaneffj=0jj2Wg.HencethereisauniquepointonthisplanethatisaminimizerforE.DenotethispointfW.ThusforanysubsetW,thereisonlyonepointfWwith@E @fj(fW)=0forj=2Wandfj=0forj2W.SincethereareanitenumberofpossiblesubsetsW,thesetofstationarypointsisnite.


5 ,thedistancebetweeniteratesgoestozero.Ostrowskiprovedthatthesetoflimitpointsofsuchasequenceisbothcompactandconnected[ 15 ].Sincethissetisnite,itmustconsistofasinglepoint.Denotethisintensitybyf. @fj(f)=0issatised,sotherstoptimalityconditionmustbeviolated.Therefore,anindexjcanbefoundsuchthatfj=0and@E @fj(f)<0andhencethereisanNsuchthat@E @fj(fn)<0foralln>N.Byconstruction,rj(fn)andsj(fn)havethesamesignandfnj>0.Thusforn>N,fn+1j=fnj+snjvj(fn)=fnjsnjfnjrj(fn)@E @fj(fn)>fnj Ifthepenaltyfunctionlacksconvexity(asinthegeneralizedLalush-Tsuipenalty( 1{7 ),theproofofTheorem 4 willnotapplyasitrequiresconvexity.Thus,aswiththemodiedRMalgorithmdiscussedinthefollowingchapter,theonlyresultthatcanbeprovedisthatallofthelimitpointsgeneratedfromthesequencewillbeminimizers. 10 ],allentriesinthestep-sizevectorarethesamepositivevalue.Inthismethod,conditions( 3{2 3{5 )aresatisedbyassumingthatrj(f)ispositiveforall


emissionintensitiesfandvoxelsj.Withthisrestriction,theobjectivefunctionwillbedecreasingforsmallpositivevaluesofthestep-sizes.However,inordertoensurethepositivityofr(f),thestrengthofthepenaltytermmayneedtobedecreased,whichalterstheobjectivefunctionneededtobeminimized.Therevisedalgorithmpresentedbelowhasbeendesignedtoremovethisstrongconditiononthepenaltytermbyusingadierentmethodtochoosethestep-size,whilestillprovidingfortheconvergenceoftheiterates. @fj(fn)willbeniteforeachj,r(fn)canneverbezero.Therefore,D+(fn)andD(fn)willpartitionthesetofvoxels.Thesesetswilldeterminewhichvalueofthestep-sizewillbeappliedtotheappropriatevoxel.Denetheequations Thecurrentvalueoftheobjectivefunctionatstepnisgivenby(0;0)=E(fn).SincethegoalistodecreasethevalueofE,asearchismadeforanewupdateinthedirectionofthenegativegradient.ThiswillgivethedirectionofthesteepestdescentofE.Ifasinglevalueisusedforourconstant(asinLange'smethod[ 10 ]ortheMair-Gillandmethod[ 14 ]),itcorrespondstousing(t,t)asthesearchdirectionin( 3{8 ).Thismaynotbethedirectionofthegreatestdecrease.Inordertoachievethegreatestdecreaseintheobjectivefunction,thevector(t1;t2)isassumedtobeinthedirectionofthenegativegradient.Therefore,dene(t1;t2)=sr(0;0)for


somes0tobechosenlater.Calculatingthetwopartialderivativesyields:@ @t1(0;0)=Xj2Dn+@E @fj(fn)vj(fn)=Xj2Dn+fnjrj(fn)(@E @fj)2(fn)@ @t2(0;0)=Xj2Dn@E @fj(fn)vj(fn)=Xj2Dnfnjrj(fn)(@E @fj)2(fn): @fj)2(f) (3{10) @fj)2(f) itcanbeseenthat(t1;t2)=s(+(fn);(fn)).Since(+;)providesasearchdirectiononly,itcanbenormalizedatthisstagetoproduceaunitvector(+(f);(f))(+andwillnowrefertothenormalizedvector).Thenewiterateisthusformedby: wherej(f)=+(f)ifj2D+(f)andj(f)=(f)ifj2D(f).Theproblemhasthusbeenreducedfroma2-dimensionalsearchfor(t1;t2)toa1-dimensionalsearchfors.DeningP(s)=E(fn+s(fn)v(fn)),yieldsP0(0)=rE(fn+s(fn)v(fn))d ds(fn+s(fn)v(fn))js=0=Xj(@E @fj)2(fn)j(fn)fnjrj(fn)0: 3{2 3{5 ).


@fj(fn)=fnj(1snj(fn)rj(fn)@E @fj(fn)): @fj(fn)j)1;(3{12) itcanbeseenthatifsn

ruleisaninexactsearchalgorithm,hereusedtoapproximatewhereP0(s)=0.ThevaluereturnedbytheArmijorulewillbeaconstantsthatwilldecreasetheobjectivefunctionasrequired.ToapplytheArmijorule,thesearchbeginswithaninitialguess(heretakentobetheminimumofsmaxandK).Theinitialvalueisdecrementedthisbyaconstantfactor(takentobe1/3)untilthefollowingrelationshipholds(isapositiveconstantchosentobecloseto0):P(0)+sP0(0)P(s): Figure3{1: Armijorule smallercomputationtimesthanthebisectionmethod.TheformerrequiresthecalculationofthederivativeonlyattheoriginandthereaftercalculatesvaluesofP(s)togivearoughapproximationtothesolution.Eventhoughitdoesnotgivethebestvalueforthestep-sizecoecient,itwasseenthatthenumberofiterationsneededtoreachastoppingpointdidnotsignicantlyincreasewhentheArmijoRulewasapplied.Hence,thesavingsincomputationtimecausedittobeusedinthereconstructions. Sincethestep-sizecoecientsnsatisesP0(sn)<0,theobjectivefunctiondecreasesasneeded.Theupdateisthenformedbyfn+1=fn+sn(fn)v(fn)


andtheiterationproceeds.Itremainstobeshownthatthischoiceofstep-sizevectorwillsatisfytherequiredconditionsforconvergence.Byconstruction,(sj(f))0.Ifthesecondconditionisnotmet,fjand@E @fj(f)arenonzeroforsomej.Iffjj@E @fj(f)j,pgd(f)j@E @fj(f)j>0.Iftheotherinequalityholds,thenpgd(f)fj0.ThusiftheKuhn-Tuckerconditionsdonothold,theprojectedgradientfunctionisnonzero.NowassumetheKuhn-Tuckerconditionsaremet.Foreachj,@E @fj(f)>0andatleastoneoffjand@E @fj(f)arezero.Iffj@E @fj(f),then@E @fj(f)=0andjmax(fj@E @fj(f);0)fjj=@E @fj(f)=0.If@E @fj(f)fj,thenfj=0andjmax(fj@E @fj(f);0)fjj=fj=0. Givensomechoiceofsmall>0,thealgorithmcanbeterminatedwhenpgd(f)<.


4.1.1RMAlgorithm 13 ][ 2 ][ 14 ],canbeusedtoreconstructthisseries.However,itsconvergencetoaminimizerhasnotbeenproven.Amodicationoftheirworkallowsforconvergenceofanalteredobjectivefunction.Asintheirwork,thealgorithmwillbedescribedinitiallyinthecontinuouscase,thendiscretizedtoproducetheiteration.Inthisreconstruction,atotalofKtime-lapsedscans(orframes)areanalyzed.Inthecontinuouscase,theframenumberwillbedenotedinthesubscript,fk(z).Inthediscretecase,theframenumberwillbetherstsubscript,fk;jreferringtothejthvoxelinframek.Alongwiththeunknownemissionintensitiesineachframek,fk(w),themotionvector,mk(z)fromframektoframek+1isalsounknown.Thismotionvectordescribesthemovementofeachpointzinframek,aszmovestoz+mk(z)inframek+1.Inthecontinuouscase,thenegativelog-likelihoodofthereconstructionisgivenby wherePfk(i)isthecontinuousprojectionoperatorgivenbyPfk(i)=Rpi(z)fk(z)dzandpi(z)istheprobabilitythatanemissionatwwillberegisteredindetectori


(comparewiththediscreteprojectionoperatorgivenbyPfi=Pipi;jfj).ThepenaltytermUnowmeasureshowwellthemotiontermmatchesuppointsbetweenframes.Ideally,fk(z)=fk+1(z+mk(z))butitistobeexpectedthatthiswillnotoccurwithrealdata.Hencetheimage-matchingtermbetweenframeskandk+1isgivenby andthepenaltytermbecomesthesumoftheseindividualterms. Inaddition,theobjectivefunctionincludesathirdtermES=PKk=1ESkmodelingtheelasticstrainofthematerial.ESk(mk)=1=2Z(z)(uk;x+vk;y+wk;z)2dz+Z(z)(u2k;x+v2k;y+w2k;z)dz+1=2Z(z)[(uk;y+vk;x)2+(uk;x+wk;x)2+(vk;z+wk;y)2]dz Thusafteraddingconstantstobalancethecontributionofeachterm,thecontinuousobjectivefunctionbecomes ThereconstructionproblemthusbecomesaminimizationofE(f;m)withtherestrictionthattheintensitiesineachframeremainpositive. AteachiterationoftheRMAlgorithm(theouterloop),twoprocessesareruninsuccession-theR-StepandtheM-Step.IntheR-Step(aninnerloop),the


previousmotioniterateisusedtoimprovethereconstructionoftheframes.SinceESdoesnotdependonf,theR-StepisappliedonlytoL(f)+U(f;m)withthemotionvectorheldconstant.APMLreconstructionmethodcanbeapplied(suchasthenon-uniformstep-sizemethoddescribedChapter3)todecreasethistermandproducethenextiterate.IntheM-Step(aninnerloop),themotionvectorisupdatedusingtheframesgeneratedfromtheR-Step.Inthisstep,theconjugategradientalgorithmisappliedtodecreaseU(f;m)+ES(m)withfnowheldconstant. 4{3 )isgivenby:


whereNk;iisthe8-voxelcubeformedbymk;iandtheweights,ck;i;j,aregivenbyTable 4{1 .Thisformfortheimage-matchingtermiscurrentlyimplementedinthe voxelvoxelcenterweight RMvoxelneighborhood objectivefunctionusedintheRMAlgorithm.However,itisdiculttoperformtheconjugategradient(M-Step)onthisterm.ithasprovendiculttoimplementtheM-SteponthethistermsincetheneighborhoodsystemNk;idependsonthemotionvector.Tosolvethisproblem,theM-Stephasinsteadbeenperformedonalinearizationoftheimage-matchingterm[ 14 ]givenbyQk(m)=Zfk(w)fk+1(w)(mk(w)~mk(w))rfk+1(w+~mk(w)dw


function,denetheterm ~U=KXk=1~Uk(fk;fk+1;mk)~Uk=Z(fk(z)fk+1(z)mk(z))rfk+1(z))2dz andupdatetheobjectivefunctiontobecome ~E(f;m)=L(f)+~U(f;m)+ES(m):(4{7) Inaddition,linearizingtheimage-matchingtermgainsconvexityintheobjectivefunction(theRMimage-matchingtermisnotconvexinm).However,thislinearizationisvalidonlyforsmalldisplacements.Ifthemotionbetweentheframesislarge,thereconstructionmaynotbevalid.Usingthismodiedobjectivefunction,thetwoinnerloopsub-processescannowbedescribedindetail. 3{1 )canbeutilized.Thus,thecontinuousR-Stepupdateforeachframekisgivenby (4{8) =fnksfnkrk(fn)DkE(f;m)


withrk(f)(z)=(Pipi(z)+DkU(f;m))1.AsderivedintheAppendixof[ 14 ],thederivativesDkE(takenwithrespecttofk)aregivenbyDkE(f;m)=Xi(pi(z)yk(i)pi(z) dtE(f1(z);f2(z)+th(z);m1(z))jt=0(4{9) toobtainD2~U1(f1;f2;m1)=2Z(f1(z)f2(z)m1(z)rf2(z))h(z)dz2Z(f1(z)f2(z)m1(z)rf2(z))(m1(z)rh(w))dz:D2~U1(f1;f2;m1)=ifequation( 4{9 )canbewrittenasR(z)h(z)dz.Asthersttermisalreadyinthisform,onlythesecondneedstobesimplied.ThiscanbedonebyapplyingtheidentityZgmrh(z)=Z(rgm)h(z)=Z(rgm)h(z)+(grm)h(z) toyieldtheequalityforthesecondterm:=2Z((r(f1(z)f2(z)m1(z)rf2(z))m1)h(z)dz+2Z(f1(z)f2(z)m1(z)rf2(z))(rm1)h(z)dz:


Distributingtherintherstintegralandaddingtheresultstothersttermyieldsthedesiredderivative. Afterdiscretizingthevoxels(andapproximatingderivativeswithcentraldierences),( 4{8 )reducestothediscreteiteration: wherevk;j=fk;jrk;j(f)@~E @fk;j(f)andrk;j(f)=(Pipk;i;j+Dk~U(f;m))1.Asbefore,thestep-sizesnvectormustbechosensothat( 3{2 3{5 )aresatised.ThemethoddescribedinChapter3canbeusedtondsuchavector. OncetheemissionintensityhasbeenupdatedbytheR-Step,theM-Stepisruntoupdatethemotionvector.Thisstepconsistsofminimizing~U(f;m)+ES(m)overallpossiblemotionvectorsm.Theconjugategradientalgorithm(usingthePolak-Ribierevariant)isthenappliedtothediscretizedvoxelstoupdatethemotionvector[ 14 ]. Inpractice,ateachiterationoftheRMalgorithm(theouterloop),oneiterationeachoftheR-StepandM-Stepoccurs(theinnerloop).Continuerunningtheouterloopunderastoppingcriteriahasbeenreached,suchastheprojectedgradient( 3{13 )fallingunderachosencriticalvalue. 4 doesnotapplyinthissituation(unlessasinglemotionvectorisxed).Hence,convergencetoasingleminimizerisnotguaranteed. 4{10 ),theneachlimitpointofthesequenceisaminimizerfortheobjectivefunction~Eand~E(fn)converges.


Asbefore,Zangwill'sConvergenceTheoremisapplied.Thedenitionofthepoint-to-setfunctionTisexpandedtothefollowing: whereTisthepreviouspoint-to-setfunctiondenedin( 3{7 )(witharangeofpossiblestep-sizevectors)andmCGisthemotionvectoroutputtedfromapplyingtheconjugategradientalgorithmto~U(g;m)+ES(m).Let~EbethelinearizedobjectivefunctionanddenethesetofstationarypointsStobethesetofallpoints(f;m)withv(f)=0.Iff2S,thenT(f)=ffgandtheR-StepwillnotchangethevalueofL+~U.TheM-Step(whichdoesnotaectthereconstructedimagef)hasbeendenedtodecrease~U+ES.Thus~E(T1(f;m))~E(f;m)iff2S.Iff=2S,thentheR-StepwilldecreaseL+~Uasdesired.Followingthis,theM-Stepwilldecreasethevalueof~U+ES.Hence~E(T1(f;m))<~E(f;m)iff=2S.TheclosureconditionisprovedbynotingthatT1canbesplitintoitstwoparts,thusallowingthecriticalclosureconditiononftobeprovedasinLemma 1 ThusbyZangwill,alllimitpointsofthesequencef(fn;mn)glieinSand~E(fn;mn)willmonotonicallyconvergeto~E(y;m)forsome(y;m)2S. Nowtoprovealllimitpointsareminimizers,letybealimitpointforthealgorithm.Considerthesubsequencefnjwhereforeachj,jjfnjyjj<1=j(thedenitionofalimitpointgivestheexistenceofsuchpoints).NowasinTheorem 4 ,assumeydoesnotmeettherstK-Tconditionforaminimizer(ByZangwill,y2S,hencethesecondK-TconditionminimizerofEismet).Therefore,anindexkcanbefoundsuchthatyk=0and@E @fk(y)<0andhencethereisanJsuchthat@E @fj(fnj)<0forallj>J.Byconstruction,rk(fnj)andskhavethesamesignandfnjk>0.Thusforj>J,fnj+1k=fnjk+snjkvk(fnj)=fnjksnjkfnjkrk(fnj)@E @fk(fnj)>fnjk


andfnjkdoesnotconvergetoyk=0,contrarytotheassumption.Therefore,bothK-Tconditionsholdatthelimitpoint. IfthemodiedM-StepisterminatedafteriterationK(lockinginaspecicvalueforthemotionvector)andthemodiedR-Stepisusedexclusivelyafterwards,theconvergenceproofforpenalizedmaximumlikelihoodpresentedinChapter3willsucetogiveconvergenceforthemodiedRMalgorithm.


Figure5{1: Phantomchestsourceimage probabilitymatrixPusing( 1{2 ).Thechestemissionwasthenprojectedintothedetectorspaceusingyi=Pjpi;jj.Finally,randomPoissonnoisewasaddedtothe 36


data.ThePMLobjectivefunctionisformedbythesumE(f)=L(f)+U(f): Thecorrespondingneighborhoodsystem,Niisformedbythe8closestvoxels,withthediagonalelementshavingaweightof1 1{3 .Thestrengthofthepenaltyterm,,rangedoverasetofvaluestoprovideacomparison.Thereconstructionswereformedbythenon-uniformstep-sizemethod( 3{1 ).Inseparatetrials,thesearchforthestep-sizecoecient,s,wasperformedwiththeArmijoRuleandthebisectionmethod.Foreachvalueof,thealgorithmwasrununtiltheprojectedgradient( 3{13 )fellunder102.Theresultingreconstructionswerethencomparedtothesourceimagewitharoot-mean-squarecalculations Pj(fjj)2 todetermineoptimalvaluestouseforthepenaltystrength. Asacomparison,thesamephantomscannerdatawasusedintheAPMLalgorithm( 2{7 ).ThereconstructionsoutputtedfromAPMLwerevisuallyidenticaltothereconstructionsgainedusingthenon-uniformstep-sizealgorithm.Inalltrials,theobjectivefunctiondecreasedrapidly,thenleveledo.Itwasseenthattheimagequalitydidnotimprovesignicantlyafterthislevelingooftheobjectivefunction.AscanbeseeninFigure 5{2 ,initiallytheobjectivefunctiondecreasesfasterwiththenon-uniformstepmethodthanwithAPML,buteventuallybothfalltothesamelevel.ThecomputationtimesforAPMLwereapproximatelythesameasthecomputationtimeusingthenon-uniform


Figure5{2: Comparingthedecreaseintheobjectivefunction step-sizemethodwiththeArmijoRule.TheAPMLalgorithmwasabletoreducetheprojectedgradientfactorinlessiterationsthanwiththenon-uniformstep-sizemethod. Table 5{1 givesboththeRMSerrorandtheiterationsrequiredforalgorithmtermination,occuringwhentheprojectedgradientfellbelow102.InFigure 5{3 0.015Bisect21947549.50.020Bisect22749246.50.025Bisect23563145.50.030Bisect18353045.40.015Armijo22011849.60.020Armijo22812546.40.025Armijo26812945.50.030Armijo22910045.4 Table5{1: PMLreconstructiondata theRMSerrorisplottedagainstthepenaltyparamter.Reconstructionusingthenon-uniformstep-sizealgorithm(applyingeitherbisectionorArmijointhelinesearchproducedanidenticalRMSerror)producedasmallerRMSerrorthanreconstructionoftheimagethanusingtheAPMLreconstructionmethod.Thus,thesemethodswereabletoproduceareconstructionclosertothesourcephantomimage.Duetoitsinexactsearch,theArmijorulereducedtheprojectedgradient


Figure5{3: Comparingtheroot-mean-square(RMS)error criteriainasignicantlyshortercputimethanthebisectionmethod(althoughtheiterationswereapproximatelythesame).TheRMSplotinFigure 5{3 showsthatapenaltyparametertakenfrom[0.03,0.04]resultedintheclosestreconstructiontotheoriginalphantom. ThereconstructionsseeninFigure 5{4 wereformedbyapplyingthebisectionmethod.AlthoughtheiterationtimewaslongerthanusingtheArmijomethod,thebisectionmethodgaveavisuallybetterimagewithhighervaluesofthepenaltyparameter,.ThenextsetofreconstructionsfoundinFigure 5{5 wereformedbyapplyingtheArmijoRule.Inthiscase,asincreased,theArmijoruledidnotperformaswellinhigh-intensityregions.Thiscanbeobservedintheheartregionofthelaterreconstructions.Figure 5{6 displaysaplotformedbyaverticallinethrutheheartregion.Inthisregion,theexactbisectionrulewasabletoproduceasmootherimageinthehigh-intensityheartregionthanthenon-exactArmijorule.Notethatbothmethodsproducedsimilarintensitiesexceptintheheartregion.Withfurtheriterations,itwasseenthatthenon-smoothintensityintheheartregionwasreduced(usingasmallervaluefortheprojectedgradientcutowouldaccomplishthis).Atradeohasthusbeenmadebetweenreconstructionspeed(theArmijorule)andaccuracyinthestep-sizechoice(bisection).Onepossiblesolution


(b)=0:025 (c)=0:030 (d)=0:035 Phantomimagereconstructionusingbisectionrule


(b)=0:025 (c)=0:030 (d)=0:035 PhantomimagereconstructionusingArmijorule Figure5{6: Lineplotofheartregion


istocombinethesemethods-beginwiththeArmijomethodtoapproximatethesolution,thenusebisectionforprecision. Originally,thenewalgorithmwasdesignedtoremovetherestrictionsonthepenaltyfunctionimposedbyLange'salgorithm( 2{4 ).Figure 5{7 showstworeconstructionsofthechest,therstusingthenon-uniformstep-sizemethodandthesecondusingLange'salgorithm(andtheArmijoRule).Inthiscase,aquadraticpenaltywasusedinplaceofthelogcoshfunction.Withapenaltyparameterof=0:03,ther(f)functionbecomesnegative(asituationnotcoveredinLange'salgorithm).Ascanbeseenfromthereconstructions,thenon-uniformstep-sizemethodisabletoreconstructtheimagecorrectly,butLange'salgorithm(togetherwiththeArmijoRule)producesartifacts. (b)Lange'smethod Comparingreconstructionalgorithms


withtheconstantstakentobe=0:02and=0:005asinpreviousreconstructions.Thelinearizedpenaltyterm~UwasusedinthereconstructioninsteadofthevirtualvoxelmethodusedintheRMalgorithmofMairandGilland[ 14 ].Althoughtheuseofthevirtualvoxelswasseentoproduceusefulreconstructions,theconvergenceoftheRMobjectivefunctionwasunproven(butwasobservednumerically). Inaddition,theuseofthelinearizedpenaltyproducedadecreasingobjectivefunctionasseeninFigure 5{8 .ThesecondplotinFigure 5{8 showsthefunctionsinvolvedintheM-Step,~UandES.Ascanbeseen,both~UandESare (b)~UandES ModiedRMfunctionplots increasingforeachouterloop.Despitethis,theM-StepisreducingtheirsumforeachM-StepascanbeseeninthesampleRMdatafoundinTable 5{2 .Typically,theR-stepincreasesthevalueof~UwhilereducingL.ThentheM-Stepdecreases~UandincreasesEStogiveadecreaseintheobjectivefunction.ThereconstructionsformedusingthemodiedRMalgorithm(withthelinearizedpenalty,~U),werecomparabletothoseoutputtedusingtheoriginalRMalgorithm(usingtheapproximatedvoxelspenaltymethod( 4{1 )).Figure 5{9 displaysaquiverplotofthereconstructedSlice20combinedwiththemotionvector.Figure 5{10 displaysacomparisonofthemethods.Thetopreconstructionswere


IterationL~UESObjective 1-R-Step2866.41.3262867.761-M-Step1.2620.02032867.72 2-R-Step2728.41.6782730.112-M-Step1.6730.02282730.11 3-R-Step2569.12.3412571.483-M-Step2.2320.07502571.42 4-R-Step2374.03.3672377.444-M-Step3.3150.1042377.42 5-R-Step2116.45.6382122.115-M-Step5.3300.2772121.97 Table5{2: RMdata=0:02,=0:005 Figure5{9: QuiverplotofSlice20


performedwiththemodiedRMalgorithm.ThelowerplotswerereconstructedusingtheoriginalRMalgorithm,bothusingtheparameters0:02and=0:005.AlthoughtheoriginalRMmethodgivesabetterestimateofthemotionpenalty, (b)Frame2Slice20ModiedRM (c)Frame1Slice20OriginalRM (d)Frame2Slice20OriginalRM RMreconstructions adecreaseintheobjectivefunction(andhenceconvergence)remainsunproven.TheuseofthelinearizedpenaltyinthemodiedRMalgorithmproducesasimilarreconstruction,whileinadditiongivingtheneededdecreaseintheobjectivefunction.


penaltyusedintheRMalgorithm).AlthoughLange'salgorithmcanbeemployedinlargersettings,neitheritsconvergencetoaminimizernoradecreaseintheobjectivefunctionisguaranteed.Ifthestep-sizeisallowedtobecomenegative[ 2 ],adecreaseintheobjectivefunctionandalgorithmconvergenceisonceagainguaranteed,butthisminimizermaynotsatisfytherstKuhn-Tuckeroptimalitycondition.However,afurtherextrapolationtoastep-sizevectorwillbothguaranteeconvergencetoaminimizerandallowforalargersetofpenaltyfunctions.Unlikepreviousmethods,theconvergenceproofofthegeneralizedmethodallowsfortheuseofinexactfasterlinesearches,suchastheArmijoRule.Thenon-uniformstep-sizealgorithmprovidesonewayofconstructingastep-sizevectorthatmeetstheconditionsrequiredforconvergence.Thisgeneralizedmethodcanbeextendedtootherimagerecoveryproblems,suchasmultipleimage-motionrecovery.Inthiscase,thegeneralizedalgorithmisincorporatedintothemodiedRMalgorithmforreconstruction.Futureworkinthisareawillconcentrateonreningthealgorithmtoincreasespeedwithoutlosingimagequality.


ThisFortrancodecanbeutilizedtorunthenon-uniformstepmethodasdescribedinChapter3.Codetocalculatethepenaltyterm(anditsderivative)isrequired.Theprobabilitymatrixpi;jisstoredinasparcematrixformatconsistingofhrow,hcolandhvpreservingthenon-zeroentriesofp. Theiterationloopconsistsofthefollowingcode:




[1] M.S.Bazaraa,H.D.Sherali,andC.M.Shetty,1993.Nonlinearprogramming:theoryandalgorithms.NewYork:J.Wiley&Sons,2ndedition. [2] Z.Cao,D.Gilland,B.Mair,andR.Jaszczak.Three-dimensionalmotionestimationwithimagereconstructionforgatedcadiacECT.IEEETrans.NuclearScience,pages384{388. [3] R.B.Carroll.Anorthogonalseriesapproachtopositronemissiontomography.PhDthesis,UniversityofFlorida,Gainesville,FL,October1998. [4] J.Chang,J.M.M.Anderson,andB.Mair.Anacceleratedpenalizedmaximumlikelihoodalgorithmforpositronemissiontomography.IEEETrans.ImageProcessing,forthcoming,2006. [5] J.Chang,J.M.M.Anderson,andJ.Votaw.Regularizedimagereconstructionalgorithmsforpositronemissiontomography.IEEETrans.Med.Imag.,23(9):1165{1175,2004. [6] A.R.DePierro.Amodiedexpectationmaximizationalgorithmforpenalizedlikelihoodestimationinemissiontomography.IEEETrans.Med.Imag.,14(1):132{137,1995. [7] S.GemanandD.McClure.Bayesianimageanalysis:anapplicationtosinglephotonemissiontomography.Proc.AmericanStatisticalSociety,pages12{18,1985. [8] P.Green.OnuseoftheEMalgorithmforpenalizedlikelihoodestimation.J.R.Statist.Soc,52(3):443{452,1990. [9] D.LalushandB.Tsui.AgeneralizedGibbspriorformaximumaposteriororeconstructioninSPECT.Phys.Med.Biol.,38:729{741,1993. [10] K.Lange.ConvergenceofEMimagereconstructionalgorithmswithGibbssmoothing.IEEETrans.Med.Imag.,9(4):439{446,1990. [11] D.Luenberger,1973.Introductiontolinearandnonlinearprogramming.Reading:Addison-Wesley. [12] B.Mair.Amathematicalmodelincorporatingtheeectsofdetectorwidthin2DPET.InverseProblems,16:223{224,2000. 52


[13] B.Mair,D.Gilland,andZ.Cao.Simultaneousmotionestimationandimagereconstructionfromgateddata.Proc.2002IEEEInternat.Symp.Biomed.Imaging,pages661{664,2002. [14] B.Mair,D.Gilland,andJ.Sun.EstimationofimagesandnonrigiddeformationsingatedemissionCT.IEEETrans.Med.Imag.,25(9):1130{1144,2006. [15] A.M.Ostrowski,1973.Solutionsofequationsineuclideanandbanachspaces.NewYork:AcademicPress,3rdedition. [16] L.A.SheppandY.Vardi.Maximumlikelihoodreconstructionforemissiontomography.IEEETrans.Med.Imag.,1:113{122,1982.


JeZahnenwasbornonAugust7,1974,inChicago,Illinois.HereceivedhisBachelorofSciencedegreeinmathematicswithaminorinhistoryfromtheUniversityofFloridain1996.Followingthis,hereceivedhismaster'sdegreein1998.In2000,hebegantostudymedicalimagingtechniquesunderthesupervisionofDr.BernardMair.In2006,hecompletedhisPh.D.workonemissiontomographymethods.Duringthistime,healsoservedasthecoachoftheUniversityofFloridaCollegeBowlTeam. 54