UFDC Home  myUFDC Home  Help 



Full Text  
PENALIZED MAXIMUM LIKELIHOOD RECONSTRUCTION METHODS FOR EMISSION TOMOGRAPHY By JEFFREY A. ZAHNEN A DISSERATATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2006 Copyright 2006 by Jeffrey A. Zahnen This dissertation is dedicated to my grandmother, Lena Zahnen, for ahiv believing in me. ACKNOWLEDGMENTS For his patience and advice through the years, I offer my greatest thanks to my advisor, Dr. Bernard Mair. In addition, I would like to thank Dr. David Gilland, Dr. William Hager, Dr. James Keesling, Dr. Scott McCullough and Dr. David Wilson for taking the time to serve on my committee and providing sl... 1 i..o:, to improve the work. Finally, I would like to thank my parents for their unwavering support of my studies. TABLE OF CONTENTS page ACKNOWLEDGMENTS ................... ...... iv LIST OF TABLES ...................... ......... vii LIST OF FIGURES ................... ......... viii ABSTRACT ....................... ........... ix CHAPTER 1 MATHEMATICS OF POSITRON EMISSION TOMOGRAPHY (PET) 1 1.1 Introduction To PET ......... ................. 1 1.2 Mathematical Model for PET ....... ............. 2 1.3 PET Probability Function ..... .................. 3 1.4 Reconstruction Techniques ........ .............. 4 1.4.1 Maximum Likelihood ...... ... ......... .. 4 1.4.2 Penalized Maximum Likelihood (PML) ...... ... ... 6 1.5 Minimizing Function ........ .................. 9 2 PML RECONSTRUCTION METHODS .................. 10 2.1 Green's One Step Late Algorithm ...... ......... 10 2.1.1 Lange's Improvement to Green's OSL ..... ... .... 11 2.1.2 MairGilland Method ....... ......... .... 12 2.2 Accelerated Penalized Maximum Likelihood Method (APML) .12 3 GENERAL PML ALGORITHM .................. ..... .15 3.1 General Algorithm Statement .................. .. 15 3.1.1 General Algorithm Update .................. .. 15 3.1.2 Penalty function criteria ............. .. .. .16 3.2 Proof of Convergence .................. ....... .. 17 3.3 New PML Algorithm .................. ....... .. 21 3.3.1 NonUniform StepSize Method ............... .. 22 3.3.2 Line Search Methods .................. ..... 24 3.3.3 Stopping Criteria .................. ... .. 26 4 MULTIPLE IMAGEMOTION RECONSTRUCTION . ... 27 4.1 Algorithm Desription .................. ....... .. 27 4.1.1 RM Algorithm ...... 4.1.2 ImageMatching Penalty 4.1.3 Modified RM Iteration 4.2 Convergence Results ...... 5 RECONSTRUCTIONS. .................. ........ 36 5.1 PML Reconstructions .................. ....... 36 5.2 Modified RM Reconstructions .................. .. 42 5.3 Conclusions .................. ............ 45 APPENDIX: SAMPLE ALGORITHM FORTRAN CODE . .47 REFERENCES .............. ................... 53 BIOGRAPHICAL SKETCH .................. ......... 54 .................... .................... .................... .................... LIST OF TABLES Table page 41 RM voxel neighborhood .................. ........ .. 30 51 PML reconstruction data .................. ........ .. 38 52 RM data a = 0.02, 0 = 0.005 .................. ... .. 44 LIST OF FIGURES Figure page 11 A 2D PET tube registering an emission ............. 1 12 Tube regions ............... ............ .. 3 13 Sample neighborhood system with associated weights . . .. 7 14 Derivatives of the penalty terms ................ ..... 8 31 Armijo rule .................. ............... .. 25 51 Phantom chest source image .................. ..... .. 36 52 Comparing the decrease in the objective function . ..... 38 53 Comparing the rootmeansquare (RMS) error . . 39 54 Phantom image reconstruction using bisection rule . . 40 55 Phantom image reconstruction using Armijo rule . 41 56 Line plot of heart region ............... ....... .. 41 57 Comparing reconstruction algorithms ............. .. .. 42 58 Modified RM function plots ................ .... .... 43 59 Quiver plot of Slice 20 ............... ........ .. 44 510 RM reconstructions ............... .......... .. 45 Abstract of Disseratation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy PENALIZED MAXIMUM LIKELIHOOD RECONSTRUCTION METHODS FOR EMISSION TOMOGRAPHY By Jeffrey A. Zahnen December 2006 C'!h In: Bernard Mair Major Department: Mathematics Positron emission tomography (or PET) has been utilized for years in medical diagnosis. The PET reconstruction problem consists of recovering the image from the data gathered by the detectors in the machine. Penalized maximum likelihood techniques allow for the recovery of the image with a smoothing term to improve the image. Previous methods using this technique forced the use of a weak penalty term in order to gain convergence. Using a new method, this restriction is removed while retaining convergence. In addition, the method is extended to the problem of recovering multiple PET images along with a vector describing the motion between the images. CHAPTER 1 MATHEMATICS OF POSITRON EMISSION TOMOGRAPHY (PET) 1.1 Introduction To PET Emission tomography has been used for years to study metabolic processes in the body. In this procedure, a patient is injected with a shortlived radioactive isotope. While the substance is being absorbed by the tissues in the body, the patient is placed in a scanner. As the isotope decays, it begins to emit positrons. These emitted positrons travel only a short distance before interacting with a nearby electron. In positron emission tomography (PET), the subsequent annihilation produces a pair of photons traveling in opposite directions on a line oriented in a uniformly random direction. A PET scanner consists of individual detector crystals arranged around the patient in a cylindrical manner. Each pair of detectors forms a virtual detection tube. Figure 11 di 1ph, a tube formed by detectors a and 3. A nearsimultaneous arrival of two photons at opposite ends of a o j/ Figure 11: A 2D PET tube registering an emission the tube is then registered as a single detection. The tube shown in Figure 11 will register the dipl i '1, emission at point B but not the emission at point A. In single photon emission computed tomography (SPECT), a single photon is emitted. In this method (more widely applied than PET due to the relative costs of the scanning machines), the detectors are located only on one side of the patient and a single detection results in a count. This scan is performed for different angles between the patient and detector to provide a full image. In both procedures, the aim is to determine the number of emissions from the various locations in the patient. The resulting intensity images are used to make medical diagnoses. This work will consider the problem of PET image reconstruction using the maximum likelihood (\I b) method with the addition of a penalty term. Although developed for PET reconstruction, the method can be extended for SPECT reconstruction. Unlike previous methods, the revised algorithm presented uses a more general penalty term, allowing for a wider range of settings. Proof of convergence and reconstruction results are obtained using this method. 1.2 Mathematical Model for PET The 2dimensional PET scanner consists of a circle of detectors surrounding the region of interest, Q, which can be taken to be rectangular. There are I virtual detector tubes, each one formed by two of the detectors composing the scanner ring. f is partitioned into J smallersized rectangles, called voxels. An emission in voxel j is detected by tube i if the emitted positrons are detected simultaneously in the two detectors forming the tube. Shepp and Vardi [16] introduced a mathematical model for PET in 1982. In the model, the random emissions from voxel j are modeled by a Poisson process with mean fj. Thus the detector count in tube i, Yi, will also be governed by a Poisson process with mean Ai = pj fj (11) where pij is the geometric probability that an emission in voxel j is registered by detector tube i. Using the properties of the Poisson distribution, the probability that there are yi counts registered in detector i is given by eCx i y y'! To recover the emission intensity, it is necessary to estimate f = [f, f2, ..., fJ] based on a random sample from Y = [Y, Y2, ..., Y]. 1.3 PET Probability Function The probability function pij for the 2dimensional PET scanner can be defined using the geometry of the scanner. Given any tube i formed by two detectors, it can be divided into six regions as shown in Figure 12. The voxels B A SR22, F3 \R /AA Figure 12: Tube regions are assumed to be so small that the probability can be assumed approximately constant on the entire voxel. If z is the center of voxel j, then pij is defined by the probability that the path of a pair of photons emitted from the center of the voxel, z, intersects both detectors defining the tube. This probability is proportional to the angleofview of the tube from z. Thus, if CzD is defined to be the acute angle formed by the lines passing through C and z and D and z, the following definition results [12]. CzD if z e R1 7 BzC if z R2 7TPi, AzB if z e R3 (12) 7 DzA if z e R4 0 if z E R22 or 44 1.4 Reconstruction Techniques The reconstruction problem is to recover the emission intensity f given the scanner data y, assumed to be Poissondistributed. Several methods have been used to accomplish this reconstruction. If the detectors are treated as points, the model reduces to line integrals along detector lines. Filteredback projection (FBP) is a popular method utilized to recover the emission intensity, as it provides a quick way to invert the Radon transform. However, it has been seen to produce images with streaking artifacts [16]. In 1998, Carroll [3] decomposed the data, probability matrix and intensity functions into basis functions, transforming the problem into an infinite set of linear equations to solve. Our algorithm though is based on a third method, Maximum likelihood estimation (! I). 1.4.1 Maximum Likelihood In the ML method, a search is made for the emission intensity f that maximizes the probability of obtaining the Poissondistributed data. Each detector count yi is assumed to be an independent sample from a Poisson distribution with a mean corresponding to i = Pf i = j pijfj. Thus the probability that there are 5 yi counts in detector i is given by P(Y,= Uyt=A) (13) The detector counts are independent random variables, so these probabilities are multiplied together to yield the probability of obtaining the data from an emission intensity map of f. An ML estimate is an intensity f that maximizes the following probability. i Y In addition, the intensity map, f is constrained to be nonnegative ({f lf > 0}). An equivalent problem is to apply the negative logarithm and perform a minimization of the term (Pf, y log Pf, log yi!). As yi! does not depend on the unknown intensity f, it can be safety dropped from the term resulting in the ML objective function. L(f) = Pf, y, logPf, (1 4) The ExpectationMaximization (or EM) algorithm developed by Shepp and Vardi [16] can then applied to find the minimizer of this term, resulting in the EMML algorithm. f7+ fni If Z iIf (1 5) Z Pi,j where f" refers to the nth step in the iteration. Alternately, the EMML iteration can be expressed as a descent algorithm. fn+1 fn fin6 f )/VL,. JY i ] *3Qf ' Restated in matrix form yields the equation f"+l f" aD(f")VL(f"). where a = 1 is the stepsize and D is the diagonal matrix whose diagonal entries consist of dj,j = fj/ ipj. While the EM algorithm has been shown to monotonically decrease the objective function and be globally convergent, a wellknown deficiency is a tendency to produce noisy images if the iteration is run too long [8]. 1.4.2 Penalized Maximum Likelihood (PML) To help reduce the noise, a smoothing term can be added to the likelihood term resulting in the penalized maximum likelihood (PML) method. Equivalently, this method can be considered as an application of MAP (maximum aposteriori), in that the recovered image is assumed to have a known local Gibbs probability distribution. This method, used by Geman and McClure [7], is employ, 1l by first choosing a penalty function Q and then forming the term U(f) E '(f fJ) (16) i j where w is a weighting factor that associates pixels. This can be used to link neighboring pixels, forcing them to have similar intensities. For example, Ni, can be defined on each voxel i to be the set of pixels in a neighborhood lattice of that voxel. Thus the weight :.' i 0 if j N(i). If j N(i), the weight is defined to be a nonzero constant. One possibility is to let Ni consist of the eight neighboring voxels as shown in Figure 13 with associated weights. In addition, a constant value, 7 is multiplied to the penalty term to control the influence of the penalty in the reconstruction. As 7 0, the reconstructions approach those of the EM algorithm. However, if 7 is chosen to be too large, the images will be oversmoothed. Figure 13: Sample neighborhood system with associated weights There exist many choices for the penalty function 0(r). A frequently used choice is a penalty of the form 0(r) = log cosh [8],where 6 is a previously chosen constant that allows for some finetuning of the penalty. This constant determines how the prior is scaled with respect to r, as the log cosh penalty will smooth all differences above a certain cutoff value, determined by 6. As 6 + o, the method will approach results of the EM Algorithm (as no smoothing will occur). This penalty is useful as it approximates quadratic smoothing for small values of r, but its derivative remains bounded. In 1993, Lalush and Tsui [9] developed a new penalty function that allowed for even greater control over the penalized range of intensities. Noting that current PML methods did not require the penalty function to be calculated, they instead built the function from properties of its derivative. Thus, they produced S r [1 + 2( )2]2 (17) Or Irl 6 r where a and 6 are constants that affect the spread and location respectively of peak of the derivative. As a increases, the peak becomes more narrow and as 6 1/sqrt(2) 1 1/sqrt(2) 1 0 1 1/sqrt(2) 1 1/sqrt(2) 8 increases, the peak away from the origin. The plot of the derivative shows the in i i iiiy of the smoothing occurring in the region under the peak of the derivative [9]. Thus altering the values of a and 6 will affect the range of smoothing in the image. By comparing this function to the derivative of the logcosh function, '(r) = (1/6) tanh r it can be seen that while the logcosh function does not have a peaked derivative 002 09 S0018 08 0016 0016 07 0014 06 0012 05 001 04 0008 03 0006 02 0004 01 0002 0 0 0 20 40 60 80 100 20 2 40 60 80 100 (a) Derivative of the LalushTsui penalty (b) Derivative of the logcosh penalty Figure 14: Derivatives of the penalty terms (it increases to an .ivmptotic limit), it can be considered as an infinitelywide peak. Thus, Green's logcosh penalty can be approximated as a LalushTsui penalty by taking a close to zero (producing a near infinitelywide peak). Integrating the LalushTsui penalty term results in their penalty function: 6 T2 2a4 ( = log[1 2a2(1 )+ a2(6 + r4 2r2(1 2a2)]. 2a 62 62 While Green's original OSL algorithm does not require the calculation of this term, other algorithms do require the frequent calculation of the objective function. In that case, the LalushTsui penalty will have a larger computation time than other penalties. In addition, the general LalushTsui penalty does not have the strict convexity condition needed for convergence to a unique minimizer. 1.5 Minimizing Function Given a choice of penalty term U and constant 7, the PML objective function thus becomes E(f) = Pf, y logPf, + TU(f) (1 8) In Maximum Likelihood, this objective function must be minimized over the set of positive intensity values {f fj > 0}. This minimum must satisfy the KuhnTucker optimality conditions for the problem. In this situation, those two conditions are that for all j OE (f) > 0 (19) afj aE fj (f)= 0 t9afj It will be shown that the revised algorithm presented in C'i lpter 3 will meet these conditions at convergence. CHAPTER 2 PML RECONSTRUCTION METHODS 2.1 Green's One Step Late Algorithm Applying the Maximum Likelihood method, a search is made to minimize the penalized objective function: E(f) (Pf, ylogPfi) + TU(f) i Differentiating the objective function and replacing it in the 2nd KuhnTucker optimality condition (19) yields the condition that for all j: hfy [ (pi, ) + 7 a (f)] 0. Regrouping terms in this sum produces i 9U f ii Pfi Now define rj(f) (EPj + 7 (f)) a i 'afj With this definition, it can be seen that the second KuhnTucker condition is equivalent to fj fjri(f) Y YiPi,j (21) i Pfi This form for the optimality condition is useful as it reformulates the equation into a fixed point condition. In 1990, Green developed the OneStepLate (OSL) algorithm [8] to apply the EM algorithm to the Penalized Maximum Likelihood problem. The OneStepLate name comes from the choice to evaluate the penalty at the current iterate rather than the (unknown) update, as it allowed the EM algorithm to be applied to the PML problem. In Green's OSL, the update to iterate f" is given by f~+ = frj (ff) 7 ) iPij i i Using the fact that 1f(f) E pj y U (f), the update can be expressed in the alternative form: f"+l fn + v(fn) (22) where aE vj(f) fjrhrj (f). (23) Thus if Green's OSL algorithm converges, it will satisfy the same fixed point condition as the second KuhnTucker condition. Note that if 7 = 0, Green's OSL algorithm reduces to the EMML Algorithm (15). However, Green's OSL algorithm does not guarantee positivity of the iterates or convergence. 2.1.1 Lange's Improvement to Green's OSL In 1990, Lange showed that altering Green's algorithm to include a line search could enforce convergence and positivity [10]. In Lange's algorithm, rj(f") is required to be be positive for all j. To meet this condition, the penalty parameter 7 be must be chosen small enough to keep rj(fn) > 0 for all possible iterates f" and voxels j. In the method, Green's OSL update is altered to include a stepsize parameter s (taking s=1 reduces the method to the original OSL method). Thus the update becomes f"+l f" + s"v(f") (24) where v is defined as before. The stepsize s" is chosen to be a positive value that gives both positivity of the new iterate (taking s" < 1 enforces this condition) and a decrease in the objective function E. To decrease the objective function, define P(s) = E(f + sv(f")) (25) If P'(0) < 0, the algorithm will be decreasing for small values of the stepsize. With this condition that rj(f") > 0, it follows that P'(O) = j frj (f")(2)2 (f ) < 0 if fL (ff") / 0 for some j. Since P'(0) < 0, the objective function will be decreasing for small values of the stepsize s. A linesearch is then performed for an appropriate stepsize that will give a positive update and a decrease in the objective function. In Lange's algorithm, the stepsize that gives the greatest decrease in the objective function is denoted s and a search is performed to find a value s < s within c of s. This condition requires an exact line search, eliminating the possibility of faster inexact search algorithms. 2.1.2 MairGilland Method One way to remove the restriction of positivity of the rterm is to allow the constant s to be a negative value. In this method (developed for use in the RM Algorithm [14]), the stepsize s" is allowed to be negative. The sign of s" depends on the sign of P'(0). If P'(0) < 0, the objective function will be decreasing for small positive values of the stepsize. In this case, the stepsize is chosen as in Lange's method. A search is made for a value that ensures positivity of the iterate, but is also within c of the value that gives the greatest decrease. If P'(0) > 0, then E will be increasing for small positive values of s. In this case, s" is allowed to be negative. The search is made in the negative direction, again for a value that ensures positivity and is within c of the value that gives the greatest decrease in the objective function. Although this method will decrease the objective function and convergence, the limit may not satisfy the second KuhnTucker optimality condition. 2.2 Accelerated Penalized Maximum Likelihood Method (APML) The accelerated penalized maximum likelihood method (APML) was developed in 2004 by ('!ih ig et al [5]. In the APML method, surrogate functions to decrease the objective function and provide convergence. In the method, a penalty function 0(r) and associated neighborhood system is chosen as described in Chapter 1. In addition, the related function 7(r) = r(r)/r is required to be nonincreasing. In the method, the trial iterate is formed by f'1 Gj(f) + /Gj(f)2 8Ej(f)F()/4F ) (2 6) with the defined functions given by YiPi,j fj Z j Pij f4, F, 2/3 ,7(f, f) kENj G p 2 i(f fk)(f + fk) i kENj As this was seen to converge slowly, an acceleration factor (the APML Method) was added to the algorithm [4]. As in Lange's method, a stepsize factor is added to the algorithm. In APML, the accelerated update is given by f"+i f" + sv(f) (27) with v(f") = f" f". In Lange's method, a search method is employ, l1 to find a suitable choice for the stepsize. Here, surrogate functions are used to find the best possible stepsize needed to decrease the objective function. Surrogate functions were first applied to PET reconstruction by DePierro [6]. In order to minimize a function f over a set S using the method of surrogate functions, another function i must be found satisfying f(x") = (x") f(x) > Q(x) for all x e S. The difficulty in using this method lies in finding an appropriate function Q. In APML, creating the surrogate function requires the calculation of four separate functions related to the penalty function, possibly increasing the computation 14 time. Reconstructions gathered using the APML algorithm will be compared to reconstructions using the revised algorithms presented in this paper. CHAPTER 3 GENERAL PML ALGORITHM 3.1 General Algorithm Statement In the maximum likelihood method, a search is made for an emission intensity f that minimizes the objective function E(f) = L(f) + 7U(f) under the restriction that fj > 0 for all j. In this chapter, an iterative algorithm to find this minimum will be presented. Under specific conditions, convergence to a minimizer will be proven. 3.1.1 General Algorithm Update The new algorithm is formed by taking Lange's modification to OSL, but treating the stepsize as a vector instead of a constant. To apply the algorithm, initialize first by taking fo = Yi yi/J. Given a current iterate f", the next iterate, f"+l can be constructed by the following update: fn+l= f" + sn v(f") (31) where s" is the stepsize vector rj(f) (i pi,j + 7l(f)) 1 vj(f) fjrj(f) (f) (x 0 y)j = xjyj (the Hadamard product). Note that v(f) and r(f) are the same defined functions as in the OSL algorithm. Given a current iterate f", the stepsize vector s" = [s,, ..., s"] is chosen to produce the next step in the algorithm. This vector must satisfy the conditions: 0 < sj < K for some large constant K (32) QE l < ( rj(fn) E(fJ)) 1 (33) E(f + s'+ v(f)) < E(fn)if lv(fn)l / 0 (34) (s' r(f'))j > 0. (35) The first three conditions on the stepsize vector enforce respectively boundedness of the stepsize, positivity of the iterate and decrease in the objective function. The final condition requires that for each voxel j, the sign of sj matches the sign of rj(f). Note that a unique s is not guaranteed. The revised algorithm presented after the proof provides an example of one way to choose a suitable stepsize vector. Once a penalty function and method of constructing a stepsize vector (both matching the required criteria) have been chosen, the algorithm can proceed. Continue iterating until a suitable stopping criteria has been reached. Convergence of this algorithm will now be investigated. 3.1.2 Penalty function criteria The general penalty term U(f) is of the form U(f) E iy(f, fy) (36) i j where 0(r) is required to satisfy the following properties: S(0) 0 limroo (r) + o (r) = (r) (0 is an even function) c C1 (0 has a differentiable first derivative) b is a strictly convex function (0"(r) > 0). The first three conditions are basic conditions for a penalty function, while the latter two conditions are required for convergence. Functions that match these conditions include the logcosh and quadratic penalties. Note that the general LalushTsui penalty does not meet these conditions due its lack of convexity. 3.2 Proof of Convergence Theorem 1. Let < be a p ,'.l'//l; function .,/>;bi;/ the conditions stated in Section (3.1.2). Let E be the PML objective function E(f) = L(f) + 7U(f). DA i, fo = E yi/J for all j (the il1.,>rithm is initialized with a flat emission ';. iilu:). Let the ilj,.rithm update be given by (31) with the stepsize vector chosen such that conditions (32 35) are .,i/r.f ,. The il'ji .':thm will then converge to a unique minimizer. The proof (extended from Lange's PML proof [10]) will demonstrate that the algorithm converges to a unique vector that satisfies both KuhnTucker optimality conditions, hence is the minimizer of E. It consists of two parts. In the first, Zangwill's convergence theorem is applied to prove that the objective function converges and that all limit points will fulfil the second KuhnTucker condition. In the second part of the proof, it will be shown that there is a single limit point and this point will also fulfil the first KuhnTucker condition. Definition 1. A stat.':'r,:, r; point is 1. I;,,, l to be ,i ':', i' ,'Ilq function f where for allj, f VE(f) 0. By this definition, the 2nd KuhnTucker condition will be met at a stationary point. Zangwill's Convergence Theorem [11] is now applied. Theorem 2. Z.i: ;,.jI :'s Convergence Theorem. Let T be a pointtoset function /, /7;,., on a metric space X. Generate a sequence {x"} such that x"'+ E T(x"). Let a solution set, S C X, be given il. ,:i with a continuous function E. Suppose that 1. the sequence {x"} is contained in a compact subset of X 2. if x i S, E(y) < E(x) for all y c T(x) 3. if x e S, E(y) < E(x) for all y e T(x) 4. the i'nij'l:',j T is closed at points outside of S. Then the limit of i,,;, convergent subsequence lies in S and {E(x")} converges monotor,::. allii to Es = E(y) for some y E S. To apply Zangwill, let X = R", S be the set of stationary points and let E the PML objective function. For any emission intensity f, define the set Sf to be the set of all possible stepsize vectors s fulfilling condition (32 35). Then define the pointtoset function T by the equation: T(f) {f+s v(f) : s Sf}. (37) Note that if f e S, v(f) = 0, thus T(f) {f}. In the PML algorithm, the next iterate is created by taking f"+l = f" + s 0 v(f") for a particular choice of s E Sfn, hence f"+l e T(f'). Lemma 1. The pointset i,,. j'.,:I' T is closed at all nonstationary points 0. The closure definition requires that if 0" 0 S, ', E T(0O), and, , then E T(O). Hence it must be shown that there exists a stepsize vector s E So such that = 0 + s ( v(O) holds for all 0 S. By the definition of the update, for each n there is a stepsize vector s" E Son such that 0 + s" O v(08). Since 2 is assumed to be continuous, rj(f) is continuous and hence vj(f) will also be continuous. Let W = {j : vj(O) / 0}. Since 0 is not a stationary point, W is nonempty. For each j E W, there exists a constant Mj > 0 such that for all mj > My, vj(0") / 0. Therefore for all j e W and m > maxj(Mj), m j 0 i Vj (0m) which converges to bj 0j Sj(0) For j e W, define sj to be ^(. Therefore for all j W, j = + sjvj(0). If j ( W, vj(0) = 0 and hence j = Oj. Thus for all j, j = Oj + sjvj(0) where sj = 1 for j W. By the continuity of E, v, 0 and b, s E So and hence b E T(O). Lemma 2. The iterates belong to a bounded and compact set. It will be shown that if IIf"l  oc, E(f") + o as well. The loglikelihood term L is a function of the form Y ci yilog(ci), where ci = Pfi and yi are both positive terms. If IIf"ll oo, then for some i, f7 oo as well. For each voxel j, there exists at least one tube i with pij / 0. Hence if the intensity at voxel j becomes unbounded, c = Pfi + o forcing ci yilog(ci)  o0. Thus if the iterates become unbounded, the loglikelihood function will also become unbounded. Hence if IIf"ll o0, E(f") = L(f) + 7U(f") o0. By construction, E(f") < E(fO) for all n. Thus, it cannot be true that E(f")  oo. Therefore, the iterates must belong to a bounded set. By construction, E(f"+l) < E(f"). Thus E(f") < E(f0) for all n. Therefore the iterates belong to the set {OIE(O) < E(fo)}. Due to the the continuity of E, the set is closed. Since the set is closed and bounded, it is a compact set. Theorem 3. The limit of i,:1, convergent subsequence is a stationary point. Zangwill's theorem is now applied. Let X = R" and S =the set of stationary points. If f" e S, v(f') = 0 and hence T(f") = {f"}. If f" P S, by construction, E(f" + s" ( v(f")) < E(f") for any stepsize vector s" satisfying conditions (32 35). By Lemma 1, the pointtoset mapping T is closed at nonstationary points. Thus Zangwill's theorem applies and all limit points are stationary points. In addition, E(f") Es = E(y) for some y E S. It must be shown that Es is welldefined. If x E S, then x is a limit point of the sequence. Hence there is a subsequence {ff } converging to x. Since E(f") + E(y), it must be true that E(x) = E(y). Lemma 3. If E(f) = Es for some n, then f" e S. In the proof of Zangwill's conditions, E(f+1) < E(f') if f i S and E(f+1) < E(f') if f E S. Thus if f" S, E(f'+1) < E(f") = Es. But E(f") converges monotonically to Es, giving a contradiction. Thus f" e S. Lemma 4. The set of limit points is finite. The results of Zangwill's Theorem state that all limit points of the sequence are stationary points. A stationary point has been defined to be one where for each index j, f 2(f) = 0. With the requirement that U is a strictly convex (penalty) term, E will be a strictly convex function [10]. Therefore, given any subset W of [1, J], E will remain a strictly convex function on the plane {fj 0 j E W}. Hence there is a unique point on this plane that is a minimizer for E. Denote this point fw. Thus for any subset W, there is only one point fw with (fw) = 0 for j ( W and fj = 0 for j E W. Since there are a finite number of possible subsets W, the set of stationary points is finite. Lemma 5. The distance between successive iterates tends to zero. If+l fnl IIn Qs v(ff). Since Is' < K for all n, j, it suffices to prove that vj(f') goes to zero. The function v is continuous and is zero for any stationary point. Since the set of limit points is finite, given e > 0, a single 6 > 0 can be found such that if If" yll < 6 for any y E S, then lv(f")l = Ilv(f") v(y)  < Let Z be the set of all sequence points within 6 of a limit point and the W be the remaining sequence points. If W is infinite, it contains a convergent subsequence (the iterates are contained in a compact set). Thus there are points in W arbitrarily close to a limit point, a contradiction. Hence W must be finite. Let N be the largest sequence index in W. Thus if n > N, I f" y <6 for some y E S. Hence Ilv(f") I < c. Therefore v(f") 0. Lemma 6. The set of limit points consists of a single station point. By Lemma 5, the distance between iterates goes to zero. Ostrowski proved that the set of limit points of such a sequence is both compact and connected [15]. Since this set is finite, it must consist of a single point. Denote this intensity by f. Theorem 4. The iterates converge to the minimizer of E. There is a single limit point for the sequence, hence the sequence will converge to f. Assume this function is not the minimizer of E. Since the limit is a stationary point, the second KuhnTucker criterion fj (f) = 0 is satisfied, so the first optimality condition must be violated. Therefore, an index j can be found such that fj = 0 and f(f) < 0 and hence there is an N such that (f") < 0 for all n > N. By construction, rj(fn) and sj(fn) have the same sign and fj > 0. Thus for n > N, fj+ fj + s vj(f) ff sj fj r(f) (ff) > fl and fj does not converge to zero, contrary to the assumption. Therefore, both KuhnTucker conditions hold at the convergence point and hence it is the minimizer of E. If the penalty function lacks convexity (as in the generalized LalushTsui penalty (17), the proof of Theorem 4 will not apply as it requires convexity. Thus, as with the modified RM algorithm discussed in the following chapter, the only result that can be proved is that all of the limit points generated from the sequence will be minimizers. 3.3 New PML Algorithm The general statement does not guarantee the existence of a method of choosing the stepsize vector with the needed conditions. However, there do exist several vwi of meeting the necessary conditions. In Lange's algorithm [10], all entries in the stepsize vector are the same positive value. In this method, conditions (32 35) are satisfied by assuming that rj(f) is positive for all emission intensities f and voxels j. With this restriction, the objective function will be decreasing for small positive values of the stepsize s. However, in order to ensure the positivity of r(f), the strength of the penalty term 7 may need to be decreased, which alters the objective function needed to be minimized. The revised algorithm presented below has been designed to remove this strong condition on the penalty term by using a different method to choose the stepsize, while still providing for the convergence of the iterates. 3.3.1 NonUniform StepSize Method In this algorithm, the stepsize vector consists of two values, one positive and one negative. To find these values at the nth step in the iteration, form the two sets, D+(f") := {jlrj(fj) > 0} and D_(f") := {jrj(fj) < 0}. Since Spi,j + 724 (fn) will be finite for each j, r(f") can never be zero. Therefore, D+(f") and D_(f") will partition the set of voxels. These sets will determine which value of the stepsize will be applied to the appropriate voxel. Define the equations g"(tl, t2 { vj(fn) if j D(fP) 8 gilt1, = t2 (38) fj + t2 (fn) if j E D_(fF) (t1, 2) = E(g"(t1, t2)). (39) The current value of the objective function at step n is given by b(0, 0) = E(f"). Since the goal is to decrease the value of E, a search is made for a new update in the direction of the negative gradient. This will give the direction of the steepest descent of E. If a single value is used for our constant (as in Lange's method [10] or the MairGilland method [14]), it corresponds to using (t,t) as the search direction in (38). This may not be the direction of the greatest decrease. In order to achieve the greatest decrease in the objective function, the vector (tl, 2) is assumed to be in the direction of the negative gradient. Therefore, define (t, 2) = sVQ(0, 0) for some s > 0 to be chosen later. Calculating the two partial derivatives yields: S(0,0) E0 (f>)v(f") E frj (fn)( )2(fn) at, aF aft9 j6D" j6D" (0,0) (f")v (f") f rj(fn)( )2(fn). jCD" jCD" Therefore by defining f E (f) fjrj(f)( )2(f) (3 10) jED (f) '(f) E jrj(f)(j)2(f) jED (f) it can be seen that (t1,t2) s(r(f"+), Tr(f)). Since (T, r_) provides a search direction only, it can be normalized at this stage to produce a unit vector (7(f), 7_(f)) (7r and 7_ will now refer to the normalized vector). The new iterate is thus formed by: f+l = f" + Tr(f") v(f") (3 11) where Tj(f) = +(f) if j E D+(f) and Tj(f) r(f) if j c D_(f). The problem has thus been reduced from a 2dimensional search for (ti, t2) to a 1dimensional search for s. Defining P(s) = E(f + sT(f") e v(f)), yields P'(0) = VE(f" + s(ff) O v(f")) d(f" + sr(f") ) v(f")) I o ds E (T)2(fn)j(f1)frj(f')_< 0. 8E The final inequality comes from the fact that rj has the same sign as rj (and so (s 7jrj > 0 for all j, proving the fourth criteria for the stepsize vector). Thus for small positive values of the stepsize coefficient s, E will be decreasing. It remains to be shown that the method of selecting the stepsize coefficient s will result in a stepsize vector, st, fulfilling the conditions stated in (32 35). 3.3.2 Line Search Methods In the nonuniform stepsize method discussed above, the stepsize vector consists of sT. Now, a search must be made for an appropriate stepsize coefficient to ensure the decrease in the objective function and positivity of the iterate. First, a bounding value, smax must be found to ensure positivity. Using the definition of the update, f+' = fj + sn v(f)vj(f) aE sf l(f/)fi(f)lf(ff) f/( s"(f')rj(f") (f )). 9fj Therefore, by defining 8E s = minj(rj(fn)Tj(fn) (f )) (3 12) then fI 0.If (3t2) it can be seen that if s" < s~,, then n+l > 0. If K < sax, let sa, = K. Now a search is made for a stepsize coefficient within the bounding value that decreases the algorithm. If P(s) is defined to be E(fn+l(s)) = E(f+sr(f") v(f")), then the algorithm will decrease the objective function if the constant s" satisfies P'(s") < 0. Using the definition of P(s), df"+l P'(s) = VE(f+(s)) d ds OE El 0E (f"+ (s))rj(ff) fr(ff") (ff). In order to find a constant s' to produce a negative value, a search algorithm is emploi. 1 One possible choice is the bisection rule, which produces a result within a specified precision. The algorithm proceeded with this choice, but with slow computation speeds as it requires multiple calculations of P'(s). In order to decrease the iteration time, the Armijo rule [1] was emploi. 1 The Armijo rule is an inexact search algorithm, here used to approximate where P'(s) = 0. The value returned by the Armijo rule will be a constant s that will decrease the objective function as required. To apply the Armijo rule, the search begins with an initial guess (here taken to be the minimum of sma and K). The initial value is decremented this by a constant factor 3 (taken to be 1/3) until the following relationship holds (6 is a positive constant chosen to be close to 0): P(0) + s6P'(0) > P(s). The following figure illustrates one application of the Armijo rule and the value that would be returned by the method. The use of the Armijo rule produced much decrement acceptable value start point x P(s) Figure 31: Armijo rule smaller computation times than the bisection method. The former requires the calculation of the derivative only at the origin and thereafter calculates values of P(s) to give a rough approximation to the solution. Even though it does not give the best value for the stepsize coefficient, it was seen that the number of iterations needed to reach a stopping point did not significantly increase when the Armijo Rule was applied. Hence, the savings in computation time caused it to be used in the reconstructions. Since the stepsize coefficient s' satisfies P'(s") < 0, the objective function decreases as needed. The update is then formed by fn+1 fn + szn_(fr) o v(fr) and the iteration proceeds. It remains to be shown that this choice of stepsize vector will satisfy the required conditions for convergence. By construction, (sTj(f)) < K. By specifying that (sTj(f)) < smax, (33) is met. (sr) decreases the objective function, satisfying (34). Finally, by construction, Tj(f) and rj(f) have the same sign, proving (35). Since conditions (32 35) have been satisfied, the iteration will converge as proven in Theorem 1. 3.3.3 Stopping Criteria The algorithm should terminate when the KuhnTucker optimality conditions (19) are realized. The projected gradient function 1 can be used to accomplish this. This function is defined by aE pgd(f) = max(fj (f)),0) fJ o. (313) 9fj Theorem 5. pgd(f) = 0 if and only if the KuhnTucker oi'nl,h,/,:li, conditions are met. Assume the KuhnTucker conditions are not met. If the first condition is not true, then (f) < 0, resulting in pgd(f) > 0. If the second condition is not met, fj and j(f) are nonzero for some j. If fj >  (f) pgd(f) > (f) > 0. If the other inequality holds, then pgd(f) > fj > 0. Thus if the KuhnTucker conditions do not hold, the projected gradient function is nonzero. Now assume the KuhnTucker conditions are met. For each j, (f) > 0 and at least one of fj and E(f) are zero. If fj > 1(f), then E(f) 0 and max(fj E(f),0) fj f) 0. If E(f) > f,, then fj 0 and max(fj (f),0) 0. (f aO) fjf fj O. Given some choice of small c > 0, the algorithm can be terminated when pgd(f) < e. 1 as I'. 1 I by Prof. Hager CHAPTER 4 MULTIPLE IMAGEMOTION RECONSTRUCTION 4.1 Algorithm Desription 4.1.1 RM Algorithm The Penalized Maximum Likelihood method discussed in Chapter 3 can be extended for use in the reconstruction of timed, 1 1, images. In this problem, a series of CTgenerated images of an object are reconstructed along with any motion of the object between the scans. The RM algorithm, as developed by Gilland and Mair [13] [2] [14], can be used to reconstruct this series. However, its convergence to a minimizer has not been proven. A modification of their work allows for convergence of an altered objective function. As in their work, the algorithm will be described initially in the continuous case, then discretized to produce the iteration. In this reconstruction, a total of K timelapsed scans (or frames) are analyzed. In the continuous case, the frame number will be denoted in the subscript, fk(z). In the discrete case, the frame number will be the first subscript, fk,j referring to the jth voxel in frame k. Along with the unknown emission intensities in each frame k, fk(w), the motion vector, mk(z) from frame k to frame k + 1 is also unknown. This motion vector describes the movement of each point z in frame k, as z moves to z + mk(z) in frame k + 1. In the continuous case, the negative loglikelihood of the reconstruction is given by K L(f) [Pfk(i) k(i)logPf(i)] (4 1) k=0 i where Pfk(i) is the continuous projection operator given by Pfk(i) pi(z)fk(z)dz and pi(z) is the probability that an emission at w will be registered in detector i (compare with the discrete projection operator given by Pfi = iPijfj). The penalty term U now measures how well the motion term matches up points between frames. Ideally, fk(z) = fk+l(z + mk(z)) but it is to be expected that this will not occur with real data. Hence the imagematching term between frames k and k + 1 is given by Uk(fk, fk+, mk) fk() fk+(z + mk(z))]2dz (42) and the penalty term becomes the sum of these individual terms. K U(f, m) Uk(fk, fk+1,m) (4 3) k= 1 In addition, the objective function includes a third term Es = 1= Esk modeling the elastic strain of the material. Esk(mk) = 1/2 A(z)(uk,x +' + k, )2dz + (z) + ,z)dz +1/2 (z)[(uk,y + Vk,)2 + (Uk,x + Wk,x)2 + (Vk,z+ "' ,)2 d where A and p are material parameters,uk, 1, i, w are the components of mk and Uk,x is the partial derivative of Uk with respect to x. Thus after adding constants to balance the contribution of each term, the continuous objective function becomes E(f, m) a cL(f) + U(f, m) + 3Es(m). (44) The reconstruction problem thus becomes a minimization of E(f, m) with the restriction that the intensities in each frame remain positive. At each iteration of the RM Algorithm (the outer loop), two processes are run in succession the RStep and the MStep. In the RStep (an inner loop), the previous motion iterate is used to improve the reconstruction of the frames. Since Es does not depend on f, the RStep is applied only to aL(f) + U(f, m) with the motion vector held constant. A PML reconstruction method can be applied (such as the nonuniform stepsize method described C'!i pter 3) to decrease this term and produce the next iterate. In the MStep (an inner loop), the motion vector is updated using the frames generated from the RStep. In this step, the conjugate gradient algorithm is applied to decrease U(f, m) + 3Es(m) with f now held constant. 4.1.2 ImageMatching Penalty In order to run the algorithm, the image space is partitioned into J equally sized voxels. However, after discretizing the image space, it cannot be assumed that the center of a voxel is mapped to the center of another voxel by the motion vector. Thus, it is necessary to distribute the motion over adjoining voxels. To accomplish this, let (uk,i, Vk,i, Wk,i) be the components of the motion vector mapping voxel i in frame k into frame k + 1. Now define di, dj and dk to be the decimal portions of u, v and w respectively. If (ai, bi, ci) are the coordinates of the center of voxel i, then the motion vector will map this voxel to the "voxel" in frame k + 1 centered at (ai + Uk,i, bi + vk,i, Ci + Wk,i). The target voxel to be approximated will lie in at most eight real voxels. Define dooo to be the voxel located at ([ai + Uk,i], [bi + Vk,i], [ci + i,' ]). Then the target voxel will lie in a cube of dimension 2voxels with dooo in the lower left corner. The emission intensity in the target voxel can then be found by a trilinear approximation using the voxels forming the cube, with the weights proportional to the distance from center of each voxel to the center of the target voxel. Thus, the discrete version of the imagematching term (43) is given by: Ukcfk, fk+l,mk) fk,i Ck,i,jfk+l,j)2 (45) i jENk,i where Nk,i is the 8voxel cube formed by mk,i and the weights, k,ij, are given by Table 41. This form for the imagematching term is currently implemented in the voxel voxel center weight dooo ([ai + uk,i], [b + vk,i], [ci + ) (1 )( dj) dk) dloo ([ai + Uk,i] + 1, [bi + Vk,i], [Ci + ) ( dj)( dk) dolo ([ai + Uk,i], [bi + Vk,i] + 1, [ci + (1 di) d( dk) dllo ([ai + Uk,i] + 1, [bi + vk,i] + 1, [ci + ,, ) didj( dk) dool ([ai + Uk,i], [bi + Vk,i], [ci + wk,i] + 1) (1 d)( dj)dk dlol ([ai + Uk,i] + 1, [bi + vk,i], [Ci + Wk,i] + 1) d dj)dk doll ([ai + Uk,i], [bi + k,i] + 1, [ci + wk,i] + 1) ( di)djdk d111 ([ai + Uk,i] + 1, [bi + vk,i] + 1, [ci + wk,i] + 1) didjdk Table 41: RM voxel neighborhood objective function used in the RM Algorithm. However, it is difficult to perform the conjugate gradient (I Step) on this term. it has proven difficult to implement the MStep on the this term since the neighborhood system Nk,i depends on the motion vector. To solve this problem, the MStep has instead been performed on a linearization of the imagematching term [14] given by Qk(m) fk(w) fk+(W) (mk(w) m(w)). Vfk+(w + imk(w)dw with the previous motion estimate given by m. However, it has not been proven that a decrease in the linearized term Qk will result in a decrease in the original imagematching term Uk as required. One possible way to solve this problem is to perform the linearization at the objective function level rather than at each inner loop. Hence, the original objective function is modified to replace Uk with a linearized version Uk. With this change, the algorithm will now be referred to as the modified RM algorithm. Since the objective functions are different, a minimizer for the modified RM algorithm is not expected to be a minimizer for the original RM algorithm. However, a decrease in the modified objective function can be proven, allowing for convergence results. To describe the modified objective function, define the term K U = YUk(fkfk+1,mk) k= 1 Uk (fk() fk+1(z) mk(z)) Vfk+l(Z))2dz (46) and update the objective function to become E(f, m) = aL(f) + U(f, m) + 3Es(m). (47) In addition, linearizing the imagematching term gains convexity in the objective function (the RM imagematching term is not convex in m). However, this linearization is valid only for small displacements. If the motion between the frames is large, the reconstruction may not be valid. Using this modified objective function, the two inner loop subprocesses can now be described in detail. 4.1.3 Modified RM Iteration The RStep consists of minimizing aL(f) + U(f, m) over the set of positive intensities f. As this in the same form as the objective function in ('!i Ilter 3, the nonuniform stepsize algorithm (31) can be utilized. Thus, the continuous RStep update for each frame k is given by f+ = f + sv(f") (48) f s frk(fn)DkE(f, m) with rk(f)(z) (= EiPi(z)+DkU(f,m))'. As derived in the Appendix of [14], the derivatives DkE (taken with respect to fk) are given by DkE(f, m) DkU(f, m) DkUk(fk, fk+I, mk) Dk k I(fk1, fk, mk1) a (pi~z k (ipi ()z+ Dki(fZ, M) N y ((z) Pf(i)" ) + DnU(f, m) Pfk(i) Dkk (fk, fk+, mk) + DkUk (f1, fk, k m1) 2(fk fk+l mk Vfk+l) 2(fkl fk mki Vfk) +2(Vfk 1 V7 V(mk Vfk)) mk_ +2(f_1 f i m_ Vfk)(V mkl). The last derivative can be seen (in the 2frame case without loss of generality) by using the formula d D2 U (fl, f2, ml) E(fI(z), f2(z) + th(z), mi(z))It=o dt (49) to obtain D2U (fl, f2, mi) 2 (fl(z) f2(z) mi(z) Vf2(z))h(z)dz 2 (fi(z) f2(z) mi(z) Vf2(z))(mi(z) Vh(w))dz. D2U(fl f, m) = if equation (49) can be written as f tb(z)h(z)dz. As the first term is already in this form, only the second needs to be simplified. This can be done by applying the identity Sgm Vh(z) S(V gm)h(z) S(Vg m)h(z) + (gV. m)h(z) to yield the equality for the second term: 2 J((V(fi(z) f2(z) mi(z) Vf2(z)) mi)h(z)dz +2 (fi(z) f2(z) mi(z) Vf2(z))(V mi)h(z)dz. Distributing the V in the first integral and adding the results to the first term yields the desired derivative. After discretizing the voxels (and approximating derivatives with central differences), (48) reduces to the discrete iteration: fJ = ft + s",jk,j (410) where fkjrk,(f) (f) and rk,(f) ( Ca k,i,j +DkU(f, m))1. As before, the stepsize s" vector must be chosen so that (32 35) are satisfied. The method described in C'! Ilpter 3 can be used to find such a vector. Once the emission intensity has been updated by the RStep, the MStep is run to update the motion vector. This step consists of minimizing U(f, m) + 3Es(m) over all possible motion vectors m. The conjugate gradient algorithm (using the PolakRibiere variant) is then applied to the discretized voxels to update the motion vector [14]. In practice, at each iteration of the RM algorithm (the outer loop), one iteration each of the RStep and MStep occurs (the inner loop). Continue running the outer loop under a stopping criteria has been reached, such as the projected gradient (313) falling under a chosen critical value. 4.2 Convergence Results Although the new objective function E aL + U + 3Es is convex in f, Lemma 4 does not apply in this situation (unless a single motion vector is fixed). Hence, convergence to a single minimizer is not guaranteed. Theorem 6. If {ff} is a sequence of iterates produced by (410), then each limit point of the sequence is a minimizer for the objective function E and E(f") converges. As before, Zangwill's Convergence Theorem is applied. The definition of the pointtoset function T is expanded to the following: Ti(f,m) {(g, mcG)g E T(f)} (411) where T is the previous pointtoset function defined in (37) (with a range of possible stepsize vectors) and mce is the motion vector outputted from applying the conjugate gradient algorithm to U(g, m) + 3Es(m). Let E be the linearized objective function and define the set of stationary points S to be the set of all points (f, m) with v(f) = 0. If f e S, then T(f) = {f} and the RStep will not change the value of L + yU. The MStep (which does not affect the reconstructed image f) has been defined to decrease yU + Es Thus E(Ti(f, m)) < E(f, m) if f e S. If f S, then the RStep will decrease L + yU as desired. Following this, the MStep will decrease the value of 7U + Es. Hence E(TI(f, m)) < E(f, m) if f S. The closure condition is proved by noting that T1 can be split into its two parts, thus allowing the critical closure condition on f to be proved as in Lemma 1. Thus by Zangwill, all limit points of the sequence {(f", m"n)} lie in S and E(f", m") will monotonically converge to E(y, m) for some (y, m) e S. Now to prove all limit points are minimizers, let y be a limit point for the algorithm. Consider the subsequence f" where for each j, II f" y < 1/j (the definition of a limit point gives the existence of such points). Now as in Theorem 4, assume y does not meet the first KT condition for a minimizer (By Zangwill, y e S, hence the second KT condition minimizer of E is met). Therefore, an index k can be found such that yk = 0 and OJ(y) < 0 and hence there is an J such that E(f j) < 0 for all j > J. By construction, rk(fnd) and Sk have the same sign and fj > 0. Thus for j > J, f' = f + "' (f) f fsf3kk f fkk9 35 and fk, does not converge to yk = 0, contrary to the assumption. Therefore, both KT conditions hold at the limit point. If the modified MStep is terminated after iteration K (locking in a specific value for the motion vector) and the modified RStep is used exclusively afterwards, the convergence proof for penalized maximum likelihood presented in Chapter 3 will suffice to give convergence for the modified RM algorithm. CHAPTER 5 RECONSTRUCTIONS 5.1 PML Reconstructions To test the nonuniform stepsize algorithm presented in C'!h pter 3, a series of reconstructions was performed on a phantom source image 1 A = [A, ..., Aj], of dimension J 128x128. As di',11l I below, the phantom image is that of a typical cardiac scan, with lungs, heart, spine and other tissues. Higher intensity values correspond to regions of greater metabolic activity (in particular the heart and lung regions). The detector data, Y = [y, ..., yi], was created by calculating the 3 Figure 51: Phantom chest source image probability matrix P using (12). The chest emission was then projected into the detector space using yi = Z pijAj. Finally, random Poisson noise was added to the 1 created by Dr.Anderson data. The PML objective function is formed by the sum E(f) L(f) + U(f). The penalty term, U(f) was formed using the logcosh function with 6 taken to be 50. U(f) E log cosh f (5 1) i jENi The corresponding neighborhood system, N, is formed by the 8 closest voxels, with the diagonal elements having a weight of and the 4 axial neighbors a weight of 1, as seen in Figure 13. The strength of the penalty term, 7, ranged over a set of values to provide a comparison. The reconstructions were formed by the nonuniform stepsize method (31). In separate trials, the search for the stepsize coefficient, s, was performed with the Armijo Rule and the bisection method. For each value of 7, the algorithm was run until the projected gradient (313) fell under 102. The resulting reconstructions were then compared to the source image with a rootmeansquare calculation Zj(fj Aj)2 128 128 to determine optimal values to use for the penalty strength. As a comparison, the same phantom scanner data was used in the APML algorithm (27). The reconstructions outputted from APML were visually identical to the reconstructions gained using the nonuniform stepsize algorithm. In all trials, the objective function decreased rapidly, then leveled off. It was seen that the image quality did not improve significantly after this leveling off of the objective function. As can be seen in Figure 52, initially the objective function decreases faster with the nonuniform step method than with APML, but eventually both fall to the same level. The computation times for APML were approximately the same as the computation time using the nonuniform S APML Non Unif L 3 5 Iteration Figure 52: Comparing the decrease in the objective function stepsize method with the Armijo Rule. The APML algorithm was able to reduce the projected gradient factor in less iterations than with the nonuniform stepsize method. Table 51 gives both the RMS error and the iterations required for algorithm termination, occurring when the projected gradient fell below 102. In Figure 53, 7 Method Iterations CPU time RMS error 0.015 Bisect 219 475 49.5 0.020 Bisect 227 492 46.5 0.025 Bisect 235 631 45.5 0.030 Bisect 183 530 45.4 0.015 Armijo 220 118 49.6 0.020 Armijo 228 125 46.4 0.025 Armijo 268 129 45.5 0.030 Armijo 229 100 45.4 Table 51: PML reconstruction data the RMS error is plotted against the penalty paramter 7. Reconstruction using the nonuniform stepsize algorithm (applying either bisection or Armijo in the line search produced an identical RMS error) produced a smaller RMS error than reconstruction of the image than using the APML reconstruction method. Thus, these methods were able to produce a reconstruction closer to the source phantom image. Due to its inexact search, the Armijo rule reduced the projected gradient 50 .. APML 49 Non Unit S485 0 0 48 S475 S47 465 46 45 5 451 3 4 5 Gamma Figure 53: Comparing the rootmeansquare (RMS) error criteria in a significantly shorter cputime than the bisection method (although the iterations were approximately the same). The RMS plot in Figure 53 shows that a penalty parameter taken from [0.03,0.04] resulted in the closest reconstruction to the original phantom. The reconstructions seen in Figure 54 were formed by applying the bisection method. Although the iteration time was longer than using the Armijo method, the bisection method gave a visually better image with higher values of the penalty parameter, 7. The next set of reconstructions found in Figure 55 were formed by applying the Armijo Rule. In this case, as 7 increased, the Armijo rule did not perform as well in highintensity regions. This can be observed in the heart region of the later reconstructions. Figure 56 di l1' a plot formed by a vertical line thru the heart region. In this region, the exact bisection rule was able to produce a smoother image in the highintensity heart region than the nonexact Armijo rule. Note that both methods produced similar intensities except in the heart region. With further iterations, it was seen that the nonsmooth intensity in the heart region was reduced (using a smaller value for the projected gradient cutoff would accomplish this). A tradeoff has thus been made between reconstruction speed (the Armijo rule) and accuracy in the stepsize choice (bisection). One possible solution (a) 7 = 0.020 (b) 7 = 0.025 (c) 7 = 0.030 (d) 7 = 0.035 Figure 54: Phantom image reconstruction using bisection rule (a) 7 0.020 (c) 7 = 0.030 (d) 7 = 0.035 Figure 55: Phantom image reconstruction using Armijo rule Armijo \, Bisect Figure 56: Line plot of heart region (b) 7 = 0.025 is to combine these methods begin with the Armijo method to approximate the solution, then use bisection for precision. Originally, the new algorithm was designed to remove the restrictions on the penalty function imposed by Lange's algorithm (24). Figure 57 shows two reconstructions of the chest, the first using the nonuniform stepsize method and the second using Lange's algorithm (and the Armijo Rule). In this case, a quadratic penalty was used in place of the logcosh function. With a penalty parameter of 7 = 0.03, the r(f) function becomes negative (a situation not covered in Lange's algorithm). As can be seen from the reconstructions, the nonuniform stepsize method is able to reconstruct the image correctly, but Lange's algorithm (together with the Armijo Rule) produces artifacts. (a) Nonuniform stepsize method (b) Lange's method Figure 57: Comparing reconstruction algorithms 5.2 Modified RM Reconstructions The modified RM algorithm presented in C'! lpter 4 was tested on a 2frame NCAT phantom image of a heart. The algorithm reconstructs both images with the assistance of the motion vector between the frames. The objective function used in the modified RM algorithm is given by E(f, m) = aL(f) + U(f, m) + 3Es(m) with the constants taken to be a = 0.02 and 3 = 0.005 as in previous reconstructions. The linearized penalty term U was used in the reconstruction instead of the virtual voxel method used in the RM algorithm of Mair and Gilland [14]. Although the use of the virtual voxels was seen to produce useful reconstructions, the convergence of the RM objective function was unproven (but was observed numerically). In addition, the use of the linearized penalty produced a decreasing objective function as seen in Figure 58. The second plot in Figure 58 shows the functions involved in the MStep, U and sEs. As can be seen, both U and 3Es are 15 2200 2000 u 0 1800 10 E > != C 3 1600 0 LL. 1400  1200 1000 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Iteration Iteration (a) Modified RM objective function (b) U and 3Es Figure 58: Modified RM function plots increasing for each outer loop. Despite this, the MStep is reducing their sum for each MStep as can be seen in the sample RM data found in Table 52. Typically, the Rstep increases the value of U while reducing aL. Then the MStep decreases U and increases 3Es to give a decrease in the objective function. The reconstructions formed using the modified RM algorithm (with the linearized penalty, U), were comparable to those outputted using the original RM algorithm (using the approximated voxels penalty method (41)). Figure 59 dipliv' a quiver plot of the reconstructed Slice 20 combined with the motion vector. Figure 510 dipl i1 a comparison of the methods. The top reconstructions were Iteration 1 RStep 1 MStep 2 RStep 2 MStep 3 RStep 3 MStep 4 RStep 4 MStep 5 RStep 5 MStep aL U 2866.4 1.326 1.262 2728.4 1.678 1.673 2569.1 2.341 2.232 2374.0 3.367 3.315 2116.4 5.638 5.330 3Es Objective 2867.76 0.0203 2867.72 2730.11 0.0228 2730.11 2571.48 0.0750 2571.42 2377.44 0.104 2377.42 2122.11 0.277 2121.97 Table 52: RM data a 0.02, 3 = 0.005 :..M 10 20 30 40  50 60 * U 10 20 30 40 50 60 Figure 59: Quiver plot of Slice 20 performed with the modified RM algorithm. The lower plots were reconstructed using the original RM algorithm, both using the parameters a 0.02 and 3 = 0.005. Although the original RM method gives a better estimate of the motion penalty, (a) Frame 1 Slice 20 Modified RM (c) Frame 1 Slice 20 Original RM Figure 510: RM reconstructions (b) Frame 2 Slice 20 Modified RM (d) Frame 2 Slice 20 Original RM a decrease in the objective function (and hence convergence) remains unproven. The use of the linearized penalty in the modified RM algorithm produces a similar reconstruction, while in addition giving the needed decrease in the objective function. 5.3 Conclusions Several methods exist to solve the problem of image recovery in Emission Tomography. Green's OSL algorithm as extended by Lange provides for convergence in a reduced set of penalty functions (which do not include the imagematching penalty used in the RM algorithm). Although Lange's algorithm can be employ, 1 in larger settings, neither its convergence to a minimizer nor a decrease in the objective function is guaranteed. If the stepsize is allowed to become negative [2], a decrease in the objective function and algorithm convergence is once again guaranteed, but this minimizer may not satisfy the first KuhnTucker optimality condition. However, a further extrapolation to a stepsize vector will both guarantee convergence to a minimizer and allow for a larger set of penalty functions. Unlike previous methods, the convergence proof of the generalized method allows for the use of inexact faster line searches, such as the Armijo Rule. The nonuniform stepsize algorithm provides one way of constructing a stepsize vector that meets the conditions required for convergence. This generalized method can be extended to other image recovery problems, such as multiple imagemotion recovery. In this case, the generalized algorithm is incorporated into the modified RM algorithm for reconstruction. Future work in this area will concentrate on refining the algorithm to increase speed without losing image quality. APPENDIX SAMPLE ALGORITHM FORTRAN CODE This Fortran code can be utilized to run the nonuniform step method as described in ('!Ch pter 3. Code to calculate the penalty term (and its derivative) is required. The probability matrix pi,j is stored in a sparce matrix format consisting of hrow, hcol and hv preserving the nonzero entries of p. The iteration loop consists of the following code: do j=l,nhdim temp = hcol(j) hsum(temp) = hsum(temp) + hv(j) enddo do j=l,nhdim tempi = hrow(j) temp2 = hcol(j) AX(templ) = AX(templ) + hv(j)*f(temp2) enddo do j=l,nhdim tempi = hrow(j) temp2 = hcol(j) if(AX(templ).ne.0)then AY(temp2) = AY(temp2) + hv(j)*Y(templ)/AX(templ) endif enddo call findderivpenalty(f,dU) do j=l,NVOXX dE(j) = hsum(j) AY(j) + gamma*dU(j) enddo tau(1) = 0 tau(2) = 0 do j=l,NVOXX r(j) = sA(j) + gamma*dU(j) fl(j) = f(j)*AY(j)/r(j) v(j) = fl(j) f(j) enddo tempi = 0 do j=l,NVOXX if(r(j).le.0)templ = 1 enddo if (templ.eq. )then do j=l,NVOXX if(r(j) .ge.0.)then tau(l)=tau(l)+(f(j)*dE(j)**2)/r(j) else tau(2)=tau(2)+(f(j)*dE(j)**2)/r(j) endif enddo else tau(1) = 1 endif tau(3) = sqrt(tau(1)**2 + tau(2)**2) tau(1) = tau(1) / tau(3) tau(2) = tau(2) / tau(3) do j=l,NVOXX if (r(j).ge.0.)then v(j) = v(j)*tau(1) else v(j) = v(j)*tau(2) endif enddo smax=0 do j=l,NVOXX if(r(j) .ge.0)then smax = max(smax, tau(1)*dE(j)/r(j)) else smax = max(smax, tau(2)*dE(j)/r(j)) endif enddo if (smax.eq.0.)then smax = 1 else smax = 1/smax endif sopt =finds(f,smax,v,hcol,hrow, hv,Y,dE,gamma) do j=l,NVOXX f(j) = f(j) + sopt*v(j) enddo Utilizing the Armijo rule in the linesearch can be accomplished by the code (called by finds in the main loop): dEdotv = 0 do j=l,NVOXX dEdotv = dEdotv + dE(j)*v(j) enddo tol = 0.1 k= 0 rho = 3 30 continue alpha = smax / (rho**k) do j=l,NVOXX f1(j) = f(j) + alpha*v(j) enddo do j=l,nhdim temple = hrow(j) temp2 = hcol(j) AX(templ) = AX(templ) + hv(j)*fl(temp2) enddo call findpenalty(fl,U) E= 0 do j=l,ndet if(AX(j).ne.0)then E = E + AX(j) Y(j)*log(AX(j)) endif enddo 51 do j=l,NVOXX E = E + gamma*U(j) enddo flag = EO + alpha tol dEdotv k = k+1 if(E.gt.flag)goto 30 finds = alpha REFERENCES [1] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty, 1993. Nonlinear r'ii'i"'.r .: ':r theory and l..', .:thms. New York : J. Wiley & Sons, 2nd edition. [2] Z. Cao, D. Gilland, B. Mair, and R. Jaszczak. Threedimensional motion estimation with image reconstruction for gated cadiac ECT. IEEE Trans. Nuclear Science, pages 384388. [3] R. B. Carroll. An or'/,. '.,, ,';l series approach to positron emission '.i,,,.'/,,/',; PhD thesis, University of Florida, Gainesville, FL, October 1998. [4] J. C'I ,i:;, J.M.M. Anderson, and B.Mair. An accelerated penalized maximum likelihood algorithm for positron emission tomography. IEEE Trans. Image Processing, forthcoming, 2006. [5] J. C'!I in,; J.M.M. Anderson, and J. Votaw. Regularized image reconstruction algorithms for positron emission tomography. IEEE Trans. Med. Imag., 23(9):11651175, 2004. [6] A.R. DePierro. A modified expectation maximization algorithm for penalized likelihood estimation in emission tomography. IEEE Trans. Med. Imag., 14(1):132137, 1995. [7] S. Geman and D. McClure. B i ,'i image analysis: an application to single photon emission tomography. Proc. American Statistical S .. I:, pages 1218, 1985. [8] P. Green. On use of the EM algorithm for penalized likelihood estimation. J.R. Statist. Soc, 52(3):443452, 1990. [9] D. Lalush and B. Tsui. A generalized Gibbs prior for maximum a posterioro reconstruction in SPECT. Phys. Med. Biol., 38:729741, 1993. [10] K. Lange. Convergence of EM image reconstruction algorithms with Gibbs smoothing. IEEE Trans. Med. Imag., 9(4):439446, 1990. [11] D. Luenberger, 1973. Introduction to linear and nonlinear "'"','i'rn.':', Reading : AddisonWesley. [12] B. Mair. A mathematical model incorporating the effects of detector width in 2D PET. Inverse Problems, 16:223 224, 2000. 53 [13] B. Mair, D. Gilland, and Z. Cao. Simultaneous motion estimation and image reconstruction from gated data. Proc. 2002 IEEE Internat. Symp. Biomed. Imaging, pages 661664, 2002. [14] B. Mair, D. Gilland, and J.Sun. Estimation of images and nonrigid deformations in gated emission CT. IEEE Trans. Med. Imag., 25(9):11301144, 2006. [15] A. M. Ostrowski, 1973. Solutions of equations in euclidean and banach spaces. New York : Academic Press, 3rd edition. [16] L. A. Shepp and Y. Vardi. Maximum likelihood reconstruction for emission tomography. IEEE Trans. Med. Imag., 1:113122, 1982. BIOGRAPHICAL SKETCH Jeff Zahnen was born on August 7, 1974, in Chicago, Illinois. He received his Bachelor of Science degree in mathematics with a minor in history from the University of Florida in 1996. Following this, he received his master's degree in 1998. In 2000, he began to study medical imaging techniques under the supervision of Dr. Bernard Mair. In 2006, he completed his Ph.D. work on emission tomography methods. During this time, he also served as the coach of the University of Florida College Bowl Team. 