
Citation 
 Permanent Link:
 http://ufdc.ufl.edu/AA00039148/00001
Material Information
 Title:
 Piecewise Gompertz model on solving cure rate problem
 Creator:
 Chen, Haoyi, 1972
 Publication Date:
 2001
 Language:
 English
 Physical Description:
 viii, 118 leaves : ill. ; 29 cm.
Subjects
 Subjects / Keywords:
 Censorship ( jstor )
Goodbyes ( jstor ) Mathematical independent variables ( jstor ) Multilevel models ( jstor ) Parametric models ( jstor ) Proportions ( jstor ) Reliability functions ( jstor ) Standardized tests ( jstor ) Statistical models ( jstor ) Statistics ( jstor ) Dissertations, Academic  Statistics  UF ( lcsh ) Statistics thesis, Ph. D ( lcsh )
 Genre:
 bibliography ( marcgt )
theses ( marcgt ) nonfiction ( marcgt )
Notes
 Thesis:
 Thesis (Ph. D.)University of Florida, 2001.
 Bibliography:
 Includes bibliographical references (leaves 115117).
 General Note:
 Printout.
 General Note:
 Vita.
 Statement of Responsibility:
 by Haoyi Chen.
Record Information
 Source Institution:
 University of Florida
 Holding Location:
 University of Florida
 Rights Management:
 The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. Â§107) for nonprofit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
 Resource Identifier:
 028196784 ( ALEPH )
49281155 ( OCLC )

Downloads 
This item has the following downloads:

Full Text 
PIECEWISE GOMPERTZ MODEL ON SOLVING CURE RATE PROBLEM
I
H
By
oY I d 4 iE'N
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
2001
Copyright 2001
by
Haoyi Chen
I dedicate this work to Hua.
ACKNOWLEDGMENTS
I would like to take this opportunity to express my deepest gratitude to Professor Myron N. Chang, my advisor, for his guidance, tremendous help, and encouragement during my doctoral research. This dissertation would not have been possible without his forceful guidance and excellent advisement.
I also would like to thank Professor Randolph R. Carter, my graduate
research supervisor, for his guidance and support during the various epidemiological research projects with which I have been involved as a graduate research assistant. My deepest appreciation is extended to the other committee members, Dr. Ronald Randles, Dr. Samuel S. Wu, and Dr. Michael B. Resnick, for their advice on my doctoral research.
Lastly, I would like to extend my gratitude to my husband Hua and my parents for their neverending support, understanding, and love.
TABLE OF CONTENTS
page
ACKNOWLEDGMENTS ............................. iv
ABSTRACT . . .... .. .... .. ... .. .. .... . .... ... . .. vii
CHAPTERS
1 INTRODUCTION .............................. 1
1.1 Problem and Common Practice ................... 1
1.1.1 M ixture M odel ......................... 2
1.1.2 Binary Regression Model ................... 6
1.1.3 Nonparametric Method .................... 7
1.1.4 Model with Bounded Cumulative Hazard Rate ......... 10
1.2 Computation Difficulty of the Semiparametric Model ......... 17 1.3 Piecewise Exponential Model ..................... 20
2 TOPIC OF RESEARCH ........................... 25
2.1 Data Structure and Model ............................ 26
2.2 Assumptions ..................................... 28
2.3 Hypothesis of Interest ............................... 29
2.4 Test of Hypothesis ....... .......................... 29
3 ESTIMATION OF PARAMETERS .......................... 31
3.1 Likelihood Function and Information Matrices .............. 31
3.2 Concavity of Likelihood Function ....................... 32
4 PROPERTIES OF THE MAXIMUM LIKELIHOOD ESTIMATES . .. 35
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Preliminary Proofs .......................... 36
4.3 Existence and Consistency of the MLE ............... 37
4.4 Asymptotic Normality of the MLE ..................... 40
4.5 GoodnessofFit Test ...... ......................... 42
5 SIMULATIONS ....................................... 47
5.1 Introduction ..................................... 47
5.2 Generation of Random Samples ........................ 47
5.2.1 Parameter Settings ...... ...................... 48
5.2.2 Generation of Censored Observations ................ 50
5.2.3 Generation of Failure Observations .................. 56
5.3 Type I Error and Power ...... ....................... 57
6 EXAMPLE ......... .................................. 76
6.1 Introduction ..................................... 76
6.2 Methodology ..................................... 77
6.3 Results ......................................... 80
6.3.1 Pairwise Comparisons on Cure Proportions ............ 80
6.3.2 Cure Proportion Estimates ....................... 82
6.3.3 GoodnessofFit .............................. 82
7 SUMMARY ........ .................................. 88
APPENDICES
A DERIVATION OF MARGINAL LIKELIHOOD .................. 89
B PROOF OF LEMMA 4.1 ....... .......................... 94
C PROOF OF LEMMA 4.2 ....... .......................... 97
D PROOF OF LEMMA 4.3 ...... .......................... 100
E PROOF OF LEMMA 4.4 ...... .......................... 101
F PROOF OF LEMMA 4.5 .......................... 107
G PROOF OF LEMMA 4.6 .......................... 109
H PROOF OF LEMMA 4.7 .......................... 112
I PROOF OF LEMMA 4.8 .......................... 113
J SIMPSON'S FORMULA ........................... 114
REFERENCES ................................... 115
BIOGRAPHICAL SKETCH ............................ 118
Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
PIECEWISE GOMPERTZ MODEL ON SOLVING CURE RATE PROBLEM By
Haoyi Chen
December 2001
Chair: Myron N. Chang
Major Department: Statistics
In cancer research, some of the patients are diseasefree after certain
treatments. It is interesting to compare treatment efficacy with longterm survival rates. One commonly used approach in analyzing this type of data is to compare the KaplanMeier estimates of the cure rate. Another approach is to apply the mixture model proposed by Farewell. However, the KaplanMeier estimates are unstable toward the end point, while the mixture model is computationally too complex.
To overcome these difficulties, we propose a test based on a piecewise
Gompertz model to compare drug efficacy on the cure rate. The proposed test also accommodates the situation where patients display different hazard patterns during different treatment stages. In this work, we have derived the strict concavity of the loglikelihood function and the existence, consistency, and asymptotic normality of the maximum likelihood estimates of the parameters. In addition, our Monte Carlo simulation study shows that the proposed test is more computationally feasible and more powerful than the test based on Farewell's mixture model. Moreover, an
example is given to show the utility of our proposed test. A goodnessoffit test of our proposed model is also discussed.
CHAPTER 1
INTRODUCTION
1.1 Problem and Common Practice
In survival analysis, one is interested in comparing the efficiency of various drugs or treatments. The survival probability is generally assumed to converge to zero when the observation time becomes large. However, it is not uncommon in cancer research that a proportion of patients are diseasefree (relapsefree) after certain treatments. These patients are called "longterm survivors" or "cured." For instance, in clinical trials dealing with childhood leukemia, some treatments do cure a proportion of patients permanently. This type of longterm survival pattern is different from what is dealt with under the traditional survival analysis. In addition, it is sometimes necessary to assume that hazard functions are different on different intervals. A piecewise exponential model proposed by Friedman (1981) accommodates this assumption. This model was considered semiparametric by Aitkin et al. (1989, PP. 287292). They stated that the piecewise exponential model has the same flexibility as the proportional hazards model in which the baseline hazard function is unspecified. In this chapter, we first introduce the common methodologies in the literature that deal with cure rate problems. We also discuss the piecewise exponential model proposed by Friedman which assumes different hazard functions on different time intervals.
In the literature, there are several methods dealing with the longterm survival problem. They are as follows:
* Using a mixture model which assumes the underlying population is a mixture of
two populations, patients who are cured and patients who are not cured.
" Applying a modified binary regression model in which the main interest is to
compare the cure rates among patient groups.
" Using nonparametric methods to compare the survival curves or to compare cure
proportions among treatment arms.
" Assuming a bounded cumulative hazard rate instead of a cumulative hazard rate
with an infinite limit.
In the next four sections, we describe each method in more detail. Assume that there are n patients in the study. For the ith patient, let T denote the time to relapse (incidence), Ui the censoring time, T* = min(T, Ut), Ai = I[T,
Let Z be a binary variable showing that a patient belongs to either the cured population or the noncured population.
{ 1, if the patient is noncured;
0, if the patient is cured.
Note that Z is not fully observable in the presence of censoring. Clearly, if the incidence is observed, then A = 1 and Z = 1; if the time to incidence is censored, then A = 0 and Z is not observable. Assume that the probability density function for the time to incidence is f*(t I Z = 1) for the noncured population. Let S*(t I Z = 1) be the corresponding survival function. If 6i = 1, then the contribution of the ith patient to the likelihood is f*(ti Zi = 1)P(Zi = 1).
If 6i = 0, then the contribution of the patient to the likelihood is
P(Zi = 0) + P(Zi = 1)S*(ti I Zi = 1).
(1.1)
Therefore, the overall likelihood function is
n
L= ]{P(Zi= 1)f*(t*IZi =1)}'{P(Z = 0) + P(Z = 1)S*(t Zi 1)}'I Z.
i=1
(1.2)
Two influential articles addressing this method were written by Farewell in 1977 and 1982. In the first article, he expressed the binary variable Z using a logistic regression model
e Tx
P(Z = 1)= e TX (1.3) 1+ e OTX'
where 3 is a vector of coefficients associated with the vector of covariates x. Farewell (1977) also assumed the time to incidence T follows an exponential distribution with mean 1/A, and A is assumed to be known. Thus, when 6i = 1, the contribution of the ith patient to the likelihood is
ef3Txi
f*(T=ti;A I Zi= 1)P(Zi= 1)= Ae '1 eTXI ti >0.
I + e,6Txi
When 6i = 0, the contribution of the jth patient to the likelihood is P(Zi = O) + P(Zi = I)S*(Ti = ti; A IZi= )= 1 eX_ e t ti > 0.
1 = eTxi +1 + e TxiType I censoring was assumed in Farewell (1977). The author also accessed the efficiency of the logisticexponential mixture model proposed in his paper relative to a regular logistic model, which assumes Z = 0 if the patient does not have a relapse until To. He defined To as the end point of the observation time. He found that the efficiency of the mixture model increases as the followup time To increases and as P(Z = 1) decreases. When the true underlying distribution is a mixture model, the estimate of the cure rate from the regular logistic model is biased.
Farewell (1982) proposed two more general models for the time to incidence T. He assumed that T follows a Weibull distribution in his first model. The probability density function and the survival function of T given that the patient is
noncured are written as
f*(T =t;A , Z =1,x) = A(At)exp{(At)}, t >0, and (1.4)
S*(T = tI Z = 1,x) = exp{(At)(}, t > 0, (1.5)
where ( is an unknown parameter, A = exp( ,Tx) is a function of covariates, and
y is a vector of unknown parameters associated with covariates x. The likelihood function is obtained by plugging in P(Z = 1) defined in Equation (1.3), the probability density function defined in Equation (1.4), and the survival function defined in Equation (1.5) to the likelihood function in Equation (1.2).
Farewell (1982) proposed the second model for grouped data on survival time. Assume there are k failure time groups (ti, ti+1], i = 1, 2,... , k, the conditional probability for T = ti given that T > t, is
e a,+OTx
P(T = tc I Z = 1, T > tiX) (1.6) where ai (i = 1, 2,... , k) are groupspecific parameters. The vector 0 is associated with covariates x and is common to all time groups. By combining Equations (1.2), (1.3), and (1.6), the likelihood function is obtained.
Farewell (1982) commented that the mixture model is appealing from both biological and statistical points of view. It has been used frequently in cancer studies, especially in the study of breast cancer. Strong evidence of the existence of individuals coming from different populations, however, is required before using the mixture model. Farewell also pointed out that the likelihood function might be flat for some data sets, which would cause possible nonidentifiability of the parameters.
Farewell (1986) discussed some problems related to the mixture model by applying the model to two data sets. The first data set was from a breast cancer trial with 139 patients. He applied a logisticWeibull mixture model to the data. Then he computed the profile likelihood function for 0, the intercept parameter
in the logistic portion of the mixture model. The profile likelihood function for O3o was obtained by replacing the other parameters in the model by their maximum likelihood estimates (MLE). He found that the profile likelihood function was in a strongly nonquadratic shape, indicating that significant nonidentifiability existed with respect to the estimation of 30. Therefore, this logisticWeibull mixture model presented an unreliable estimate of the cure proportion. Farewell also applied a multinomialweibull mixture model to a data set from Risch (1983) on the time to the onset of a major affective illness in siblings of affected individuals. Risch's data were grouped into six time intervals. The onset probabilities were modelled by a multinomial distribution. The mixture model fitted data well. Therefore, Farewell concluded that one should be cautious when using mixture models. The cure proportion might be very imprecise in some cases and strong evidence for the existence of two populations is required to make a reliable inference.
A significant amount of literature discussed the mixture model. The common parametric distributions for the time to incidence for the noncured population can take many forms. For example, they can be exponential (Berkson & Gage 1952, Bithell & Upton 1977, Farewell 1977, Goldman 1984); lognormal (Boag 1949); Weibull (Farewell 1982); Gompertz (Gordon 1990); Generalized Gamma (Yamaguchi 1992); and Generalized Fdistribution (Peng et al. 1998). The distribution for the time to incidence can also take nonparametric or semiparametric forms, such as KaplanMeier (KM) estimates (Laska & Meisner 1992, Taylor 1995) and the proportional hazards model (Kuk & Chen 1992). The probability P(Z = 1) can be treated as a parameter in a Bernoulli trial. It can also be written as a function of a vector of covariates x in a logistic regression model, such as the model defined in Equation (1.3).
To conclude the previous discussion, it is generally plausible to use the mixture model when there exists strong scientific evidence of the mixture of
different populations. However, as both Farewell (1977, 1982, & 1986) and Cantor and Shuster (1992) pointed out, the likelihood function is sometimes nonquadratic or flat, which could produce problems for both computation and inference. In addition, parameter estimates could fall out of the parameter space when the data do not support a nonzero surviving fraction. The asymptotic properties of the MLE from a logisticparametric mixture model were developed systematically by Maller and Zhou (1996). The asymptotic properties related to the model parameter estimates in both logisticsemiparametric and logisticnonparametric models have not been explored in the literature.
1.1.2 Binary Regression Model
Jung (1996) proposed an interesting method to estimate the cure rate,
which is defined to be the probability that the time to incidence T is greater than a fixed time T. In his paper, the standard procedure for inference on parameters in a binary regression model is modified to accommodate censored observations. Let 0(7r) = f3Tx, where 0 is a link function which is monotonic and differentiable on [0, 1]. Also let 7r* = 01, the inverse function of 0. Let F = 1  7r* and 7r*(0) = a r*(0)/90q. The MLE of 3 in a standard binary regression model without censoring is obtained by solving
7r (j3T~i)
n1/2 Z{I(t, > r)  7*(x3TX)} , 1 (x) ( x) = 0.
In the case with censoring, denote T* = min(T, Ui). Assume independent censoring, i.e., T and Ui are independent given xi for i = 1, 2,. . ., n. Also assume that U1, U2, . . . , U, are independently, identically distributed (iid) with a cumulative distribution function 1  G. Let G be the KM estimate of G. Then the MLE /MLE of /3 can be obtained by solving ) * (x *(,3Txi)x =0
n : 7r* (/TXi)}17* O Ti X= 0.
i=1 G(T) (/T)w*(,3x)
Numerically /MLE can be obtained by applying the NewtonRaphson method. Under mild conditions, it is proved that IMLE is a unique and a consistent estimate of /3 and nl/2( MLE ) follows an asymptotic normal distribution with a zero mean. The problem with this method is that T is very difficult to define. The result is not convincing if T is too small. On the other hand, it is not cost effective for T to be too long.
Farewell (1977) compared the efficiency of a logisticexponential model to a binary logistic regression model that treats any individual with t* > T as a longterm survivor. He also pointed out that the estimate of the cure rate would be biased using a logistic model when the underlying true distribution is a mixture distribution. The logistic model discussed by Farewell (1977) did not consider censoring. However, he emphasized that a similar conclusion can be reached by expanding the model to include censored individuals. Farewell also noted that it is especially unreasonable for the logistic model to ignore the possible censoring times when T* exceeds the largest observing time T.
1.1.3 Nonparametric Method
Nonparametric tests are widely applied for both comparing survival curves with the presence of cure and comparing cure proportions between treatment arms. When the main interest is to compare survival curves of several populations with the presence of cure, some of the nonparametric tests are considered standard, such as the logrank test and the generalized Wilcoxon test. Halpern and Brown (1987) studied the power of these two tests by simulations when the mixture cure rate model by Goldman (1984) is appropriate. They compared the power of these two tests for detecting the difference between two treatment arms for selected values of cure rates, instant failure rates, and followup time. Each of the two treatment arms had 50 participating individuals. They used 200 replications for each set of simulation. The simulation results are somewhat inconclusive. The logrank test is
more powerful than the Wilcoxon test in general. However, the logrank test does not show any advantage over the Wilcoxon test in the following situations: 1) cure rates are high and are at similar levels in both arms; 2) there is a large difference in the time to the end point for both arms; and 3) the followup is long relative to the time to failure.
In the comparison of cure proportions, Sposto et al. (1992) studied several tests based on simulated samples. These tests are the likelihood ratio, the Wald, the score tests based on two different mixture models, and a test based on the differences of the KM estimates of the cure rates. The two mixture models are logisticexponential and logisticWeibull models proposed by Farewell (1977 & 1982). The time point in the KM estimate of the cure rate is defined as the last observed failure time. Simulated samples under assumed mixture models are generated in order to check the results in an ideal case. In addition, samples under some different models are generated in order to study the robustness of the tests to model misspecification. The authors also generated samples based on various levels of cure proportions. The test statistics calculated from the mixture models sometimes can not be obtained because of the nonconvergence during the maximization procedure. The authors pointed out two reasons for the nonconvergence: 1) the maximization is obtained on the boundary; and 2) the algorithm terminates abnormally with a jump to a nonnegative definite region. Test statistics with convergence problem are not presented in the article. In conclusion, the authors noted that the KM estimate of the cure rate is a reasonable alternative to the parametric estimate in general. When the underlying model includes unequal failure rates between treatment arms, the proportion test based on KM estimates is more powerful than the test based on the complicated parametric model. However, the KM estimate introduces more bias than the parametric
estimate of the cure rate when the sample is small, when failure rates are equal, and when not all failures are observed.
Gray and Tsiatis (1989) proposed a linear rank test to compare survival curves between two treatment arms over the entire time span, but their test focused power against the later difference on survival curves. This linear rank test is evaluated on a mixture distribution of a proportion representing the cure and a failure distribution with an unspecified form. They assumed that failure distribution functions for both treatment arms are constant over time. They specified a weight function in a twosample nonparametric censored test proposed by Tarone and Ware (1977) in order to obtain their own test. Note that the logrank test also can be derived from Tarone and Ware's test with a different weight function. However, the difference between Gray and Tsiatis' test and the logrank test is that the former put more weight towards the later difference and the latter put equal weight throughout the entire observation period.
Gary and Tsiatis (1989) calculated relative efficiencies under equal censoring for their test, the logrank test, and the proportion test based on KM estimates of cure rates. When there is no censoring, their test gains efficiency over the logrank test for a small cure rate. However, the efficiency gain is less for a cure rate greater than 50%. These results can be explained by observing that the weight functions become closer for their test and for the logrank test as the cure rate increases. In addition, under the noncensoring assumption, their test is asymptotically equivalent to the proportion test. When there is censoring and when the censoring proportion increases, their test is increasingly more powerful than the proportion test. But note that their test is not a pure test of the equality of the cure rates. Therefore, it has power against alternatives with unequal cure rates and alternatives with equal cure rates and unequal survival curves.
In this section, two types of statistical tests are discussed: 1) comparing survival curves under the assumption of the existence of cure rates; and 2) comparing cure rates. Some of the nonparametric tests such as the logrank test is quite standard in oncology for comparing survival curves. When the problem narrows down to the comparison of cure rates, the proportion test based on KM estimates are quite popular due to the computation difficulties related to the test based on the mixture model structure. However, there are problems associated with KM estimates for cure rates, as pointed out by Cantor and Shuster (1992). The standard error of the KM estimate increases in the righthand extremes because of the small sample size at the end of the study. The KM estimate sometimes can produce counterintuitive results. For example, the timetorelapse curve plateau could be higher than the timetodealth curve plateau because of the large downward steps at the end of the study. Therefore, one should be cautious relying on the KM estimate of the cure rate. As an alternative to the KM estimate of the cure rate, Cantor and Shuster proposed a parametric model with a bounded cumulative hazard rate to accommodate the cure proportion.
1.1.4 Model with Bounded Cumulative Hazard Rate
Parametric Model
Cantor and Shuster (1992) proposed a Gompertz model to incorporate longterm survivors. The survival function is written as:
S(t) = exp[O31a(1  el")], a > 0, (1.7) where a is the initial hazard. When 3 > 0, the survival function S(t)  0 as t + oc; and when /3 < 0, the survival function S(t) + exp(a/3) as t  oc. The cure rate is presented as exp(a//3). The authors then maximized the likelihood function based on the Gompertz model using the NewtonRaphson method. Both Var(&) and Var() are obtained from the observed information matrix
and the variance of the cure rate &/3 is obtained using the Delta method. In the article, they also compared the Gompertz model with the mixture model. When there is a cure proportion, both models produce similar curves and cure rate estimates. However, the Gompertz model is preferred due to the divergence problem associated with the mixture model. The authors also noted the instability of the KM estimate toward the end. Some other nice properties of the proposed Gompertz model are also provided in the article. For example, the cure probability given the patient has survived through time to is exp( eotï¿½); the loglikelihood function is strictly concave so one can easily estimate the parameters and the cure rate; and finally, S(t)' is also a Gompertz survival function when S(t) is a Gompertz survival function.
Gieser et al. (1998) incorporated covariates information into the model
proposed by Cantor & Shuster (1992). The treatment specific survival function for an individual is written as:
S(t; i,x) = {Si(t)}ex exp{yrTx+ 1(1 et)}, (1.8)
where a* is defined as log a for a in Equation (1.7), i is the treatment index, x is a vector of covariates, and y is a vector of parameters associated with x. Note that S(t; i, x) in Equation (1.8) is also a Gompertz survival function as noted by Cantor and Shuster (1992). The cure rate in this case is 7r(i,x) = S(o; i,x) = exp{17[e +}. The obtained loglikelihood function is concave with respect to parameters. The MLEs of the parameters are efficient and asymptotic normal when the assumed model is correct and when the sample size is large. The authors compared the treatment effects on cure rates for any given covariates vector x. The ratio of the
log of the cure rates for two different treatment arms is expressed as logir(i,x) _3ec i
'  log (i', x) /3e'i'
The null hypothesis on testing the equality of cure rates is written as
Ho : log Oii, = 0
and the test statistic is expressed as
Z = [di + log(j,)  log(!3i)  d,]/SE, where the standard error SE in the denominator is obtained by the Delta method.
Gieser et al. (1998) applied their model to 763 patients from a multicenter prospective study preformed in 1981 by the Pediatric Oncology Group. This study analyzed the standard risk of acute lymphocytic leukemia in children. They compared two treatment effects on the cure rate. Children's age and sex are considered as covariates. The NewtonRaphson algorithm is used to yield MLEs of the parameters. The algorithm converges after nine iterations. Based on this study, they compared the estimates and the variances of the cure rates based on the Gompertz model, the logisticexponential mixture model, and the KM method. They found that the Gompertz model produces similar estimate of the cure rate to the KM estimate but with much smaller variance. They also provided a generalized model in which the treatment indicator is included as one of the covariates. One important advantage of this generalized model is the loglinear structure of the parameters, since the loglinear structure implies the concavity of the loglikelihood function and the existence, the uniqueness, and the asymptotic normality of the parameter estimates.
Semiparametric Model
Chen et al. (1999) proposed a semiparametric model for the survival data with a cure fraction. They assumed that there are two variables: a) the number of carcinogenic cells N* in the body of a patient; and b) the incubation time Wi for the ith carcinogenic cell (i = 1, 2,... , N*). The number of carcinogenic cells N* = 0 implies that there is no such cell in the patient's body. Assume that N* follows a Poisson distribution with mean 0 and assume that Wi for i = 1, 2,... , N* are iid with distribution function F(w). Variables W and N* are independent. Therefore, the time to incidence is T = min(Wi, 0 < i < N*). Note that P(Wo = c0) 1. Then the survival function for T is presented as
S(t) = P(No incidence by time t)
= P(N*=0)+P(T>t, N*>0)
= P(N* = 0) + P(W1 >_ t, W2 > t,..., WN. > t,N* > 0) e + [1F
i=1
e0 + e0 [eO(1F(t))  11
SeOF(t) (1.9) Note that S(t) is an improper survival function because S(t) 4 e0 > 0 as t + oc. The survival fraction e0 is indeed the probability that there exists no carcinogenic cell in a patient's body. The corresponding probability density function and the hazard function are
f(t) = Of(t)exp(OF(t)),
A (t) = Of(t).
Let S* (t) denote the survival function for the noncured proportion of patients, then eOF(t) _
S*(t) = P(T > t I N* > 1) = 1  e0 (1.10)
The corresponding probability density function is d e OF(t)
f*(t) =  t I f(t) and the hazard function is f*(t) _ eOF(t) Of (t1 A*(t) S*(t) eOF(t)  e 0fJt) P(T < 00 T > t)A~t)
The likelihood function is derived as follows
n
L ff(t)1iS(t*)li=1
n
A H(t*)6iS(t*)
i
n
* OF(t!)
JI[Of(t*)]6e . (1.11) i=1
The model is semiparametric in the sense that the form of F(t) is unknown. It has a strong biological meaning by gathering contributions from two characteristics of the tumor growth, the initial numbers of carcinogenic cells and the rate of their growths. Even without any biological meaning, the model still fits any data with a survival fraction. It can always be treated as being generated by an unknown number N* of latent competing risks. In this model, S(t), f(t), and A(t) are improper, while S*(t), f*(t), and A*(t) are proper survival function, density function, and hazard function, respectively. One advantage of this model over the mixture model is that the overall hazard function A(t) has a proportional hazard structure if the covariates enter the model through 0. In the mixture model, however, the overall hazard function does not have a proportional hazard structure if the covariates enter the model through the cure rate 7r. The hazard function h* (t) for the failure proportion is magnified by 1 > 1 compared to h(t).
P(Tt)
As t goes to infinity, function h*(t) converges to h(t). In addition, the authors pointed out the close relationship between the model presented in Equation (1.9)
and the mixture model presented in Equation (1.1) by observing that S(t) defined in Equation (1.9) can also be written as
S(t) = e + S*(t)(l  eC).
If we let P(Zi = 0) in Equation (1.1) take the value of e0 and let S*(t I Zi = 1) take the form of S*(t) defined in Equation (1.10), then S(t) has the same form as in Equation (1.1). Model in Equation (1.1) also corresponds to some model of the form in Equation (1.9) for some 0 and F(t). Note that F(t) is assumed to have a Weibull form for convenience in the article. The parameter 0 is a function of covariates in the form of 0 = exp(I3Tx).
Tsodikov (1998) proposed the same semiparametric model from different angles. As above, let 0 be a function of covariates, denoted by 9(x, 3). Assume that n is the number of failure observations and O/d is the parameter of interest for the ith failure observation. Let mi be the number of censored observations in the time interval (ti, ti+l), where t, is the failure time associated the ith failure observation. Let til, ti2,... , timi be the censoring time for the observations censored within interval (ti, t+1). Let Oc.(i = 1, 2,..., nj = 1, 2,..., mi) be the parameters of interest for the censored observations. Also let 8i be the sum of Os for all individuals on the interval [ti, ti+1) and let
b
Sa,b = E,
k=a
b
Sa,b = E
k=a
P,b = Sa,bSa+1,b" . b,b, and
pa,b = ,b b1 Sa,a,
for 0 < a < b < n. Note that E. 0 and a = I if a > b. The form of F(t) is completely unspecified.
In order to draw inference on the cure proportion with the presence of an unknown form of F(t), the author applied the partial likelihood, the marginal likelihood, and the profile likelihood function of 0. The partial likelihood function is written in the form
n o
p Sl,nS2,n... Sn,n
The partial likelihood is valid due to the proportional hazard structure. It provides consistent estimates. However, it can only be used to estimate the ratio of 0 and 0b, the baseline value. It is therefore inconclusive in the estimation of 0b. When there is no covariate, i.e., when all patients are homogeneous, cure rate can not be estimated.
Both the marginal and the profile likelihood are derived from the full likelihood function which is written as n mi
L d(ti) eOFt) i Co eF),AQ(tiJ)' (1.12) i1 j=1
where Q(t) is the survival function for the censoring time. Type I censoring mechanism with a fixed end point t, is assumed throughout the article. Then Q is a singlestep function such that Q(ti) = I for i = 1, 2,.. ., n, AQ(t,) = 1, and ty t, for j= 1, 2,..., Mn. The marginal likelihood function is derived as
n odn P
Lm = H i  (ï¿½j04)F(t,) F(1) ep iln(t,) }pniPn~,n .
Sl,nS2,n ... Snn e=0
The profile likelihood function is derived from Equation (1.12) by maximizing it with respect to F. Nonparametric maximum likelihood estimator (NPMLE) is used to estimate F.
Both Chen et al. (1999) and Tsodikov (1998) proposed the same semiparametric model, while using different estimation techniques. The former assumed independent and noninformative censoring as well as Weibull distribution for F(t) in the model. Therefore, their model is a parametric model although F(t) can take
a nonparametric form. The latter assumed a Type I censoring mechanism, which is restrictive.
Both the parametric and the semiparametric models with a bounded cumulative hazard rate are appealing because of the simplicity and intuitive meaning of these models. However, there are limitations for the more flexible semiparametric model. It is difficult in terms of computation. In addition, the asymptotic properties of this model have not been explored in the literature. In the next section, we will reexamine the semiparametric model proposed by Chen et al. (1999) and Tsodikov (1998). In this examination, we make no parametric assumption for F(t). In addition, we do not limit the censoring to Type I mechanism. By using the simulated data, we show the computational barrier raised by this model.
1.2 Computation Difficulty of the Semiparametric Model
We assume random and noninformative censoring. The notations used in Tsodikov (1998)'s paper are used in the section except some changes are made in the definition of P,b and pa,b. We define them as follows:
P,b = S.,bSa+l,b ... Sb,b, and
pa,b = Sa,bSa,b1 ... S,.
The likelihood function takes the following form,
n
L = j 04f (ti)ef(tj)8j' (1.13)
where f(ti) is the density function corresponding to F(ti). Define
eSni+ln
(1)pni+,nP,ni
and
3 = S' , ni+ lk +k,n i O
n,~ n ~ , 
k=ni~l nilk k=1 Define In = En=0 Ii*, Jn = 0 J*, and Sk = OS,,k/O3 for any j, k = 1, 2,..n.
ii0i
Let Lm denote the marginal likelihood function. As derived in Appendix A,
n
Lm  fJO'dIn. (1.14) Without loss of generality, assume that there is only one covariate x in the study, such as the treatment indicator. Also assume that 0 is a function of the covariate x and is expressed in an exponential term as exp(a + /x). The first and the second derivatives of Sjj with respect to / are denoted by S,,j and S,'. They are derived as follows:
asi,j _ Mk k=1 1=1
021 j mk
2, + M k
I =__ L _ X : kI2) SI0j a02 k=1 1=1
The score function for a is obtained as v(a) E  i S.iz; .
i=0 _I l.
and the score function for /3 is written as e( )= x j . U(3) =E EI
i=1 i=O
The Hessian matrix is calculated and written as 02 log(Lm)/a2 02 log(Lm)/0ao/3 1 02log(Lm)/OaO3 02 log(Lm)/,932
where
02log(Lm)/0a2 E  Sni+l,ni (Sni+l,n + 1) + 2. i( Si+l,, i=O In n i=0
2log(Lm)/a0 E i (JSi+,n i+,n) i=0 j=0 In
and
n n
02 log(Lm)/032  In 2 + I' Jj } + ( I, J) 2, i=0 Ii=
where J*' is the first derivative of J* with respect to /3 and is expressed as n ) 2 ni ) Sni+lkSni+lk  (Sni+lk) Sk'niSkni  (Skni)
J_'= S_+,, + 2 + $5
k=ni+l Sni+l,k k= 1,n
Suppose that the covariate x is the treatment indicator, i.e., x = 1 if the patient is in the treatment group and 0 if the patient is in the control group. We are now interested in testing the hypothesis: H0 : /3 = 0. (1.15)
The score test (Rao, 1973) is used to test the hypothesis in Equation (1.15). The score statistic for / under H0 is denoted as U(O3)IHo and the asymptotic variance for the score statistic is _02 log(Lm)/0/32 under the null hypothesis. If a is assumed known, then the standardized test statistic is T =U(3)IHï¿½ (1.16) V02 log(Lm)/0/32JH(
In order to explore the feasibility for computation, we first generate 10000 samples under the assumption that F is an exponential distribution with a mean of 1. Then the survival function is exp(OF) = exp((1  et)e+Ox). Assume a = 1.5 and /3 = 0. Each random sample contains 30 observations in the control group and 30 observations in the treatment group. The censoring proportion is around 50%. The standardized score statistic for /3 is calculated for all random samples generated. The plot of the 10000 standardized score statistics is presented in Figure (11). In this bar plot, we observe that the score statistic behaves well and is close to a normal shape given that a is known.
The problem becomes difficult to handle, however, when a is unknown. It is common to use a profile likelihood method to handle the nuisance parameter a when our primary parameter of interest is /3. The profile marginal likelihood function for /3 is obtained from the marginal likelihood function by replacing a by its MLE under H0. The test statistic based on the profile marginal likelihood function of /3 is similar to the one defined in Equation (1.16) given that some adjustment on the denominator of Equation (1.16) is required. Therefore, our primary task is to solve for the MLE of a under H0. By using the same parameter setting for Figure (11), we generate a sample with 30 patients in the treatment group and 30 in the control group. We show in Figure (12) the plot of the log of the marginal likelihood function as a function of a on interval [2, 6] when /3 is valued at 0. The marginal likelihood for a under H0 is poorly behaved. In order to further investigate the behavior of the marginal likelihood function on a and /3, we show in Figure (13) a threedimensional plot of the log marginal likelihood function defined based on the same random sample as a function of a on the interval [2, 2] and /3 on the interval [3, 3]. The poorly behaved log marginal likelihood as a function of a and /3 explains the behavior of the log marginal likelihood function in Figure (12).
1.3 Piecewise Exponential Model
As we mentioned earlier in the chapter, it is sometimes necessary to assume different hazard functions for the time to failure on different time intervals. Friedman (1981) proposed a piecewise exponential model to handle this problem. Let the time scale be divided into I(n) intervals (0, T], (T, T2], .. ., (TI(n)1, TI(n)]. The hazard rate of each individual is assumed to be constant over any given interval. Denote the hazard rate of the jth individual in the ith interval by Aij Assume Aij > 0 for all (i, j). Let lij = log Aij and let tij be the survival time of the
jth individual in the ith interval. Define tij as
tij = max{0, min(T  Ti1,tj  Ti_1)}. (1.17)
Let
1, when the jth individual fails in [Ti 1, T) Iij =
{0, otherwise.
Then the loglikelihood function is log L (1) = 1: lij lij  E tij exp (lij). (1.18) i,j ij
One of the forms lij can take is lij = a + k kxk"
The author proved the existence and the uniqueness of the MLE of the vector 1. The consistency and the asymptotic properties of the MLE were also explored. Karrison (1987) also considered a piecewise exponential model in his study.
0
a 0
o >
0
LO
ï¿½ U
0
4 2 0 2 4 Standardized Score
Figure 11. Standardized score under H0 (0 = 0) for 10000 random samples.
0
0o
I I I I 1
2 0 2 4 6
Figure 12. Log marginal likelihood function of a when 3 = 0.
0
0
t5
C LL
Cu
8
C
0
L0
0
.J
C?
0
1
Figure 13. Log marginal likelihood function of a and 3.
CHAPTER 2
TOPIC OF RESEARCH
The model we propose in this dissertation is a piecewise Gompertz model. It incorporates the cure proportion by assuming a bounded cumulative hazard rate. Moreover, our model gives the flexibility that the survival times for different time intervals may have different hazard functions. As noted by Aitkin et al. (1989) that a piecewise exponential model has the same flexibility as a proportional hazards model under the assumption that each interval is defined by one failure observation in the piecewise model. The choice of using a Gompertz model over an exponential model is made because of the possible plateau of the survival curve from a Gompertz model. We would speculate that under a piecewise Gompertz model, the more intervals we define, the more flexible the model is. Eventually, when there is only one failure observation on each interval, the flexibility of this model would be the same as the semiparametric model we discussed in Chapter 1. Therefore, our model preserves the flexibility of a semiparametric model to some extent. The degree of the flexibility depends on the number of intervals we define for our model. Meanwhile, the computation for our model is much easier to handle than a semiparametric model due to our model's parametric structure. However, when the number of intervals in our model increases, the number of individuals on each interval decreases. We would anticipate large variance for the estimate of the cure rate, which is similar to the situation we discussed earlier for the KM estimate of the cure rate. An appropriate number of intervals needs to be accessed in order to reach maximum flexibility and computational feasibility of the model. In this chapter, we describe our model in detail. In Chapters 3 and 4, asymptotic
properties of the MLEs of parameters in the model are proved. In addition, a goodnessoffit test is derived which can be applied to determine the appropriate number of time intervals in the model. In Chapter 5, a MonteCarlo simulation study is carried out to show the advantages of our model. Chapter 6 gives a real example and demonstrates the application of our model to the example.
2.1 Data Structure and Model
Assume there are n patients in the study. Consider the jth patient who is in treatment group g and is associated with a vector of covariates xj, where xj is a pdimensional vector. Let T and Uj be the survival time and the censoring time of the jth patient, respectively. The observed data on the jth patient are (t*, 5j, xj), where t is the observation on
Tj* = min(Tj, Uj),
and 6j is the observation on
tAj = I(Tj < j).
Assume there are G treatment groups and the interval [0, oc) is divided into I intervals [Ti1,  i), i = 1, 2, ..., I, where 0 = 0 and T1 = oo. Let OT= ('3T AT), where 3 is pdimensional and A = (, AT, ..., AT. Note that AT = (Aig, A2g,..., Arg). Clearly, 0 is a (p + IG)dimensional vector. It is assumed that the hazard function for the jth patient in treatment group g has the form e:igteTxj, t E [Ti 1, TO) (2.1)
Define an indicator function ig(j), where Kg 1 if j h patient is in group g;
otherwise.
Let n(j) be a vector of ig(j) for g = 1,2,..., G, i.e., K (j) = (K 1 (J), K2 (j), . . . , IG (j) )T.
Let us also define J 2 ( ) if t E [r i 1 , T ) ;
0 otherwise,
for i 1,2,...,I and j (t) = (j, (t),j2 (t),...,(t)) T.
Let
ii(j)J(t)
(j) (&j)J(t)
Q(t,j)  (j) ï¿½ J(t) K2(j)J(t) KG()J(t)
Note that Q(t, j) is a GIdimensional vector. The hazard function in Equation (2.1) is also written as
h(t; A, n (j), J(t))eTx = eAigtel3Txj, if g(j)= 1 and Ji(t)= 1.
The cumulative hazard function Hj(t, 0), the survival function Sj(t, 0), and the density function fj (t, 0) can be derived as follows:
Hj (t; 0) =egT I h (u; A, r,(j), J(u)) du,
Sj(t;O) = exp{e Tx jh(A (j)Ju)d}
fj(t;O) = h(t;rb ,(j),J(t))eO'xjSj(t;O).
The likelihood function contributed by (tj, 5j, xj) is expressed as
L, = [h(tj; A,n(j),J(t*))eoTxi]I Sj(t;; 0).
The corresponding loglikelihood function is written as
l j = log Lj
= 5j[log(h(t;; A,t,(j), J(tj))) + 3Tx3]  eTx f h(u; A,K,(j),J(u))du.
(2.2)
consequently, the overall likelihood function is defined as
n
Ln = H Lnj,
j=1
and the corresponding loglikelihood function is
n
In = E lj.
j=l
2.2 Assumptions
Our derivation of asymptotic properties is based on the following three assumptions:
Assumption 1. There is a B > 0 such that Ixj < B for j = 1,2, . . . , n. Assumption 2. There exist c > 0, 6 > 0, and an integer no. when n > no, there exists a subset A of overall individuals with nA/n > 6, where nA is the size of the subset A such that for any unit vector a,
laTxj I>c for j cA. Assumption 3. Let ng be the number of individuals in treatment group g, where g = 1, 2,..., G. There exists a 6 > 0 such that minl 6 when n
n
is sufficiently large. Assumption 4. The censoring variables U1, U2,..., Un are iid with a survival function Gu(u). Assume that there exists an E > 0 such that
Gu(TI1)  Gu(oo) > E.
The first assumption assumes that all covariates are uniformly bounded. It is reasonable in practice because the majority of the covariates in a cancer study are patients' characteristics, which are bounded uniformly. Note that we are primarily interested in timeindependent covariates. The second assumption assumes that at least a set A of patients' covariates are dispersed to some extent. Assumption 3 means that the number of observations in each treatment group must increase proportionally to the sample size. Assumption 4 assumes that not all the censoring observations occur very early on the time span.
2.3 Hypothesis of Interest
We are interested in comparing cure rates among treatment groups. Let 7r9 (x) be the cure rate of patients who are under treatment group g and who are associated with covariates vector x. Considering the case with two treatment groups, the null hypothesis is
H0: 7r1(x)  7i2(x) = 0, (2.3)
for any covariate x. The test statistic will be derived in the following section.
2.4 Test of Hypothesis
The cure rate corresponding to patients in treatment group g who are associated with a specific setting of covariates x is
7(x) = S(oo;0) exp{e 6Tx h(u;A,K(j),J(u))du}
 1 Aig i Alg7lexp{e 9Tx[ e 9igi ezg + eA(2.4
i=A (2.4) The hypothesis defined in Equation (2.3) is equivalent to
 1 eAilri   eAiTi eAI 17 1 eAi27_I  e12 i eA 2 r 1.
Ho Ail + AI1 Ai2 A12
30
Let the MLEs of 0 = (AT 3T)T be denoted by 0 (AT /T)T" The test statistic is derived as
T= }. + + & . (2.5)
i=1 Ail A,, i=1 Ai2 A12
The variance of the above test statistic can be derived by the Delta method. The test statistic given in Equation (2.5) is a function of the estimates of A. We will show in later chapters that 0 follows an asymptotic normal with a mean 0 and a variancecovariance matrix E. The variance estimate of the test statistic can be derived as follows: V(T) = (( where ( [ =) represents the derivative of the test statistic with respect to A while evaluated at the MLE of A. E* is the variancecovariance matrix of A and Z" is the estimate of E*. The vector oT is derived as follows:
OT ( Ti e Air i1  Tieil r )Ail  (eAil Ti1  AilTi )
aAil Ail
OT (Ti__e Ai2Ti1  reii2Ti )A 2  (e Ai2Ti1  e Ai2Ti) A2
aAi2 Ai2
for i 1,2,...,I 1, and OT _ (A11T11l 1)e'I"'l aAnl Ail
T (AI2T.1  1)eA2rI1 0A12
aA12 A12
The normalized test statistic is
T
T (2.6)
it follows an asymptotic standard normal distribution.
CHAPTER 3
ESTIMATION OF PARAMETERS
3.1 Likelihood Function and Information Matrices
Based on the loglikelihood function defined in Equation (2.2), the score statistic S,,j(0) for the jth individual can be expressed as
Snj(0) (n ()i (3.1) si (0))
where
sj(0) = ï¿½/3 ; c5x3xje h3x j h(u; A, n(j), J(u))du (3.2) and
SOl( eTxJ Q (u,j)uh(u; A,(j),J(u))du. (3.3)
n(0()  j jt*Q(tj,ji)  e1 oX
Note that S()(0), S(2) (0), and Snj(O) are p, IG, and (p + IG)dimensional vectors, respectively. Furthermore, the matrices formed by second order derivatives of l1j can be derived as
F(I 021,j  xxTe6TxJ ft h(u; A, x (j), J(u))du,
F.(1,2) (0) "_ _ xje6TXj QT(uj)uh(u; A,(j),J(u))du,
( 2 ,1)0 2 1n j , t;
F (0)= O o3 TX j Q(u'j)xTuh(u;A,(j),J(u))du,
and
2,2) 021nj 6TXj t (uQT(uj)U2
jn (0) aA(9AT = 0 u ,r jJ() u
Then the observed Fisher's information matrix is written as
F(1,1)0) ( 1,2) (0
=~ (0) =0 Fny n 0
F~j(O) ( ) (2,2)
(nj,1() Fn'n(o
which is a (p + IG) x (p + IG) matrix. The Fisher's information matrix is the expectation of Fj(O), which is written as
DTx.x (E(Fq(3( ))d)
DTj = E(Fnj(0)) = e xE j h(u; A,K(j),J(u)) uQ (Xf UQT(u,j))du (3.5)
3.2 Concavity of Likelihood Function
The loglikelihood function for n individuals is written as
n
1n(0) = Y { [log(h(t*; A, ,.(j), J(t*))) + OiTxj]  e ,x jt h(u; A, '.(j), J(u))du}.
j=1
(3.6)
The score statistic, the observed Fisher's information matrix, and the Fisher's information matrix for n individuals in the sample are respectively as follows:
n
Sn(0) = Z Sj(0), (3.7) j=1
n
Fn (0) = 1: Fnj (0), (3.8) j=l
and
n
Dn = S Dnj. (3.9) j=1
Let cT (aT bT), where a is a pdimensional vector and b = (bll  bi1 ... bIG ... bIG)T is a IGdimensional vectors. We will show that Fn(O) is positive definite and therefore, the loglikelihood function defined in Equation (3.6) is strictly
concave. To show F,(0) is positive definite, it suffices to show that CTFn(0)c > 0 for any nonnull vector c of dimension (p + IG). In order to prove that F"(0) is positive, we assume that there is at least one observation on each interval for each treatment group. Note that
CTFn(0)C ( aT bT ) Fn( (0)(
 Se oTXi J h(u; A, K(j), J(u)) [aTxj + ubTQ(u, j)]2du j=l
> 0.
This implies that Fn(0) is semipositive definite. Since h(u; A, rz(j), J(u)) > 0, cTFn(0)c = 0 implies that for any u E (0, t ), j = 1,2,...,n, aTxj + bTQ(u,j)u = 0. (3.10) For any integer i, 1 < i < I and for any integer g, 1 < g < G, there exists an individual j (say) in group g and t E (Ti1, Ti). Then Equation (3.10) implies that aTxj + ubig = 0 for u c (,, t*). Consequently
aTx3 = 0
and
big = 0. (3.11) Since Equation (3.11) holds for any integer i, 1 < i < I and g, 1 < g < G, it is then proved that
b=0.
Therefore, Equation (3.10) implies that for any j = 1, 2,... n, aTxj = 0,
i.e.,
aT
aHxj =0 for allj = 1,2,...,n, which is a contradiction to Assumption 2.
Therefore, F,(6) is positive definite. The strict concavity of the loglikelihood function as a function of unknown parameters 0 guarantees the uniqueness of the MLEs of the parameters provided that the MLEs exist. Based on the existence of the MLEs and the concavity of the loglikelihood function, the NewtonRaphson algorithm converges.
CHAPTER 4
PROPERTIES OF THE MAXIMUM LIKELIHOOD ESTIMATES
4.1 Introduction
In the previous chapter, we proved the strict concavity of the loglikelihood function. We pointed out that we obtain unique MLEs of parameters if we can prove that these MLEs exist. In this chapter, we will first provide some preliminary results as a preparation for later derivations. Then we will show the existence, the consistency, and the asymptotic normality of the MLEs obtained. The method applied to prove the properties of the MLEs follows a similar method provided by Maller and Zhou (1996, PP211222). Maller and Zhou proved the asymptotic properties of MLEs in a survival model which assumes that the failure time follows an exponential distribution with covariates information. The exponential model with covariates in their book is a special case of the Gompertz model. By studying the piecewise Gompertz, we move one step further through the incorporation of different hazard functions on different time intervals. In our model, if we allow A= 0, the hazard function for the failure time T is h(t) = ex. Therefore, the hazard is a constant, which implies that T follows an exponential distribution.
In the next section, we state several lemmas which will be applied to prove the existence, the consistency, and the asymptotic normality of the MLEs. The proofs of the lemmas are given in detail in the Appendices. We prove the existence and the consistency of the MLEs in Section 4.3. The proofs of the asymptotic normality is stated in Section 4.4. In Section 4.5, a goodnessoffit test which compares models with different numbers of pieces is derived. We also show that the corresponding test statistic follows an asymptotic X2 distribution.
4.2 Preliminary Proofs
Lemma 4.1. Let E() and V(.) represent the expected value and the variance of a random variable, respectively. Then the following two equations hold: E{Sn(O)}= 0 and
V{S.(O)} = Dn.
Proof. See Appendix B. El Lemma 4.2. Let 0o =(0, Ao) be the true parameter vector, where Ao
(Ao)A,..., o . . ï¿½, AG(), A). For any positive integer m and for 1 = 1,2, there exist an c > 0 and a F > 0 such that for any A' satisfies IJA'  Aoll < C, EoI{j umh(u; A', (j),J(u))du} < F, for j = 1, 2,..., n. Note that F is a constant which depends on 00 and C. Proof. See Appendix C.
Lemma 4.3. Define 0o as in Lemma 4.2. For any positive integers I and m, there exists a 'y > 0 such that
Eoo{j I umhL(u; Ao,r(j),J(u))du} > y, (4.1) for j = 1, 2,..., n. Note that y is a constant depending on 0o. Proof. See Appendix D. El Lemma 4.4. Under Assumptions 2, 3, and 4, there exist a y > 0 and an integer no > 0 such that when n > no,
Amin(Dn)
n
where Amin(Dn) is the smallest eigenvalue of Dn.
Proof. See Appendix E. LI
4.3 Existence and Consistency of the MLE
For each A > 1, define an elliptical neighborhood of the true value 00 of unknown parameters by
N(Oo) = {O: (6  Oo)TDn(0  Oo) < A}. (4.2) As shown in Lemma 4.4, A.min(Dn) > ny when n > no. Hereafter we assume that the condition n > no is satisfied. We observe that for any 0 E N(0),
A > ( Oo)TDn(0  o) > 110oll2A.min(Dn) > 110 o0112ny,
i.e.,
1 0011 < 1A (4.3) Let us consider an arbitrary vector 0 which is on the boundary aN(Oo) of N(0o), then 0 is a vector which satisfies the following equation: (0  Oo)TD,(0  0o) = A.
If we can show that 1n(O) < l,(0o) for all 0 E ON(0o) with probability approaching 1, then the loglikelihood function has a local maximum in N(Oo). In addition, the gloabl maximum exists and lies in N(Oo) since the loglikelihood function is strictly concave. The consistency of 60 is proved by observing that Equation (4.3) holds. Theorem 4.1. There exists a unique and consistent On which maximizes ln(O). Proof. For any On E ONn(0), the Taylor's expansion of the loglikelihood function about 00 is written as
1n(On)  ln(00) =(0.  00)S  (  0()OFn( )(0.  00), (4.4)
2
where 0,, is a random vector on the line segment between 0,, and 00. Hence, the vector W,,c N(Oo). Recall that Sn(O0) is the vector of the firstorder derivatives of 1,(0) evaluated at the true parameter 00 and Fn(On) is the negative matrix of the second derivatives of 1(0) evaluated at ,,. We have
P{1/(0n)  1(O0) < O}
> P{(On Oo)TS.(Oo)< 1(on OO)T(on)(on Oo)}
> P{(OoOo)T (6o) A, (O O)TF(o)(  Oo)> A I}
> P{(~O o)TFn(On)(Ono)>  P{(On  O0)TSn(00) > d}
We proved in Lemma 4.1 that E(S,(O0)) = 0
and
V(Sn(O0)) = D.
Since (O"  Oo)TS"(Oo) is a sum of independent random variables with mean zero, we obtain the following inequality by applying the Chebyshev's inequality:
(. TS(00 > A 16(0  Oo)TDn(On  Oo) _ 16 (4.5)
O )no)  A2 A (4.5
since O E ON(O0). The probability in Equation (4.5) can be made arbitrarily small by choosing A sufficiently large. Next we need to show that P OO)TF(O,)(OnOo)> A +1, (4.6)
as n  oo.
Let I be a (p + IG) x (p + IG)dimensional identity matrix. If we can show that, uniformly in O,,E N(O),
D /2Fn(On)Da1/2 4 1, as n +oo,
(4.7)
then we obtain the following: (6n Oo)TF (n)(O 00)
(On )T D1/2 (D1/2F( O)D12) /2(O  00)
_ O n 112F(nD )n(0o S(On  )TDn(On  Oo)Amin(Dnl/2Fn(n)Dn 1/2)
SAAmin(Dn 1/2F( 1)Dn/2)
4 A,
where On E ON(6o) and nE N(Oo). Therefore, Equation (4.6) holds as n  cx. Consequently, if we let both n and A become large, P{1n(On) < /n(O0)} + 1 for On E ON(Oo). Unique MLEs of parameters in the model exist by the strict concavity of the loglikelihood function and are consistent by Equation (4.3). l
We shall now prove Equation (4.7) through the following three lemmas. Note that Equation (4.7) indeed shows that the information matrix Dn is consistently estimated by Fn(/3). Let cT = (aT, bT) be any (p + IG)dimensional unit vector, where a and b are p and IGdimensional, respectively. Lemma 4.5. For any (p + IG)dimensional unit vector c, CTD '/2[Fn(Oo)  Dn]D 1/2c 4 0 , (4.8) where 6o = (/3T, AoT)T is the true parameter. Proof. See Appendix F. [] Lemma 4.6. For any (p + IG)dimensional unit vector c and any 6 E N(8o), CTDI 1/2(Fn(0)  Fn(0o))D /2c 4 0.
Proof. See Appendix G.
Lemma 4.7. For any 0 E N(Oo) D112F (O)D 1/2 2+ I(p+IG)x(p+IG) (4.9) Proof. See Appendix H. El
4.4 Asymptotic Normality of the MLE
Lemma 4.8. Let I be an arbitrary positive integer, then Eoo HSj (0) l' is uniformly bounded for all j = 1, 2,... , n. Note that Sny(0) is defined in Equation (3.1). Proof. See Appendix I. 0 Lemma 4.9. For any (p + IG)dimensional unit vector c, CTDnl /2S(O0)4 N(0, 1). (4.10) Proof. Denote
n
Zn = cT Dnl/2Sn(o0) = cT Dn1/2 Snj (0o), j=1
where Snj(00), j  1, 2,... , n are independent random variables. By Lemma 4.1
E(Zn) =0
and
V(Zn) = 1.
By Liapounov's condition, it suffices to show that
n
EoICT Dnl/2Snj(Oo)13  0 (4.11) j=1
as n + oo. Note that IcTDl/2Sj(0o)la < [Amax(Dn 112)]aIISj(oo)H13
< ()3IlSj(0o)l3.
Since E00IjSnj(o)I3 is uniformly bounded for all individuals by Lemma 4.8, Equation (4.11) holds and the proof is accomplished by the Central Limit Theorem.
Theorem 4.2. The MLE 0, are asymptotic normal, i.e., D1/2(W" 0) 4 N(0, I) as n  oc. Proof. By Taylor's expansion, we obtain that for some 6 on the line segment between On and 0o and for any unit vector u,
uTD 1/2Sn(4n) = uTDn/2Sn(0) _ UTD1/2Fn(O)(n (o  00).
Also because UT Dn 1 (/2 ) = 0 since n is the MLE,
*" 1/2Sn(eo) UT u 120
uTD/2s(Fn(b) D/ + UTD1/2Dn uTDn /2(Fn() D,)ni/2n, /2(6)  (o) + uTDV2 (on  0 )
uT[D n/2(F()  Dn)D/2 ]D/2(6 _0) + uTD12(  O).
n n n/ On n nv  00)
Since On E N(O0), Equation (4.9) holds by Lemma 4.9. Therefore, the sum of the absolute values of all elements of [Dn1"2(F(O) Dn)Dn'1/2] goes to 0 in probability. Hence,
[n1/2 (Fn(b)  Dn)Dn 1/2] = op (1), where op(1) is a matrix with all elements converging to 0 as n 4 00. Then we obtain that
T 1/2Sn(0) = UT1/2(00)+UT01/ uTDnI/2Sn(6o) uTop(1)]Dn/2(0) + u On  0) Therefore,
uTD /2(O 0o)= uTD I/2Sn(6o)+ op(1).
42
Since we have proved the normality of uTDn 112S"(O0) in Lemma 4.7, the normality of UTDnl/2(O  0) follows. Hence, Theorem 4.2 is proved.
4.5 GoodnessofFit Test
In practice, one might be interested in knowing how many intervals are in fact necessary in a specific study. Therefore, a goodnessoffit test can be performed to test how many time intervals are adequate. Assume that we have a saturated model with I, + 2 intervals and an unsaturated model with I, intervals. We are interested in testing whether there is significant difference between the two models. We assume that the unsaturated model is obtained by combining intervals I, + 1, I1 + 2, . .., I1 + 12 in the saturated model with interval I1. The null hypothesis can be written as
Ho : A(I,+1)g = A(11+2)g . A11+I2)g Aj1j, Vg = 1, 2,..., G.
(4.12)
Since 0T = ('3T AT), the null hypothesis defined above can also be written as
Ho: HO=O,
(4.13)
where H is
/ ,.T
a design matrix with full row rank 12G and is written as
oIT 1 oT
Ij1
of1
of1 oT
oT
0110iI 
0 0 "* O
0 0
0
... oT
111
" OI1
ï¿½... 0T
111
o.. o 1
0
0
0
0
0
where Os in bold are null vectors. Note that the first p columns of the matrix H put no restriction on j3. Columns (p + 1) through (p + I,  1) represent the nonrestrictiveness of A for treatment group 1 corresponding to the first I,  1 intervals. Similar patterns are presented for the rest of the columns. We will now construct a likelihood ratio test for the hypothesis in Equation (4.13). To maximize the loglikelihood function l, (0) under H0 is equivalent to maximize
1 (0)  OT HTY
unrestrictively in 0 E RIP+(I"+I2)G] and y E R/2G, where 7 is a vector of Lagrange multipliers.
If we let 00 be the true parameters, 6,, be the parameters estimated under H0, and 6,, be the estimated parameters under the saturated model, then 0. satisfies the following equation:
Sn(On)  HT, = 0, (4.14) where , is a vector of the estimated y under H0. Also under the null hypothesis, Hbn = 0 and HOo = 0,
H(bn  00) = 0. (4.15) Now we need to solve for 0n from Equations (4.14) and (4.15). By Taylor's expansion,
D1/2Sn (6n) = D 1/2Sn(0o)  D1/2(On  00) + oP(1), hence
D1/2(b,  Oo) + Dn1/2Sn(O) = D112S,(0o) + oP(1). By Equation (4.14), the above equation can be written as
/  o) + D T2 D 1I/2Sn(O0) + oP(1).
(4.16)
We can rewrite Equation (4.15) as (HD1/2 )DI/2(O _ 00) = O. (4.17) Now express Equations (4.16) and (4.17) in a matrix formula,
I D. H Dn n(n  00)) Dn1/2S.(O0) + O(1) ) (4.18)
HDn 1/2 0 0 (418
Since H is of full row rank I2G and Dn is nonsingular, HDnIHT is of full rank I2G. Hence, the inverse of matrix ( ) exists and can be (HDn/ 0
written as
I DnI/2HT IP Q HDn /2 0 QT R
where
P Dn 1/2HT(HD,[lHT) HDnl/2, Q D, /2HT(HDnIHT) , and R = (HDn 1HT) 1.
Therefore, we can solve Equations (4.18) as
( 1D/2(O~) (Q ) /2Sn(6o)ï¿½+oP(1)) 5" T R 0 hence,
/2(b  O0) = (I  P)D 1/2Sn(O0) + Or(1). Similarly for 6n, we have
/2(bn  0o) = Dl/2Sn(O0) + Op(1),
by Taylor's expansion and by observing the fact that Sn,(0,) 0. Therefore,
D 1/2 (0,  0,n) = Dl/2(6n _ 00)  D12(bn 00G)
PDn/2Sn(0o) + oP(1).
(4.19)
Note that PP = P is idempotent with full rank I2G; therefore, P can be written as
P=AT (
II2G 0 A,
0 0)'
where A is an orthogonal matrix. Let Zn = ADn /2Sn(Oo), then
d
Zn + N(0,I),
by Lemma 4.9. From Equation (4.19), we obtain that
(On _On)TDn(OnOn)
ST(Oo)Dn1/2PDn /2Sn(Oo) + o,(1)
 sT(0o)Dnl/2AT (
12G 0
0 0
12G 0) 0 G0)
ADn"/2Sn(Oo) + oP(l)
Zn + op (1)
li2 0 z + o()
Z T G
X2
The likelihood ratio test statistic for the hypothesis in Equation (4.12) can
be written as
A =sup ooï¿½ Ln(0)
supoce Ln(0)
Ln (n)
Ln(bn)'
(4.20)
where e represents the overall parameter space and 60 the parameter space under H0. By Taylor's expansion, we can express 2 log(A) as
2log(A) =
  0.)TS.(b.)] + (b.  .) (0 .),
where 6 lies between 6, and 6, and therefore 0 E N. (A). Note that S, (6.) = 0 and F,(6) can be replaced by D, from Lemma 4.7. Hence, 2 log(A) is written as
2 log(A) = (6"  6")TD,(b,  6.) + o,(1),
which converges in distribution to X2,G by Equation (4.20).
CHAPTER 5
SIMULATIONS
5.1 Introduction
The results discussed in the previous chapters showed the asymptotic
properties of parameter MLEs in the piecewise Gompertz model. These results enable us to carry out the hypothesis test of which the null hypothesis is proposed in Equation (2.3). In this chapter, Monte Carlo simulation studies are performed in order to evaluate the normality approximation of the proposed test statistic in Equation (2.6) and the power of the test under different settings based on the amount of location shift and the sample size. Due to the complexity of the piecewise model, the procedure for generating the random sample is quite complicated and is discussed in section 5.2. Results are given on observed Type I errors and powers for both the piecewise Gompertz model and the exponentiallogistic mixture model. All simulations were completed on the Sun Ultra, Enterprise 450 computer using Ox 2.0 (Doornik 1998). Random number generators in the Ox 2.0 environment were used to generate uniformly distributed random numbers.
5.2 Generation of Random Samples
In this section we describe the procedure for generating the random sample when the sample follows a piecewise Gompertz model based on certain parameter settings. In order to generate a random sample of size n with a specified proportion of censoring, we need to generate a sample of failure observations of size n and a sample of censored observations of size n. The sample of failure observations follows the piecewise Gompertz model. The sample of censored observations
is assumed to follow a uniform (0, M) distribution, where M is defined by the censoring proportion. The final sample is obtained by comparing the two samples generated. A subject is considered failed if its observation from the first sample is smaller than the one obtained from the second sample. The subject is censored if its observation from the first sample is greater than the one from the second sample.
5.2.1 Parameter Settings
We consider five different parameter settings in our Monte Carlo simulations. There is no treatment difference in random samples generated based on Settings 1 and 2. Therefore we can compare Type I errors between tests based on random samples generated from these two settings. In parameter Settings 3, 4, and 5, treatment effects exist between groups. Settings 3 to 5 present the smallest, the intermediate, and the largest difference between treatment groups, respectively. The comparison of powers of tests are based on random samples generated from these three parameter settings. All five parameter settings are listed as follows: Setting 1:
* Two treatment groups
* Three intervals: [0, 3), [3, 7), and [7, oo)
* One covariate with value 0 or 1
* Cure proportion varied with covariate values
* Censoring rates: 30%
* A1O A11 = 4.2
ï¿½ A20 = A21 = 0.52
" A30 = A31 = 0.2
" 0o 0.1, /31 = 0.2
Setting 2:
* Two treatment groups
" Three intervals: [0, 4), [4, 10), and [10, cc) " One covariate with value 0 or 1 " Cure proportion varied with covariate values " Censoring rates: 30% " A10 = All = 3.5 " A20 = A21 = 0.45 " A30 = A31 = 0.15 " *0 = 0.2, 13= 0.15 Setting 3:
" Two treatment groups " Three intervals: [0, 3), [3, 7), and [7, oc) " One covariate with value 0 or 1 " Cure proportion varied with covariate values and treatment groups " Censoring rates: 30% " A10 = 3.5 and A11 = 4.2 " A20 = 0.45 and A21 = 0.52 " A30 = 0.21 and A31 = 0.2 " 0o = 0.1, 01 = 0.2 Setting 4:
" Two treatment groups " Three intervals: [0, 4), [4, 10), and [10, cc) " One covariate with value 0 or 1 " Cure proportion varied with covariate values and treatment groups " Censoring rates: 30% " A10 = 3.9 and All = 4.2 " A20 = 0.45 and A21 = 0.4 " A30 = 0.165 and A31 = 0.18
* 0 =  0.2, 31 = 0.15 Setting 5:
" Two treatment groups " Three intervals: [0, 4), [4, 10), and [10, oc) " One covariate with value 0 or 1 " Cure proportion varied with covariate values and treatment groups " Censoring rates: 30% " A10 = 3.5 and A11 = 4.2 " A20 = 0.45 and A21 = 0.4 " A30 = 0.15 and A31 = 0.18 " 30 = 0.2, 01 = 0.15
The survival curves for the above five parameter settings are shown in
Figures (51) through (55). The survival probabilities are calculated by applying
i1 A 1 Am r Aigr 1 _ e~gt SexpeTx(mg + e ) I (5.1) f =1 Am9g i for t E [Ti_1, Tj). There are two separate survival curves in the first two figures for individuals with covariate values of 0 and 1. For Figures (53), (54), and (55), we draw four survival curves for all four possible combinations of the covariate and the treatment group indicator.
5.2.2 Generation of Censored Observations
From the assumption, the sample of censored observations follows a uniform (0, M) distribution. Therefore, we can obtain a sample with a uniform (0, 1) distribution and then transform them into a sample that follows a uniform (0, M) distribution. The primary task is to obtain an appropriate value for M. Let U and T denote random variables for censored and failure observations, respectively. By taking parameter setting 1 as an example, we illustrate the derivation of the
 Cov 0 ............ Cov 1
I I I I I I 0 10 20 30 40 50 Time
Figure 51. Survival curves for patients with parameter setting 1.
O
0
C;
.0
ZP
=1C0 Co
2
(1. _; C\O 0,
0
Cov 0 Cov1
0
CL 3.
0 10 20 30 40 50 Time
Figure 52. Survival curves for patients with parameter setting 2.
Cov 0, Treatment 1
... Cov 1, Treatment 1
Cov 0, Treatment 2 Cov 1, Treatment 2

Time
Figure 53. Survival curves for patients with parameter setting 3.
Eo
Co
a
Cn0
C4 "'
o
Coy 0, Treatment 1
.. Cov 1, Treatment 1
Cov O, Treatment 2
 Cov 1, Treatment 2
\ %
........
....................
Time
Figure 54. Survival curves for patients with parameter setting 4.
0
C
0
.0
CU
0 CI) c'J
N 0o
Cov 0, Treatment 1
...  Cov 1, Treatment 1
Cov 0, Treatment 2 Cov 1, Treatment 2
... . . . .                
~.....
.........
Time
Figure 55. Survival curves for patients with parameter setting 5.
C0
0
a
:3~
()C;J
0
censoring proportion P(U < T) as follows.
P(U < T) e+119c
for M E [0,3);
P(U < T)
f3 x 1e\gc
exp{e+0+01x }dc
=M 0Aig
1 1  eAlg3 fM expeOIX A2g3  A2gC dc
x Aig 13 A2g
for M E [3, 7); and for M E [7, oc),
P(U < T)
1j 3{/ +3 1  e'\gc }dc
M 0 A19
1 1  e\ga 3 f 7 exXï¿½ z 02g3  e A2gC
+ M expfeOï¿½+ A19 e J3 A2g }dc
+ Mexpc 1 eAg3 eA293 eA2 Mexp3o+lxeA39 7 eA3gc
M A  3g }dA
Calculation of integrals in the above equations are carried out approximately using the Simpson's Formula. The detail of the Simpson's Formula is provided in Appendix J. Therefore, we search for an appropriate value for M such that P(U < T) reaches the required censoring proportion.
5.2.3 Generation of Failure Observations
In order to generate a sample of failure observations, we need to generate a sample of uniform (0, 1) observations. Then we can treat the value from the uniform (0, 1) as the survival probability for the failure time T. By inverting the survival function defined in Equation (5.1), we obtain the value for the failure time. Note that special attention is required at the inversion because we are dealing with
piecewise distribution. Examples are given for observations with covariate x = 0 in the first parameter setting.
From Figure (51), we observe that each survival curve consists of three pieces based on the three intervals we consider in the study. Therefore, the inversion function for generating the failure time observation depends on the value of the generated uniform random number U*. The survival probability when t  0 is 1, the survival probability P(T > 3) = exp{e 11_e 4.2x3 = 0.81,
4.2
and
1  e4.2x3 e0.52x3  e0.52x7
4.2 0.52 0.59.
Therefore if u* E [0.81, 1], then T 1 log [e A1gro _ Ageo log(u*)]; if u* E [0.59, 0.81), then
T =  log e 1\g71  A2g9[e o log(u*)  eAIgrï¿½ ;lg~l
A29 t Ai9 and if u* E [0, 0.59), then
A A39T2 39 [_lu  elgrl e ,2gl _e 2g72
T =A9lge  Ag e1 gu* Aig + Ag
Note that it is necessary that there is at least one failure observation in each interval for each treatment group in order to make inferences on all parameters.
5.3 Type I Error and Power
Type I errors and powers of tests are calculated based on the piecewise Gompertz model and on the exponentiallogistic mixture model for all five sets of parameters. The test statistic based on the piecewise Gompertz model
proposed in this dissertation, as we recall, was presented in Equation (2.6). For the exponentiallogistic mixture model, we assume that there is one covariate x in the model. The time to failure is assumed to follow an exponential distribution with mean 1/A, where A is modelled as a function of the covariate x and the treatment indicator g. The parameter A is presented as exp(yo + y1x + y2g). The cure rate P(Z = 0) is defined as 1+0+ 1x+$29 Therefore, for a fixed covariate value, the test of the equality of cure rates between two treatment groups is equivalent to the test of H0 : 32 = 0. Hence, the standardized test statistic for this null hypothesis is TS  /32
VV ( 2)
Type I errors based on the first two parameter settings for both models
are shown in Table (51). Powers for tests based on parameter settings 3 through
5 for both models are presented in Table (52). All tests are carried out for 10000 generated random samples and for three different sample sizes: 400, 1000, and 2000. The numbers of unobtainable test statistics (# NaNs) for the exponentiallogistic mixture model are shown in the last column of each table. The unobtainable test statistics occur because the likelihood function for the mixture model is close to flat at the point that MLEs are obtained. When the sample size becomes larger, however, unobtainable or unidentifiable test statistics are eliminated.
We present bar plots of all standardized test statistics obtained for all three sample sizes based on those five parameter settings. These plots help to visualize the advantages of the piecewise Gompertz model in terms of the normal approximation of the test under the null hypothesis and the power of the test under the alternative hypothesis. Figures (56), (57), and (58) are the distributions of the test statistics under parameter setting 1 for both the mixture model and the piecewise Gompertz model. These three figures use sample sizes of 400, 1000, and
Table 51. Type I Errors for the Test of Equality of Cure Proportions for the Mixture and the ThreePiece Gompertz Models.
Parameter Setting
1
2
Sample Size (200, 200) (500, 500)
(1000, 1000) (200, 200) (500, 500) (1000, 1000)
Type Mixture 0.0305 0.0171 0.0365 0.0203 0.0356 0.0402
I Error Gompertz
0.0375 0.0412 0.0412 0.0373 0.0360 0.0408
Mixture
220
1
NaNs Gompertz
Table 52. Powers for Tests of Equality of Cure Proportions for the Mixture and the ThreePiece Gompertz Models.
Parameter Setting
3 4 5
Sample Size (200, 200) (500, 500)
(1000, 1000) (200, 200) (500, 500)
(1000, 1000) (200, 200) (500, 500) (1000, 1000)
Power
Mixture Gompertz
0.0260 0.1016 0.0328 0.2024 0.0679 0.3703 0.0616 0.1896 0.0357 0.4351 0.2721 0.7379 0.2706 0.5572 0.7180 0.9331 0.9601 0.9981
# NaNs
Mixture Gompertz
127
5 28
2000, respectively. Similarly, Figures (59), (510), and (511) are for parameter setting 2; Figures (512), (513), and (514) are for parameter setting 3; Figures (515), (516), and (517) are for parameter setting 4; and Figures (518), (519), and (520) are for parameter setting 5.
From these tables and plots, we observe that Type I errors obtained from the threepiece Gompertz model are closer to 0.05 than those from the mixture model. Powers presented in Table (52) for the threepiece Gompertz model are much larger than ones for the mixture model for all sample sizes. The estimation procedure for obtaining MLEs is also easier to carry out for the threepiece Gompertz model than for the exponentiallogistic model. In addition, the algorithm for the threepiece Gompertz model converges much faster.
0
0
0
(DO1
0
0* a3 )
L0
0
0.o
2 0 2 Standardized Test Statistic
0
0
U 0
0 LIO
0
2 0 2 Standardized Test Statistic
Figure 56. Standardized test statistics for parameter setting 1 for 10000 random samples with Size 400. a) Mixture model; b) Threepiece Gompertz model.
01
0
LO 0
4 2 0 2
Standardized Test Statistic
0
0
0
0 o~Y
8.
Lo 0.
4 2 0 2
Standardized Test Statistic
Figure 57. Standardized test statistics for parameter setting 1 for 10000 random samples with size 1000. a) Mixture model; b) Threepiece Gompertz model.
C: 0
U
LO
0
0 LLO 0: "8)
0
4 2 0 2 4
Standardized Test Statistic
C.
C0
0
to
4 2 0 2 A
Standardized Test Statistic
Figure 58. Standardized test statistics for parameter setting 1 for 10000 random samples with size 2000. a) Mixture Model; b) Threepiece Gompertz model.
C U
0
4 2 0 2 4
Standardized Test Statistic
0
1.
LIO 0
2 0 2 Standardized Test Statistic
Figure 59. Standardized test statistics for parameter setting 2 for 10000 random samples with size 400. a) Mixture model; b) Threepiece Gompertz model.
.... ]
LL
LO)
0 
2 0 2 Standardized Test Statistic
0
0
0
tn
N
Co
'I
2 0 2 Standardized Test Statistic
Figure 510. Standardized test statistics for parameter setting 2 for 10000 random samples with size 1000. a) Mixture model; b) Threepiece Gompertz model.
O a0 :0~
LL0
LO
0
2 0 2 4 Standardized Test Statistic
0
0
C:) cMj
h
U0
LIO
4 2 0 2 4
Standardized Test Statistic
Figure 511. Standardized test statistics for parameter setting 2 for 10000 random samples with size 2000. a) Mixture model; b) Threepiece Gompertz model.
0
C a)
0
LL
4 2 0 2
Standardized Test Statistic
U
0
0 _j
4 2 0 2
Standardized Test Statistic
Figure 512. Standardized test statistics for parameter setting 3 for 10000 random samples with size 400. a) Mixture model; b) ThreePiece Gompertz model.
8
LL0 t) 08
4 2 0 2
Standardized Test Statistic
C, U
4 2 0 2
Standardized Test Statistic
Figure 513. Standardized test statistics for parameter setting 3 for 10000 random samples with size 1000. a) Mixture model; b) Threepiece Gompertz model.
0
Oo U _8
UO
,o
0
LL
4 2 0 2
Standardized Test Statistic
C
0
00
LO
8
if
2 0 2 Standardized Test Statistic
Figure 514. Standardized test statistics for parameter setting 3 for 10000 random samples with size 2000. a) Mixture model; b) Threepiece Gompertz model.
0
0
0
=3 0
4)
0
4 2 0
Standardized Test Statistic
8
>O
a)
0
LO
LL
0
0 it)
6 4 2 0 2 4
Standardized Test Statistic
Figure 515. Standardized test statistics for parameter setting 4 for 10000 random samples with size 400. a) Mixture model; b) Threepiece Gompertz model.
0
0
I)
U
LO) 0~
2 0
Standardized Test Statistic
4 2 0
Standardized Test Statistic
Figure 516. Standardized test statistics for parameter setting 4 for 10000 random samples with size 1000. a) Mixture model; b) Threepiece Gompertz model.
0
0
30
LLo0
0
0
6 4 2 0
Standardized Test Statistic
0 (D
U 0
0 0r
0J
2 0
Standardized Test Statistic
Figure 517. Standardized test statistics for parameter setting 4 for 10000 random samples with size 2000. a) Mixture model; b) Threepiece Gompertz model.
8
cOJ
00 CO
0L
LL
8 6 4 2 0
Standardized Test Statistic
:8
0" :) U
8 6 4 2 0
Standardized Test Statistic
2 4
2 4
Figure 518. Standardized test statistics for parameter setting 5 for 10000 random samples with size 400. a) Mixture model; b) Threepiece Gompertz model.
0
O0
('D
LI)
LO u.
0o U) LI
8 6 4 2 0
Standardized Test Statistic
0
0 LIO
0
8 6 4 2 0
Standardized Test Statistic
2 4
2 4
Figure 519. Standardized test statistics for parameter setting 5 for 10000 random samples with size 1000. a) Mixture model; b) Threepiece Gompertz model.
la) 0O
a
LL
LO)
0
8 6 4 2 0
Standardized Test Statistic
0
LO C ' LL
8 6 4 2 0
Standardized Test Statistic
2 4
2 4
Figure 520. Standardized test statistics for parameter setting 5 for 10000 random samples with size 2000. a) Mixture model; b) Threepiece Gompertz model.
CHAPTER 6
EXAMPLE
6.1 Introduction
In Chapter 2 we proposed a test statistic that addresses the hypothesis presented in Equation (2.3). Subsequently in Chapter 3 and Chapter 4 we established the asymptotic normality of parameter estimates in the piecewise Gompertz model. The asymptotic normality validated the test of hypothesis proposed in Chapter 2. Chapter 5 reported Monte Carlo Type I error and power calculations, which established the strength of our test statistic on the comparison of cure rates under various parameter settings. In this chapter, we demonstrate the applicability of our test statistic with a data set.
The data set comes from a study by the Pediatric Oncology Group (POG) at the university of Florida (Ravindranath et al. 1996). The motivation of this study is to compare cure proportions among patients with acute myeloid leukemia (AML) under different treatments. Three treatments, allogeneic bone marrow transplantation (allo. BMT), autologous bone marrow transplantation (auto. BMT), and intensive chemotherapy, are performed in the study. In order to register in the study, all patients are required to be in remission after the first two courses of induced chemotherapy. The POG evaluated 666 patients whose ages were under 21 from June 1988 to March 1993. A total of 552 patients entered remission and are therefore eligible for the study. Of the eligible patients, 89 patients received allo. BMT; 117 were assigned to an additional sixcourse intensive chemotherapy; and 115 were assigned to receive auto. BMT. There were 231 patients who did not participate in the study due to various reasons.
6.2 Methodology
Our objective based on this data set is to perform pairwise comparisons
on cure rates between treatment groups after controlling for patient's gender. We compare tests for three parametric models and for a proportion test based on KM estimates of cure. The three parametric models are the logisticexponential mixture model, the onepiece Gompertz model, and the threepiece Gompertz model. In addition, we present cure rates and their standard deviations for every treatment and gender combination based on estimates from parametric models and on KM estimates. In the end, we compare the standard onepiece Gompertz model with the threepiece Gompertz model by applying the goodnessoffit test we derived in Chapter 5. In this section, we will briefly describe 1) three parametric models and their corresponding test statistics; and 2) the proportion test based on KM estimates of cure proportions.
In the exponentiallogistic mixture model, we assume an exponential distribution with parameter A for the failure time and a logistic regression model for the cure proportion. The expression log(A) is defined as a linear function of the covariate x and the treatment indicator. We define two dummy variables to represent the threelevel treatment indicator. They are presented as G, = ![if patient in treatment 1] and G2 = ![if patient in treatment 2]. The survival function for the failure time given that the observation is not a longterm survivor (Z = 1) is
S*(tlZ = 1) = eC = exp{te ï¿½+ Y1X+Y291+Y3921. (6.1) The cure probability is defined as
7Tg(x) = 1 + eï¿½+031x1+0 2g1+3g2.*
(6.2)
Therefore the hypothesis of testing the equality of cure rates in three treatment groups can be written as
H0: /2 =/ 3 = 0. (6.3) Pairwise comparisons between two treatments are based on the following standardized test statistics:
T1 = v/V( _/3 ,(6.4)
T2 = 02 and (6.5)
T3  = (6.6)
Note that T1 is the test statistic that compares cure proportions between Treatments 1 and 2; T2 compares cure proportions between Treatments 1 and 3; and T3 compares cure proportions between Treatments 2 and 3. The variances are obtained by inverting the observed information matrix obtained by taking the second derivatives of the loglikelihood function with respect to parameters.
The onepiece Gompertz model is a special case of the piecewise Gompertz model. Let Ag be the parameter associated with the treatment g and #1 the coefficient associated with the covariate x. For Ag < 0, the cure proportion for patients with the treatment g and the covariate x is written as 7rg(x) = exp{e3ï¿½+01X I }. (6.7) If A9 < 0, there is cure proportion for patients under the gth treatment. If Ag < 0 for any g, then the hypothesis of testing the equality of cure proportions is written as
Ho: A =A2 =A3.
(6.8)
Standardized test statistics for pairwise comparisons of cure proportions are presented as
 A12 (6.9) = /V()1 _,2)'
T2 A3 and (6.10) /V(A1 A)
T3  .2A3 (6.11) VV(X2 3)
Similarly these three standardized test statistics are for the comparison of cure proportions between Treatments 1 and 2, between Treatments 1 and 3, and between Treatments 2 and 3, respectively. The variances are obtained from the inverse of the observed information matrix.
Finally, as we have discussed in Chapter 2, the cure proportion in a
piecewise Gompertz model as a function of the covariate x and the treatment g is expressed in Equation (2.4). We consider a threeinterval case in this example. Denote the three intervals by [TO, T ), [T1, T2), and [T2, T3), where T0 = 0 and T3 = 0c. When A39 < 0 for any g, testing the equality of cure proportions for any given value of x is equivalent to testing
3 _ e~eliil3 eeli2i
H 3 ï¿½ S ei'i1  e\i21i1 e 2 Ti 1 eIi2TAi3Ti
Hï¿½ il Ai2 i3 (6.12)
i=1 i=1A~
Standardized test statistics for pairwise comparisons between treatment groups are presented as follows:
T1  i31 e xil 'i l e2'i _ E l3 e1e '\ xi 2
T, A12 (6.13) eV1  3 e
__i____i___l________ i3= e)'12 r'i e) i2"rl il ei2
3 le il i e 3 Ai3T:1  eli3"i 21 A3 , and (6.14)
V____1___3_ J i 'i1 I i 3 e\ i 1 e~i3ri VA 3
3 e,\ ei 3' 1T)2 1eT.(6.15) V(E3 e2T i2i I 2ri _ 3 el3rZ1e'3 ) =1 i,, 1 ,,3
These test statistics T1, T2, and T3 are for comparisons of cure proportions between Treatments 1 and 2, between Treatments 1 and 3, and between Treatments 2 and 3, respectively. Variance estimates are not quite straightforward as for tests based on the mixture model and on the onepiece Gompertz model. They have already been presented in Chapter 2 and will not be repeated here.
Proportion tests of KM estimates of cure for pairwise comparisons between treatments are generated based on KM estimates of cure proportions and standard errors of the estimated cure proportions.
Pvalues for tests derived from parametric models and for the proportion test will be presented in the next section. In addition, we present survival curves estimated based on both parametric models and KM estimates. Furthermore, we present the cure proportion and the corresponding standard deviation for each treatment and gender combination based on parametric estimates and on KM estimates. The result of comparing the onepiece Gompertz model with the threepiece model by the goodnessoffit test we derived earlier is also presented.
6.3 Results
6.3.1 Pairwise Comparisons on Cure Proportions
In Table (61), we present pvalues for tests based on three parametric models and for the proportion test based on KM estimates. The three intervals for the threepiece Gompertz model are defined as [0, 50), [50, 140), and [140, 0c). All numbers presented here are in terms of weeks in followup. Tests based on the exponentiallogistic mixture model, the threepiece Gompertz model, and the KM estimates detect the difference in cure proportions between allo. BMT and auto.
BMT and the difference between allo. BMT and intensive chemotherapy. The allo. BMT increases cure proportion significantly more than the other two treatments. There is no significant difference between auto. BMT and chemotherapy. A test based on the onepiece Gompertz model does not show any significant difference between any two of these three treatments. The maximization procedures obtaining MLEs of parameters for both the onepiece Gompertz model and the threepiece Gompertz model are smooth. The algorithm converges within a few steps. However, there is a convergence problem related to the mixture model. Different starting values for parameters are chosen in order to obtain the convergence. We have discussed the convergence problem related to the mixture model in both Chapter 1 and Chapter 5. The convergence problem we encounter here confirms our earlier assertion.
Table 61. Comparison of Tests from the Mixture Model, the OnePiece Gompertz Model, and the ThreePiece Gompertz Model, and the Proportion Test on Cure Proportions.
Pvalues for Treatment Comparisons
1Piece 3Piece Proportion
Comparisons Mixture Gompertz Gompertz Test
Allo. BMT vs Auto. BMT 0.016 0.384 0.015 0.011 Allo. BMT vs chemotherapy 0.012 0.101 0.003 0.007 Auto. BMT vs chemotherapy 0.690 0.516 0.897 0.741
In Figures (61) to (64), we show estimated survival curves from different parametric models and from KM estimates. The three curves on Figure (61)a are survival curves for three treatment groups for male patients based on the mixture model. Figure (61)b presents three treatment survival curves for female patients based on the mixture model. Figure (62) presents survival curves estimated from the onepiece Gompertz model for male and female patients in three treatment groups. Survival curves estimated from the threepiece Gompertz model are
presented in Figure (63). The KM estimates of survival curves are presented in Figure (64).
6.3.2 Cure Proportion Estimates
Estimates of cure proportions are presented in Table (62). We present
estimates of cure rates and their standard deviations from the mixture model, the onepiece Gompertz model, the threepiece Gompertz model, and the KM estimate for all combinations of patient's gender and three treatment groups. Treatment
1 represents allo. BMT, treatment 2 represents auto. BMT, and Treatment 3 represents intensive chemotherapy. Note that cure proportion exists for each of the combinations. Female patients show higher cure probability than male patients. Standard errors for cure proportions estimated from both the threepiece Gompertz model and the mixture model are smaller compared to the onepiece Gompertz model and the KM estimates.
Table 62. Estimates and Standard Deviations of Cure Proportion Generated from the Mixture Model, the OnePiece Gompertz Model, the ThreePiece Gompertz Model, and the KaplanMeier Estimates.
Cure Proportions (Standard Deviations) 1piece 3piece
Gender Treatment Mixture Gompertz Gompertz KM M 1 0.558 (0.059) 0.504 (0.062) 0.568 (0.056) 0.648 (0.076) M 2 0.386 (0.050) 0.445 (0.060) 0.397 (0.050) 0.387 (0.065) M 3 0.407 (0.033) 0.409 (0.050) 0.405 (0.033) 0.388 (0.036) F 1 0.597 (0.056) 0.533 (0.062) 0.596 (0.054) 0.531 (0.071) F 2 0.424 (0.051) 0.475 (0.061) 0.429 (0.051) 0.421 (0.065) F 3 0.445 (0.036) 0.440 (0.051) 0.437 (0.035) 0.466 (0.041)
6.3.3 GoodnessofFit
From the above discussion, we observe that the threepiece Gompertz model fits better than the onepiece Gompertz model. It is natural to compare these two models by applying the goodnessoffit test we derived in this dissertation. The null
83
a
____ a0lo. BMT ........ auto. BMT
0  chemotherapy CO
a.  .>0
>0
0 200 400 600
Time in weeks
b
allo. BMT
I %" ........ auto. BMT
0   chemotherapy
.0
0  .....       
a "q . .. ... .......  ._...............
>.0
ci,
0
0 200 400 600
Time in weeks
Figure 61. Survival curves estimated from the mixture model for three treatments. a) Male; b) Female.
\~~ ~ ......l'o. BM I
   chemotherapy
"";":'::'"::' Z'22 " ......."..".. .......................... .
0 200 400 600
Time in weeks
b
........ auto. BMT
I  chemotherapy
200
400
Time in weeks
Figure 62. Survival curves estimated from the onepiece Gompertz model for three treatments. a) Male; b) Female.
=0
cu
.
.>0d
0
0
=0
Ci
.0
q CO >0
C) 05
85
a
I alto, BMVT >'00 ........ auto. BMVT
C5 chemotherapy
=0
2
>
CD,
0
0
0 200 400 600
Time in weeks
b
allo. BMVT >40 ........ auto. BMVT
C5 chemotherapy
C.ï¿½ >0
C,,
> 6
0 200 400 600
Time in weeks
Figure 63. Survival curves estimated from the threepiece Gompertz model for three treatments. a) Male; b) Female.
C.
20
a.
CO j
0 10O0 200 300 400 500 600 700
Time in weeks
b
0 100 200 300 400 500 600 700
Time in weeks
Figure 64. Survival curves from KaplanMeier estimates for three treatments. a) Male; b) Female.
0
Ca CO .0
o 0o
C5~h
0
6
hypothesis is expressed as
Ho : g = A2g = A3g,
for any g = 1, 2, and 3. Note that A29 and A3g are parameters for the second and third interval in the threepiece Gompertz model, respectively. We showed that under H0
2(11  12) 2DV6,
where 11 and 12 are the loglikelihood functions for the onepiece and the threepiece Gompertz models, respectively. From the computation, we obtain that 11 = 1977.289 and 12 = 1950.979. Therefore, the test is highly significant. It implies that the threepiece model is significantly better than the onepiece Gompertz model.
CHAPTER 7
SUMMARY
In this dissertation, we presented a piecewise Gompertz model to address longterm survival problem. We constructed a standardized test statistic based on our model which compares cure rates between treatment groups. We derived the strict concavity of the likelihood function based on the model. In addition, we derived the asymptotic properties of the MLEs of parameters in the model. A goodnessoffit test was also derived to access an appropriate number of intervals for the model.
In oncology study when there are longterm survivors and when there are different hazard patterns on different time intervals, one might consider using our piecewise Gompertz model. Our model is more computational feasible than both the mixture model and the semiparametric model with bounded cumulative hazard function. In addition, our model is more flexible due to its piecewise structure. Our model also provides a reliable estimate of the cure rate than the KM estimate when the sample size is small at the end of the study period.
In the future, we plan to find an optimal partition of the time span by
deriving a test to compare different partitions. Sample size determination will also be assessed. Since we notice that there are jumps on the survival curve defined by the piecewise Gompertz model, we might consider using a logspline model to achieve smoothness of the curve. In addition, we plan to perform more detailed simulation studies on the semiparametric model which assumes bounded cumulative hazard function and to study the asymptotic behavior of the semiparametric model.
APPENDIX A
DERIVATION OF MARGINAL LIKELIHOOD
Let
In = ... f (ti)eF t:O)dtndtnx"'" dr1
f=0 t1 =
and define
Ik .. "'" f (ti)e F(ti)ï¿½*idtndtnI ... dtnk+l, k =1, 2,.. n.
tnk fn1 i=nk+l
As k = 1,
fo f [t~~nO t eF(tn)On  eOn (A.1)
11 = jnf(tn)eF(n)endtn 1 (A
ni1n~n e] and when k = 2,
12 f : 1 f (ti)eF(ti)eidtndtn1
t2 tn1 i=n1
f(t )eF(tn1)On1 If f(tn)eF(tn)Endtn}dtnl
1 f f(tni1)eF(tnl1)On}{eF(tn0en  eenldtn1
12 On
1 + [e_(O1+On)F(tn_2) _ e_(nl+(n)]
On(On1 + On)
eOn
e_ [eOnF(tn2) _ eOnI]
OriOn1
By similar derivation, we obtain that when k = 3,
13 = fl3f (tn2)e F(tn2)0n2 2dtn_2
+ 1 +e(On+On1+ien2)F(tn3)  e(On+E_.e+e.n2)]
On(On + On1)(On + EOn1 + On2)
90
een
ene _1(e _l + e___)[e(e1+On2)F(t.3)  e()1+6n2)]
+ e (O + _) [e 02F(tn3)  e1n2]
On1E~n2(0, + O___e)
Let
k
Ik = E(1)ilAi,k[e Snk+l. +lF(tnk)  eSnk+1"i+1], (A.2)
i=1
then
Ik+1 =t Ikf(tnk)e F(tn k)'9k dtnk
0 k
nk 1 i )Ai,k [e k+1 +lF(tnk)  eSn k+lni+l]
f (tnk)e F(tnk)8nk dtnk
k1 [eSn k,n i+lF(tn k l)  e k,n i+l] S _(1) Ai,ks iS_+e
k
ï¿½Z(1) l AikeSnk+l,i+l[F(tnk1)Onken ] (A.3)
+ " ' ,4__,le  e" A 3 Onk
Also we know that k+1
Ik+1 )(1) Ai,k+l[e skkni+lF(tn1)  eSnk,n+l] j=1
k
Z(1)Ai,k+l[eSn k,ni+F(tn1k)  e Snk,ni+l]
il1
(1)kAk+l,k+l[eSnk,n kF(tnk1)  eSn k,nk]. (A.4)
Equate (A.3) and (A.4), we obtain that
Aik+l Aik 1 (A.5) Snkni+1
and
k
Ak+l,k+, E(1)ik Ai,k esn k+ln_+l (A.6) i=1 Onk
Therefore by Equation (A.5)
1
A ~ =  Sn k+l,ni+lSnk,ni+l
1 = Ai i (A.7) Sni,ni+1Sni1,ni+1 " " Snk+l,ni+l Pnk+l,ni+l
for i < k and by the definition of P,b given in Chapter 1. In addition by Equation (A.6),
k ik Ai,k e _Sk+l,n i+1 Ak+,k+i= n (k
k 1 ik Ajk eSnk+lni+l + Ak, Snk+ln k+ l
= " e esl,+
=nk Onk k1
(  ) A . , 1 S n k ,n i+ 1 S n  k ,n  i+ l
i=1 Sni,ni+1Sni1,ni+1 . knk
eSnk+l,nk+l (A.8
+Ak,k nk (A.8) Also we observe that
I , = [es'nF(tn1) le  SnnF(tn)
where the first equality is true by Equation (A.2) and the second is true from Equation (A.1). Therefore
1 _1
A1,1  S,n On
Similarly we obtain through derived Equation (A.8)
A 2 ,2 = A l l1 n  e _ _ _ eSn'n Sn,nSnl,nl
and
A 3,3 e Sn l,n Snl,nSn2,n2Sn1,n
If we let
Ak,k e n lSnnk+2,n (A.9)
where p,k+2,, was defined earlier in Chapter 1 as Ij=nk+2 Snk+2,j. In addition, we observe from Equation (A.8) that
Ak,k
k2 s
S~~~n ii (  k+lA z ni+l,n i+1e  k+2, i+1
 Sn k+l, k+l .=  pnk+2,ni+l
j=1
qAkl,k leS.k+2,.i+l
eSnk+2,n k1 1
Sk+l,nk+l {Z(1)k+1pni+2,npnk+2,ni+l (A.10)
Hence from equating Equations (A.9) and (A.10), the following equation holds.
1Y 1 1 (A.11)
(pni+l,npnk+l,ni  pnk+l,n i=0
Now we can derive Ak+x,k+l as follows:
Ak+l,k+l
k1 nSnk+l ni+l
"(p nki+1),n + + Ak,ke Snk+lnk+l1
Snk,nk t_ / , pnk+l,ni+l"r"Zkk J
i=1
e Snk+2,n k1 1 1
Snk,nk {= ()ikpni+2,npnk +1ni+1 +Snk+l,nk+lPnt+2n} i1
ef k lnk 1 1 1
Snk,nk {pnk+l,n pnk+2,nP k+,nkï¿½1 ï¿½ Snk+l,nk+lPnk+2,n
eSnkn2,n (A.12)
Snk,,nk pnk +l,,n A.2
by Equation (A.11). Therefore, Sni+l,ni+l (A.13) Ai,k = Pnk+l,ni+lPni+2,n,
for i < k. Ik, as defined in Equation (A.2), therefore can be written as
k 1e i+2,nSnk+1"nj+ F(t k)  e Sk+(14)
Ik pni+2,nPk+l,,i+l

Full Text 
PAGE 1
L PIECEWISE GOMPERTZ MODEL ON SOLVING CURE RATE PROBLEM > i ' ;T' ^ * By M s HAOYI CHEN I A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2001
PAGE 2
Copyright 2001 by Haoyi Chen
PAGE 3
I dedicate this work to Hua.
PAGE 4
ACKNOWLEDGMENTS I would like to take this opportunity to express my deepest gratitude to Professor Myron N. Chang, my advisor, for his guidance, tremendous help, and encouragement during my doctoral research. This dissertation would not have been possible without his forceful guidance and excellent advisement. I also would like to thank Professor Randolph R. Carter, my graduate research supervisor, for his guidance and support during the various epidemiological research projects with which I have been involved as a graduate research assistant. My deepest appreciation is extended to the other committee members. Dr. Ronald Randles, Dr. Samuel S. Wu, and Dr. Michael B. Resnick, for their advice on my doctoral research. Lastly, I would like to extend my gratitude to my husband Hua and my parents for their neverending support, understanding, and love. IV
PAGE 5
TABLE OF CONTENTS page ACKNOWLEDGMENTS iv ABSTRACT vii CHAPTERS 1 INTRODUCTION 1 1.1 Problem and Common Practice 1 1.1.1 Mixture Model 2 1.1.2 Binary Regression Model 6 1.1.3 Nonparametric Method 7 1.1.4 Model with Bounded Cumulative Hazard Rate 10 1.2 Computation Difficulty of the Semiparametric Model 17 1.3 Piecewise Exponential Model 20 2 TOPIC OF RESEARCH 25 2.1 Data Structure and Model 26 2.2 Assumptions 28 2.3 Hypothesis of Interest 29 2.4 Test of Hypothesis 29 3 ESTIMATION OF PARAMETERS 31 3.1 Likelihood Function and Information Matrices 31 3.2 Concavity of Likelihood Function 32 4 PROPERTIES OF THE MAXIMUM LIKELIHOOD ESTIMATES ... 35 4.1 Introduction 35 4.2 Preliminary Proofs 36 4.3 Existence and Consistency of the MLE 37 4.4 Asymptotic Normality of the MLE 40 4.5 GoodnessofFit Test 42 5 SIMULATIONS 47 5.1 Introduction 47 5.2 Generation of Random Samples 47 5.2.1 Parameter Settings 48 V
PAGE 6
5.2.2 Generation of Censored Observations 50 5.2.3 Generation of Failure Observations 56 5.3 Type I Error and Power 57 6 EXAMPLE 76 6.1 Introduction 76 6.2 Methodology 77 6.3 Results 80 6.3.1 Pairwise Comparisons on Cure Proportions 80 6.3.2 Cure Proportion Estimates 82 6.3.3 GoodnessofFit 82 7 SUMMARY 88 APPENDICES A DERIVATION OF MARGINAL LIKELIHOOD 89 B PROOF OF LEMMA 4.1 94 C PROOF OF LEMMA 4.2 97 D PROOF OF LEMMA 4.3 100 E PROOF OF LEMMA 4.4 101 F PROOF OF LEMMA 4.5 107 G PROOF OF LEMMA 4.6 109 H PROOF OF LEMMA 4.7 112 I PROOF OF LEMMA 4.8 113 J SIMPSONÂ’S FORMULA 114 REFERENCES 115 BIOGRAPHICAL SKETCH 118 VI
PAGE 7
Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy PIECEWISE GOMPERTZ MODEL ON SOLVING CURE RATE PROBLEM By Haoyi Chen December 2001 Chair: Myron N. Chang Major Department: Statistics In cancer research, some of the patients are diseasefree after certain treatments. It is interesting to compare treatment efficacy with longterm survival rates. One commonly used approach in analyzing this type of data is to compare the KaplanMeier estimates of the cure rate. Another approach is to apply the mixture model proposed by Farewell. However, the KaplanMeier estimates are unstable toward the end point, while the mixture model is computationally too complex. To overcome these difficulties, we propose a test based on a piecewise Gompertz model to compare drug efficacy on the cure rate. The proposed test also accommodates the situation where patients display different hazard patterns during different treatment stages. In this work, we have derived the strict concavity of the loglikelihood function and the existence, consistency, and asymptotic normality of the maximum likelihood estimates of the parameters. In addition, our Monte Carlo simulation study shows that the proposed test is more computationally feasible and more powerful than the test based on FarewellÂ’s mixture model. Moreover, an vii
PAGE 8
example is given to show the utility of our proposed test. A goodnessoffit test of our proposed model is also discussed.
PAGE 9
CHAPTER 1 INTRODUCTION 1.1 Problem and Common Practice In survival analysis, one is interested in comparing the efficiency of various drugs or treatments. The survival probability is generally assumed to converge to zero when the observation time becomes large. However, it is not uncommon in cancer research that a proportion of patients are diseasefree (relapsefree) after certain treatments. These patients are called Â“longterm survivorsÂ” or Â“cured.Â” For instance, in clinical trials dealing with childhood leukemia, some treatments do cure a proportion of patients permanently. This type of longterm survival pattern is different from what is dealt with under the traditional survival analysis. In addition, it is sometimes necessary to assume that hazard functions are different on different intervals. A piecewise exponential model proposed by Friedman (1981) accommodates this assumption. This model was considered semiparametric by Aitkin et al. (1989, PP. 287292). They stated that the piecewise exponential model has the same flexibility as the proportional hazards model in which the baseline hazard function is unspecified. In this chapter, we first introduce the common methodologies in the literature that deal with cure rate problems. We also discuss the piecewise exponential model proposed by Friedman which assumes different hazard functions on different time intervals. In the literature, there are several methods dealing with the longterm survival problem. They are as follows: Â• Using a mixture model which assumes the underlying population is a mixture of two populations, patients who are cured and patients who are not cured. 1
PAGE 10
Â• Applying a modified binary regression model in which the main interest is to compare the cure rates among patient groups. 2 Â• Using nonparametric methods to compare the survival curves or to compare cure proportions among treatment arms. Â• Assuming a bounded cumulative hazard rate instead of a cumulative hazard rate with an infinite limit. In the next four sections, we describe each method in more detail. Assume that there are n patients in the study. For the patient, let Ti denote the time to relapse (incidence), Ui the censoring time, T* = min(rj, Ui), Aj = I[Ti
PAGE 11
3 Therefore, the overall likelihood function is n L = Y[{P{Zi = l)r{t* I Z, = 1)}'^P(Z, = 0) + P{Zi = I Z, = ( 1 . 2 ) Two influential articles addressing this method were written by Farewell in 1977 and 1982. In the first article, he expressed the binary variable Z using a logistic regression model where /3 is a vector of coefficients associated with the vector of covariates x. Farewell (1977) also assumed the time to incidence T follows an exponential distribution with mean 1/A, and A is assumed to be known. Thus, when = 1, the contribution of the patient to the likelihood is g)3^Xi f*{Ti = tf,\\Zi = l)P{Zi = 1) = Ae^*^ti > 0. 1 __ gAÂ» Xj When 5i = 0, the contribution of the patient to the likelihood is P{Zi = 0 ) + P{Zi = l)S*{Ti = ti,\\Zi = 1 ) 1 d/3 Xi + 1 + xi 1 + U > 0 . Type I censoring was assumed in Farewell (1977). The author also accessed the efficiency of the logisticexponential mixture model proposed in his paper relative to a regular logistic model, which assumes Z = 0 if the patient does not have a relapse until TÂ°. He defined as the end point of the observation time. He found that the efficiency of the mixture model increases as the followup time TÂ° increases and as P(Z = 1) decreases. When the true underlying distribution is a mixture model, the estimate of the cure rate from the regular logistic model is biased. Farewell (1982) proposed two more general models for the time to incidence T. He assumed that T follows a Weibull distribution in his first model. The probability density function and the survival function of T given that the patient is
PAGE 12
4 noncured are written as f*{T = t] X,( \ Z = l,x) = exp{Â— (At)^}, t > 0, and (1.4) S'*(T = t I Z = 1, x) = exp{Â— (At)'Â’}, t > 0, (1.5) where C is an unknown parameter, A = exp(Â— 7 ^x) is a function of covariates, and 7 is a vector of unknown parameters associated with covariates x. The likelihood function is obtained by plugging in P{Z = 1) defined in Equation (1.3), the probability density function defined in Equation (1.4), and the survival function defined in Equation (1.5) to the likelihood function in Equation (1.2). Farewell (1982) proposed the second model for grouped data on survival time. Assume there are k failure time groups {ti, tj+i], i = 1,2, ... ,k, the conditional probability for T = ti given that T > U is ^Oi+O'^x P(T = ti \ Z = 1,T > ti,x) = 5 ;, (1.6) where a.i (z = 1, 2, . . . , A;) are groupspecific parameters. The vector 0 is associated with covariates x and is common to all time groups. By combining Equations (1.2), (1.3), and (1.6), the likelihood function is obtained. Farewell (1982) commented that the mixture model is appealing from both biological and statistical points of view. It has been used frequently in cancer studies, especially in the study of breast cancer. Strong evidence of the existence of individuals coming from different populations, however, is required before using the mixture model. Farewell also pointed out that the likelihood function might be fiat for some data sets, which would cause possible nonidentifiability of the parameters. Farewell (1986) discussed some problems related to the mixture model by applying the model to two data sets. The first data set was from a breast cancer trial with 139 patients. He applied a logisticWeibull mixture model to the data. Then he computed the profile likelihood function for (3q, the intercept parameter
PAGE 13
5 in the logistic portion of the mixture model. The profile likelihood function for Pq was obtained by replacing the other parameters in the model by their maximum likelihood estimates (MLE). He found that the profile likelihood function was in a strongly nonquadratic shape, indicating that significant nonidentifiability existed with respect to the estimation of /3 q. Therefore, this logisticWeibull mixture model presented an unreliable estimate of the cure proportion. Farewell also applied a multinomialweibull mixture model to a data set from Risch (1983) on the time to the onset of a major affective illness in siblings of affected individuals. RischÂ’s data were grouped into six time intervals. The onset probabilities were modelled by a multinomial distribution. The mixture model fitted data well. Therefore, Farewell concluded that one should be cautious when using mixture models. The cure proportion might be very imprecise in some cases and strong evidence for the existence of two populations is required to make a reliable inference. A significant amount of literature discussed the mixture model. The common parametric distributions for the time to incidence for the noncured population can take many forms. For example, they can be exponential (Berkson & Gage 1952, Bithell & Upton 1977, Farewell 1977, Goldman 1984); lognormal (Boag 1949); Weibull (Farewell 1982); Gompertz (Gordon 1990); Generalized Gamma (Yamaguchi 1992); and Generalized Fdistribution (Peng et al. 1998). The distribution for the time to incidence can also take nonparametric or semiparametric forms, such as KaplanMeier (KM) estimates (Laska & Meisner 1992, Taylor 1995) and the proportional hazards model (Kuk & Chen 1992). The probability P{Z = 1) can be treated as a parameter in a Bernoulli trial. It can also be written as a function of a vector of covariates x in a logistic regression model, such as the model defined in Equation (1.3). To conclude the previous discussion, it is generally plausible to use the mixture model when there exists strong scientific evidence of the mixture of
PAGE 14
6 different populations. However, as both Farewell (1977, 1982, & 1986) and Cantor and Shuster (1992) pointed out, the likelihood function is sometimes nonquadratic or flat, which could produce problems for both computation and inference. In addition, parameter estimates could fall out of the parameter space when the data do not support a nonzero surviving fraction. The asymptotic properties of the MLE from a logisticparametric mixture model were developed systematically by Mailer and Zhou (1996). The asymptotic properties related to the model parameter estimates in both logisticsemiparametric and logisticnonparametric models have not been explored in the literature. 1.1.2 Binary Regression Model Jung (1996) proposed an interesting method to estimate the cure rate, which is defined to be the probability that the time to incidence T is greater than a fixed time r. In his paper, the standard procedure for inference on parameters in a binary regression model is modified to accommodate censored observations. Let 0(7 t) = /3^x, where 0 is a link function which is monotonic and differentiable on [0, 1]. Also let tt* = the inverse function of (j). Let tt* = 1 Â— tt* and 7T^(^) = dTr*{4>)/d(f). The MLE of in a standard binary regression model without censoring is obtained by solving n 1/2 ^{/(ti > r) 7T*(/3^Xi)}7r;(/3'^Xi) 0 . 7i*(/3 Xj)7r*(/3 Xj) In the case with censoring, denote T* = min(Ti, Ui). Assume independent censoring, i.e., Ti and Ui are independent given Xj for i = 1, 2, . . . , n. Also assume that Ui,U 2 , . ,Un are independently, identically distributed (iid) with a cumulative distribution function 1 Â— G. Let G be the KM estimate of G. Then the MLE Pmle be obtained by solving n 1/2 E( 1=1 m > r) 7T*(/3^Xi)} 7r*(/3^Xi)7T*(/3'^Xi) X,: = 0.
PAGE 15
7 Numerically 0mle can be obtained by applying the NewtonRaphson method. Under mild conditions, it is proved that ^mle is a unique and a consistent estimate of /3 and n^^Â‘^{0MLE ~ P) follows an asymptotic normal distribution with a zero mean. The problem with this method is that r is very difficult to define. The result is not convincing if r is too small. On the other hand, it is not cost effective for r to be too long. Farewell (1977) compared the efficiency of a logisticexponential model to a binary logistic regression model that treats any individual with t* > r as a longterm survivor. He also pointed out that the estimate of the cure rate would be biased using a logistic model when the underlying true distribution is a mixture distribution. The logistic model discussed by Farewell (1977) did not consider censoring. However, he emphasized that a similar conclusion can be reached by expanding the model to include censored individuals. Farewell also noted that it is especially unreasonable for the logistic model to ignore the possible censoring times when T* exceeds the largest observing time r. 1.1.3 Nonparametric Method Nonparametric tests are widely applied for both comparing survival curves with the presence of cure and comparing cure proportions between treatment arms. When the main interest is to compare survival curves of several populations with the presence of cure, some of the nonparametric tests are considered standard, such as the logrank test and the generalized Wilcoxon test. Halpern and Brown (1987) studied the power of these two tests by simulations when the mixture cure rate model by Goldman (1984) is appropriate. They compared the power of these two tests for detecting the difference between two treatment arms for selected values of cure rates, instant failure rates, and followup time. Each of the two treatment arms had 50 participating individuals. They used 200 replications for each set of simulation. The simulation results are somewhat inconclusive. The logrank test is
PAGE 16
8 more powerful than the Wilcoxon test in general. However, the logrank test does not show any advantage over the Wilcoxon test in the following situations: 1) cure rates are high and are at similar levels in both arms; 2) there is a large difference in the time to the end point for both arms; and 3) the followup is long relative to the time to failure. In the comparison of cure proportions, Sposto et al. (1992) studied several tests based on simulated samples. These tests are the likelihood ratio, the Wald, the score tests based on two different mixture models, and a test based on the differences of the KM estimates of the cure rates. The two mixture models are logisticexponential and logisticWeibull models proposed by Farewell (1977 & 1982). The time point in the KM estimate of the cure rate is defined as the last observed failure time. Simulated samples under assumed mixture models are generated in order to check the results in an ideal case. In addition, samples under some different models are generated in order to study the robustness of the tests to model misspecification. The authors also generated samples based on various levels of cure proportions. The test statistics calculated from the mixture models sometimes can not be obtained because of the nonconvergence during the maximization procedure. The authors pointed out two reasons for the nonconvergence: 1) the maximization is obtained on the boundary; and 2) the algorithm terminates abnormally with a jump to a nonnegative definite region. Test statistics with convergence problem are not presented in the article. In conclusion, the authors noted that the KM estimate of the cure rate is a reasonable alternative to the parametric estimate in general. When the underlying model includes unequal failure rates between treatment arms, the proportion test based on KM estimates is more powerful than the test based on the complicated parametric model. However, the KM estimate introduces more bias than the parametric
PAGE 17
9 estimate of the cure rate when the sample is small, when failure rates are equal, and when not all failures are observed. Gray and Tsiatis (1989) proposed a linear rank test to compare survival curves between two treatment arms over the entire time span, but their test focused power against the later difference on survival curves. This linear rank test is evaluated on a mixture distribution of a proportion representing the cure and a failure distribution with an unspecified form. They assumed that failure distribution functions for both treatment arms are constant over time. They specified a weight function in a twosample nonparametric censored test proposed by Tarone and Ware (1977) in order to obtain their own test. Note that the logrank test also can be derived from Tarone and WareÂ’s test with a different weight function. However, the difference between Gray and TsiatisÂ’ test and the logrank test is that the former put more weight towards the later difference and the latter put equal weight throughout the entire observation period. Gary and Tsiatis (1989) calculated relative efficiencies under equal censoring for their test, the logrank test, and the proportion test based on KM estimates of cure rates. When there is no censoring, their test gains efficiency over the logrank test for a small cure rate. However, the efficiency gain is less for a cure rate greater than 50%. These results can be explained by observing that the weight functions become closer for their test and for the logrank test as the cure rate increases. In addition, under the noncensoring assumption, their test is asymptotically equivalent to the proportion test. When there is censoring and when the censoring proportion increases, their test is increasingly more powerful than the proportion test. But note that their test is not a pure test of the equality of the cure rates. Therefore, it has power against alternatives with unequal cure rates and alternatives with equal cure rates and unequal survival curves.
PAGE 18
10 In this section, two types of statistical tests are discussed: 1) comparing survival curves under the assumption of the existence of cure rates; and 2) comparing cure rates. Some of the nonparametric tests such as the logrank test is quite standard in oncology for comparing survival curves. When the problem narrows down to the comparison of cure rates, the proportion test based on KM estimates are quite popular due to the computation difficulties related to the test based on the mixture model structure. However, there are problems associated with KM estimates for cure rates, as pointed out by Cantor and Shuster (1992). The standard error of the KM estimate increases in the righthand extremes because of the small sample size at the end of the study. The KM estimate sometimes can produce counterintuitive results. For example, the timetorelapse curve plateau could be higher than the timetodealth curve plateau because of the large downward steps at the end of the study. Therefore, one should be cautious relying on the KM estimate of the cure rate. As an alternative to the KM estimate of the cure rate. Cantor and Shuster proposed a parametric model with a bounded cumulative hazard rate to accommodate the cure proportion. 1.1.4 Model with Bounded Cumulative Hazard Rate Parametric Model Cantor and Shuster (1992) proposed a Gompertz model to incorporate longterm survivors. The survival function is written as: S'(t) = exp[^~^a(l Â— e^*)], a > 0, (1.7) where a is the initial hazard. When /3 > 0, the survival function S(t) Â—) 0 as t Â— > oo; and when ^ < 0, the survival function S(t) exp(a//3) as t > oo. The cure rate is presented as exp(a//d). The authors then maximized the likelihood function based on the Gompertz model using the NewtonRaphson method. Both Var(o;) and Var(/5) are obtained from the observed information matrix
PAGE 19
11 and the variance of the cure rate a/^ is obtained using the Delta method. In the article, they also compared the Gompertz model with the mixture model. When there is a cure proportion, both models produce similar curves and cure rate estimates. However, the Gompertz model is preferred due to the divergence problem associated with the mixture model. The authors also noted the instability of the KM estimate toward the end. Some other nice properties of the proposed Gompertz model are also provided in the article. For example, the cure probability given the patient has survived through time to is exp(e^*Â“); the loglikelihood function is strictly concave so one can easily estimate the parameters and the cure rate; and finally, S{t)^ is also a Gompertz survival function when S{t) is a Gompertz survival function. Gieser et al. (1998) incorporated covariates information into the model proposed by Cantor & Shuster (1992). The treatment specific survival function for an individual is written as: 5(*;i,x) = {S.mK" = exp{eT'Â’'Â‘+Â“:/?*(l e*')}, (1.8) where a* is defined as logo; for a in Equation (1.7), i is the treatment index, x is a vector of covariates, and 7 is a vector of parameters associated with x. Note that S{t] z,x) in Equation (1.8) is also a Gompertz survival function as noted by Cantor and Shuster (1992). The cure rate in this case is 7 t(z,x) = 5(00; z,x) = exp{/ 3 Â“^e'Â’'^Â’^''Â‘Â“Â‘*}. The obtained loglikelihood function is concave with respect to parameters. The MLEs of the parameters are efficient and asymptotic normal when the assumed model is correct and when the sample size is large. The authors compared the treatment effects on cure rates for any given covariates vector x. The ratio of the
PAGE 20
12 log of the cure rates for two different treatment arms is expressed as ^ ^ log7r(f,x) ^ Â” log7r(f',x) The null hypothesis on testing the equality of cure rates is written as Hq : log^../ = 0 and the test statistic is expressed as Z =[di + log(/?./) log(A) d^>]/SE, where the standard error in the denominator is obtained by the Delta method. Gieser et al. (1998) applied their model to 763 patients from a multicenter prospective study preformed in 1981 by the Pediatric Oncology Group. This study analyzed the standard risk of acute lymphocytic leukemia in children. They compared two treatment effects on the cure rate. GhildrenÂ’s age and sex are considered as covariates. The NewtonRaphson algorithm is used to yield MLEs of the parameters. The algorithm converges after nine iterations. Based on this study, they compared the estimates and the variances of the cure rates based on the Gompertz model, the logisticexponential mixture model, and the KM method. They found that the Gompertz model produces similar estimate of the cure rate to the KM estimate but with much smaller variance. They also provided a generalized model in which the treatment indicator is included as one of the covariates. One important advantage of this generalized model is the loglinear structure of the parameters, since the loglinear structure implies the concavity of the loglikelihood function and the existence, the uniqueness, and the asymptotic normality of the parameter estimates.
PAGE 21
13 Semiparametric Model Chen et al. (1999) proposed a semiparametric model for the survival data with a cure fraction. They assumed that there are two variables: a) the number of carcinogenic cells N* in the body of a patient; and b) the incubation time Wi for the carcinogenic cell (i = 1, 2, . . . , A^*). The number of carcinogenic cells = 0 implies that there is no such cell in the patientÂ’s body. Assume that N* follows a Poisson distribution with mean 9 and assume that ITj for z = 1, 2, . . . , A^* are iid with distribution function F{w). Variables Wi and N* are independent. Therefore, the time to incidence is T = min(lT,,0 < i < N*). Note that P{Wo = oo) = 1. Then the survival function for T is presented as S{t) P(No incidence by time t) P{N* = 0) + P{T >t, N* > 0) P{N* = 0) + P{Wi >t,W2>t,...,WN,>t,N*> 0) .e i=l eeF{t)_ (1.9) Note that S{t) is an improper survival function because S{t) e~^ > 0 as t Â— oo. The survival fraction e~^ is indeed the probability that there exists no carcinogenic cell in a patientÂ’s body. The corresponding probability density function and the hazard function are f{t) = 9f{t)exp{9F{t)), m = 0f{t). Let S*{t) denote the survival function for the noncured proportion of patients, then _ f ,6 S*{t) = P{T>t\ N* >1) = 1 Â— ( 1 . 10 )
PAGE 22
14 The corresponding probability density function is d /Â•(*) = and the hazard function is f*ii) 1 = P(rT <^>' The likelihood function is derived as follows n n i=\ = niÂ»/('ni''eÂ“"'"Â”(I'll) iÂ—\ The model is semiparametric in the sense that the form of F{t) is unknown. It has a strong biological meaning by gathering contributions from two characteristics of the tumor growth, the initial numbers of carcinogenic cells and the rate of their growths. Even without any biological meaning, the model still fits any data with a survival fraction. It can always be treated as being generated by an unknown number N* of latent competing risks. In this model, S'(t), /(t), and \{t) are improper, while S*{t), and \*{t) are proper survival function, density function, and hazard function, respectively. One advantage of this model over the mixture model is that the overall hazard function \{t) has a proportional hazard structure if the covariates enter the model through 9. In the mixture model, however, the overall hazard function does not have a proportional hazard structure if the covariates enter the model through the cure rate tt. The hazard function h*{t) for the failure proportion is magnified by p(^xt ) ^ ^ compared to h{t). As t goes to infinity, function h*{t) converges to h{t). In addition, the authors pointed out the close relationship between the model presented in Equation (1.9)
PAGE 23
15 and the mixture model presented in Equation (1.1) by observing that S{t) defined in Equation (1.9) can also be written as If we let P{Zi = 0) in Equation (1.1) take the value of and let S*{t \ Zi = 1) take the form of S*{t) defined in Equation (1.10), then S{t) has the same form as in Equation (1.1). Model in Equation (1.1) also corresponds to some model of the form in Equation (1.9) for some 6 and F{t). Note that F{t) is assumed to have a Weibull form for convenience in the article. The parameter ^ is a function of covariates in the form oi 9 = exp(/3^x). Tsodikov (1998) proposed the same semiparametric model from different angles. As above, let 0 be a function of covariates, denoted by ^(x, /3). Assume that n is the number of failure observations and 6f is the parameter of interest for the failure observation. Let mi be the number of censored observations in the time interval (ti,ti+i), where U is the failure time associated the failure observation. Let , Urm be the censoring time for the observations censored within interval {U, U+i). Let 9^j{i = 1, 2, . . . , n, j = 1,2,..., m^) be the parameters of interest for the censored observations. Also let be the sum of 9s for all individuals on the interval and let for 0 < a < 5 < n. Note that Ylt = 0 Fla = 1 if o > 5. The form of F{t) is completely unspecified. S{t) = e^ + e^). b k=a b
PAGE 24
16 In order to draw inference on the cure proportion with the presence of an unknown form of F{t), the author applied the partial likelihood, the marginal likelihood, and the profile likelihood function of 6. The partial likelihood function is written in the form r q q Q The partial likelihood is valid due to the proportional hazard structure. It provides consistent estimates. However, it can only be used to estimate the ratio of 9 and 9b, the baseline value. It is therefore inconclusive in the estimation of 9bWhen there is no covariate, i.e., when all patients are homogeneous, cure rate can not be estimated. Both the marginal and the profile likelihood are derived from the full likelihood function which is written as n rrii L = (1.12) j = l where Q{t) is the survival function for the censoring time. Type I censoring mechanism with a fixed end point tc is assumed throughout the article. Then Q is a singlestep function such that Q{U) = 1 for z = 1, 2, . . . , n, AQ{tc) = 1, and tnj = tc for j = 1,2, . . . , ninThe marginal likelihood function is derived as ^n,n _ . r i0 The profile likelihood function is derived from Equation (1.12) by maximizing it with respect to F. Nonparametric maximum likelihood estimator (NPMLE) is used to estimate F. Both Chen et al. (1999) and Tsodikov (1998) proposed the same semiparametric model, while using different estimation techniques. The former assumed independent and noninformative censoring as well as Weibull distribution for F{t) in the model. Therefore, their model is a parametric model although F{t) can take
PAGE 25
17 a nonparametric form. The latter assumed a Type I censoring mechanism, which is restrictive. Both the parametric and the semiparametric models with a bounded cumulative hazard rate are appealing because of the simplicity and intuitive meaning of these models. However, there are limitations for the more flexible semiparametric model. It is difficult in terms of computation. In addition, the asymptotic properties of this model have not been explored in the literature. In the next section, we will reexamine the semiparametric model proposed by Chen et al. (1999) and Tsodikov (1998). In this examination, we make no parametric assumption for F{t). In addition, we do not limit the censoring to Type I mechanism. By using the simulated data, we show the computational barrier raised by this model. 1.2 Computation Difficulty of the Semiparametric Model We assume random and noninformative censoring. The notations used in Tsodikov (1998)Â’s paper are used in the section except some changes are made in the definition of Pa,b and We define them as follows: Pa,b = Sa,bPa+l,b ' Â• Sb,b, II Pa,bPa,bÂ—l ^a,a The likelihood function takes the following form, i=l where f{ti) is the density function corresponding to F{ti). Define (1.13) /Â• = (1 )' g Â— t+l,n pni+l,np
PAGE 26
18 and Let Lm denote the marginal likelihood function. As derived in Appendix A, Without loss of generality, assume that there is only one covariate x in the study, such as the treatment indicator. Also assume that 0 is a function of the covariate x and is expressed in an exponential term as exp (a + /?x). The first and the second derivatives of with respect to are denoted by S'j and SÂ”^ They are derived as follows; n (1.14) i=l The score function for a is obtained as n T* and the score function for /? is written as n n T* The Hessian matrix is calculated and written as dÂ‘^\og{Lm)/da'^ d'^\og{Lm)/dad/3 dÂ‘^\og(Lm)/dadp dÂ‘^\og{Lm)/d(3^ where
PAGE 27
19 a^og(LÂ„)/aaa/? = i=0 " 'raÂ— i+l,n and " i=0 " i=0 where Jf is the first derivative of J* with respect to j3 and is expressed as Suppose that the covariate x is the treatment indicator, i.e., a: = 1 if the patient is in the treatment group and 0 if the patient is in the control group. We are now interested in testing the hypothesis: The score test (Rao, 1973) is used to test the hypothesis in Equation (1.15). The score statistic for (3 under Hq is denoted as U{^)\ho and the asymptotic variance for the score statistic is Â—5^ \og{L^)ld0^ under the null hypothesis. If a is assumed known, then the standardized test statistic is In order to explore the feasibility for computation, we first generate 10000 samples under the assumption that F is an exponential distribution with a mean q; = 1.5 and /? = 0. Each random sample contains 30 observations in the control group and 30 observations in the treatment group. The censoring proportion is around 50%. The standardized score statistic for (3 is calculated for all random samples generated. The plot of the 10000 standardized score statistics is presented in Eigure (11). In this bar plot, we observe that the score statistic behaves well and is close to a normal shape given that a is known. Ho: P = 0. (1.15) umn. (1.16) v^aMog(ZÂ„)/a/?2Â„Â„' of 1. Then the survival function is exp{Â—9F) = exp(Â— (1 Â— e Assume
PAGE 28
20 The problem becomes difficult to handle, however, when a is unknown. It is common to use a profile likelihood method to handle the nuisance parameter a when our primary parameter of interest is /?. The profile marginal likelihood function for /? is obtained from the marginal likelihood function by replacing a by its MLE under Hq. The test statistic based on the profile marginal likelihood function of j3 is similar to the one defined in Equation (1.16) given that some adjustment on the denominator of Equation (1.16) is required. Therefore, our primary task is to solve for the MLE of a under Hq. By using the same parameter setting for Eigure (11), we generate a sample with 30 patients in the treatment group and 30 in the control group. We show in Figure (12) the plot of the log of the marginal likelihood function as a function of a on interval [Â—2, 6] when [3 is valued at 0. The marginal likelihood for a under Ho is poorly behaved. In order to further investigate the behavior of the marginal likelihood function on a and /?, we show in Figure (13) a threedimensional plot of the log marginal likelihood function defined based on the same random sample as a function of a on the interval [Â—2,2] and (3 on the interval [Â—3,3]. The poorly behaved log marginal likelihood as a function of a and /? explains the behavior of the log marginal likelihood function in Figure (12). 1.3 Piecewise Exponential Model As we mentioned earlier in the chapter, it is sometimes necessary to assume different hazard functions for the time to failure on different time intervals. Friedman (1981) proposed a piecewise exponential model to handle this problem. Let the time scale be divided into I{n) intervals (0,Ti], (Ti,T 2 ], . . ., (T/(Â„)_i, T/(Â„)]. The hazard rate of each individual is assumed to be constant over any given interval. Denote the hazard rate of the individual in the interval by . Assume Xij > 0 for all {i,j). Let lij = logA^ and let Uj be the survival time of the
PAGE 29
21 individual in the interval. Define Uj as Uj = max{0,min(Tj Tj_i)}. (1.17) Let 0, otherwise. 1, when the individual fails in [Tj_i,Ti) Then the loglikelihood function is logZ/(l) ^ y ^ ^ exp ( lij ) . (1.18) ij ij One of the forms can take is lij = a t^kXjThe author proved the existence and the uniqueness of the MLE of the vector 1 . The consistency and the asymptotic properties of the MLE were also explored. Karrison (1987) also considered a piecewise exponential model in his study.
PAGE 30
Frequency 22 o o o cvi 4 2 0 2 4 Standardized Score Figure 11. Standardized score under Hq (/5 = 0) for 10000 random samples.
PAGE 31
23 a Figure 12. Log marginal likelihood function of a when ^ = 0.
PAGE 32
Marginal Likelihood Function 24 'i? Figure 13. Log marginal likelihood function of a and /?.
PAGE 33
CHAPTER 2 TOPIC OF RESEARCH The model we propose in this dissertation is a piecewise Gompertz model. It incorporates the cure proportion by assuming a bounded cumulative hazard rate. Moreover, our model gives the flexibility that the survival times for different time intervals may have different hazard functions. As noted by Aitkin et al. (1989) that a piecewise exponential model has the same flexibility as a proportional hazards model under the assumption that each interval is defined by one failure observation in the piecewise model. The choice of using a Gompertz model over an exponential model is made because of the possible plateau of the survival curve from a Gompertz model. We would speculate that under a piecewise Gompertz model, the more intervals we define, the more flexible the model is. Eventually, when there is only one failure observation on each interval, the flexibility of this model would be the same as the semiparametric model we discussed in Chapter 1. Therefore, our model preserves the flexibility of a semiparametric model to some extent. The degree of the flexibility depends on the number of intervals we define for our model. Meanwhile, the computation for our model is much easier to handle than a semiparametric model due to our modelÂ’s parametric structure. However, when the number of intervals in our model increases, the number of individuals on each interval decreases. We would anticipate large variance for the estimate of the cure rate, which is similar to the situation we discussed earlier for the KM estimate of the cure rate. An appropriate number of intervals needs to be accessed in order to reach maximum flexibility and computational feasibility of the model. In this chapter, we describe our model in detail. In Chapters 3 and 4, asymptotic 25
PAGE 34
26 properties of the MLEs of parameters in the model are proved. In addition, a goodnessoffit test is derived which can be applied to determine the appropriate number of time intervals in the model. In Chapter 5, a MonteCarlo simulation study is carried out to show the advantages of our model. Chapter 6 gives a real example and demonstrates the application of our model to the example. 2.1 Data Structure and Model Assume there are n patients in the study. Consider the patient who is in treatment group g and is associated with a vector of covariates Xj, where Xj is a pdimensional vector. Let Tj and Uj be the survival time and the censoring time of the patient, respectively. The observed data on the patient are (t*,6j,Xj), where t* is the observation on Tj = mm{Tj, Uj), and Sj is the observation on A, = I{T, < Uj). Assume there are G treatment groups and the interval [0, oo) is divided into / intervals [ri_i,ri), i = 1, 2, . . . ,/, where tq = 0 and tj = oo. Let 0^ = {0^ , A^), where (3 is pdimensional and A = (A^, A^, . . . , A^)^. Note that Ag = (Aig, A23, . . . , \ig). Clearly, 0 is a {p + 7G)dimensional vector. It is assumed that the hazard function for the patient in treatment group g has the form tG[ri_i,Ti). (2.1) Define an indicator function Kg{j), where { 1 if patient is in group g; 0 otherwise.
PAGE 35
27 Let K{j) be a vector of Kg{j) for ^ = 1, 2, . . . , G, i.e., K{j) = (Â«l(i),Â«2(j),..,KG(i))^ Let us also define for i = 1, 2, . . . , 7 and Mt) 1 if t Â€ [ri_i,rj); 0 otherwise, Let Q{t,j) = K{j) 0 J(t) = K2(i)JW \KG{j)J{t) Note that Q(t, j) is a G/dimensional vector. The hazard function in Equation (2.1) is also written as if Â«g(i) = 1 and Jj(t) = 1. The cumulative hazard function Hj{t,9), the survival function Sj{t,6), and the density function fj (t, 0) can be derived as follows: Hj{t,0) = [ h(u; A, Ac(j), J(u))du, Jo Sj{t]0) = exp{Â— f h{u,X,K{j),J{u))du}, Jo The likelihood function contributed by (tj,5j,Xj) is expressed as LÂ„, = A, ((Â•;Â»).
PAGE 36
28 The corresponding loglikelihood function is written as Inj \ogLj = 6j[\og{h{t*] X, K{j),J{t*))) + /3'^y.j] [' h{u,X,K{j),J{u))du. Jo ( 22 ) consequently, the overall likelihood function is defined as n Ln Â— L'nj ) i=i and the corresponding loglikelihood function is ^ ^ ^nj t=l 2.2 Assumptions Our derivation of asymptotic properties is based on the following three assumptions: Assumption 1. There is a 5 > 0 such that xj < J5 for j = 1, 2, . . . , n. Assumption 2. There exist e > 0, 5 > 0, and an integer nowhen n > no, there exists a subset A of overall individuals with n^/n > 5, where ua is the size of the subset A such that for any unit vector a, I a^Xj > e for G A. Assumption 3. Let Ug be the number of individuals in treatment group g, where g = 1,2, ... ,G. There exists a (5 > 0 such that mini 6 when n is sufficiently large. Assumption 4. The censoring variables Ui, U 2 , . . . ,Un are iid with a survival function Gu{u). Assume that there exists an e > 0 such that Gv{tii) Â— Gu{oo) > e.
PAGE 37
29 The first assumption assumes that all covariates are uniformly bounded. It is reasonable in practice because the majority of the covariates in a cancer study are patientsÂ’ characteristics, which are bounded uniformly. Note that we are primarily interested in timeindependent covariates. The second assumption assumes that at least a set A of patientsÂ’ covariates are dispersed to some extent. Assumption 3 means that the number of observations in each treatment group must increase proportionally to the sample size. Assumption 4 assumes that not all the censoring observations occur very early on the time span. 2.3 Hypothesis of Interest We are interested in comparing cure rates among treatment groups. Let 7Tg(x) be the cure rate of patients who are under treatment group g and who are associated with covariates vector x. Considering the case with two treatment groups, the null hypothesis is Ho : iTi(x) ?T2 (x) = 0, (2.3) for any covariate x. The test statistic will be derived in the following section. 2.4 Test of Hypothesis The cure rate corresponding to patients in treatment group g who are associated with a specific setting of covariates x is poo 7Tp(x) = S{oo\ 6) = exp{Â— / h{u\ A, K{j),J{u))du} Jo = exp{e^"[^ + Â— ]}. i=l ^*9 '^^9 The hypothesis defined in Equation (2.3) is equivalent to (2,4) H, i=l + A,: A 71 }{E t=l ^ ^ gAj2Til gA/2T/_i + A, 72 A 0 . 72
PAGE 38
30 /V T '' T Let the MLEs oi 6 = (A^ 0^Y be denoted by 0 = (A ^ )^. The test statistic is derived as i=i i=i \2 a/2 The variance of the above test statistic can be derived by the Delta method. The test statistic given in Equation (2.5) is a function of the estimates of A. We will show in later chapters that 6 follows an asymptotic normal with a mean d and a variancecovariance matrix S. The variance estimate of the test statistic can be derived as follows: V(T) = (f where (^ Ia=a) represents the derivative of the test statistic with respect to A while evaluated at the MLE of A. E* is the variancecovariance matrix of A and E* is the estimate of E*. The vector is derived as follows: dT dXii dT dXi2 A?/ A?2 for i = 1, 2, Â— 1, and dT _ (A/ir/_i Â— dT _ (A/2T/_i 5A/2 A^2 The normalized test statistic is it follows an asymptotic standard normal distribution.
PAGE 39
CHAPTER 3 ESTIMATION OF PARAMETERS 3.1 Likelihood Function and Information Matrices Based on the loglikelihood function defined in Equation (2.2), the score statistic Snj{0) for the individual can be expressed as Sn,W = (3.1) where Snj{0) = ^ ^ ' h{u; A, K{j),J{u))du (3.2) and 5gÂ’(Â«) ^ P Q{u,j)uh{u,X,K(j),3{u))du. (3.3) Note that S^j{0), and Snj{6) are p, IG, and {p + 7G)dimensional vectors, respectively. Furthermore, the matrices formed by second order derivatives of /Â„j can be derived as _ r = dl3d0 FifhO) = ^ ^ d(3dX^ ^ dH dXdl3^ / h(u; A,K(i), J(u))du, Jo ' / ^ QF{u,j)uh{u;X,K{j),J{u))du, Jo Q(u, j)xjnh(u; A,K(i), J(u))du, and dXdX^ [ Q{u,j)Q^{u,j)uÂ‘^h{u;X,K{j),J Jo [u))du. 31
PAGE 40
32 Then the observed FisherÂ’s information matrix is written as dt Â’(Â») y ftÂ”. 3^^^^ [' h{u,X,K{j),J{u)) Jo i \ ^uQ{u,j) j (x[ uQ^(u,j))du, (3.4) which is a (p + IG) x {p + IG) matrix. The FisherÂ’s information matrix is the expectation of Fnj{6), which is written as pT* ( \ = E{Fr,j{e)) = / h{u;X,K{j),J{u)) ^Jo \uCl{uG) ^ uQ^(u,j))du . (3.5) 3.2 Concavity of Likelihood Function The loglikelihood function for n individuals is written as ^n(^) = '^{5j[\og{h{t*]X,K{j),3{t*))) +/3'^Xj] / h{uX,K{j),J{u))du}. j=i (3.6) The score statistic, the observed FisherÂ’s information matrix, and the FisherÂ’s information matrix for n individuals in the sample are respectively as follows: n 5Â„(9) = ^SÂ„,{9), f=l (3,7) n FÂ„(9) = ^FÂ„,(9), f=l (3.8) and Tt Dfi Â— ^ ^ ^nj Â• j=l (3.9) Let = (a^ b^), where a is a pdimensional vector and b = {bn Â• Â•Â• bn Â• Â• biG Â• Â• bio)^ is a JGdimensional vectors. We will show that Fn{6) is positive definite and therefore, the loglikelihood function defined in Equation (3.6) is strictly
PAGE 41
33 concave. To show Fn{6) is positive definite, it suffices to show that c^FÂ„(0)c > 0 for any nonnull vector c of dimension {p + IG). In order to prove that FÂ„(0) is positive, we assume that there is at least one observation on each interval for each treatment group. Note that ^Fn{e)c / \ Tl ( \ (aÂ’a V / j=l = f /i(w; A,k(j), J(u))[a^Xj j=i > 0 . uh^Q{u, j)Y du This implies that FÂ„(0) is semipositive definite. Since /i(u; A, /c(j), J(u)) > 0, c^Fn{0)c = 0 implies that for any u 6 (0, tj), j = 1, 2, . . . , n, a^Xj 4hFQ{u,j)u = 0. (3.10) For any integer i, 1 < i < I and for any integer g, I < g < G, there exists an individual j (say) in group g and t* G (rj_i,Ti). Then Equation (3.10) implies that a^Xj Iubig = 0 for u G Consequently a^Xj = 0 and kg = 0. (3.11) Since Equation (3.11) holds for any integer i, 1 < i < I and g, 1 < 5 < G, it is then proved that b = 0.
PAGE 42
34 Therefore, Equation (3.10) implies that for any j Â— 1, 2, . . . , n, = 0 , i.e., Xj = 0 for all j = 1,2, ,n. which is a contradiction to Assumption 2. Therefore, Fn{9) is positive definite. The strict concavity of the loglikelihood function as a function of unknown parameters 0 guarantees the uniqueness of the MLEs of the parameters provided that the MLEs exist. Based on the existence of the MLEs and the concavity of the loglikelihood function, the NewtonRaphson algorithm converges.
PAGE 43
CHAPTER 4 PROPERTIES OF THE MAXIMUM LIKELIHOOD ESTIMATES 4.1 Introduction In the previous chapter, we proved the strict concavity of the loglikelihood function. We pointed out that we obtain unique MLEs of parameters if we can prove that these MLEs exist. In this chapter, we will first provide some preliminary results as a preparation for later derivations. Then we will show the existence, the consistency, and the asymptotic normality of the MLEs obtained. The method applied to prove the properties of the MLEs follows a similar method provided by Mailer and Zhou (1996, PP21 1222). Mailer and Zhou proved the asymptotic properties of MLEs in a survival model which assumes that the failure time follows an exponential distribution with covariates information. The exponential model with covariates in their book is a special case of the Gompertz model. By studying the piecewise Gompertz, we move one step further through the incorporation of different hazard functions on different time intervals. In our model, if we allow A = 0, the hazard function for the failure time T is h{t) = Therefore, the hazard is a constant, which implies that T follows an exponential distribution. In the next section, we state several lemmas which will be applied to prove the existence, the consistency, and the asymptotic normality of the MLEs. The proofs of the lemmas are given in detail in the Appendices. We prove the existence and the consistency of the MLEs in Section 4.3. The proofs of the asymptotic normality is stated in Section 4.4. In Section 4.5, a goodnessoffit test which compares models with different numbers of pieces is derived. We also show that the corresponding test statistic follows an asymptotic distribution. 35
PAGE 44
36 4.2 Preliminary Proofs Lemma 4.1. Let E{) and F() represent the expected value and the variance of a random variable, respectively. Then the following two equations hold: E{5Â„(6>)} = 0 and V{Sn{d)} = D,. Proof. See Appendix B. Lemma 4.2. Let 6q = (/3g, Aq) be the true parameter vector, where Aq = , . . . , Xfi , . . . , Ajq, . . . , Xf^). For any positive integer m and for I = 1,2, there exist an e > 0 and a F > 0 such that for any A satisfies A Â— Ao < e, pTI Eeo{ ' rt(u; A',k(j), J(u))du} < F, for _; = 1, 2, . . . , n. Note that F is a constant which depends on and e. Proof. See Appendix C. Lemma 4.3. Define 0q as in Lemma 4.2. For any positive integers I and m, there exists a 7 > 0 such that u'^h^{u\Xo,K{j),J{u))du} > (4.1) for j = 1, 2, . . . , n. Note that 7 is a constant depending on Oq. Proof. See Appendix D. Lemma 4.4. Under Assumptions 2, 3, and 4, there exist a 7 > 0 and an integer no > 0 such that when n > no, Amin(F^n) n > 7 , where Amin(DÂ„) is the smallest eigenvalue of
PAGE 45
37 Proof. See Appendix E. 4.3 Existence and Consistency of the MLE For each A > 1, define an elliptical neighborhood of the true value do of unknown parameters by N{do) = {e:{eOoVDnie do) < a}. (4.2) As shown in Lemma 4.4, Amin(Dn) > when n > noHereafter we assume that the condition n > no is satisfied. We observe that for any d G N{do), A>[ddo)'^Dn[ddo) > \\ddofXr^n{Dn) > Wddofn'y, i.e., 00o < ^/n^j 7 ' Let us consider an arbitrary vector d which is on the boundary dN{do) of N{do) then 0 is a vector which satisfies the following equation: (4.3) (9 eÂ„fDÂ„(9 eÂ„) = >1. If we can show that ln{d) < Ini^o) for all d G dN{do) with probability approaching 1, then the loglikelihood function has a local maximum in N (do). In addition, the gloabl maximum exists and lies in N{do) since the loglikelihood function is strictly concave. The consistency of is proved by observing that Equation (4.3) holds. Theorem 4.1. There exists a unique and consistent which maximizes ln{d). Proof. For any G dNn{do), the TaylorÂ’s expansion of the loglikelihood function about 00 is written as IniOn) IniOo) (0Â„ 0q)^5Â„(0o) " ^(^n ~ 0 q)"Â’T;(0~ ) (0Â„ do), (4.4)
PAGE 46
38 where is a random vector on the line segment between and Oq. Hence, the vector OnE N{9o). Recall that 5Â„(0o) is the vector of the firstorder derivatives of ln{9) evaluated at the true parameter do and Fn{0n) is the negative matrix of the second derivatives of ln{0) evaluated at We have ln{Oo) < o > P{{0n 0ofSn{0o) < \{en 0of Fr,{0n){0n ~ ^o)} > P[{0n 0ofSn{0o) < \a, 0oVFn{0n){0n ~ ^o) > > P[{0n 0oVFn{0r,){0n 0o) > \a} P{(0Â„ 0ofSn{0o) > \a] . We proved in Lemma 4.1 that E{Sr,{0o)) = 0 and H(5Â„(0o)) = LÂ»n Since (0Â„ Â— 0o)^Sni0o) is a sum of independent random variables with mean zero, we obtain the following inequality by applying the ChebyshevÂ’s inequality: P{{en0ofSn{0o)>^}< 16(6>Â„ 0ofDn{0n 0o) _ 16 AÂ’ (4.5) since G dN(0o). The probability in Equation (4.5) can be made arbitrarily small by choosing A sufficiently large. Next we need to show that P{(0n 0ofFn(0n)(0n ~ Oq) > \a] ^ 1, (4.6) as n Â— ) oo. Let I be a (p t7G) x {p + 7(7)dimensional identity matrix. If we can show r\j that, uniformly in N{0o), A I, as n Â» oo, (4.7)
PAGE 47
39 then we obtain the following: (0Â„0orFÂ„(0i)(0Â„0o) = {9n 6o) > {0n 0ofDn{0n 0,)\rrnn{D^'^Fr,{0n)D^l^) = AK,,{D^l'^Fr,{0n)D^'^) 4 A, where G dN{0o) and 0Â„G N{0q). Therefore, Equation (4.6) holds as n Â— > oo. Consequently, if we let both n and A become large, P{ln{0n) < in{0o)} 1 for 0n ^ dN{0o). Unique MLEs of parameters in the model exist by the strict concavity of the loglikelihood function and are consistent by Equation (4.3). We shall now prove Equation (4.7) through the following three lemmas. Note that Equation (4.7) indeed shows that the information matrix is consistently estimated by FÂ„(/9Â„). Let = (a^,b^) be any {pF /G')dimensional unit vector, where a and b are pand /Gdimensional, respectively. Lemma 4.5. Eor any (p I7G')dimensional unit vector c, ^D^l^\FÂ„(e,) A 0, (4.8) where $o = (/^o ) FV is the true parameter. Proof. See Appendix E. Lemma 4.6. Eor any (p 47G)dimensional unit vector c and any 0 G N{0q), c^L>i/2(FÂ„(0) Fn{0o))D^^^c A 0. Proof. See Appendix G.
PAGE 48
40 Lemma 4.7. For any 0 G N{Gq) A I(p+/G)x(p+/G) (4.9) Proof. See Appendix H. 4.4 Asymptotic Normality of the MLE Lemma 4.8. Let I be an arbitrary positive integer, then EgQ\\Snj{0)\\Â‘ is uniformly bounded for all j = 1, 2, . . . , n. Note that Snj{0) is defined in Equation (3.1). Proof. See Appendix I. Lemma 4.9. For any {p + /G')dimensional unit vector c, c^D^^^Sn{eo)^N{0,l). (4.10) Proof. Denote zÂ„ = 1=1 where Snj{0o), j = 1, 2, . . . , n are independent random variables. By Lemma 4.1 E{Zn) = 0 and V{Zr,) = 1 . By LiapounovÂ’s condition, it suffices to show that 1=1 as n Â— >Â• oo. Note that (411)
PAGE 49
41 Since Â£'e(,S'Â„j(0o)^ is uniformly bounded for all individuals by Lemma 4.8, Equation (4.11) holds and the proof is accomplished by the Central Limit Theorem. Theorem 4.2. The MLE are asymptotic normal, i.e., D]^Â‘^{dn Â— 0o) ^ N{0, I) as n ^ oo. Proof. By TaylorÂ’s expansion, we obtain that for some 9 on the line segment between On and Oq and for any unit vector u, n^D^I^Sn{K) n^D^I^SniOo) 9o). Also because \x^ Dn^^^Sn{9n) = 0 since is the MLE, n^Dy^Sn{9,) = M^D^'^FnmTn ~ 9,) = n^D^l\Fn{9) Dn){Tn 0o) + n^D^I^Dn{Tn 0o) = n^D^l\Fn{9) Dn)D^lÂ‘^D]l\Tn 0o) + n^D]!\Tn 9o) = n^[D^l\Fn{9) Dn)D^l^]D]l^{rn 0o) + Vi^D]!\Tn 0o). Since G N{9q), Equation (4.9) holds by Lemma 4.9. Therefore, the sum of the absolute values of all elements of [T)n^^^(FÂ„(0) Â— DÂ„)Dn^^^] goes to 0 in probability. Hence, = o^, where Op(l) is a matrix with all elements converging to 0 as n oo. Then we obtain that u^D^/^Sn{9o) = u^[0p(l)]F>y2(^ _ ^ _ 0,). Therefore, 0,) u^D^^^Sn{9o) + Op(l).
PAGE 50
42 Since we have proved the normality of in Lemma 4.7, the normality of Dn^^Â‘^{6n Â— 9 q) follows. Hence, Theorem 4 .2 is proved. 4.5 GoodnessofFit Test In practice, one might be interested in knowing how many intervals are in fact necessary in a specific study. Therefore, a goodnessoffit test can be performed to test how many time intervals are adequate. Assume that we have a saturated model with A + 12 intervals and an unsaturated model with I\ intervals. We are interested in testing whether there is significant difference between the two models. We assume that the unsaturated model is obtained by combining intervals A + 1, 7i + 2, . . ., 7i + 12 in the saturated model with interval 7i. The null hypothesis can be written as Ho A(/j+i)p = A(/i+ 2 )g = Â• Â• Â• = A(/i+/ 2 )g = A/jg, V 5 ' = 1, 2, . . . , G. (4.12) Since 6^ = {0^ A^), the null hypothesis defined above can also be written as 77o : 770 = 0, (4.13) where 77 is a design matrix with full row rank I2G and is written as 0 0;_i 0 0 0 0 0 0 0 0 1 Of^_i 0 0 0 0 0 0 or,i 1 1 0 Of,i 1 0 1 0 0 0 0 0 0 Ot.I 1 0 0
PAGE 51
43 where Os in bold are null vectors. Note that the first p columns of the matrix H put no restriction on (3. Columns (p + 1) through (p + 7i Â— 1 ) represent the nonrestrictiveness of A for treatment group 1 corresponding to the first /i Â— 1 intervals. Similar patterns are presented for the rest of the columns. We will now construct a likelihood ratio test for the hypothesis in Equation (4.13). To maximize the loglikelihood function ln{G) under Hq is equivalent to maximize U9) unrestrictively in 0 Â€ i?[p+Ci+^ 2 )G] ^ ^ where 7 is a vector of Lagrange multipliers. If we let 9(1 be the true parameters, 9n be the parameters estimated under ^ Â• T Hq, and be the estimated parameters under the saturated model, then satisfies the following equation: SnmH^ = 0, (4.14) where 7 is a vector of the estimated 7 under Hq. Also under the null hypothesis, H9ji = 0 and H9q = 0, H{9n 9q) = 0. (4.15) Now we need to solve for 9n from Equations (4.14) and (4.15). By TaylorÂ’s expansion, So) + 0 ,( 1 ), hence D]l\9n 9q) + DV25Â„(0Â„) = D^/^Sn{9o) + 0 ,( 1 ). By Equation (4.14), the above equation can be written as D'Ae. Â«o) + = D'/^SÂ„(e,) + Op(l). (4.16)
PAGE 52
44 We can rewrite Equation (4.15) as 9 Â„) = 0 . Now express Equations (4.16) and (4.17) in a matrix formula, (4.17) ( 11/2 \ / ^ Dn ' HD, 1/2 0 / V 7 / J^n \ 5n(6>o) + Op(l) 0 (4.18) / Since H is of full row rank DG and DÂ„ is nonsingular, is of full rank I 2 G. Hence, the inverse of matrix written as HD, 1/2 0 exists and can be / 1 1 ^ I P Q ^ Â° ) \ V where Q = and R = Therefore, we can solve Equations (4.18) as ^ E>y'(0Â„ 9o) ^ ^ IP V ^ ) ^ R j \ T>n'^n(6>o) + Op(l) 0 \ hence, D\lÂ‘^(en Oo) = (/ P)D^^^Sn{do) + Op(l). Similarly for 0Â„, we have Dl^^iOn do) = D^lÂ‘^Sn{9o) + Op(l),
PAGE 53
45 by TaylorÂ’s expansion and by observing the fact that = 0. Therefore, D]!\9nen) = D]i\e^e,)Dll\9r,e, = PD^I^Sn{9,) + o,{l). (4.19) Note that PP = P is idempotent with full rank / 2 G'; therefore, P can be written as A, P = A^ ( I 0^ V 0 0 / where A is an orthogonal matrix. Let then zÂ„4iV(o,/), by Lemma 4.9. From Equation (4.19), we obtain that (0Â„ 9^^ Dn{9n 9n) = S^{9o)D^/^PD^^^Sr^i9o) + Op(l) = S'^{9o)D^/^A^ ^ I 0^ t/sG U V 0 0 APi/2pÂ„(0o) + o^(l) Zi ( 1 0^ P%G U / 0 0 4g 0 0 0 Zn + Gp(l) Z + Op(l) hG(4.20) The likelihood ratio test statistic for the hypothesis in Equation (4.12) can be written as A SUPegeo ^n{Q) ^ Ln{9n) supeeÂ©TÂ„(6>) Â” LÂ„(0Â„)Â’
PAGE 54
46 where Â© represents the overall parameter space and Â©o the parameter space under Hq. By TaylorÂ’s expansion, we can express Â— 21og(A) as 2l0g(A) = 2[ln{dn) ln{K)] = 2 [( 0 Â„ OnfFr^mOn ~ K). where 6 lies between and On and therefore 0 G Nn{A). Note that Sn{6n) = 0 and Fn{9) can be replaced by from Lemma 4.7. Hence, Â— 21og(A) is written as 2 log(A) = [On OnfDniOn ~ On) + Op{l), which converges in distribution to ^7 Equation (4.20).
PAGE 55
CHAPTER 5 SIMULATIONS 5.1 Introduction The results discussed in the previous chapters showed the asymptotic properties of parameter MLEs in the piecewise Gompertz model. These results enable us to carry out the hypothesis test of which the null hypothesis is proposed in Equation (2.3). In this chapter, Monte Carlo simulation studies are performed in order to evaluate the normality approximation of the proposed test statistic in Equation (2.6) and the power of the test under different settings based on the amount of location shift and the sample size. Due to the complexity of the piecewise model, the procedure for generating the random sample is quite complicated and is discussed in section 5.2. Results are given on observed Type I errors and powers for both the piecewise Gompertz model and the exponentiallogistic mixture model. All simulations were completed on the Sun Ultra, Enterprise 450 computer using Ox 2.0 (Doornik 1998). Random number generators in the Ox 2.0 environment were used to generate uniformly distributed random numbers. 5.2 Generation of Random Samples In this section we describe the procedure for generating the random sample when the sample follows a piecewise Gompertz model based on certain parameter settings. In order to generate a random sample of size n with a specified proportion of censoring, we need to generate a sample of failure observations of size n and a sample of censored observations of size n. The sample of failure observations follows the piecewise Gompertz model. The sample of censored observations 47
PAGE 56
48 is assumed to follow a uniform (0, M) distribution, where M is defined by the censoring proportion. The final sample is obtained by comparing the two samples generated. A subject is considered failed if its observation from the first sample is smaller than the one obtained from the second sample. The subject is censored if its observation from the first sample is greater than the one from the second sample. 5.2.1 Parameter Settings We consider five different parameter settings in our Monte Carlo simulations. There is no treatment difference in random samples generated based on Settings 1 and 2. Therefore we can compare Type I errors between tests based on random samples generated from these two settings. In parameter Settings 3, 4, and 5, treatment effects exist between groups. Settings 3 to 5 present the smallest, the intermediate, and the largest difference between treatment groups, respectively. The comparison of powers of tests are based on random samples generated from these three parameter settings. All five parameter settings are listed as follows: Setting 1: Â• Two treatment groups Â• Three intervals: [0,3), [3,7), and [7, oo) Â• One covariate with value 0 or 1 Â• Cure proportion varied with covariate values Â• Censoring rates: 30% Â• Aio = All = Â—4.2 Â• A 20 = A 21 == Â—0.52 Â• A30 = A31 = Â—0.2 Â• /?o = 0.1, A = 0.2 Setting 2: Â• Two treatment groups
PAGE 57
49 Three intervals: [0,4), [4,10), and [10, oo) One covariate with value 0 or 1 Cure proportion varied with covariate values Censoring rates: 30% ^10 = = Â—3.5 A20 = A21 = Â—0.45 A30 = A31 = Â—0.15 /3o = 0.2, A = 0.15 Setting 3: Two treatment groups Three intervals: [0,3), [3,7), and [7, 00) One covariate with value 0 or 1 Cure proportion varied with covariate values and treatment groups Censoring rates: 30% Aio = Â—3.5 and An = Â—4.2 A20 = Â—0.45 and A21 = Â—0.52 A30 = Â— 0.21 and A31 Â— 0.2 /3o = 0.1, /?i = 0.2 Setting 4: Two treatment groups Three intervals: [0,4), [4,10), and [10, 00) One covariate with value 0 or 1 Cure proportion varied with covariate values and treatment groups Censoring rates: 30% Aio = Â—3.9 and An = Â—4.2 A20 = Â—0.45 and A21 = Â—0.4 Â• A30 = Â—0.165 and A31 = Â—0.18
PAGE 58
Â• /?o = 0.2, 13, = 0.15 Setting 5: Â• Two treatment groups 50 Â• Three intervals: [0,4), [4,10), and [10, oo) Â• One covariate with value 0 or 1 Â• Cure proportion varied with covariate values and treatment groups Â• Censoring rates: 30% Â• Aio = Â—3.5 and An = Â—4.2 Â• A20 = Â—0.45 and A21 = Â—0.4 Â• A30 = Â—0.15 and A31 = Â—0.18 Â• po = 0.2, A = 0.15 The survival curves for the above five parameter settings are shown in Figures (51) through (55). The survival probabilities are calculated by applying S(Â«) = + p^ig^ m=\ A mg A *5 )}. (5.1) for t e [rj_i,ri). There are two separate survival curves in the first two figures for individuals with covariate values of 0 and 1. For Figures (53), (54), and (55), we draw four survival curves for all four possible combinations of the covariate and the treatment group indicator. 5.2.2 Generation of Censored Observations From the assumption, the sample of censored observations follows a uniform (0, M) distribution. Therefore, we can obtain a sample with a uniform (0, 1) distribution and then transform them into a sample that follows a uniform (0, M) distribution. The primary task is to obtain an appropriate value for M. Let U and T denote random variables for censored and failure observations, respectively. By taking parameter setting 1 as an example, we illustrate the derivation of the
PAGE 59
51 Figure 51. Survival curves for patients with parameter setting 1.
PAGE 60
52 Figure 52. Survival curves for patients with parameter setting 2.
PAGE 61
53 0 10 20 30 40 50 Time Figure 53. Survival curves for patients with parameter setting 3.
PAGE 62
54 Figure 54. Survival curves for patients with parameter setting 4.
PAGE 63
55 Figure 55. Survival curves for patients with parameter setting 5.
PAGE 64
56 censoring proportion P{U < T) as follows. 1 e dc, for M E [0, 3); P{U < T) 1 _ p^ljC ^ A 19 1 1 + w exple^o+'^i* Â— iVl Mg J3 for M 6 [3, 7); and for M G [7, 00 ), pM } J exp{< ,/3o+;3ix ' o^2g^ X 2s }dc, P{U < T) 1 /'3 1 _ pAigC li I exp{e'>Â»+'>j^}dc + M 1 pAit;3 pA2o3 PA23C p{g/?o+/ 3 ix Â— ^ j. / exp{e^Â“'''^^^ ^ }dc ^19 Jz 1 1 /3A103 o ^ 2 gZ _ oA 2 o 7 M Â‘A M 29 A 19 A 29 / M exp{( ,/^o\(3ix gAsg7 Â— gAspC A: 39 }dc. Calculation of integrals in the above equations are carried out approximately using the SimpsonÂ’s Formula. The detail of the SimpsonÂ’s Formula is provided in Appendix J. Therefore, we search for an appropriate value for M such that P{U < T) reaches the required censoring proportion. 5.2.3 Generation of Failure Observations In order to generate a sample of failure observations, we need to generate a sample of uniform (0, 1) observations. Then we can treat the value from the uniform (0, 1) as the survival probability for the failure time T. By inverting the survival function defined in Equation (5.1), we obtain the value for the failure time. Note that special attention is required at the inversion because we are dealing with
PAGE 65
57 piecewise distribution. Examples are given for observations with covariate a; = 0 in the first parameter setting. From Figure (51), we observe that each survival curve consists of three pieces based on the three intervals we consider in the study. Therefore, the inversion function for generating the failure time observation depends on the value of the generated uniform random number U*. The survival probability when t = 0 is 1, the survival probability 1 ^Â—4.2x3 P{T > 3) = exp{eÂ“Â°^ Â— Â— } = 0.81, and P{T > 7) = exp{e 0.1 1 Â— e 4.2x3 ^0.52x3 ^0.52x7 4.2 + Â— e 0.52 ]} = 0.59. Therefore if u* G [0.81, 1], then T = ^ log[e^iÂ»"''> log(i/*)]; Is if u* G [0.59,0.81), then = logj X^g [e log(it*) Mg ^AlgTo p^lgTl X Is and if u* G [0, 0.59), then Aas [e^Â“ log(u*) {+ A,; Ag Hg '^2g Note that it is necessary that there is at least one failure observation in each interval for each treatment group in order to make inferences on all parameters. 5.3 Type I Error and Power Type I errors and powers of tests are calculated based on the piecewise Gompertz model and on the exponentiallogistic mixture model for all five sets of parameters. The test statistic based on the piecewise Gompertz model
PAGE 66
58 proposed in this dissertation, as we recall, was presented in Equation (2.6). For the exponentiallogistic mixture model, we assume that there is one covariate x in the model. The time to failure is assumed to follow an exponential distribution with mean 1/A, where A is modelled as a function of the covariate x and the treatment indicator g. The parameter A is presented as exp( 7 o + 7ix I72^)The cure rate P{Z = 0) is defined as i_^^ 0 o+\x +$29 Â• Therefore, for a fixed covariate value, the test of the equality of cure rates between two treatment groups is equivalent to the test of JTo : /^2 = 0. Hence, the standardized test statistic for this null hypothesis is Type I errors based on the first two parameter settings for both models are shown in Table (51). Powers for tests based on parameter settings 3 through 5 for both models are presented in Table (52). All tests are carried out for 10000 generated random samples and for three different sample sizes: 400, 1000, and 2000. The numbers of unobtainable test statistics (# NaNs) for the exponentiallogistic mixture model are shown in the last column of each table. The unobtainable test statistics occur because the likelihood function for the mixture model is close to flat at the point that MLEs are obtained. When the sample size becomes larger, however, unobtainable or unidentifiable test statistics are eliminated. We present bar plots of all standardized test statistics obtained for all three sample sizes based on those five parameter settings. These plots help to visualize the advantages of the piecewise Gompertz model in terms of the normal approximation of the test under the null hypothesis and the power of the test under the alternative hypothesis. Figures (56), (57), and (58) are the distributions of the test statistics under parameter setting 1 for both the mixture model and the piecewise Gompertz model. These three figures use sample sizes of 400, 1000, and TS =
PAGE 67
59 Table 51. Type I Errors for the Test of Equality of Cure Proportions for the Mixture and the ThreePiece Gompertz Models. Parameter Setting Sample Size Type I Error # NaNs Mixture Gompertz Mixture Gompertz 1 (200, 200) 0.0305 0.0375 220 (500, 500) 0.0171 0.0412 1 (1000, 1000) 0.0365 0.0412 2 (200, 200) 0.0203 0.0373 69 (500, 500) 0.0356 0.0360 (1000, 1000) 0.0402 0.0408 Table 52. Powers for Tests of Equality of Cure Proportions for the Mixture and the ThreePiece Gompertz Models. Parameter Setting Sample Size Power # NaNs Mixture Gompertz Mixture Gompertz 3 (200, 200) 0.0260 0.1016 127 (500, 500) 0.0328 0.2024 (1000, 1000) 0.0679 0.3703 4 (200, 200) 0.0616 0.1896 5 (500, 500) 0.0357 0.4351 (1000, 1000) 0.2721 0.7379 5 (200, 200) 0.2706 0.5572 28 (500, 500) 0.7180 0.9331 (1000, 1000) 0.9601 0.9981
PAGE 68
60 2000, respectively. Similarly, Figures (59), (510), and (511) are for parameter setting 2; Figures (512), (513), and (514) are for parameter setting 3; Figures (515), (516), and (517) are for parameter setting 4; and Figures (518), (519), and (520) are for parameter setting 5. From these tables and plots, we observe that Type I errors obtained from the threepiece Gompertz model are closer to 0.05 than those from the mixture model. Powers presented in Table (52) for the threepiece Gompertz model are much larger than ones for the mixture model for all sample sizes. The estimation procedure for obtaining MLEs is also easier to carry out for the threepiece Gompertz model than for the exponentiallogistic model. In addition, the algorithm for the threepiece Gompertz model converges much faster.
PAGE 69
61 a o o o CM >. o (D w o LO O I 1 1 1 I I 4 2 0 2 4 Standardized Test Statistic b o o o CM O O 4 2 0 2 4 Standardized Test Statistic Figure 56. Standardized test statistics for parameter setting 1 for 10000 random samples with Size 400. a) Mixture model; b) Threepiece Gompertz model.
PAGE 70
62 a 4 2 0 2 4 Standardized Test Statistic b 4 2 0 2 4 Standardized Test Statistic Figure 57. Standardized test statistics for parameter setting 1 for 10000 random samples with size 1000. a) Mixture model; b) Threepiece Gompertz model.
PAGE 71
Frequency Frequency 500 1000 2000 0 500 1000 2000 63 a 4 2 0 2 4 Standardized Test Statistic b 4 2 0 2 4 Standardized Test Statistic Figure 58. Standardized test statistics for parameter setting 1 for 10000 random samples with size 2000. a) Mixture Model; b) Threepiece Gompertz model.
PAGE 72
64 a 0) O D o U^r0) o LO I r 1 1 1 4 2 0 2 4 Standardized Test Statistic b o T 1 T 4 2 0 2 4 Standardized Test Statistic Figure 59. Standardized test statistics for parameter setting 2 for 10000 random samples with size 400. a) Mixture model; b) Threepiece Gompertz model.
PAGE 73
65 a 4 2 0 2 4 Standardized Test Statistic b I I I 1 1 4 2 0 2 4 Standardized Test Statistic Figure 510. Standardized test statistics for parameter setting 2 for 10000 random samples with size 1000. a) Mixture model; b) Threepiece Gompertz model.
PAGE 74
66 a 4 2 0 2 4 Standardized Test Statistic b I ^ ^ r 4 2 0 2 4 Standardized Test Statistic Figure 511. Standardized test statistics for parameter setting 2 for 10000 random samples with size 2000. a) Mixture model; b) Threepiece Gompertz model.
PAGE 75
67 a I I 1 r 4 2 0 2 4 Standardized Test Statistic b I I i I , 1 4 2 0 2 4 Standardized Test Statistic Figure 512. Standardized test statistics for parameter setting 3 for 10000 random samples with size 400. a) Mixture model; b) ThreePiece Gompertz model.
PAGE 76
68 a 4 2 0 2 4 Standardized Test Statistic b 1 1 ~T 4 2 0 2 4 Standardized Test Statistic Figure 513. Standardized test statistics for parameter setting 3 for 10000 random samples with size 1000. a) Mixture model; b) Threepiece Gompertz model.
PAGE 77
69 a o 4 2 0 2 4 Standardized Test Statistic b 4 2 0 2 4 Standardized Test Statistic Figure 514. Standardized test statistics for parameter setting 3 for 10000 random samples with size 2000. a) Mixture model; b) Threepiece Gompertz model.
PAGE 78
70 a I I 6 4 2 0 2 4 Standardized Test Statistic b o o o o o 6 Â—I r \ 1 1 4 2 0 2 4 Standardized Test Statistic Figure 515. Standardized test statistics for parameter setting 4 for 10000 random samples with size 400. a) Mixture model; b) Threepiece Gompertz model.
PAGE 79
71 a I 1 1 r 6 4 2 0 2 4 Standardized Test Statistic b o o O 1c LL. O O lO o 6 42 0 2 Standardized Test Statistic 4 Figure 516. Standardized test statistics for parameter setting 4 for 10000 random samples with size 1000. a) Mixture model; b) Threepiece Gompertz model.
PAGE 80
72 a T 1 1 Â— r 6 4 2 0 2 4 Standardized Test Statistic b 6 4 2 0 2 4 Standardized Test Statistic Figure 517. Standardized test statistics for parameter setting 4 for 10000 random samples with size 2000. a) Mixture model; b) Threepiece Gompertz model.
PAGE 81
73 a T 1 I 1 r 8 6 4 2 0 2 4 Standardized Test Statistic b 8 6 4 2 0 2 4 Standardized Test Statistic Figure 518. Standardized test statistics for parameter setting 5 for 10000 random samples with size 400. a) Mixture model; b) Threepiece Gompertz model.
PAGE 82
74 a o T r 8 6 4 2 0 2 4 Standardized Test Statistic b 8 6 4 2 0 2 4 Standardized Test Statistic Figure 519. Standardized test statistics for parameter setting 5 for 10000 random samples with size 1000. a) Mixture model; b) Threepiece Gompertz model.
PAGE 83
75 a 8 6 4 2 0 2 4 Standardized Test Statistic b 8 6 4 2 0 2 4 Standardized Test Statistic Figure 520. Standardized test statistics for parameter setting 5 for 10000 random samples with size 2000. a) Mixture model; b) Threepiece Gompertz model.
PAGE 84
CHAPTER 6 EXAMPLE 6.1 Introduction In Chapter 2 we proposed a test statistic that addresses the hypothesis presented in Equation (2.3). Subsequently in Chapter 3 and Chapter 4 we established the asymptotic normality of parameter estimates in the piecewise Gompertz model. The asymptotic normality validated the test of hypothesis proposed in Chapter 2. Chapter 5 reported Monte Carlo Type I error and power calculations, which established the strength of our test statistic on the comparison of cure rates under various parameter settings. In this chapter, we demonstrate the applicability of our test statistic with a data set. The data set comes from a study by the Pediatric Oncology Group (POG) at the university of Florida (Ravindranath et al. 1996). The motivation of this study is to compare cure proportions among patients with acute myeloid leukemia (AML) under different treatments. Three treatments, allogeneic bone marrow transplantation (alio. BMT), autologous bone marrow transplantation (auto. BMT), and intensive chemotherapy, are performed in the study. In order to register in the study, all patients are required to be in remission after the first two courses of induced chemotherapy. The POG evaluated 666 patients whose ages were under 21 from June 1988 to March 1993. A total of 552 patients entered remission and are therefore eligible for the study. Of the eligible patients, 89 patients received alio. BMT; 117 were assigned to an additional sixcourse intensive chemotherapy; and 115 were assigned to receive auto. BMT. There were 231 patients who did not participate in the study due to various reasons. 76
PAGE 85
77 6.2 Methodology Our objective based on this data set is to perform pairwise comparisons on cure rates between treatment groups after controlling for patientÂ’s gender. We compare tests for three parametric models and for a proportion test based on KM estimates of cure. The three parametric models are the logisticexponential mixture model, the onepiece Gompertz model, and the threepiece Gompertz model. In addition, we present cure rates and their standard deviations for every treatment and gender combination based on estimates from parametric models and on KM estimates. In the end, we compare the standard onepiece Gompertz model with the threepiece Gompertz model by applying the goodnessoffit test we derived in Ghapter 5. In this section, we will briefly describe 1) three parametric models and their corresponding test statistics; and 2) the proportion test based on KM estimates of cure proportions. In the exponentiallogistic mixture model, we assume an exponential distribution with parameter A for the failure time and a logistic regression model for the cure proportion. The expression log(A) is defined as a linear function of the covariate x and the treatment indicator. We define two dummy variables to represent the threelevel treatment indicator. They are presented as Gi Â— /[if patient in treatment 1] &nd G 2 Â— f[if patient in treatment 2]The SUrvival function for the failure time given that the observation is not a longterm survivor (Z = 1) is S*{t\Z = 1) = (6.1) The cure probability is defined as ~ g/3o+/3iXi+/923i+/33S2 ' (6.2)
PAGE 86
78 Therefore the hypothesis of testing the equality of cure rates in three treatment groups can be written as ffo ^2 = 03 = 0. (63) Pairwise comparisons between two treatments are based on the following standardized test statistics: Ti = P 2 Â— 03 (6.4) \Iv0203) T 2 = . , and vm) (6.5) Ts = 03 ( 6 . 6 ) \/V{03) Note that Ti is the test statistic that compares cure proportions between Treatments 1 and 2; T 2 compares cure proportions between Treatments 1 and 3; and T 3 compares cure proportions between Treatments 2 and 3. The variances are obtained by inverting the observed information matrix obtained by taking the second derivatives of the loglikelihood function with respect to parameters. The onepiece Gompertz model is a special case of the piecewise Gompertz model. Let Xg be the parameter associated with the treatment g and /?i the coefficient associated with the covariate x. For Xg < 0, the cure proportion for patients with the treatment g and the covariate x is written as Â’Kg{x) = (6.7) If Xg < 0, there is cure proportion for patients under the treatment. If Ag < 0 for any g, then the hypothesis of testing the equality of cure proportions is written as Ho Ai Â— A2 Â— A3. (6.8)
PAGE 87
79 Standardized test statistics for pairwise comparisons of cure proportions are presented as Ai Â— A2 Ti T, = X\ Â— A3 ^1/(AiA3) A2 Â— A3 , and (6.9) ( 6 . 10 ) ( 6 . 11 ) y/v{X 2 A3) Similarly these three standardized test statistics are for the comparison of cure proportions between Treatments 1 and 2, between Treatments 1 and 3, and between Treatments 2 and 3, respectively. The variances are obtained from the inverse of the observed information matrix. Finally, as we have discussed in Chapter 2, the cure proportion in a piecewise Gompertz model as a function of the covariate x and the treatment g is expressed in Equation (2.4). We consider a threeinterval case in this example. Denote the three intervals by [ro,ri), [ri,T2), and [r2,r3), where ro = 0 and T3 = 00. When Xsg < 0 for any g, testing the equality of cure proportions for any given value of X is equivalent to testing Ho : Yo Â— 1 Â— 1 ^ Â— 1 A. il E A,: i2 = E A i3 ( 6 . 12 ) i=l i=l iÂ—1 Standardized test statistics for pairwise comparisons between treatment groups are presented as follows: Ti = To = i=l Xi2 (6.13) ^(ELi y3 Â’Â’i Ai Ail * ^ Ai2 E 3 e^Â‘3 i=l Ai3 and ^(^3 g^tiUi _e^iiÂ’"i _ y^3 * ^ Ail ^i3Â’"il _gAi3T. Ai3 4 (6.14)
PAGE 88
80 Y^3 2^iz=:\ 2^iÂ— 1 _eAj2^i Ai 2 Z^i=l ^i3'^il e^iSn ^i3 (6.15) e^i2ni Â—g>^i2'^i *' vZ^i=i r: z^i=i 3 e^i3Â’ii_eAi3Ti ) Â•^i2 * ^ A,3 These test statistics Ti,T 2 , and T 3 are for comparisons of cure proportions between Treatments 1 and 2, between Treatments 1 and 3, and between Treatments 2 and 3, respectively. Variance estimates are not quite straightforward as for tests based on the mixture model and on the onepiece Gompertz model. They have already been presented in Chapter 2 and will not be repeated here. Proportion tests of KM estimates of cure for pairwise comparisons between treatments are generated based on KM estimates of cure proportions and standard errors of the estimated cure proportions. Pvalues for tests derived from parametric models and for the proportion test will be presented in the next section. In addition, we present survival curves estimated based on both parametric models and KM estimates. Furthermore, we present the cure proportion and the corresponding standard deviation for each treatment and gender combination based on parametric estimates and on KM estimates. The result of comparing the onepiece Gompertz model with the threepiece model by the goodnessoffit test we derived earlier is also presented. 6.3 Results 6.3.1 Pairwise Comparisons on Cure Proportions In Table (61), we present pvalues for tests based on three parametric models and for the proportion test based on KM estimates. The three intervals for the threepiece Gompertz model are defined as [0, 50), [50, 140), and [140, 00 ). All numbers presented here are in terms of weeks in followup. Tests based on the exponentiallogistic mixture model, the threepiece Gompertz model, and the KM estimates detect the difference in cure proportions between alio. BMT and auto.
PAGE 89
81 BMT and the difference between alio. BMT and intensive chemotherapy. The alio. BMT increases cure proportion significantly more than the other two treatments. There is no significant difference between auto. BMT and chemotherapy. A test based on the onepiece Gompertz model does not show any significant difference between any two of these three treatments. The maximization procedures obtaining MLEs of parameters for both the onepiece Gompertz model and the threepiece Gompertz model are smooth. The algorithm converges within a few steps. However, there is a convergence problem related to the mixture model. Different starting values for parameters are chosen in order to obtain the convergence. We have discussed the convergence problem related to the mixture model in both Chapter 1 and Chapter 5. The convergence problem we encounter here confirms our earlier assertion. Table 61. Comparison of Tests from the Mixture Model, the OnePiece Gompertz Model, and the ThreePiece Gompertz Model, and the Proportion Test on Cure Proportions. Comparisons Pvalues for Treatment Comparisons Mixture 1Piece Gompertz 3Piece Gompertz Proportion Test Alio. BMT vs Auto. BMT 0.016 0.384 0.015 0.011 Alio. BMT vs chemotherapy 0.012 0.101 0.003 0.007 Auto. BMT vs chemotherapy 0.690 0.516 0.897 0.741 In Figures (61) to (64), we show estimated survival curves from different parametric models and from KM estimates. The three curves on Figure (6l)a are survival curves for three treatment groups for male patients based on the mixture model. Figure (6l)b presents three treatment survival curves for female patients based on the mixture model. Figure (62) presents survival curves estimated from the onepiece Gompertz model for male and female patients in three treatment groups. Survival curves estimated from the threepiece Gompertz model are
PAGE 90
presented in Figure (63). The KM estimates of survival curves are presented in Figure (64). 82 6.3.2 Cure Proportion Estimates Estimates of cure proportions are presented in Table (62). We present estimates of cure rates and their standard deviations from the mixture model, the onepiece Gompertz model, the threepiece Gompertz model, and the KM estimate for all combinations of patientÂ’s gender and three treatment groups. Treatment 1 represents alio. BMT, treatment 2 represents auto. BMT, and Treatment 3 represents intensive chemotherapy. Note that cure proportion exists for each of the combinations. Female patients show higher cure probability than male patients. Standard errors for cure proportions estimated from both the threepiece Gompertz model and the mixture model are smaller compared to the onepiece Gompertz model and the KM estimates. Table 62. Estimates and Standard Deviations of Cure Proportion Generated from the Mixture Model, the OnePiece Gompertz Model, the ThreePiece Gompertz Model, and the KaplanMeier Estimates. Cure Proportions (Standard Deviations) 1piece 3piece Gender Treatment Mixture Gompertz Gompertz KM M 1 0.558 (0.059) 0.504 (0.062) 0.568 (0.056) 0.648 (0.076) M 2 0.386 (0.050) 0.445 (0.060) 0.397 (0.050) 0.387 (0.065) M 3 0.407 (0.033) 0.409 (0.050) 0.405 (0.033) 0.388 (0.036) F 1 0.597 (0.056) 0.533 (0.062) 0.596 (0.054) 0.531 (0.071) F 2 0.424 (0.051) 0.475 (0.061) 0.429 (0.051) 0.421 (0.065) F 3 0.445 (0.036) 0.440 (0.051) 0.437 (0.035) 0.466 (0.041) 6.3.3 GoodnessofFit From the above discussion, we observe that the threepiece Gompertz model fits better than the onepiece Gompertz model. It is natural to compare these two models by applying the goodnessoffit test we derived in this dissertation. The null
PAGE 91
83 a 0 200 400 600 Time in weeks b 0 200 400 600 Time in weeks Figure 61. Survival curves estimated from the mixture model for three treatments, a) Male; b) Female.
PAGE 92
84 a 0 200 400 600 Time in weeks b 0 200 400 600 Time in weeks Figure 62. Survival curves estimated from the onepiece Gompertz model for three treatments, a) Male; b) Female.
PAGE 93
85 a 0 200 400 600 Time in weeks b 0 200 400 600 Time in weeks Figure 63. Survival curves estimated from the threepiece Gompertz model for three treatments, a) Male; b) Female.
PAGE 94
86 a Time in weeks b Time in weeks Figure 64. Survival curves from KaplanMeier estimates for three treatments, a) Male; b) Female.
PAGE 95
87 hypothesis is expressed as Hq : Alg \2g = Xzg, for any g = 1,2, and 3. Note that \ 2 g and \^g are parameters for the second and third interval in the threepiece Gompertz model, respectively. We showed that under Hq 2{h h) Xi, where and I 2 are the loglikelihood functions for the onepiece and the threepiece Gompertz models, respectively. From the computation, we obtain that li = Â—1977.289 and I 2 = Â—1950.979. Therefore, the test is highly significant. It implies that the threepiece model is significantly better than the onepiece Gompertz model.
PAGE 96
CHAPTER 7 SUMMARY In this dissertation, we presented a piecewise Gompertz model to address longterm survival problem. We constructed a standardized test statistic based on our model which compares cure rates between treatment groups. We derived the strict concavity of the likelihood function based on the model. In addition, we derived the asymptotic properties of the MLEs of parameters in the model. A goodnessoffit test was also derived to access an appropriate number of intervals for the model. In oncology study when there are longterm survivors and when there are different hazard patterns on different time intervals, one might consider using our piecewise Gompertz model. Our model is more computational feasible than both the mixture model and the semiparametric model with bounded cumulative hazard function. In addition, our model is more flexible due to its piecewise structure. Our model also provides a reliable estimate of the cure rate than the KM estimate when the sample size is small at the end of the study period. In the future, we plan to find an optimal partition of the time span by deriving a test to compare different partitions. Sample size determination will also be assessed. Since we notice that there are jumps on the survival curve defined by the piecewise Gompertz model, we might consider using a logspline model to achieve smoothness of the curve. In addition, we plan to perform more detailed simulation studies on the semiparametric model which assumes bounded cumulative hazard function and to study the asymptotic behavior of the semiparametric model. 88
PAGE 97
APPENDIX A DERIVATION OF MARGINAL LIKELIHOOD Let and define poo poo JtQ=0 Jtn Â— 1 jÂ— j poo poo ^Â‘ = / Â• 7 n /('ije""'''Â’Â®'*, Jtnk Jtnl i=nk+l dtnÂ—1 ' dtfi^k+ii ^ 1, 2, . . . , n. As /c = 1, poo n h= eÂ”Â®"], (A.l) Jtnl and when k = 2, aOO n f{U)e~^^*''Â’^'dtndtni 11 i=nl poo poo J tn Â— 2 "inÂ— I poo 1 Jtn2 1 On(Onl + On) Jg(0nl+Â©n)i^(tn2) _ g(Â©nl+0n)j 0n [e' 0Â„_lF(tn2) p Â— Â©n eÂ®7. OnOnÂ— 1 By similar derivation, we obtain that when k Â— 3, poo Ju% j'g(Â©n+Â©nl+Â©n2)F(tÂ„_3) _ g(Â©n+Â©nl+Â©n2)j 1 On(Â©n + OÂ„_i)(0Â„ IÂ©nl + Â©n2 ) 89
PAGE 98
90 On 0n0nl(Â©nl + 0n2) g Â— (0n+0ril) 0 nlÂ©n2 ( 0 n + Â©nl) Jg Â— (0nl+0n2)fÂ’(tn3) g Â— (0nl+0n2)j g0n2^(
PAGE 99
91 Therefore by Equation (A. 5) = A 1 ^nÂ—k+l,nÂ—i+l^nÂ—k,nÂ—i+l 1 Q c >^n (A.7) ^nÂ—i,nÂ—i+l^nÂ—iÂ—l,nÂ—i+l Â‘ ' ' SnÂ—k+l,nÂ—i+l PnÂ—k+l,nÂ—i+l for i < A: and by the definition of Pa,b given in Chapter 1. In addition by Equation (A.6), t=l Ak+i,k+i = Â“7 ^nk kÂ—l = E(i) fc + l,n Â— Â»+l Z=1 kÂ—l \ik ^S, ^nÂ—k ^ *^Tl Â— fc + l,Tl Â— t+l I kjk Â— k^1,71 Â— fc + 1 On. 1 g ^n Â— fc+l,7i Â— i+l 1=1 +Ak,k 'S'nÂ— i,nÂ— 1 +lÂ‘S'nÂ— 1 Â— l,nÂ— 1+1 SnÂ—k+l,nÂ—i+l SnÂ—k,nÂ—k g *^nÂ— fc+l,n Â— fe + 1 ^nÂ—k (A.8) Also we observe that Â©n where the first equality is true by Equation (A. 2) and the second is true from Equation (A.l). Therefore ^1, 1 1 *S'n,n Â©n Similarly we obtain through derived Equation (A.8) C p Â— Sn,n A Â—A ^ Â•^2,2 ^11 c o o ^ P ~'^ n^n ,S ,S i 1 ^ 1,71Â— 1 and If we let ~SnÂ— l,n A3, 3 Â— Â‘S'n Â— 1 ,nÂ— 1 Â‘5'n Â— 2,n Â— 2 Â‘S'u Â— 1 ,n g *^n Â— fc+2,n C , , , Dnfc+2,nÂ’ *^nÂ— fc+l,nÂ— fc+l* (A.9)
PAGE 100
92 where PÂ” was defined earlier in Chapter 1 as rij=nfc +2 ^nk+ 2 ,jIn addition, we observe from Equation (A. 8) that fc 2 1 '1 ^ C . , , , ^P Â— fc+2,n Â— i+l / pnk+2,nÂ—i+l fe+l,nÂ— fe+1 _c _ kÂ—1 p *^71 Â— fc+2,n fe+i } Snk+l,nk+l pni+2,npnk+2,ni+l Hence from equating Equations (A. 9) and (A. 10), the following equation holds. (A.IO) k I ^ ^ V(D* = = (1)*Â“^ / ^ ' pnÂ—k\ly i=0 (A.ll) Now we can derive Ajt+i,fc+i as follows: 1A:+1,A:+1 SnÂ—k,nÂ—k , ^1=1 gSÂ„_*+2,n SnÂ—k.nÂ—k f ^ / v '' ' pnÂ—i+2,'i pnÂ—k+l,nÂ—i+l 1 + + Afc,fce 5nfc+l.n* + l 1 i=l f> *^n Â— fc+2,n I { 1 1 + Q , , pnÂ—k+\,n pnÂ— fc+2,n pnÂ— fc+l,7jÂ— fc+1 C , , , , , , pnÂ—k+2, ^nÂ—k,nÂ—k ' Â•* '^nfc+l,nÂ— fc+l* ^Â“*^71Â— /c+2,n C , , pnk+\,n Â’ Â‘Jnk,nk^ by Equation (A.ll). Therefore, } (A.12) ^i,k Â— Aj^i ^71 Â— 1+1,71 Â— 1 + 1 ' p , , . , pni+2,n Â’ OiÂ— A:+l,nÂ— 1 + 1 * for i < k. Ik, as defined in Equation (A. 2), therefore can be written as ^ Â— 1+2,71 Â— 1+1 ^(^nÂ—fc ) Â— Â— (A.13) 4 = ^(irÂ‘i=l (A.14)
PAGE 101
Set k = n, we obtain since Â‘n n 1 = V(iyi n = y(iyi ' pnÂ—i+2,n [e Â— Sn Â— i+2,n *S'l,nÂ— t+l^(^o) g i=l n [e Â“^n Â— *+2,n g '^n Â— t+2,n ^ g *^l,n i=l nÂ— 1 pnÂ—i+2,n p, . , * ^ l,nÂ— i+l 1=1 Â— <7 Â• I i nÂ— 1 _c, E ^ Â— On Â— 1+1,71 *Jl,n (1 )*Â— ^ y^(i)* Â— ^ Â— ' ' pnÂ—i+l,np ZÂ— /' ' pnÂ—i+l,n i=0 i=0 T Â— i g ~' 5 n t + l,n = y^(i)* 1 (1 )Â’ '' pni+l,np . ^ i=0 n pn pl,n P\,nÂ—i p 'S'nÂ— t+l,n E(i)Â‘5^ 1=0 pni+l,np^ Â’ nÂ— 1 E = (l) nr pi,n Â’ (A. Pi Â„_jP"*+iÂ’" 1=0 by Equation (A. 11). Hence the marginal likelihood function defined in Equation (1.14) can be derived since we observe that poo poo L Â„=/ Â•Â•/ L = n Â« f /Â„, JtoÂ—0 Jtnl j_l where L was defined in Equation (1.13).
PAGE 102
APPENDIX B PROOF OF LEMMA 4.1 The results presented in Lemma 4.1 hold for any general parametric model if it satisfies certain regularity conditions. In our proof, we show that the results also hold for any parametric model with independent censoring under certain regularity conditions. Let T* = min(T', U) be the observed survival/censoring time and A be the censoring indicator. Denote the hazard function, the probability density function, and the survival function for the survival time T by h{T,0), f{T,6), and 5(T;0), respectively. Denote the survival function for the censoring time U by Gu{u). Then the likelihood function is written as L{T*,A) = h^{T*;0)S{T*,d). Therefore the expectation of the score function S{0) has the following expression: = E = E E[S{e)] d d9 logL(T*,A) = E L(T\^) h^(TÂ’,e)S(TÂ’,e) /(A = 1) + E ' MhÂ‘^iTi0)S{T\0)) h^{TÂ‘]0)S(TÂ’,0) /(A = 0) (B.l) We can write the first term on the righthand side of Equation (B.l) as E dO I h^{T*,e)S{T*0) /(A = 1) = E ifiT,0) A. d0 f{u P{T < U). L /(T;0) u0) d ^f{u]0)Gu{u)du= Â— f{u;0)Gu{u)d I{T < U) 94
PAGE 103
95 Similarly, the second term of the righthand side of Equation (B.l) can be written as ^(h^(r,e)s(r,e)) h^(TÂ’,e)S{TÂ‘,0) /(A = 0) = E if de S{U;0) S{U;0) I{T > U) Jo S{u,0) d0 Jo Hence, E[5(6>)] = ^ [P{T U)] = 0. Now we need to prove F[5(0)] = D. Let and jg{jÂ§L) represent the (p f IG) X (p qIG) dimensional matrices formed by corresponding second order derivatives. Note that j2 1 _ r j dr id? d^ogL ^ L^LiiL){^,Ly dO^ dd^ L Â’ 1 cPL. r V L2 L ' <55 Â“>Â« It is clear that E[5(0)] = D{6) if we can prove E [Â— 1 = 0 . ^LdO^ Note that we can derive E[j^^] as L dO^ E{ 1 d^L 1 ,d^L 1 ,(PL^ ] = ^[ 7 ( 3^2 = 1 )] + E[ji^)I{A = 0 )]. 'Ld0Â‘^' 'L^d0'^'^' ' 'L^d0^ The first term of the righthand side of Equation (B.2) can be written as (fi f{T,0) E[yi^)I{A = 1)] = E[ ^\Z.nf' nT < U)] I (PL, f{u0) d^ roo ^ f{^. Q\ M roo = / = / n^^^)Gu{u)du Jo fw0) d0 Jo d0'^ P(T < U). (B.2) (B.3)
PAGE 104
96 Similarly, the second term of the righthand side of Equation (B.2) is written as poo ^ poo = / c:, i rS{u. 0)d(l Gv(u)) = ^ S(u,e)d(l Gv{u)) Jc S{u,e) d$ Jo = ^P{T > U). (B.4) The combination of Equations (B.3) and (B.4) shows that T fpT fl2
PAGE 105
APPENDIX C PROOF OF LEMMA 4.2 We will prove this lemma for / = 1. The proof for / = 2 is similar and is omitted. Note that '0 0 < u'^h{u]>^ ,K{j),J{u))du^ poo pt = / {/ u"^h{u;)< ,K{j),3{u))du} d[Snj{t,0o)Gu{t)], (C.l) Jo Jo where and Gu{t) are survival functions of the survival time Tj and the censoring time Uj, respectively. Note that poo pt { u"^h{u,x' ,K{j),J{u))du}d[Snj{t,0o)Gu{t)] Jo Jo = 0a)Gu{t) [ f Â«Â’"A(a; A', Â«(j), J(Â«)MÂ«] } Â„Â” poo + / Snj{t;Oo)Gu{t)t^h{tx\K{j),3{t))dt Jo poo < / Snj{t;6o)t'^h{t,x! ,K{j),J{t))dt. (C.2) Jo Let e = mm I e 2,Â„.x(0) a then under the assumption that the individual belongs to group g, Snj{t; 9o)f^h{t\ a', k(j), J(t))dt / T/_i poo ^ = / 5Â„,(t;0o)r/i(t;A,/c(i),J(t))dt+ / 6>o)r/i(t; A , K(j), J(t))dt Jo Jtji f 97
PAGE 106
98 Jo roo it , + / exp{Â— / h{u,X(i^K{j)^3{u))du]f^e^'9^dt. (C.3) Jtji Jo It is observed that the first term in Equation (C.3) is uniformly bounded for A satisfying A' Â— Ao < e. The second term in Equation (C.3) can be written as noo rt , / exp{Â— / h{u\\o,K,{j)^i{u))du}f^e^Â’9^dt J Tjl Jo < f exp {Â— f e^'9'^du}t'^e^^9^dt, Jrii Jtii (C.4) since xj < B for any j = 1, 2, . . . , n by Assumption 1. We will discuss Equation (C.4) in the following three cases; Case 1. Xfg < 0. A' Â— Ao < e implies X'jg < Xfg + e < ^Xfg < 0. Therefore, oB\\0o\\ f Jtii r Jtii dt = Fi < oo. poo / exp{< Jtii Case 2. xfg = 0. The following inequality holds. roo ft , foo , I exp{Â— e^Â’s'^du}f^e^^9^dt < / exp{Â— Â— r/_i)}tÂ”^e^^Â»*dt. Jtii Jtii Jtii Since A' Â— Ao < e implies that jA;^ = Â— A^Â°^ < e and A^^ < Equation (C.4) satisfies < poo / exp{( Jrji poo / exp{( Jtii BWM J Tll t Â— r/_i)}t"*dt = T 2 < oo. Case 3. xfg > 0. Equation (C.4) can be written as poo / exp{Jtii poo = / exp{J Tll Jtii }re^*dt < Ta < oo, f>^Ia ^ p'^Iq 'Il x(0) '^ig
PAGE 107
99 since X'jg is uniformly bounded. Hence, there exists a F > 0 such that Eoo{ J A',K(j), J(Â«))du < F, for X' satisfying A' Â— Aoll < e.
PAGE 108
APPENDIX D PROOF OF LEMMA 4.3 Note that for 0 < 5 < oo, poo pt [ u^h\u,Xo,K,{j),3{u))du\d[Snj{t,Oo)Gu{t)] Jo Jo poo pS > [ u^h^{u;XQ,K{j),3{u))du]d[Snj{t,9o)Gu{t)] Js Jo pS poo = / u"^h^{u]Xo,K{j),3{u))du / d[Snj{t]9o)Gu{t)] Jo Js [J u'^h\u,Xo,K{j),3{u))du][Snj{S;9o)Gu{S) Snj{oo,9o)Gu{oo)]. We observe that Snj{5,9o)Gu{S) Snj{oo,9o)Gu{oo) > Snj{S,9o)[Gu{6) G'u(oo)] pS > exp{e^ll^Â“M h{u,Xo,K{j),3{u))du}[Gu{S)Gu{oo)] Jo Â— > 1 Â— G[/(oo), as (5 0 uniformly for j = 1, 2, . . . , n. Therefore, we can find a > 0 and an e > 0 such that Snj{So,9o)Gu{do) Snj{oo;9o)Gu{oc) > e and pSo L vJ^h:{u\ Ao, K{j),3{u))du > e. Thus the expectation term in Equation (4.1) > for all j = 1, 2, . . . , n. If we take 7 = e^, the lemma is proved. 100
PAGE 109
APPENDIX E PROOF OF LEMMA 4.4 Let = (a^, b^) be a (p + /G) dimensional unit vector, where a is pdimensional and b^ = {bu , . . . , bn , . . . , bic, , bjoY is /Gdimensional. It suffices to show that c^DnC > wy, for any unit vector c. We showed in Chapter 3 that YDjiC = / h{u; A, /c(j), J(u))(a^Xj + u\YQ,{u, j))^du. j=\ Â•'o By expanding the expectation in the above equation, we obtain c^DÂ„c ^ noo f ft o') = ly h(u; A,/c(j), J(u))(a'^Xj + ub^Q(u, j)) dudFr;(t) = h{u,X,K{j),3{u)){a^yij + u\YQ{u,j)Ydu'^d[Sj{t)Gu{t)]. Since lx,ll < B by Assumption 1, C^DnC ^ fOO ft > / / h{u,X,K{j),J{u)){a^Xj + u)YCl{u,j)) du d[Sj{t)Gu{t)] Jo Jo G poo ft = Y2 / h{u,X,K{j),J{u)){a^y:j + u\YQ{u,j)) du 9=1 jegroup g ^ ^ d[SYt)Gu{t)]. (E.l) 101
PAGE 110
102 Assume that the j'" individual belongs to the treatment group g, then / ub Q('U,j) == ^ ^ ^ig^ue[Tii,Tj)] i=l and i=l Note that there exist a go where I < go < G and a io where I < io G such that *090 Â— JQ 1 IG (E. 2 ) We will discuss the behavior of the righthand side of Equation (E.l) in the following four cases. Case 1 . ap < 2(8 B^g/+t^) ^ treatment group ^0, poo pi / h{u;X,K{j),J{u))[a^yij + uh^Q,{u,j)]^dud[Sj{t)Gu{t)] Jo Jo poo pT\ > / h{u,X,K{j),3{u))[a^Xj+uh^Q{u,j)Ydud[Sj{t)Gu{t)]. J Tt J hri Because u > ti, x < B, and ^ , we observe that [a^Xj +ub^Q(u, j)]2 > hub^CliuJ)]'^ (a^Xj) 1 1 > ^(1llainB^llalp > 8 IG^ " " Â’ " " 16 /G Hence, > poo pt / h{u] X,K{j),J{u))[a^Xj + uh^Q{u,j)fdu d[Sj{t)Gu{t)] Jo Jo 2 poo pTi ~T^lJr lj,^^''^Â’^''du]d[Sj{t)Gu{t)] e 5 Â»*.")[S,(rOGt;(Ti) S,(oo)GÂ„(cÂ»)l
PAGE 111
103 > > 16 G/ AigQ 1 ^ ^g'^lSo'Â’Â’! _ exp {Â— f e^^Â®oÂ“(iu}[G{/(ri) Â— Gt/(oo)] GI Algo Jq 16GI Algp 7i > 0, by Assumption 4. 2 Case 2. a^ < 2 { 8 b^gi+t^) ^ treatment group go, IM h{u] A,k(j), J(u))[a^Xj +ub^Q,{u,j)]^du^d[Sj{t)Gu{t)] {/ + uh^Q{fJ'J)?du^d[Sj{t)Gu{t)] JtiI ''Jtj 1 roo . pTii+S ^ > { h{u,X,K{j),J{u))[a^:x.j + uh^Cl{u,j)]^du\d[Sj{t)Gu{t)]. Jtji+S Since u > r/_i > ti in the above expression, with the additional conditions that iINI igo Â— IG and xj < B for any j = 1, 2, . . . , n, we obtain that [a^Xj + ub^Q(u,j)] > 2 \ 'Â’'i 16IG Therefore, />00 pTji+S . {/ h{u,X,K{j),J{u))[a^yij + uh^Q{u,j)fdu\d[Sj{t)Gu{t)] i_(e^/9oC^i+<^) e^Â’^o'^'^)[Sj{Tii + 6 )Gu{tii + 5) 5j(oo)G[/(oo)] > > 16 G 7 A/gp + 6)[Gu{t,, + S) GÂ„(oo). iOGi A/go (E.3) Note that Sj(r/_i + S)[Gu{tii + ( 5 ) Â— GÂ£/(oo)] Sj(Tll)[Gu{Tll) Â— Gj/(oo)] > exp{eÂ®ll^ll / /i(u; A, /c(j), J(u))dn}[G[;(r/_i) G[/(oo)] > e > 0,
PAGE 112
104 where Assumption 4 is applied. Therefore, there exists a 5 > 0 such that J ly /i(u; A, K(j), J(u))[a^Xj + uh^ Q{u, j)]Â‘^du'^d[Sj{t)Gu{t)] > 72 > 0. 2 Case 3. lap < 2 {s b^gi+t^) I ^. Therefore, / [J h{u,X,K{j),J{u))[a^y.j + uh^Q{u,j)fdu^d[Sj{t)Gu{t)] poo pTig > Â— / h{u;X,K{j),J{u))[a^Xj + uh^Q{u,j)]Â‘^dud[Sj{t)Gu{t)] Jn^ JriQi > > > 16GI 1 16G/ e^'o9o^du[Sj{Tio)Gu{Tio) 5j(oo)Gt;(oo)] Tf 1 16G/ Aj(jgo [Gu{Tio) ~ G'f/(oo)] .(e^iosoUo _ eNo9o^ioi)5'^.(riJ[Gt/(riJ G[/(oo)] I (e'^Â»o9o'^Â‘o Â— 6^090 ^Â•01) / h{w,X,K{j),J{u))du} Jo = 73 > 0, where Assumption 4 is used again. By combining the above three cases, we obtain that c^DnC > . where j' = min(7i, 72, 73) and Ug^ is the number of individuals in group ^oUnder Assumption 3 the following holds: Â• r mm Â— > di, l
PAGE 113
105 when n is sufficiently large. Therefore, c^DnC > = n'y where 7 = 5iVe Case 4. ar > 2{SB'^GI+Tf) say el Note that ub^Q(Â«, j) < \u\GI. Therefore, (a^x, + !ib''Â’Q{ti, j))^ > l(aÂ’Â’x,)^ [tib^Q(a, j)]^ By Assumption 2, there exist an e > 0, a 5 > 0, and a no > 0. When n > no, there exists a subset A of {1,2, ...,n} with size ua > nS such that T I AtXjI > e for j Â€ A. (E4) HI '' Now we assume n > uq and (E.4) is satisfied. Therefore, if we denote by Si, for u e (0,5i], [a^Xj + nb^Q(u, j)]^ > (a^Xj)^ Â— > 1 2,2 Â€ e Hence, C^DnC A(Â«;A.K(i),J(a))^/K(o.,,ll
PAGE 114
Without loss of generality, let us assume tii < Ti. For the individual in treatment group g, 106 / OO 2 2 [ J h{u] K{j), J{u))Â—^I[ue{0,5i]]du\d[Sj{t)Gu{t)] poo pSi 2,2 > y [J h{u]X,K{j),3{u))^du]d[Sj{t)Gu{t)] > 5 ,(oo)Gc,(oo)] > 7^ > 0 , where Assumption 4 is used. Note that 7^ depends on g. If we take 7o = min 7 Â„ > 0 and 1<5 e > e = Â’^7
PAGE 115
APPENDIX F PROOF OF LEMMA 4.5 From Lemma 4.1, we observe that E[c^D^l^[Fr,{e,) = c^D^l\Dn Dr;\D^l\ = 0 . In addition we have ,=1 ^0 1/2 n J=1 DZ^^'^cdu. \ uQ{u,j) xj uQ'^iuJ) 1 /rt We showed in Lemma 4.4 that Amin(DÂ„) > ri'y. This implies that Xmax{Dn ' ) < F=. Therefore, y/fpy Â’ < ( ^ ^uQ{u,j) ^ (xj uQ \ Y Â• yxj uQ^{u,j) j uQ{u,j) ^ ( \ D'l^c = / / (F.l) Since T*, j = 1, 2, . . . , n are independent. J=1 / \ pT* 1 / ' /i(u; Ao,K(j), X, 'A (xj uQ^{u,j)) 107
PAGE 116
108 / )l/2 \ ^uQ{u,j) ^ < [ /i(u; Ao,K(j), J(m))c^DÂ„ j=i Jo DÂ‘/Â’cdÂ«}' < f ' /i(u;Ao,KO),J(ti))Â— (B^ + /Gti^)dti}^ maxE'l / uÂ’^/i(u; Ao, K(j), J(u))(iw > ^m= 0,2 1^0 J (xj uQ^(Â«,i)) < e2Bj3o (B2 + iGf n^'f< e2B/3o = 0(n>). (B^ + IG)'r)2^2 nT The last inequality holds by Lemma 4.2 and F is a constant depending on Aq. Therefore, for any e > 0, B(cÂ’Â’Â£);Â‘/"(EÂ„(eÂ„) > e) < ^ ^ 0 , as n ^ 00 by the ChebyshevÂ’s inequality. This implies c^D^^^[Fn{eo) D,,]D^/^c 4 0.
PAGE 117
APPENDIX G PROOF OF LEMMA 4.6 Let 0^ = {f3^,X^) and 9^ = (/3q, A g), then c^D^^^[Fr,{9)Fn{0o)]D^^^c = f J(^)) Ao,k(j), j(w))] j=i ^0 / By TaylorÂ’s expansion, \ uQU) ! (x^ uQ^(;))DÂ„^/2cdu. (G.l) [e^"^^h{u, A, k(j), J(^)) e^"^^h{uAg, K{j),3{u))] = [xJ(/3/3g) +m 1^(AAo)]e^j'Â’'^'/i(u;Aj,K;(j), J(u)), r\i where 1 is a /Gdimensional vector of 1 and (/3j, AJ) = cxj{0^, Ag ) + (1 Â— aj){0^,X^) for 0 < aj < 1. By Equation (4.3) and Assumption 1, {/3 /3g) + ul^( A Ag) I < (R + IGu ) ^ which, together with (F.l), implies that the absolute value of the integrand in (G.l) < (R + /Gu)4=\/e^"^Â“" Â— (5" + ^Gu2)/i(u;AÂ„K(i),J(u)). (G.2) y 7 By Lemma 4.2, ' {B + /Gu)^^e^ll^Â°ll^(R2 + IGu^)h{uXj, k(j), J(Â«))du} 109
PAGE 118
no < n(B + /G)(52 + /G)eÂ®ll^Â»ll Â— max L ri\/n7v^ o
PAGE 119
Ill Hence, the lemma is proved since mean square convergence implies convergence in ?** Â•
PAGE 120
APPENDIX H PROOF OF LEMMA 4.7 From Lemmas 4.5 and 4.6, DÂ„)D^'^c A 0, (H.1) where c is a (p + 7G)dimensional unit vector. As a result of Equation (H.l), A 1, which implies that lim = im = 1 nÂ— >00 in probability. Therefore, DZ'I^F(B)D^I^ 4 I(p+,G)x(p+,0). 112
PAGE 121
APPENDIX I PROOF OF LEMMA 4.8 It suffices to show that for each A:, A: = 1, 2, . . . ,p + IG, Eeo\\Snjk{9)\\' is uniformly bounded for j = 1, 2, . . . , n, where Snjk{0) is the component of the vector Snj{9)From Equations (3.2) and (3.3), we observe that \Snjk{0)\ is less than or equal to the maximum of the following two quantities; R + [ h{u,Xo,K{j),3{u))du ' h{u;Xo,K{j),3{u))du. Jo Note that Tj is the survival time of the individual and T* = min(Tj, Uj), where Uj is the censoring time associated with the individual. To show Eeg\Snjk\ is bounded uniformly for j, it suffices to show that E[Tj]^ is uniformly bounded since we have shown the uniform boundedness of Ee^ Jq ^ h{u, Aq, K{j),3{u))du in Lemma 4.2. Assume that the individual belongs to the treatment group g, then roo E[Tj]^ = / t^h{u,Xo,K{j),3{u))e^o^^Snj{t;0o)dt Jo < r'Â“A(t;AÂ„,/c(j),J(u))(ii Jo poo +e^ll^oll / t^h{u,Xo,K{j),3{u))Snj{t,do)dt. (LI) Jrii Note that the first term in Equation (1.1) is uniformly bounded for any j. The second term is the same as in Equation (C.4) where the parameters are replaced by their true value. Hence, the uniform boundedness is proved. 113
PAGE 122
APPENDIX J SIMPSONÂ’S FORMULA Suppose we want to approximate one integral by dividing the integration interval to Nq points. Assume a is the lower bound of the integral, b is the upper bound of the integral, and the integration form we are interested is [ f{c)dc, (J.l) J a then we will divide the interval (a, b) into Nq intervals with the end points of the intervals represented as where xq = a, xatq = b, Xj = a + jh, and h = The numerical approximation of (J.l) is then written as , Nol No1 f{c)dc = {f{xo) + f{xNo) + 2 ^ f{xj) + 4 ^ Â° j=i j=o 114
PAGE 123
REFERENCES Aitken, M., Anderson, D., Francis, B. and Hinde, J. (1989). Statistical modelling in GLIM, Clarendon Press [Oxford University Press]. Berkson, J. and Gage, R. P. (1952). Survival curve for cancer patients following treatment. Journal of the American Statistical Association 47: 501515. Bithell, J. F. and Upton, R. G. (1977). A mixed model for survival applied to British children with neuroblastoma. Recent Developments in Statistics, Proceedings of the 1976 European Meeting of Statisticians, NorthHolland/Elsevier (Amsterdam), pp. 635646. Boag, J. W. (1949). Maximum likelihood estimates of the proportion of patients cured by cancer therapy. Journal of the Royal Statistical Society 11: 1553. Cantor, A. B. and Shuster, J. J. (1992). Parametric versus nonparametric methods for estimating cure rates based on censored survival data. Statistics in Medicine 11: 931937. Chen, M., Ibrahim, J. and Sinha, D. (1999). A new bayesian model for survival data with a surviving fraction. Journal of the American Statistical Association 94: 909949. Doornik, J. A. (1998). Ox An ObjectOriented Matrix Programming Language, Timberlake Consultants (London). Farewell, V. T. (1977). A model for a binary variable with timecensored observations, Biometrika 64: 4346. Farewell, V. T. (1982). The use of mixture models for the analysis of survival data with longterm survivors. Biometrics 38: 10411046. Farewell, V. T. (1986). Mixture models in survival analysis: Are they worth the risk?. The Canadian Journal of Statistics 14: 257262. Friedman, M. (1981). Piecewise exponential models for survival data with covariates, Annal of Statistics 10: 101113. Gieser, P. W., Chang, M. N., Rao, P. V., Shuster, J. J. and Pullen, J. (1998). Modelling cure rates using the Gompertz model with covariate information. Statistics in Medicine 17: 831839. 115
PAGE 124
116 Goldman, A. I. (1984). Survivorship analysis when cure is a possibility: A Monte Carlo study, Statistics in Medicine 3 : 153163. Gordon, N. H. (1990). Application of the theory of finite mixtures for the estimation of Â’Â’cureÂ” rates of treated cancer patients. Statistics in Medicine 9 : 397407. Gray, R. J. and Tsiatis, A. A. (1989). A linear rank test for use when the main interest is in differences in cure rates. Biometrics 45 : 899904. Halpern, J. and Brown, B. W. J. (1987). Cure rate models: Power of the logrank and generalized Wilcoxon tests. Statistics in Medicine 6: 483489. Jung, S.H. (1996). Regression analysis for longterm survival rate, Biometrika 83 : 227232. Kalbfleisch, J. D. and Prentice, R. L. (1980). The Statistical Analysis of Failure Time Data, Wiley (New York). Karrison, T. (1987). Restricted mean life with adjustment for covariates. Journal of the American Statistical Association 82 : 11691176. Kuk, A. Y. C. and Chen, C.H. (1992). A mixture model combining logistic regression with proportional hazards regression, Biometrika 79 : 531541. Laska, E. M. and Meisner, M. J. (1992). Nonparametric estimation and testing in a cure model. Biometrics 48 : 12231234. Mailer, R. A. (1988). On the exponential model for survival, Biometrika 75 : 582586. Mailer, R. A. and Zhou, X. (1996). Survival Analysis with LongTerm Survivors, Wiley (New York). Peng, Y., Dear, K. B. G. and Denham, J. W. (1998). A generalized f mixture model for cure rate estimation. Statistics in Medicine 17 : 813830. Rao, C. (1973). Linear Statistical Inference and ItÂ’s Applications (Second Edit ion), Wiley (New York). Ravindranath, Y., Yeager, A. M., Chang, M. N., Steuber, C. P., Krischer, J., GrahamPole, J., Carroll, A., Inoue, S., Camitta, B. and Weinstein, H. J. (1996). Autologous bone marrow transplantation vesus intensive consolidation chemotherapy for acute myeloid leukemia in childhood. The New England Journal of Medicine 334 : 14281434. Risch, N. (1983). Estimating morbidity risks with variable age of onset: Review of methods and a maximum likelihood approach, Biometrics 39 : 929939.
PAGE 125
117 Sposto, R., Sather, H. N. and Baker, S. A. (1992). A comparison of tests of the difference in the proportion of patients who are cured, Biometrics 48: 8799. Tarone, R. E. and Ware, J. (1977). On distributionfree tests for equality of survival distribution, Biometrika 64: 156160. Taylor, J. M. G. (1995). Semiparametric estimation in failure time mixture models. Biometrics 51: 899907. Tsodikov, A. (1998). A proportional hazards model taking account of longterm survivors. Biometrics 54: 15081516. Yamaguchi, K. (1992). Accelerated failuretime regression models with a regression model of surviving fraction: An application to the analysis of Â“permanent employmentÂ” in Japan, Journal of the American Statistical Association 87: 284292.
PAGE 126
BIOGRAPHICAL SKETCH Haoyi Chen was born on November 29, 1972, in Jinan, PeopleÂ’s Republic of China. She received her Bachelor of Economics degree from Shanghai University of Finance and Economics in July 1993, majoring in statistics. After two years of industrial experience in business statistics in China, she joined the graduate program in the Department of Statistics at the University of Florida. She received her Master of Statistics degree in January 1998. She is expected to receive her PhD degree in December 2001. 118
PAGE 127
I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully ade(4uate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. fA Myron^'. Chang, Chairm Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. 7 / Randolph L. Carter Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. J II Ronald H. Randles Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy Samuel S. Wu Assistance Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. VY \ . ^ Michael B. Resnick Professor of Pediatrics
PAGE 128
This dissertation was submitted to the Graduate Faculty of the Department of Statistics in the College of Liberal Arts and Sciences and to the Graduate School and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. December 2001 Dean, Graduate School

