Citation
Piecewise Gompertz model on solving cure rate problem

Material Information

Title:
Piecewise Gompertz model on solving cure rate problem
Creator:
Chen, Haoyi, 1972-
Publication Date:
Language:
English
Physical Description:
viii, 118 leaves : ill. ; 29 cm.

Subjects

Subjects / Keywords:
Censorship ( jstor )
Goodbyes ( jstor )
Mathematical independent variables ( jstor )
Multilevel models ( jstor )
Parametric models ( jstor )
Proportions ( jstor )
Reliability functions ( jstor )
Standardized tests ( jstor )
Statistical models ( jstor )
Statistics ( jstor )
Dissertations, Academic -- Statistics -- UF ( lcsh )
Statistics thesis, Ph. D ( lcsh )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 2001.
Bibliography:
Includes bibliographical references (leaves 115-117).
General Note:
Printout.
General Note:
Vita.
Statement of Responsibility:
by Haoyi Chen.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. §107) for non-profit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
Resource Identifier:
028196784 ( ALEPH )
49281155 ( OCLC )

Downloads

This item has the following downloads:


Full Text









PIECEWISE GOMPERTZ MODEL ON SOLVING CURE RATE PROBLEM


I


H


By

oY I d 4 iE'N


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


2001































Copyright 2001

by

Haoyi Chen















I dedicate this work to Hua.














ACKNOWLEDGMENTS


I would like to take this opportunity to express my deepest gratitude to Professor Myron N. Chang, my advisor, for his guidance, tremendous help, and encouragement during my doctoral research. This dissertation would not have been possible without his forceful guidance and excellent advisement.

I also would like to thank Professor Randolph R. Carter, my graduate

research supervisor, for his guidance and support during the various epidemiological research projects with which I have been involved as a graduate research assistant. My deepest appreciation is extended to the other committee members, Dr. Ronald Randles, Dr. Samuel S. Wu, and Dr. Michael B. Resnick, for their advice on my doctoral research.

Lastly, I would like to extend my gratitude to my husband Hua and my parents for their never-ending support, understanding, and love.














TABLE OF CONTENTS

page
ACKNOWLEDGMENTS ............................. iv

ABSTRACT . . .... .. .... .. ... .. .. .... . .... ... . .. vii

CHAPTERS

1 INTRODUCTION .............................. 1

1.1 Problem and Common Practice ................... 1
1.1.1 M ixture M odel ......................... 2
1.1.2 Binary Regression Model ................... 6
1.1.3 Nonparametric Method .................... 7
1.1.4 Model with Bounded Cumulative Hazard Rate ......... 10
1.2 Computation Difficulty of the Semiparametric Model ......... 17 1.3 Piecewise Exponential Model ..................... 20

2 TOPIC OF RESEARCH ........................... 25

2.1 Data Structure and Model ............................ 26
2.2 Assumptions ..................................... 28
2.3 Hypothesis of Interest ............................... 29
2.4 Test of Hypothesis ....... .......................... 29

3 ESTIMATION OF PARAMETERS .......................... 31

3.1 Likelihood Function and Information Matrices .............. 31
3.2 Concavity of Likelihood Function ....................... 32

4 PROPERTIES OF THE MAXIMUM LIKELIHOOD ESTIMATES . .. 35

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Preliminary Proofs .......................... 36
4.3 Existence and Consistency of the MLE ............... 37
4.4 Asymptotic Normality of the MLE ..................... 40
4.5 Goodness-of-Fit Test ...... ......................... 42

5 SIMULATIONS ....................................... 47

5.1 Introduction ..................................... 47
5.2 Generation of Random Samples ........................ 47
5.2.1 Parameter Settings ...... ...................... 48








5.2.2 Generation of Censored Observations ................ 50
5.2.3 Generation of Failure Observations .................. 56
5.3 Type I Error and Power ...... ....................... 57

6 EXAMPLE ......... .................................. 76

6.1 Introduction ..................................... 76
6.2 Methodology ..................................... 77
6.3 Results ......................................... 80
6.3.1 Pairwise Comparisons on Cure Proportions ............ 80
6.3.2 Cure Proportion Estimates ....................... 82
6.3.3 Goodness-of-Fit .............................. 82

7 SUMMARY ........ .................................. 88

APPENDICES

A DERIVATION OF MARGINAL LIKELIHOOD .................. 89

B PROOF OF LEMMA 4.1 ....... .......................... 94

C PROOF OF LEMMA 4.2 ....... .......................... 97

D PROOF OF LEMMA 4.3 ...... .......................... 100

E PROOF OF LEMMA 4.4 ...... .......................... 101

F PROOF OF LEMMA 4.5 .......................... 107

G PROOF OF LEMMA 4.6 .......................... 109

H PROOF OF LEMMA 4.7 .......................... 112

I PROOF OF LEMMA 4.8 .......................... 113

J SIMPSON'S FORMULA ........................... 114

REFERENCES ................................... 115

BIOGRAPHICAL SKETCH ............................ 118














Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

PIECEWISE GOMPERTZ MODEL ON SOLVING CURE RATE PROBLEM By

Haoyi Chen

December 2001

Chair: Myron N. Chang
Major Department: Statistics

In cancer research, some of the patients are disease-free after certain

treatments. It is interesting to compare treatment efficacy with long-term survival rates. One commonly used approach in analyzing this type of data is to compare the Kaplan-Meier estimates of the cure rate. Another approach is to apply the mixture model proposed by Farewell. However, the Kaplan-Meier estimates are unstable toward the end point, while the mixture model is computationally too complex.

To overcome these difficulties, we propose a test based on a piecewise

Gompertz model to compare drug efficacy on the cure rate. The proposed test also accommodates the situation where patients display different hazard patterns during different treatment stages. In this work, we have derived the strict concavity of the log-likelihood function and the existence, consistency, and asymptotic normality of the maximum likelihood estimates of the parameters. In addition, our Monte Carlo simulation study shows that the proposed test is more computationally feasible and more powerful than the test based on Farewell's mixture model. Moreover, an








example is given to show the utility of our proposed test. A goodness-of-fit test of our proposed model is also discussed.














CHAPTER 1
INTRODUCTION


1.1 Problem and Common Practice

In survival analysis, one is interested in comparing the efficiency of various drugs or treatments. The survival probability is generally assumed to converge to zero when the observation time becomes large. However, it is not uncommon in cancer research that a proportion of patients are disease-free (relapse-free) after certain treatments. These patients are called "long-term survivors" or "cured." For instance, in clinical trials dealing with childhood leukemia, some treatments do cure a proportion of patients permanently. This type of long-term survival pattern is different from what is dealt with under the traditional survival analysis. In addition, it is sometimes necessary to assume that hazard functions are different on different intervals. A piecewise exponential model proposed by Friedman (1981) accommodates this assumption. This model was considered semiparametric by Aitkin et al. (1989, PP. 287-292). They stated that the piecewise exponential model has the same flexibility as the proportional hazards model in which the baseline hazard function is unspecified. In this chapter, we first introduce the common methodologies in the literature that deal with cure rate problems. We also discuss the piecewise exponential model proposed by Friedman which assumes different hazard functions on different time intervals.

In the literature, there are several methods dealing with the long-term survival problem. They are as follows:
* Using a mixture model which assumes the underlying population is a mixture of
two populations, patients who are cured and patients who are not cured.








" Applying a modified binary regression model in which the main interest is to
compare the cure rates among patient groups.

" Using nonparametric methods to compare the survival curves or to compare cure
proportions among treatment arms.

" Assuming a bounded cumulative hazard rate instead of a cumulative hazard rate
with an infinite limit.

In the next four sections, we describe each method in more detail. Assume that there are n patients in the study. For the ith patient, let T denote the time to relapse (incidence), Ui the censoring time, T* = min(T, Ut), Ai = I[T,
Let Z be a binary variable showing that a patient belongs to either the cured population or the noncured population.

{ 1, if the patient is noncured;

0, if the patient is cured.

Note that Z is not fully observable in the presence of censoring. Clearly, if the incidence is observed, then A = 1 and Z = 1; if the time to incidence is censored, then A = 0 and Z is not observable. Assume that the probability density function for the time to incidence is f*(t I Z = 1) for the noncured population. Let S*(t I Z = 1) be the corresponding survival function. If 6i = 1, then the contribution of the ith patient to the likelihood is f*(ti Zi = 1)P(Zi = 1).


If 6i = 0, then the contribution of the patient to the likelihood is


P(Zi = 0) + P(Zi = 1)S*(ti I Zi = 1).


(1.1)








Therefore, the overall likelihood function is
n
L= ]{P(Zi= 1)f*(t*IZi =1)}'{P(Z = 0) + P(Z = 1)S*(t Zi 1)}'I Z.
i=1
(1.2)

Two influential articles addressing this method were written by Farewell in 1977 and 1982. In the first article, he expressed the binary variable Z using a logistic regression model
e Tx
P(Z = 1)= e TX (1.3) 1+ e OTX'

where 3 is a vector of coefficients associated with the vector of covariates x. Farewell (1977) also assumed the time to incidence T follows an exponential distribution with mean 1/A, and A is assumed to be known. Thus, when 6i = 1, the contribution of the ith patient to the likelihood is
ef3Txi
f*(T=ti;A I Zi= 1)P(Zi= 1)= Ae- '1 eTXI ti >0.
I + e,6Txi

When 6i = 0, the contribution of the jth patient to the likelihood is P(Zi = O) + P(Zi = I)S*(Ti = ti; A IZi= )= 1 eX_ e- t ti > 0.
1 = eTxi +1 + e TxiType I censoring was assumed in Farewell (1977). The author also accessed the efficiency of the logistic-exponential mixture model proposed in his paper relative to a regular logistic model, which assumes Z = 0 if the patient does not have a relapse until To. He defined To as the end point of the observation time. He found that the efficiency of the mixture model increases as the follow-up time To increases and as P(Z = 1) decreases. When the true underlying distribution is a mixture model, the estimate of the cure rate from the regular logistic model is biased.

Farewell (1982) proposed two more general models for the time to incidence T. He assumed that T follows a Weibull distribution in his first model. The probability density function and the survival function of T given that the patient is








noncured are written as

f*(T =t;A , Z =1,x) = A(At)-exp{-(At)}, t >0, and (1.4)

S*(T = tI Z = 1,x) = exp{-(At)(}, t > 0, (1.5)


where ( is an unknown parameter, A = exp(-- ,Tx) is a function of covariates, and

-y is a vector of unknown parameters associated with covariates x. The likelihood function is obtained by plugging in P(Z = 1) defined in Equation (1.3), the probability density function defined in Equation (1.4), and the survival function defined in Equation (1.5) to the likelihood function in Equation (1.2).

Farewell (1982) proposed the second model for grouped data on survival time. Assume there are k failure time groups (ti, ti+1], i = 1, 2,... , k, the conditional probability for T = ti given that T > t, is
e a,+OTx
P(T = tc I Z = 1, T > tiX) (1.6) where ai (i = 1, 2,... , k) are group-specific parameters. The vector 0 is associated with covariates x and is common to all time groups. By combining Equations (1.2), (1.3), and (1.6), the likelihood function is obtained.

Farewell (1982) commented that the mixture model is appealing from both biological and statistical points of view. It has been used frequently in cancer studies, especially in the study of breast cancer. Strong evidence of the existence of individuals coming from different populations, however, is required before using the mixture model. Farewell also pointed out that the likelihood function might be flat for some data sets, which would cause possible nonidentifiability of the parameters.
Farewell (1986) discussed some problems related to the mixture model by applying the model to two data sets. The first data set was from a breast cancer trial with 139 patients. He applied a logistic-Weibull mixture model to the data. Then he computed the profile likelihood function for 0, the intercept parameter








in the logistic portion of the mixture model. The profile likelihood function for O3o was obtained by replacing the other parameters in the model by their maximum likelihood estimates (MLE). He found that the profile likelihood function was in a strongly nonquadratic shape, indicating that significant nonidentifiability existed with respect to the estimation of 30. Therefore, this logistic-Weibull mixture model presented an unreliable estimate of the cure proportion. Farewell also applied a multinomial-weibull mixture model to a data set from Risch (1983) on the time to the onset of a major affective illness in siblings of affected individuals. Risch's data were grouped into six time intervals. The onset probabilities were modelled by a multinomial distribution. The mixture model fitted data well. Therefore, Farewell concluded that one should be cautious when using mixture models. The cure proportion might be very imprecise in some cases and strong evidence for the existence of two populations is required to make a reliable inference.

A significant amount of literature discussed the mixture model. The common parametric distributions for the time to incidence for the noncured population can take many forms. For example, they can be exponential (Berkson & Gage 1952, Bithell & Upton 1977, Farewell 1977, Goldman 1984); lognormal (Boag 1949); Weibull (Farewell 1982); Gompertz (Gordon 1990); Generalized Gamma (Yamaguchi 1992); and Generalized F-distribution (Peng et al. 1998). The distribution for the time to incidence can also take nonparametric or semiparametric forms, such as Kaplan-Meier (KM) estimates (Laska & Meisner 1992, Taylor 1995) and the proportional hazards model (Kuk & Chen 1992). The probability P(Z = 1) can be treated as a parameter in a Bernoulli trial. It can also be written as a function of a vector of covariates x in a logistic regression model, such as the model defined in Equation (1.3).
To conclude the previous discussion, it is generally plausible to use the mixture model when there exists strong scientific evidence of the mixture of








different populations. However, as both Farewell (1977, 1982, & 1986) and Cantor and Shuster (1992) pointed out, the likelihood function is sometimes nonquadratic or flat, which could produce problems for both computation and inference. In addition, parameter estimates could fall out of the parameter space when the data do not support a nonzero surviving fraction. The asymptotic properties of the MLE from a logistic-parametric mixture model were developed systematically by Maller and Zhou (1996). The asymptotic properties related to the model parameter estimates in both logistic-semiparametric and logistic-nonparametric models have not been explored in the literature.

1.1.2 Binary Regression Model

Jung (1996) proposed an interesting method to estimate the cure rate,

which is defined to be the probability that the time to incidence T is greater than a fixed time T. In his paper, the standard procedure for inference on parameters in a binary regression model is modified to accommodate censored observations. Let 0(7r) = f3Tx, where 0 is a link function which is monotonic and differentiable on [0, 1]. Also let 7r* = 0-1, the inverse function of 0. Let F = 1 - 7r* and 7r*(0) = a r*(0)/90q. The MLE of 3 in a standard binary regression model without censoring is obtained by solving

7r (j3T~i)
n-1/2 Z{I(t, > r) - 7*(x3TX)} , 1 (x) ( x) = 0.


In the case with censoring, denote T* = min(T, Ui). Assume independent censoring, i.e., T and Ui are independent given xi for i = 1, 2,. . ., n. Also assume that U1, U2, . . . , U, are independently, identically distributed (iid) with a cumulative distribution function 1 - G. Let G be the KM estimate of G. Then the MLE /MLE of /3 can be obtained by solving ) * (x *(,3Txi)x =0
n -: 7r* (/TXi)}17* O Ti X= 0.
i=1 G(T) (/T)w*(,3x)








Numerically /MLE can be obtained by applying the Newton-Raphson method. Under mild conditions, it is proved that IMLE is a unique and a consistent estimate of /3 and nl/2( MLE -) follows an asymptotic normal distribution with a zero mean. The problem with this method is that T is very difficult to define. The result is not convincing if T is too small. On the other hand, it is not cost effective for T to be too long.

Farewell (1977) compared the efficiency of a logistic-exponential model to a binary logistic regression model that treats any individual with t* > T as a long-term survivor. He also pointed out that the estimate of the cure rate would be biased using a logistic model when the underlying true distribution is a mixture distribution. The logistic model discussed by Farewell (1977) did not consider censoring. However, he emphasized that a similar conclusion can be reached by expanding the model to include censored individuals. Farewell also noted that it is especially unreasonable for the logistic model to ignore the possible censoring times when T* exceeds the largest observing time T.

1.1.3 Nonparametric Method

Nonparametric tests are widely applied for both comparing survival curves with the presence of cure and comparing cure proportions between treatment arms. When the main interest is to compare survival curves of several populations with the presence of cure, some of the nonparametric tests are considered standard, such as the log-rank test and the generalized Wilcoxon test. Halpern and Brown (1987) studied the power of these two tests by simulations when the mixture cure rate model by Goldman (1984) is appropriate. They compared the power of these two tests for detecting the difference between two treatment arms for selected values of cure rates, instant failure rates, and follow-up time. Each of the two treatment arms had 50 participating individuals. They used 200 replications for each set of simulation. The simulation results are somewhat inconclusive. The log-rank test is








more powerful than the Wilcoxon test in general. However, the log-rank test does not show any advantage over the Wilcoxon test in the following situations: 1) cure rates are high and are at similar levels in both arms; 2) there is a large difference in the time to the end point for both arms; and 3) the follow-up is long relative to the time to failure.

In the comparison of cure proportions, Sposto et al. (1992) studied several tests based on simulated samples. These tests are the likelihood ratio, the Wald, the score tests based on two different mixture models, and a test based on the differences of the KM estimates of the cure rates. The two mixture models are logistic-exponential and logistic-Weibull models proposed by Farewell (1977 & 1982). The time point in the KM estimate of the cure rate is defined as the last observed failure time. Simulated samples under assumed mixture models are generated in order to check the results in an ideal case. In addition, samples under some different models are generated in order to study the robustness of the tests to model misspecification. The authors also generated samples based on various levels of cure proportions. The test statistics calculated from the mixture models sometimes can not be obtained because of the nonconvergence during the maximization procedure. The authors pointed out two reasons for the nonconvergence: 1) the maximization is obtained on the boundary; and 2) the algorithm terminates abnormally with a jump to a nonnegative definite region. Test statistics with convergence problem are not presented in the article. In conclusion, the authors noted that the KM estimate of the cure rate is a reasonable alternative to the parametric estimate in general. When the underlying model includes unequal failure rates between treatment arms, the proportion test based on KM estimates is more powerful than the test based on the complicated parametric model. However, the KM estimate introduces more bias than the parametric








estimate of the cure rate when the sample is small, when failure rates are equal, and when not all failures are observed.

Gray and Tsiatis (1989) proposed a linear rank test to compare survival curves between two treatment arms over the entire time span, but their test focused power against the later difference on survival curves. This linear rank test is evaluated on a mixture distribution of a proportion representing the cure and a failure distribution with an unspecified form. They assumed that failure distribution functions for both treatment arms are constant over time. They specified a weight function in a two-sample non-parametric censored test proposed by Tarone and Ware (1977) in order to obtain their own test. Note that the logrank test also can be derived from Tarone and Ware's test with a different weight function. However, the difference between Gray and Tsiatis' test and the log-rank test is that the former put more weight towards the later difference and the latter put equal weight throughout the entire observation period.

Gary and Tsiatis (1989) calculated relative efficiencies under equal censoring for their test, the log-rank test, and the proportion test based on KM estimates of cure rates. When there is no censoring, their test gains efficiency over the log-rank test for a small cure rate. However, the efficiency gain is less for a cure rate greater than 50%. These results can be explained by observing that the weight functions become closer for their test and for the log-rank test as the cure rate increases. In addition, under the noncensoring assumption, their test is asymptotically equivalent to the proportion test. When there is censoring and when the censoring proportion increases, their test is increasingly more powerful than the proportion test. But note that their test is not a pure test of the equality of the cure rates. Therefore, it has power against alternatives with unequal cure rates and alternatives with equal cure rates and unequal survival curves.








In this section, two types of statistical tests are discussed: 1) comparing survival curves under the assumption of the existence of cure rates; and 2) comparing cure rates. Some of the nonparametric tests such as the log-rank test is quite standard in oncology for comparing survival curves. When the problem narrows down to the comparison of cure rates, the proportion test based on KM estimates are quite popular due to the computation difficulties related to the test based on the mixture model structure. However, there are problems associated with KM estimates for cure rates, as pointed out by Cantor and Shuster (1992). The standard error of the KM estimate increases in the right-hand extremes because of the small sample size at the end of the study. The KM estimate sometimes can produce counterintuitive results. For example, the time-to-relapse curve plateau could be higher than the time-to-dealth curve plateau because of the large downward steps at the end of the study. Therefore, one should be cautious relying on the KM estimate of the cure rate. As an alternative to the KM estimate of the cure rate, Cantor and Shuster proposed a parametric model with a bounded cumulative hazard rate to accommodate the cure proportion.

1.1.4 Model with Bounded Cumulative Hazard Rate

Parametric Model

Cantor and Shuster (1992) proposed a Gompertz model to incorporate long-term survivors. The survival function is written as:


S(t) = exp[O3-1a(1 - el")], a > 0, (1.7) where a is the initial hazard. When 3 > 0, the survival function S(t) -- 0 as t -+ oc; and when /3 < 0, the survival function S(t) -+ exp(a/3) as t -- oc. The cure rate is presented as exp(a//3). The authors then maximized the likelihood function based on the Gompertz model using the Newton-Raphson method. Both Var(&) and Var() are obtained from the observed information matrix








and the variance of the cure rate &/3 is obtained using the Delta method. In the article, they also compared the Gompertz model with the mixture model. When there is a cure proportion, both models produce similar curves and cure rate estimates. However, the Gompertz model is preferred due to the divergence problem associated with the mixture model. The authors also noted the instability of the KM estimate toward the end. Some other nice properties of the proposed Gompertz model are also provided in the article. For example, the cure probability given the patient has survived through time to is exp( eot�); the log-likelihood function is strictly concave so one can easily estimate the parameters and the cure rate; and finally, S(t)' is also a Gompertz survival function when S(t) is a Gompertz survival function.

Gieser et al. (1998) incorporated covariates information into the model

proposed by Cantor & Shuster (1992). The treatment specific survival function for an individual is written as:
S(t; i,x) = {Si(t)}ex exp{yrTx+ -1(1 et)}, (1.8)



where a* is defined as log a for a in Equation (1.7), i is the treatment index, x is a vector of covariates, and -y is a vector of parameters associated with x. Note that S(t; i, x) in Equation (1.8) is also a Gompertz survival function as noted by Cantor and Shuster (1992). The cure rate in this case is 7r(i,x) = S(o; i,x) = exp{17[e +}. The obtained log-likelihood function is concave with respect to parameters. The MLEs of the parameters are efficient and asymptotic normal when the assumed model is correct and when the sample size is large. The authors compared the treatment effects on cure rates for any given covariates vector x. The ratio of the








log of the cure rates for two different treatment arms is expressed as logir(i,x) _3ec- i
' - log -(i', x) /3e'i'

The null hypothesis on testing the equality of cure rates is written as


Ho : log Oii, = 0

and the test statistic is expressed as

Z = [di + log(-j,) - log(-!3i) - d,]/SE, where the standard error SE in the denominator is obtained by the Delta method.

Gieser et al. (1998) applied their model to 763 patients from a multicenter prospective study preformed in 1981 by the Pediatric Oncology Group. This study analyzed the standard risk of acute lymphocytic leukemia in children. They compared two treatment effects on the cure rate. Children's age and sex are considered as covariates. The Newton-Raphson algorithm is used to yield MLEs of the parameters. The algorithm converges after nine iterations. Based on this study, they compared the estimates and the variances of the cure rates based on the Gompertz model, the logistic-exponential mixture model, and the KM method. They found that the Gompertz model produces similar estimate of the cure rate to the KM estimate but with much smaller variance. They also provided a generalized model in which the treatment indicator is included as one of the covariates. One important advantage of this generalized model is the log-linear structure of the parameters, since the log-linear structure implies the concavity of the log-likelihood function and the existence, the uniqueness, and the asymptotic normality of the parameter estimates.








Semiparametric Model

Chen et al. (1999) proposed a semiparametric model for the survival data with a cure fraction. They assumed that there are two variables: a) the number of carcinogenic cells N* in the body of a patient; and b) the incubation time Wi for the ith carcinogenic cell (i = 1, 2,... , N*). The number of carcinogenic cells N* = 0 implies that there is no such cell in the patient's body. Assume that N* follows a Poisson distribution with mean 0 and assume that Wi for i = 1, 2,... , N* are iid with distribution function F(w). Variables W and N* are independent. Therefore, the time to incidence is T = min(Wi, 0 < i < N*). Note that P(Wo = c0) 1. Then the survival function for T is presented as


S(t) = P(No incidence by time t)

= P(N*=0)+P(T>t, N*>0)

= P(N* = 0) + P(W1 >_ t, W2 > t,..., WN. > t,N* > 0) e- + [1-F
i=1
e-0 + e-0 [eO(1-F(t)) - 11

Se-OF(t) (1.9) Note that S(t) is an improper survival function because S(t) -4 e-0 > 0 as t -+ oc. The survival fraction e-0 is indeed the probability that there exists no carcinogenic cell in a patient's body. The corresponding probability density function and the hazard function are


f(t) = Of(t)exp(-OF(t)),

A (t) = Of(t).

Let S* (t) denote the survival function for the noncured proportion of patients, then e-OF(t) _
S*(t) = P(T > t I N* > 1) = 1 - e-0 (1.10)








The corresponding probability density function is d e -OF(t)
f*(t) = - t I f(t) and the hazard function is f*(t) _ e-OF(t) Of (t1 A*(t) S*(t) e-OF(t) - e- 0fJt) P(T < 00 T > t)A~t)

The likelihood function is derived as follows
n
L ff(t)1iS(t*)li=1
n
A H(t*)6iS(t*)
i
n
* -OF(t!)
JI[Of(t*)]6e . (1.11) i=1

The model is semiparametric in the sense that the form of F(t) is unknown. It has a strong biological meaning by gathering contributions from two characteristics of the tumor growth, the initial numbers of carcinogenic cells and the rate of their growths. Even without any biological meaning, the model still fits any data with a survival fraction. It can always be treated as being generated by an unknown number N* of latent competing risks. In this model, S(t), f(t), and A(t) are improper, while S*(t), f*(t), and A*(t) are proper survival function, density function, and hazard function, respectively. One advantage of this model over the mixture model is that the overall hazard function A(t) has a proportional hazard structure if the covariates enter the model through 0. In the mixture model, however, the overall hazard function does not have a proportional hazard structure if the covariates enter the model through the cure rate 7r. The hazard function h* (t) for the failure proportion is magnified by 1 > 1 compared to h(t).
P(Tt)
As t goes to infinity, function h*(t) converges to h(t). In addition, the authors pointed out the close relationship between the model presented in Equation (1.9)








and the mixture model presented in Equation (1.1) by observing that S(t) defined in Equation (1.9) can also be written as


S(t) = e + S*(t)(l - eC).

If we let P(Zi = 0) in Equation (1.1) take the value of e-0 and let S*(t I Zi = 1) take the form of S*(t) defined in Equation (1.10), then S(t) has the same form as in Equation (1.1). Model in Equation (1.1) also corresponds to some model of the form in Equation (1.9) for some 0 and F(t). Note that F(t) is assumed to have a Weibull form for convenience in the article. The parameter 0 is a function of covariates in the form of 0 = exp(I3Tx).

Tsodikov (1998) proposed the same semiparametric model from different angles. As above, let 0 be a function of covariates, denoted by 9(x, 3). Assume that n is the number of failure observations and O/d is the parameter of interest for the ith failure observation. Let mi be the number of censored observations in the time interval (ti, ti+l), where t, is the failure time associated the ith failure observation. Let til, ti2,... , timi be the censoring time for the observations censored within interval (ti, t+1). Let Oc.(i = 1, 2,..., nj = 1, 2,..., mi) be the parameters of interest for the censored observations. Also let 8i be the sum of Os for all individuals on the interval [ti, ti+1) and let

b
Sa,b = E,
k=a
b
Sa,b = E
k=a
P,b = Sa,bSa+1,b" . b,b, and
pa,b = ,b b-1 Sa,a,


for 0 < a < b < n. Note that E. 0 and a = I if a > b. The form of F(t) is completely unspecified.








In order to draw inference on the cure proportion with the presence of an unknown form of F(t), the author applied the partial likelihood, the marginal likelihood, and the profile likelihood function of 0. The partial likelihood function is written in the form
n o
p Sl,nS2,n... Sn,n

The partial likelihood is valid due to the proportional hazard structure. It provides consistent estimates. However, it can only be used to estimate the ratio of 0 and 0b, the baseline value. It is therefore inconclusive in the estimation of 0b. When there is no covariate, i.e., when all patients are homogeneous, cure rate can not be estimated.

Both the marginal and the profile likelihood are derived from the full likelihood function which is written as n mi
L d(ti) e-OFt) i Co e-F),AQ(tiJ)' (1.12) i1 j=1

where Q(t) is the survival function for the censoring time. Type I censoring mechanism with a fixed end point t, is assumed throughout the article. Then Q is a single-step function such that Q(ti) = I for i = 1, 2,.. ., n, AQ(t,) = 1, and ty t, for j= 1, 2,..., Mn. The marginal likelihood function is derived as
n odn P
Lm = H i - (�j-04)F(t,) F(1) ep- -iln(t,) }pniPn~,n .
Sl,nS2,n ... Snn e=0


The profile likelihood function is derived from Equation (1.12) by maximizing it with respect to F. Nonparametric maximum likelihood estimator (NPMLE) is used to estimate F.

Both Chen et al. (1999) and Tsodikov (1998) proposed the same semiparametric model, while using different estimation techniques. The former assumed independent and noninformative censoring as well as Weibull distribution for F(t) in the model. Therefore, their model is a parametric model although F(t) can take








a nonparametric form. The latter assumed a Type I censoring mechanism, which is restrictive.

Both the parametric and the semiparametric models with a bounded cumulative hazard rate are appealing because of the simplicity and intuitive meaning of these models. However, there are limitations for the more flexible semiparametric model. It is difficult in terms of computation. In addition, the asymptotic properties of this model have not been explored in the literature. In the next section, we will reexamine the semiparametric model proposed by Chen et al. (1999) and Tsodikov (1998). In this examination, we make no parametric assumption for F(t). In addition, we do not limit the censoring to Type I mechanism. By using the simulated data, we show the computational barrier raised by this model.


1.2 Computation Difficulty of the Semiparametric Model

We assume random and noninformative censoring. The notations used in Tsodikov (1998)'s paper are used in the section except some changes are made in the definition of P,b and pa,b. We define them as follows:

P,b = S.,bSa+l,b ... Sb,b, and
pa,b = Sa,bSa,b-1 ... S,.


The likelihood function takes the following form,
n
L = j 04f (ti)e-f(tj)8j' (1.13)


where f(ti) is the density function corresponding to F(ti). Define

e--Sn-i+ln
(1)pni+,nP,n-i








and

3 = S' , n-i+ lk +k,n- i O
n-,~ -n- -~ --, -
k=n-i~l nilk k=1 Define In = En=0 Ii*, Jn = 0 J*, and Sk = OS,,k/O3 for any j, k = 1, 2,..n.
ii0i

Let Lm denote the marginal likelihood function. As derived in Appendix A,
n
Lm - fJO'dIn. (1.14) Without loss of generality, assume that there is only one covariate x in the study, such as the treatment indicator. Also assume that 0 is a function of the covariate x and is expressed in an exponential term as exp(a + /x). The first and the second derivatives of Sjj with respect to / are denoted by S,,j and S,'. They are derived as follows:

asi,j _ Mk k=1 1=1
021 j mk
2, + M k
I =__ L _ X : kI2) SI0j a02 k=1 1=1

The score function for a is obtained as v(a) E - i S.iz; .
i=0 _I l.

and the score function for /3 is written as e( )= x- j . U(3) =E EI
i=1 i-=O

The Hessian matrix is calculated and written as 02 log(Lm)/a2 02 log(Lm)/0ao/3 1 02log(Lm)/OaO3 02 log(Lm)/,932

where

02log(Lm)/0a2 E - Sn-i+l,ni (-Sn-i+l,n + 1) + 2. i( S-i+l,, i=O In n i=0








2log(Lm)/a0 E i (JSi+,n -i+,n) i=0 j=0 In

and
n n
02 log(Lm)/032 - In 2 + I' Jj } + ( I, J) 2, i=0 Ii=
where J*' is the first derivative of J* with respect to /3 and is expressed as n ) 2 n-i ) Sn-i+lkSn-i+lk - (Sn-i+lk) Sk'n-iSkn-i - (Skn-i)
J_'= S_+,, + 2 + $5
k=n-i+l Sn-i+l,k k= 1,n

Suppose that the covariate x is the treatment indicator, i.e., x = 1 if the patient is in the treatment group and 0 if the patient is in the control group. We are now interested in testing the hypothesis: H0 : /3 = 0. (1.15)


The score test (Rao, 1973) is used to test the hypothesis in Equation (1.15). The score statistic for / under H0 is denoted as U(O3)IHo and the asymptotic variance for the score statistic is _02 log(Lm)/0/32 under the null hypothesis. If a is assumed known, then the standardized test statistic is T =U(3)IH� (1.16) V02 log(Lm)/0/32JH(

In order to explore the feasibility for computation, we first generate 10000 samples under the assumption that F is an exponential distribution with a mean of 1. Then the survival function is exp(-OF) = exp(-(1 - e-t)e+Ox). Assume a = 1.5 and /3 = 0. Each random sample contains 30 observations in the control group and 30 observations in the treatment group. The censoring proportion is around 50%. The standardized score statistic for /3 is calculated for all random samples generated. The plot of the 10000 standardized score statistics is presented in Figure (1-1). In this bar plot, we observe that the score statistic behaves well and is close to a normal shape given that a is known.








The problem becomes difficult to handle, however, when a is unknown. It is common to use a profile likelihood method to handle the nuisance parameter a when our primary parameter of interest is /3. The profile marginal likelihood function for /3 is obtained from the marginal likelihood function by replacing a by its MLE under H0. The test statistic based on the profile marginal likelihood function of /3 is similar to the one defined in Equation (1.16) given that some adjustment on the denominator of Equation (1.16) is required. Therefore, our primary task is to solve for the MLE of a under H0. By using the same parameter setting for Figure (1-1), we generate a sample with 30 patients in the treatment group and 30 in the control group. We show in Figure (1-2) the plot of the log of the marginal likelihood function as a function of a on interval [-2, 6] when /3 is valued at 0. The marginal likelihood for a under H0 is poorly behaved. In order to further investigate the behavior of the marginal likelihood function on a and /3, we show in Figure (1-3) a three-dimensional plot of the log marginal likelihood function defined based on the same random sample as a function of a on the interval [-2, 2] and /3 on the interval [-3, 3]. The poorly behaved log marginal likelihood as a function of a and /3 explains the behavior of the log marginal likelihood function in Figure (1-2).

1.3 Piecewise Exponential Model

As we mentioned earlier in the chapter, it is sometimes necessary to assume different hazard functions for the time to failure on different time intervals. Friedman (1981) proposed a piecewise exponential model to handle this problem. Let the time scale be divided into I(n) intervals (0, T], (T, T2], .. ., (TI(n)-1, TI(n)]. The hazard rate of each individual is assumed to be constant over any given interval. Denote the hazard rate of the jth individual in the ith interval by Aij Assume Aij > 0 for all (i, j). Let lij = log Aij and let tij be the survival time of the








jth individual in the ith interval. Define tij as

tij = max{0, min(T - Ti-1,tj - Ti_1)}. (1.17)

Let
1, when the jth individual fails in [Ti- 1, T) Iij =
{0, otherwise.

Then the log-likelihood function is log L (1) = 1: lij lij - E tij exp (lij). (1.18) i,j ij

One of the forms lij can take is lij = a + k kxk"

The author proved the existence and the uniqueness of the MLE of the vector 1. The consistency and the asymptotic properties of the MLE were also explored. Karrison (1987) also considered a piecewise exponential model in his study.

























0






a 0





o >
0






LO

















� U
















0








-4 -2 0 2 4 Standardized Score


Figure 1-1. Standardized score under H0 (0 = 0) for 10000 random samples.





























































0

0o


I I I I 1
-2 0 2 4 6


Figure 1-2. Log marginal likelihood function of a when 3 = 0.


0



0
t5
C LL





Cu





-8











































C
0



L0

0


-.J


C?



0
-1


Figure 1-3. Log marginal likelihood function of a and 3.














CHAPTER 2
TOPIC OF RESEARCH

The model we propose in this dissertation is a piecewise Gompertz model. It incorporates the cure proportion by assuming a bounded cumulative hazard rate. Moreover, our model gives the flexibility that the survival times for different time intervals may have different hazard functions. As noted by Aitkin et al. (1989) that a piecewise exponential model has the same flexibility as a proportional hazards model under the assumption that each interval is defined by one failure observation in the piecewise model. The choice of using a Gompertz model over an exponential model is made because of the possible plateau of the survival curve from a Gompertz model. We would speculate that under a piecewise Gompertz model, the more intervals we define, the more flexible the model is. Eventually, when there is only one failure observation on each interval, the flexibility of this model would be the same as the semiparametric model we discussed in Chapter 1. Therefore, our model preserves the flexibility of a semiparametric model to some extent. The degree of the flexibility depends on the number of intervals we define for our model. Meanwhile, the computation for our model is much easier to handle than a semiparametric model due to our model's parametric structure. However, when the number of intervals in our model increases, the number of individuals on each interval decreases. We would anticipate large variance for the estimate of the cure rate, which is similar to the situation we discussed earlier for the KM estimate of the cure rate. An appropriate number of intervals needs to be accessed in order to reach maximum flexibility and computational feasibility of the model. In this chapter, we describe our model in detail. In Chapters 3 and 4, asymptotic








properties of the MLEs of parameters in the model are proved. In addition, a goodness-of-fit test is derived which can be applied to determine the appropriate number of time intervals in the model. In Chapter 5, a Monte-Carlo simulation study is carried out to show the advantages of our model. Chapter 6 gives a real example and demonstrates the application of our model to the example.

2.1 Data Structure and Model

Assume there are n patients in the study. Consider the jth patient who is in treatment group g and is associated with a vector of covariates xj, where xj is a p-dimensional vector. Let T and Uj be the survival time and the censoring time of the jth patient, respectively. The observed data on the jth patient are (t*, 5j, xj), where t is the observation on

Tj* = min(Tj, Uj),


and 6j is the observation on

tAj = I(Tj < j).

Assume there are G treatment groups and the interval [0, oc) is divided into I intervals [Ti-1, - i), i = 1, 2, ..., I, where -0 = 0 and T1 = oo. Let OT= ('3T AT), where 3 is p-dimensional and A = (, AT, ..., AT. Note that AT = (Aig, A2g,..., Arg). Clearly, 0 is a (p + IG)-dimensional vector. It is assumed that the hazard function for the jth patient in treatment group g has the form e:igteTxj, t E [Ti -1, TO) (2.1)


Define an indicator function ig(j), where Kg 1 if j h patient is in group g;

otherwise.








Let n(j) be a vector of ig(j) for g = 1,2,..., G, i.e., K (j) = (K 1 (J), K2 (j), . . . , IG (j) )T.


Let us also define J 2 ( ) if t E [r i- 1 , T ) ;
0 otherwise,
for i 1,2,...,I and j (t) = (j, (t),j2 (t),...,(t)) T.

Let
ii(j)J(t)
(j) (&j)J(t)
Q(t,j) - (j) � J(t) K2(j)J(t) KG()J(t)
Note that Q(t, j) is a GI-dimensional vector. The hazard function in Equation (2.1) is also written as

h(t; A, n (j), J(t))eTx = eAigtel3Txj, if g(j)= 1 and Ji(t)= 1.

The cumulative hazard function Hj(t, 0), the survival function Sj(t, 0), and the density function fj (t, 0) can be derived as follows:

Hj (t; 0) =egT I h (u; A, r,(j), J(u)) du,


Sj(t;O) = exp{-e Tx jh(A (j)Ju)d}

fj(t;O) = h(t;rb ,(j),J(t))eO'xjSj(t;O).

The likelihood function contributed by (tj, 5j, xj) is expressed as


L, = [h(tj; A,n(j),J(t*))eoTxi]I Sj(t;; 0).








The corresponding log-likelihood function is written as


l j = log Lj

= 5j[log(h(t;; A,t,(j), J(tj))) + 3Tx3] - eTx f h(u; A,K,(j),J(u))du.

(2.2)

consequently, the overall likelihood function is defined as
n
Ln = H Lnj,
j=1

and the corresponding log-likelihood function is
n
In = E lj.
j=l

2.2 Assumptions

Our derivation of asymptotic properties is based on the following three assumptions:

Assumption 1. There is a B > 0 such that Ixj < B for j = 1,2, . . . , n. Assumption 2. There exist c > 0, 6 > 0, and an integer no. when n > no, there exists a subset A of overall individuals with nA/n > 6, where nA is the size of the subset A such that for any unit vector a,


laTxj I>c for j cA. Assumption 3. Let ng be the number of individuals in treatment group g, where g = 1, 2,..., G. There exists a 6 > 0 such that minl 6 when n
n
is sufficiently large. Assumption 4. The censoring variables U1, U2,..., Un are iid with a survival function Gu(u). Assume that there exists an E > 0 such that


Gu(TI-1) - Gu(oo) > E.








The first assumption assumes that all covariates are uniformly bounded. It is reasonable in practice because the majority of the covariates in a cancer study are patients' characteristics, which are bounded uniformly. Note that we are primarily interested in time-independent covariates. The second assumption assumes that at least a set A of patients' covariates are dispersed to some extent. Assumption 3 means that the number of observations in each treatment group must increase proportionally to the sample size. Assumption 4 assumes that not all the censoring observations occur very early on the time span.

2.3 Hypothesis of Interest

We are interested in comparing cure rates among treatment groups. Let 7r9 (x) be the cure rate of patients who are under treatment group g and who are associated with covariates vector x. Considering the case with two treatment groups, the null hypothesis is

H0: 7r1(x) - 7i2(x) = 0, (2.3)

for any covariate x. The test statistic will be derived in the following section.

2.4 Test of Hypothesis

The cure rate corresponding to patients in treatment group g who are associated with a specific setting of covariates x is

7-(x) = S(oo;0) exp{-e 6Tx h(u;A,K(j),J(u))du}
- 1 Aig i Alg7-lexp{e 9Tx[ e 9igi- ezg + eA(2.4
i=A (2.4) The hypothesis defined in Equation (2.3) is equivalent to

- 1 eAilri - - eAiTi eAI 17- 1 eAi27_I - e1--2 i eA 2 r 1.
Ho Ail + AI1 Ai2 A12






30

Let the MLEs of 0 = (AT 3T)T be denoted by 0 (AT /T)T" The test statistic is derived as

T= }. + + & . (2.5)
i=1 Ail A,, i=1 Ai2 A12
The variance of the above test statistic can be derived by the Delta method. The test statistic given in Equation (2.5) is a function of the estimates of A. We will show in later chapters that 0 follows an asymptotic normal with a mean 0 and a variance-covariance matrix E. The variance estimate of the test statistic can be derived as follows: V(T) = (( where ( [ =) represents the derivative of the test statistic with respect to A while evaluated at the MLE of A. E* is the variance-covariance matrix of A and Z" is the estimate of E*. The vector oT is derived as follows:

OT ( Ti- e Air i1 - Tieil r )Ail - (eAil Ti-1 - AilTi )
aAil Ail
OT (Ti__e Ai2Ti-1 - reii2Ti )A 2 - (e Ai2Ti-1 - e Ai2Ti) A2
aAi2 Ai2

for i 1,2,...,I- 1, and OT _ (A11T11l 1)e'I"'-l aAnl Ail

T (AI2T.-1 - 1)eA2rI-1 0A12
aA12 A12

The normalized test statistic is
T
T (2.6)


it follows an asymptotic standard normal distribution.












CHAPTER 3
ESTIMATION OF PARAMETERS

3.1 Likelihood Function and Information Matrices
Based on the log-likelihood function defined in Equation (2.2), the score statistic S,,j(0) for the jth individual can be expressed as


Snj(0) (n ()i (3.1) si (0))

where
sj(0) = -�/3 ; c5x3-xje h3x j h(u; A, n(j), J(u))du (3.2) and

SOl( eTxJ Q (u,j)uh(u; A,-(j),J(u))du. (3.3)
n(0() - j jt*Q(tj,ji) - e1 oX


Note that S()(0), S(2) (0), and Snj(O) are p, IG, and (p + IG)-dimensional vectors, respectively. Furthermore, the matrices formed by second order derivatives of l1j can be derived as
F(I 021,j - xxTe6TxJ ft h(u; A, x (j), J(u))du,


F.(1,2) (0) --"_ _ xje6TXj QT(uj)uh(u; A,(j),J(u))du,
( 2 ,1)0 2 1n j , t;
F- (0)= O -o3 TX j Q(u'j)xTuh(u;A,(j),J(u))du,

and

2,2) 021nj 6T-Xj t (uQT(uj)U2
jn (0)- aA(9AT = 0 u ,r jJ() u








Then the observed Fisher's information matrix is written as
F(1,1)0) ( 1,2) (0
=~ (0) =0 Fny n 0
F~j(O) ( ) (2,2)
(nj,1() Fn'n(o






which is a (p + IG) x (p + IG) matrix. The Fisher's information matrix is the expectation of Fj(O), which is written as

DTx.x (E(Fq(3( ))d)
DTj = E(Fnj(0)) = e xE j h(u; A,K(j),J(u)) uQ (Xf UQT(u,j))du (3.5)


3.2 Concavity of Likelihood Function

The log-likelihood function for n individuals is written as
n

1n(0) = Y { [log(h(t*; A, ,.(j), J(t*))) + OiTxj] - e ,x jt h(u; A, '.(j), J(u))du}.
j=1
(3.6)

The score statistic, the observed Fisher's information matrix, and the Fisher's information matrix for n individuals in the sample are respectively as follows:
n
Sn(0) = Z Sj(0), (3.7) j=1
n
Fn (0) = 1: Fnj (0), (3.8) j=l

and
n
Dn = S Dnj. (3.9) j=1
Let cT (aT bT), where a is a p-dimensional vector and b = (bll - bi1 ... bIG ... bIG)T is a IG-dimensional vectors. We will show that Fn(O) is positive definite and therefore, the log-likelihood function defined in Equation (3.6) is strictly








concave. To show F,(0) is positive definite, it suffices to show that CTFn(0)c > 0 for any non-null vector c of dimension (p + IG). In order to prove that F"(0) is positive, we assume that there is at least one observation on each interval for each treatment group. Note that


CTFn(0)C ( aT bT ) -Fn( (0)(



- Se oTXi J h(u; A, K(j), J(u)) [aTxj + ubTQ(u, j)]2du j=l
> 0.

This implies that Fn(0) is semipositive definite. Since h(u; A, rz(j), J(u)) > 0, cTFn(0)c = 0 implies that for any u E (0, t ), j = 1,2,...,n, aTxj + bTQ(u,j)u = 0. (3.10) For any integer i, 1 < i < I and for any integer g, 1 < g < G, there exists an individual j (say) in group g and t E (Ti-1, Ti). Then Equation (3.10) implies that aTxj + ubig = 0 for u c (,-, t*). Consequently

aTx3 = 0

and

big = 0. (3.11) Since Equation (3.11) holds for any integer i, 1 < i < I and g, 1 < g < G, it is then proved that


b=0.








Therefore, Equation (3.10) implies that for any j = 1, 2,... n, aTxj = 0,


i.e.,
aT
a-Hxj =0 for allj = 1,2,...,n, which is a contradiction to Assumption 2.

Therefore, F,(6) is positive definite. The strict concavity of the loglikelihood function as a function of unknown parameters 0 guarantees the uniqueness of the MLEs of the parameters provided that the MLEs exist. Based on the existence of the MLEs and the concavity of the log-likelihood function, the Newton-Raphson algorithm converges.













CHAPTER 4
PROPERTIES OF THE MAXIMUM LIKELIHOOD ESTIMATES

4.1 Introduction

In the previous chapter, we proved the strict concavity of the log-likelihood function. We pointed out that we obtain unique MLEs of parameters if we can prove that these MLEs exist. In this chapter, we will first provide some preliminary results as a preparation for later derivations. Then we will show the existence, the consistency, and the asymptotic normality of the MLEs obtained. The method applied to prove the properties of the MLEs follows a similar method provided by Maller and Zhou (1996, PP211-222). Maller and Zhou proved the asymptotic properties of MLEs in a survival model which assumes that the failure time follows an exponential distribution with covariates information. The exponential model with covariates in their book is a special case of the Gompertz model. By studying the piecewise Gompertz, we move one step further through the incorporation of different hazard functions on different time intervals. In our model, if we allow A= 0, the hazard function for the failure time T is h(t) = ex. Therefore, the hazard is a constant, which implies that T follows an exponential distribution.

In the next section, we state several lemmas which will be applied to prove the existence, the consistency, and the asymptotic normality of the MLEs. The proofs of the lemmas are given in detail in the Appendices. We prove the existence and the consistency of the MLEs in Section 4.3. The proofs of the asymptotic normality is stated in Section 4.4. In Section 4.5, a goodness-of-fit test which compares models with different numbers of pieces is derived. We also show that the corresponding test statistic follows an asymptotic X2 distribution.








4.2 Preliminary Proofs

Lemma 4.1. Let E(-) and V(.) represent the expected value and the variance of a random variable, respectively. Then the following two equations hold: E{Sn(O)}= 0 and

V{S.(O)} = Dn.

Proof. See Appendix B. El Lemma 4.2. Let 0o =(0, Ao) be the true parameter vector, where Ao
(Ao)A,..., o . . �, AG(), A). For any positive integer m and for 1 = 1,2, there exist an c > 0 and a F > 0 such that for any A' satisfies IJA' - Aoll < C, EoI{j umh(u; A', (j),J(u))du} < F, for j = 1, 2,..., n. Note that F is a constant which depends on 00 and C. Proof. See Appendix C.

Lemma 4.3. Define 0o as in Lemma 4.2. For any positive integers I and m, there exists a 'y > 0 such that

Eoo{j I umhL(u; Ao,r(j),J(u))du} > -y, (4.1) for j = 1, 2,..., n. Note that -y is a constant depending on 0o. Proof. See Appendix D. El Lemma 4.4. Under Assumptions 2, 3, and 4, there exist a -y > 0 and an integer no > 0 such that when n > no,

Amin(Dn)
n


where Amin(Dn) is the smallest eigenvalue of Dn.








Proof. See Appendix E. LI

4.3 Existence and Consistency of the MLE

For each A > 1, define an elliptical neighborhood of the true value 00 of unknown parameters by

N(Oo) = {O: (6 - Oo)TDn(0 - Oo) < A}. (4.2) As shown in Lemma 4.4, A.min(Dn) > n-y when n > no. Hereafter we assume that the condition n > no is satisfied. We observe that for any 0 E N(0),

A > (- Oo)TDn(0 - o) > 110-oll2A.min(Dn) > 110- o0112ny,

i.e.,

1- 0011 < 1A (4.3) Let us consider an arbitrary vector 0 which is on the boundary aN(Oo) of N(0o), then 0 is a vector which satisfies the following equation: (0 - Oo)TD,(0 - 0o) = A.


If we can show that 1n(O) < l,(0o) for all 0 E ON(0o) with probability approaching 1, then the log-likelihood function has a local maximum in N(Oo). In addition, the gloabl maximum exists and lies in N(Oo) since the log-likelihood function is strictly concave. The consistency of 60 is proved by observing that Equation (4.3) holds. Theorem 4.1. There exists a unique and consistent On which maximizes ln(O). Proof. For any On E ONn(0), the Taylor's expansion of the log-likelihood function about 00 is written as

1n(On) - ln(00) =(0. - 00)S - ( - 0()OFn( )(0. - 00), (4.4)
2








where 0,, is a random vector on the line segment between 0,, and 00. Hence, the vector W,,c N(Oo). Recall that Sn(O0) is the vector of the first-order derivatives of 1,(0) evaluated at the true parameter 00 and Fn(On) is the negative matrix of the second derivatives of 1(0) evaluated at ,,. We have

P{1/(0n) - 1(O0) < O}

> P{(On -Oo)TS.(Oo)< 1(on- OO)T(on)(on- Oo)}

> P{(Oo-Oo)T (6o) A, (O -O)TF(o)( - Oo)> A I}
> P{(~O -o)TFn(On)(On-o)> - P{(On - O0)TSn(00) > d}


We proved in Lemma 4.1 that E(S,(O0)) = 0

and

V(Sn(O0)) = D.

Since (O" - Oo)TS"(Oo) is a sum of independent random variables with mean zero, we obtain the following inequality by applying the Chebyshev's inequality:
(. TS(00 > A 16(0 - Oo)TDn(On - Oo) _ 16 (4.5)
O )no) - A2 A (4.5

since O E ON(O0). The probability in Equation (4.5) can be made arbitrarily small by choosing A sufficiently large. Next we need to show that P OO)TF(O,)(On-Oo)> A -+1, (4.6)


as n - oo.
Let I be a (p + IG) x (p + IG)-dimensional identity matrix. If we can show that, uniformly in O,,E N(O),


D /2Fn(On)Da1/2 4 1, as n -+oo,


(4.7)








then we obtain the following: (6n -Oo)TF (n)(O -00)


(On )T D1/2 (D-1/2F( O)D-12) /2(O - 00)
_ O n 112F(nD )n(0-o S(On - )TDn(On - Oo)Amin(Dnl/2Fn(n)Dn 1/2)
SAAmin(Dn 1/2F( 1)Dn/2)


4 A,

where On E ON(6o) and nE N(Oo). Therefore, Equation (4.6) holds as n -- cx. Consequently, if we let both n and A become large, P{1n(On) < /n(O0)} -+ 1 for On E ON(Oo). Unique MLEs of parameters in the model exist by the strict concavity of the log-likelihood function and are consistent by Equation (4.3). l

We shall now prove Equation (4.7) through the following three lemmas. Note that Equation (4.7) indeed shows that the information matrix Dn is consistently estimated by Fn(/3). Let cT = (aT, bT) be any (p + IG)-dimensional unit vector, where a and b are p- and IG-dimensional, respectively. Lemma 4.5. For any (p + IG)-dimensional unit vector c, CTD '/2[Fn(Oo) - Dn]D 1/2c 4 0 , (4.8) where 6o = (/3T, AoT)T is the true parameter. Proof. See Appendix F. [] Lemma 4.6. For any (p + IG)-dimensional unit vector c and any 6 E N(8o), CTDI 1/2(Fn(0) - Fn(0o))D /2c 4 0.


Proof. See Appendix G.








Lemma 4.7. For any 0 E N(Oo) D112F -(O)D 1/2 2+ I(p+IG)x(p+IG) (4.9) Proof. See Appendix H. El


4.4 Asymptotic Normality of the MLE

Lemma 4.8. Let I be an arbitrary positive integer, then Eoo HSj (0) l' is uniformly bounded for all j = 1, 2,... , n. Note that Sny(0) is defined in Equation (3.1). Proof. See Appendix I. 0 Lemma 4.9. For any (p + IG)-dimensional unit vector c, CTDnl /2S(O0)4 N(0, 1). (4.10) Proof. Denote
n
Zn = cT Dnl/2Sn(o0) = cT Dn1/2 Snj (0o), j=1
where Snj(00), j - 1, 2,... , n are independent random variables. By Lemma 4.1


E(Zn) =0

and

V(Zn) = -1.

By Liapounov's condition, it suffices to show that
n
EoICT Dnl/2Snj(Oo)13 - 0 (4.11) j=1

as n --+ oo. Note that IcTD-l/2Sj(0o)la < [Amax(Dn 112)]aIISj(oo)H13
< (-)3IlSj(0o)l3.








Since E00IjSnj(o)I3 is uniformly bounded for all individuals by Lemma 4.8, Equation (4.11) holds and the proof is accomplished by the Central Limit Theorem.



Theorem 4.2. The MLE 0, are asymptotic normal, i.e., D1/2(W"- 0) -4 N(0, I) as n - oc. Proof. By Taylor's expansion, we obtain that for some 6 on the line segment between On and 0o and for any unit vector u,

uTD 1/2Sn(4n-) = uTDn/2Sn(0) _ UTD1/2Fn(O)(n (o - 00).


Also because UT Dn 1 (/2 ) = 0 since -n is the MLE,
*" 1/2Sn(eo) UT u 1-20

uTD/2s(Fn(b)- D/- + UTD1/2Dn- uTDn /2(Fn()- D,)ni/2n, /2(6) - (o) + uTDV2 (on - 0 )


uT[D n/2(F() - Dn)D-/2 ]D/2(6- _0) + uTD12( - O).
n- n n/ On n nv -- 00)

Since On E N(O0), Equation (4.9) holds by Lemma 4.9. Therefore, the sum of the absolute values of all elements of [Dn1"2(F(O) -Dn)Dn'1/2] goes to 0 in probability. Hence,

[n1/2 (Fn(b) - Dn)Dn 1/2] = op (1), where op(1) is a matrix with all elements converging to 0 as n -4 00. Then we obtain that
T 1/2Sn(0) = UT1/2(00)+-UT01/ uTDnI/2Sn(6o) uTop(1)]Dn/2(-0) + u On - 0) Therefore,
uTD /2(O -0o)= uTD I/2Sn(6o)+ op(1).






42

Since we have proved the normality of uTDn 112S"(O0) in Lemma 4.7, the normality of UTDnl/2(O - 0) follows. Hence, Theorem 4.2 is proved.


4.5 Goodness-of-Fit Test

In practice, one might be interested in knowing how many intervals are in fact necessary in a specific study. Therefore, a goodness-of-fit test can be performed to test how many time intervals are adequate. Assume that we have a saturated model with I, + 2 intervals and an unsaturated model with I, intervals. We are interested in testing whether there is significant difference between the two models. We assume that the unsaturated model is obtained by combining intervals I, + 1, I1 + 2, . .., I1 + 12 in the saturated model with interval I1. The null hypothesis can be written as


Ho : A(I,+1)g = A(11+2)g . A11+I2)g Aj1j, Vg = 1, 2,..., G.


(4.12)


Since 0T = ('3T AT), the null hypothesis defined above can also be written as


Ho: HO=O,


(4.13)


where H is

/ ,-.T


a design matrix with full row rank 12G and is written as


oIT 1 oT
Ij-1



of1


of1 oT
oT

0110iI -


0 0 "* O






0 0


0


... oT







11-1
" OI-1


�... 0T
11-1

o.. o 1


0

0



0


0

0








where Os in bold are null vectors. Note that the first p columns of the matrix H put no restriction on j3. Columns (p + 1) through (p + I, - 1) represent the nonrestrictiveness of A for treatment group 1 corresponding to the first I, - 1 intervals. Similar patterns are presented for the rest of the columns. We will now construct a likelihood ratio test for the hypothesis in Equation (4.13). To maximize the log-likelihood function l, (0) under H0 is equivalent to maximize

1 (0) - OT HTY

unrestrictively in 0 E RIP+(I"+I2)G] and -y E R/2G, where 7 is a vector of Lagrange multipliers.

If we let 00 be the true parameters, 6,, be the parameters estimated under H0, and 6,, be the estimated parameters under the saturated model, then 0. satisfies the following equation:


Sn(On) - HT, = 0, (4.14) where -, is a vector of the estimated -y under H0. Also under the null hypothesis, Hbn = 0 and HOo = 0,

H(bn - 00) = 0. (4.15) Now we need to solve for 0n from Equations (4.14) and (4.15). By Taylor's expansion,
D-1/2Sn (6n) = D 1/2Sn(0o) - D1/2(On - 00) + oP(1), hence

D1/2(b, - Oo) + Dn1/2Sn(O) = D-112S,(0o) + oP(1). By Equation (4.14), the above equation can be written as


/ - o) + D T2 D 1I/2Sn(O0) + oP(1).


(4.16)







We can rewrite Equation (4.15) as (HD-1/2 )DI/2(O _ 00) = O. (4.17) Now express Equations (4.16) and (4.17) in a matrix formula,

I D.- H Dn n(n - 00)) Dn1/2S.(O0) + O(1) ) (4.18)
HDn 1/2 0 0 (418

Since H is of full row rank I2G and Dn is nonsingular, HDnIHT is of full rank I2G. Hence, the inverse of matrix ( ) exists and can be (HDn-/ 0
written as
I DnI/2HT I-P Q HDn- /2 0 QT R

where

P Dn 1/2HT(HD,[lHT) -HDnl/2, Q D, /2HT(HDnIHT) -, and R = (HDn 1HT) -1.

Therefore, we can solve Equations (4.18) as
( 1D/2(O~) (Q ) /2Sn(6o)�+oP(1)) 5" T R 0 hence,
/2(b - O0) = (I - P)D- 1/2Sn(O0) + Or(1). Similarly for 6n, we have


/2(bn - 0o) = D-l/2Sn(O0) + Op(1),








by Taylor's expansion and by observing the fact that Sn,(0,) -0. Therefore,

D 1/2 (0, - 0,n) = Dl/2(6n _ 00) - D12(bn 00G)


PDn/2Sn(0o) + oP(1).


(4.19)


Note that PP = P is idempotent with full rank I2G; therefore, P can be written as


P=AT (


II2G 0 A,
0 0)'


where A is an orthogonal matrix. Let Zn = ADn- /2Sn(Oo), then

d
Zn + N(0,I),

by Lemma 4.9. From Equation (4.19), we obtain that

(On _On)TDn(On-On)

ST(Oo)Dn1/2PDn /2Sn(Oo) + o,(1)


- sT(0o)Dnl/2AT (


12G 0
0 0


12G 0) 0 G0)


ADn"/2Sn(Oo) + oP(l)


Zn + op (1)


li2 0 z + o()


Z T G

X2


The likelihood ratio test statistic for the hypothesis in Equation (4.12) can


be written as


A =sup oo� Ln(0)
supoce Ln(0)


Ln (n)
Ln(bn)'


(4.20)








where e represents the overall parameter space and 60 the parameter space under H0. By Taylor's expansion, we can express -2 log(A) as

-2log(A) =
- - 0.)TS.(b.)] + (b. - -.) (0 -.),

where 6 lies between 6, and 6, and therefore 0 E N. (A). Note that S, (6.) = 0 and F,(6) can be replaced by D, from Lemma 4.7. Hence, -2 log(A) is written as

-2 log(A) = (6" - 6")TD,(b, - 6.) + o,(1),


which converges in distribution to X2,G by Equation (4.20).














CHAPTER 5
SIMULATIONS

5.1 Introduction

The results discussed in the previous chapters showed the asymptotic

properties of parameter MLEs in the piecewise Gompertz model. These results enable us to carry out the hypothesis test of which the null hypothesis is proposed in Equation (2.3). In this chapter, Monte Carlo simulation studies are performed in order to evaluate the normality approximation of the proposed test statistic in Equation (2.6) and the power of the test under different settings based on the amount of location shift and the sample size. Due to the complexity of the piecewise model, the procedure for generating the random sample is quite complicated and is discussed in section 5.2. Results are given on observed Type I errors and powers for both the piecewise Gompertz model and the exponentiallogistic mixture model. All simulations were completed on the Sun Ultra, Enterprise 450 computer using Ox 2.0 (Doornik 1998). Random number generators in the Ox 2.0 environment were used to generate uniformly distributed random numbers.

5.2 Generation of Random Samples

In this section we describe the procedure for generating the random sample when the sample follows a piecewise Gompertz model based on certain parameter settings. In order to generate a random sample of size n with a specified proportion of censoring, we need to generate a sample of failure observations of size n and a sample of censored observations of size n. The sample of failure observations follows the piecewise Gompertz model. The sample of censored observations








is assumed to follow a uniform (0, M) distribution, where M is defined by the censoring proportion. The final sample is obtained by comparing the two samples generated. A subject is considered failed if its observation from the first sample is smaller than the one obtained from the second sample. The subject is censored if its observation from the first sample is greater than the one from the second sample.

5.2.1 Parameter Settings

We consider five different parameter settings in our Monte Carlo simulations. There is no treatment difference in random samples generated based on Settings 1 and 2. Therefore we can compare Type I errors between tests based on random samples generated from these two settings. In parameter Settings 3, 4, and 5, treatment effects exist between groups. Settings 3 to 5 present the smallest, the intermediate, and the largest difference between treatment groups, respectively. The comparison of powers of tests are based on random samples generated from these three parameter settings. All five parameter settings are listed as follows: Setting 1:

* Two treatment groups

* Three intervals: [0, 3), [3, 7), and [7, oo)

* One covariate with value 0 or 1

* Cure proportion varied with covariate values

* Censoring rates: 30%

* A1O A11 = -4.2

� A20 = A21 = -0.52

" A30 = A31 = -0.2

" 0o -0.1, /31 = 0.2
Setting 2:

* Two treatment groups








" Three intervals: [0, 4), [4, 10), and [10, cc) " One covariate with value 0 or 1 " Cure proportion varied with covariate values " Censoring rates: 30% " A10 = All = -3.5 " A20 = A21 = -0.45 " A30 = A31 = -0.15 " *0 = -0.2, 13= 0.15 Setting 3:

" Two treatment groups " Three intervals: [0, 3), [3, 7), and [7, oc) " One covariate with value 0 or 1 " Cure proportion varied with covariate values and treatment groups " Censoring rates: 30% " A10 = -3.5 and A11 = -4.2 " A20 = -0.45 and A21 = -0.52 " A30 = -0.21 and A31 = -0.2 " 0o = -0.1, 01 = 0.2 Setting 4:

" Two treatment groups " Three intervals: [0, 4), [4, 10), and [10, cc) " One covariate with value 0 or 1 " Cure proportion varied with covariate values and treatment groups " Censoring rates: 30% " A10 = -3.9 and All = -4.2 " A20 = -0.45 and A21 = -0.4 " A30 = -0.165 and A31 = -0.18








* 0 = - 0.2, 31 = 0.15 Setting 5:

" Two treatment groups " Three intervals: [0, 4), [4, 10), and [10, oc) " One covariate with value 0 or 1 " Cure proportion varied with covariate values and treatment groups " Censoring rates: 30% " A10 = -3.5 and A11 = -4.2 " A20 = -0.45 and A21 = -0.4 " A30 = -0.15 and A31 = -0.18 " 30 = -0.2, 01 = 0.15
The survival curves for the above five parameter settings are shown in

Figures (5-1) through (5-5). The survival probabilities are calculated by applying
i-1 A 1 Am r Aigr 1 _ e~gt SexpeTx(mg + e ) I (5.1) f =1 Am9g i for t E [Ti_1, Tj). There are two separate survival curves in the first two figures for individuals with covariate values of 0 and 1. For Figures (5-3), (5-4), and (5-5), we draw four survival curves for all four possible combinations of the covariate and the treatment group indicator.

5.2.2 Generation of Censored Observations

From the assumption, the sample of censored observations follows a uniform (0, M) distribution. Therefore, we can obtain a sample with a uniform (0, 1) distribution and then transform them into a sample that follows a uniform (0, M) distribution. The primary task is to obtain an appropriate value for M. Let U and T denote random variables for censored and failure observations, respectively. By taking parameter setting 1 as an example, we illustrate the derivation of the

































- Cov 0 ............ Cov 1


I I I I I I 0 10 20 30 40 50 Time


Figure 5-1. Survival curves for patients with parameter setting 1.


O
0




C;




.0
ZP


=1C0 Co
2
(1. _; C\O 0,




















0







Cov 0 Cov1





0
CL 3.
















0 10 20 30 40 50 Time


Figure 5-2. Survival curves for patients with parameter setting 2.























Cov 0, Treatment 1
--.---.--.-- Cov 1, Treatment 1
Cov 0, Treatment 2 Cov 1, Treatment 2





















-------------------


Time


Figure 5-3. Survival curves for patients with parameter setting 3.


E-o

Co






a
Cn0


C4 "'


o-
























Coy 0, Treatment 1
----.----.-- Cov 1, Treatment 1
Cov O, Treatment 2
- -Cov 1, Treatment 2









\ %










-----------------------........
--....................


Time


Figure 5-4. Survival curves for patients with parameter setting 4.


0






C
0






.0-
CU
0 CI) c'J






N 0o





























Cov 0, Treatment 1
-.---.---. - Cov 1, Treatment 1
Cov 0, Treatment 2 Cov 1, Treatment 2
























----------------... . . . .- - - - - - - - - ---- - - - - - - -
~.....



.........


Time


Figure 5-5. Survival curves for patients with parameter setting 5.


C0
0
a



:3~



()C;J


0








censoring proportion P(U < T) as follows.

P(U < T) e+119c


for M E [0,3);

P(U < T)
f3 x 1e\gc
exp{e+0+01x }dc
=M 0Aig

1 1 - eAlg3 fM expeOIX A2g3 - A2gC dc
x Aig 13 A2g

for M E [3, 7); and for M E [7, oc),

P(U < T)
1j 3{/ +3 1 - e'\gc }dc
M 0 A19
1 1 - e\ga 3 f 7 exX� z 02g3 - e A2gC
+ M expfeO�+ A19 e J3 A2g }dc

+ Mexpc 1 eAg3 eA293 eA2- Mexp3o+lxeA39 7 eA3gc
M A - 3g }dA

Calculation of integrals in the above equations are carried out approximately using the Simpson's Formula. The detail of the Simpson's Formula is provided in Appendix J. Therefore, we search for an appropriate value for M such that P(U < T) reaches the required censoring proportion.

5.2.3 Generation of Failure Observations

In order to generate a sample of failure observations, we need to generate a sample of uniform (0, 1) observations. Then we can treat the value from the uniform (0, 1) as the survival probability for the failure time T. By inverting the survival function defined in Equation (5.1), we obtain the value for the failure time. Note that special attention is required at the inversion because we are dealing with








piecewise distribution. Examples are given for observations with covariate x = 0 in the first parameter setting.

From Figure (5-1), we observe that each survival curve consists of three pieces based on the three intervals we consider in the study. Therefore, the inversion function for generating the failure time observation depends on the value of the generated uniform random number U*. The survival probability when t - 0 is 1, the survival probability P(T > 3) = exp{e -11_e 4.2x3 = 0.81,
-4.2

and
1 - e-4.2x3 e-0.52x3 - e-0.52x7
-4.2 -0.52 0.59.

Therefore if u* E [0.81, 1], then T 1 log [e A1gro _ Ageo log(u*)]; if u* E [0.59, 0.81), then

T = - log e 1\g71 - A2g9[e o log(u*) - eAIgr�- ;lg~l
A29 t Ai9 and if u* E [0, 0.59), then
A A39T2 39 [_lu - elgrl e ,2gl _e 2g72
T =A9lge - Ag e1 gu* Aig + -Ag

Note that it is necessary that there is at least one failure observation in each interval for each treatment group in order to make inferences on all parameters.

5.3 Type I Error and Power

Type I errors and powers of tests are calculated based on the piecewise Gompertz model and on the exponential-logistic mixture model for all five sets of parameters. The test statistic based on the piecewise Gompertz model








proposed in this dissertation, as we recall, was presented in Equation (2.6). For the exponential-logistic mixture model, we assume that there is one covariate x in the model. The time to failure is assumed to follow an exponential distribution with mean 1/A, where A is modelled as a function of the covariate x and the treatment indicator g. The parameter A is presented as exp(yo + y1x + y2g). The cure rate P(Z = 0) is defined as 1+0+ 1x+$29- Therefore, for a fixed covariate value, the test of the equality of cure rates between two treatment groups is equivalent to the test of H0 : 32 = 0. Hence, the standardized test statistic for this null hypothesis is TS - /32
VV ( 2)

Type I errors based on the first two parameter settings for both models

are shown in Table (5-1). Powers for tests based on parameter settings 3 through

5 for both models are presented in Table (5-2). All tests are carried out for 10000 generated random samples and for three different sample sizes: 400, 1000, and 2000. The numbers of unobtainable test statistics (# NaNs) for the exponential-logistic mixture model are shown in the last column of each table. The unobtainable test statistics occur because the likelihood function for the mixture model is close to flat at the point that MLEs are obtained. When the sample size becomes larger, however, unobtainable or unidentifiable test statistics are eliminated.

We present bar plots of all standardized test statistics obtained for all three sample sizes based on those five parameter settings. These plots help to visualize the advantages of the piecewise Gompertz model in terms of the normal approximation of the test under the null hypothesis and the power of the test under the alternative hypothesis. Figures (5-6), (5-7), and (5-8) are the distributions of the test statistics under parameter setting 1 for both the mixture model and the piecewise Gompertz model. These three figures use sample sizes of 400, 1000, and











Table 5-1. Type I Errors for the Test of Equality of Cure Proportions for the Mixture and the Three-Piece Gompertz Models.


Parameter Setting
1




2


Sample Size (200, 200) (500, 500)
(1000, 1000) (200, 200) (500, 500) (1000, 1000)


Type Mixture 0.0305 0.0171 0.0365 0.0203 0.0356 0.0402


I Error Gompertz
0.0375 0.0412 0.0412 0.0373 0.0360 0.0408


Mixture
220
1


NaNs Gompertz


Table 5-2. Powers for Tests of Equality of Cure Proportions for the Mixture and the Three-Piece Gompertz Models.


Parameter Setting
3 4 5


Sample Size (200, 200) (500, 500)
(1000, 1000) (200, 200) (500, 500)
(1000, 1000) (200, 200) (500, 500) (1000, 1000)


Power
Mixture Gompertz

0.0260 0.1016 0.0328 0.2024 0.0679 0.3703 0.0616 0.1896 0.0357 0.4351 0.2721 0.7379 0.2706 0.5572 0.7180 0.9331 0.9601 0.9981


# NaNs
Mixture Gompertz
127




5 28








2000, respectively. Similarly, Figures (5-9), (5-10), and (5-11) are for parameter setting 2; Figures (5-12), (5-13), and (5-14) are for parameter setting 3; Figures (5-15), (5-16), and (5-17) are for parameter setting 4; and Figures (5-18), (5-19), and (5-20) are for parameter setting 5.

From these tables and plots, we observe that Type I errors obtained from the three-piece Gompertz model are closer to 0.05 than those from the mixture model. Powers presented in Table (5-2) for the three-piece Gompertz model are much larger than ones for the mixture model for all sample sizes. The estimation procedure for obtaining MLEs is also easier to carry out for the three-piece Gompertz model than for the exponential-logistic model. In addition, the algorithm for the three-piece Gompertz model converges much faster.


























0
0
0



(DO1
0

0* a3 )

L0
0
0.-o


-2 0 2 Standardized Test Statistic


0
0









U 0
0 LIO


0


-2 0 2 Standardized Test Statistic


Figure 5-6. Standardized test statistics for parameter setting 1 for 10000 random samples with Size 400. a) Mixture model; b) Three-piece Gompertz model.

































0-1

0
LO 0-


-4 -2 0 2
Standardized Test Statistic


0
0
0




0 o~Y

8.
Lo 0.-


-4 -2 0 2
Standardized Test Statistic


Figure 5-7. Standardized test statistics for parameter setting 1 for 10000 random samples with size 1000. a) Mixture model; b) Three-piece Gompertz model.































C: 0

U


LO
0
0 LLO 0: "8)

0


-4 -2 0 2 4
Standardized Test Statistic


C.


C0
0

to


-4 -2 0 2 A
Standardized Test Statistic


Figure 5-8. Standardized test statistics for parameter setting 1 for 10000 random samples with size 2000. a) Mixture Model; b) Three-piece Gompertz model.

































C U




0


-4 -2 0 2 4
Standardized Test Statistic


0




1.

LIO 0


-2 0 2 Standardized Test Statistic


Figure 5-9. Standardized test statistics for parameter setting 2 for 10000 random samples with size 400. a) Mixture model; b) Three-piece Gompertz model.


.... ]





































LL

LO)


0 -


-2 0 2 Standardized Test Statistic


0
0
0





tn
N


Co


'I


-2 0 2 Standardized Test Statistic


Figure 5-10. Standardized test statistics for parameter setting 2 for 10000 random samples with size 1000. a) Mixture model; b) Three-piece Gompertz model.


























O a0 :0~





LL0

LO


0-


-2 0 2 4 Standardized Test Statistic


0
0



C:) cMj



h


U-0
LIO


-4 -2 0 2 4
Standardized Test Statistic


Figure 5-11. Standardized test statistics for parameter setting 2 for 10000 random samples with size 2000. a) Mixture model; b) Three-piece Gompertz model.





























0

C a)
0



LL


-4 -2 0 2
Standardized Test Statistic


U


0


0 _j


-4 -2 0 2
Standardized Test Statistic


Figure 5-12. Standardized test statistics for parameter setting 3 for 10000 random samples with size 400. a) Mixture model; b) Three-Piece Gompertz model.





















8




LL0 t) 08


-4 -2 0 2
Standardized Test Statistic


C, U-


-4 -2 0 2
Standardized Test Statistic


Figure 5-13. Standardized test statistics for parameter setting 3 for 10000 random samples with size 1000. a) Mixture model; b) Three-piece Gompertz model.




















0

Oo U _8
UO
,o
-0

LL


-4 -2 0 2
Standardized Test Statistic


C
0




0-0

LO
8

if


-2 0 2 Standardized Test Statistic


Figure 5-14. Standardized test statistics for parameter setting 3 for 10000 random samples with size 2000. a) Mixture model; b) Three-piece Gompertz model.



























0
0
0






=3 0
4)





0


-4 -2 0
Standardized Test Statistic


8
>O

a)
0
LO
LL
0
0 it)


6 -4 -2 0 2 4
Standardized Test Statistic


Figure 5-15. Standardized test statistics for parameter setting 4 for 10000 random samples with size 400. a) Mixture model; b) Three-piece Gompertz model.


























0
0



I)
U

LO) 0~


-2 0
Standardized Test Statistic


-4 -2 0
Standardized Test Statistic


Figure 5-16. Standardized test statistics for parameter setting 4 for 10000 random samples with size 1000. a) Mixture model; b) Three-piece Gompertz model.






















0
0



30




LLo0
0


0


-6 -4 -2 0
Standardized Test Statistic


0 (D


U- 0
0 0r


0J


-2 0
Standardized Test Statistic


Figure 5-17. Standardized test statistics for parameter setting 4 for 10000 random samples with size 2000. a) Mixture model; b) Three-piece Gompertz model.






















8
cOJ


00 CO
0L
LL


-8 -6 -4 -2 0
Standardized Test Statistic


:8

0" :) U-


-8 -6 -4 -2 0
Standardized Test Statistic


2 4


2 4


Figure 5-18. Standardized test statistics for parameter setting 5 for 10000 random samples with size 400. a) Mixture model; b) Three-piece Gompertz model.


























0


O0



('-D
LI)

LO u.
0o U) LI


-8 -6 -4 -2 0
Standardized Test Statistic


0
0 LIO


0-


-8 -6 -4 -2 0
Standardized Test Statistic


2 4


2 4


Figure 5-19. Standardized test statistics for parameter setting 5 for 10000 random samples with size 1000. a) Mixture model; b) Three-piece Gompertz model.


































la) 0O
a
LL

LO)

0


-8 -6 -4 -2 0
Standardized Test Statistic


0
LO C ' LL


-8 -6 -4 -2 0
Standardized Test Statistic


2 4


2 4


Figure 5-20. Standardized test statistics for parameter setting 5 for 10000 random samples with size 2000. a) Mixture model; b) Three-piece Gompertz model.













CHAPTER 6
EXAMPLE


6.1 Introduction

In Chapter 2 we proposed a test statistic that addresses the hypothesis presented in Equation (2.3). Subsequently in Chapter 3 and Chapter 4 we established the asymptotic normality of parameter estimates in the piecewise Gompertz model. The asymptotic normality validated the test of hypothesis proposed in Chapter 2. Chapter 5 reported Monte Carlo Type I error and power calculations, which established the strength of our test statistic on the comparison of cure rates under various parameter settings. In this chapter, we demonstrate the applicability of our test statistic with a data set.

The data set comes from a study by the Pediatric Oncology Group (POG) at the university of Florida (Ravindranath et al. 1996). The motivation of this study is to compare cure proportions among patients with acute myeloid leukemia (AML) under different treatments. Three treatments, allogeneic bone marrow transplantation (allo. BMT), autologous bone marrow transplantation (auto. BMT), and intensive chemotherapy, are performed in the study. In order to register in the study, all patients are required to be in remission after the first two courses of induced chemotherapy. The POG evaluated 666 patients whose ages were under 21 from June 1988 to March 1993. A total of 552 patients entered remission and are therefore eligible for the study. Of the eligible patients, 89 patients received allo. BMT; 117 were assigned to an additional six-course intensive chemotherapy; and 115 were assigned to receive auto. BMT. There were 231 patients who did not participate in the study due to various reasons.








6.2 Methodology

Our objective based on this data set is to perform pairwise comparisons

on cure rates between treatment groups after controlling for patient's gender. We compare tests for three parametric models and for a proportion test based on KM estimates of cure. The three parametric models are the logistic-exponential mixture model, the one-piece Gompertz model, and the three-piece Gompertz model. In addition, we present cure rates and their standard deviations for every treatment and gender combination based on estimates from parametric models and on KM estimates. In the end, we compare the standard one-piece Gompertz model with the three-piece Gompertz model by applying the goodness-of-fit test we derived in Chapter 5. In this section, we will briefly describe 1) three parametric models and their corresponding test statistics; and 2) the proportion test based on KM estimates of cure proportions.

In the exponential-logistic mixture model, we assume an exponential distribution with parameter A for the failure time and a logistic regression model for the cure proportion. The expression log(A) is defined as a linear function of the covariate x and the treatment indicator. We define two dummy variables to represent the three-level treatment indicator. They are presented as G, = ![if patient in treatment 1] and G2 = ![if patient in treatment 2]. The survival function for the failure time given that the observation is not a long-term survivor (Z = 1) is

S*(tlZ = 1) = eC = exp{-te �+ Y1X+Y291+Y3921. (6.1) The cure probability is defined as


7Tg(x) = 1 + e�+031x1+0 2g1+3g2.*


(6.2)








Therefore the hypothesis of testing the equality of cure rates in three treatment groups can be written as

H0: /2 =/ 3 = 0. (6.3) Pairwise comparisons between two treatments are based on the following standardized test statistics:


T1 = v/V( _/3 ,(6.4)


T2 = 02 and (6.5)


T3 - = (6.6)


Note that T1 is the test statistic that compares cure proportions between Treatments 1 and 2; T2 compares cure proportions between Treatments 1 and 3; and T3 compares cure proportions between Treatments 2 and 3. The variances are obtained by inverting the observed information matrix obtained by taking the second derivatives of the log-likelihood function with respect to parameters.

The one-piece Gompertz model is a special case of the piecewise Gompertz model. Let Ag be the parameter associated with the treatment g and #1 the coefficient associated with the covariate x. For Ag < 0, the cure proportion for patients with the treatment g and the covariate x is written as 7rg(x) = exp{e3�+01X I }. (6.7) If A9 < 0, there is cure proportion for patients under the gth treatment. If Ag < 0 for any g, then the hypothesis of testing the equality of cure proportions is written as


Ho: A =A2 =A3.


(6.8)








Standardized test statistics for pairwise comparisons of cure proportions are presented as

- A1-2 (6.9) = /V()1 _,2)'

T2 --A3 and (6.10) /V(A1- A)

T3 - .2-A3 (6.11) VV(X2- 3)

Similarly these three standardized test statistics are for the comparison of cure proportions between Treatments 1 and 2, between Treatments 1 and 3, and between Treatments 2 and 3, respectively. The variances are obtained from the inverse of the observed information matrix.

Finally, as we have discussed in Chapter 2, the cure proportion in a

piecewise Gompertz model as a function of the covariate x and the treatment g is expressed in Equation (2.4). We consider a three-interval case in this example. Denote the three intervals by [TO, T ), [T1, T2), and [T2, T3), where T0 = 0 and T3 = 0c. When A39 < 0 for any g, testing the equality of cure proportions for any given value of x is equivalent to testing
3 _ e~eliil3- eeli2i
H 3 � S ei'i-1 - e\i21i-1 e 2 Ti- 1- eIi2TAi3Ti
H� il Ai2 -i3 (6.12)
i=1 i=1A~

Standardized test statistics for pairwise comparisons between treatment groups are presented as follows:
T1 -- i31 e xil 'i- l- e2'i _- E l3 e1e -'\ xi 2

T, A12 (6.13) eV1- - 3 -e
__i____i___-l________ i3= e)'12 r'i-- -e) i2"rl il ei2

-3 le i-l i e 3 Ai3T:1 - eli3"i 21 A3 , and (6.14)
V____1___3_ J -i 'i-1 -I i 3 e\ i- 1 -e~i3ri VA 3








3 e-,\- ei 3' 1T)2 1eT.(6.15) V(E3 e2T i2-i- I 2ri _ 3 el3rZ-1-e'3 ) =1 i,, 1 ,,3

These test statistics T1, T2, and T3 are for comparisons of cure proportions between Treatments 1 and 2, between Treatments 1 and 3, and between Treatments 2 and 3, respectively. Variance estimates are not quite straightforward as for tests based on the mixture model and on the one-piece Gompertz model. They have already been presented in Chapter 2 and will not be repeated here.

Proportion tests of KM estimates of cure for pairwise comparisons between treatments are generated based on KM estimates of cure proportions and standard errors of the estimated cure proportions.

P-values for tests derived from parametric models and for the proportion test will be presented in the next section. In addition, we present survival curves estimated based on both parametric models and KM estimates. Furthermore, we present the cure proportion and the corresponding standard deviation for each treatment and gender combination based on parametric estimates and on KM estimates. The result of comparing the one-piece Gompertz model with the three-piece model by the goodness-of-fit test we derived earlier is also presented.


6.3 Results

6.3.1 Pairwise Comparisons on Cure Proportions

In Table (6-1), we present p-values for tests based on three parametric models and for the proportion test based on KM estimates. The three intervals for the three-piece Gompertz model are defined as [0, 50), [50, 140), and [140, 0c). All numbers presented here are in terms of weeks in follow-up. Tests based on the exponential-logistic mixture model, the three-piece Gompertz model, and the KM estimates detect the difference in cure proportions between allo. BMT and auto.








BMT and the difference between allo. BMT and intensive chemotherapy. The allo. BMT increases cure proportion significantly more than the other two treatments. There is no significant difference between auto. BMT and chemotherapy. A test based on the one-piece Gompertz model does not show any significant difference between any two of these three treatments. The maximization procedures obtaining MLEs of parameters for both the one-piece Gompertz model and the threepiece Gompertz model are smooth. The algorithm converges within a few steps. However, there is a convergence problem related to the mixture model. Different starting values for parameters are chosen in order to obtain the convergence. We have discussed the convergence problem related to the mixture model in both Chapter 1 and Chapter 5. The convergence problem we encounter here confirms our earlier assertion.
Table 6-1. Comparison of Tests from the Mixture Model, the One-Piece Gompertz Model, and the Three-Piece Gompertz Model, and the Proportion Test on Cure Proportions.

P-values for Treatment Comparisons
1-Piece 3-Piece Proportion
Comparisons Mixture Gompertz Gompertz Test
Allo. BMT vs Auto. BMT 0.016 0.384 0.015 0.011 Allo. BMT vs chemotherapy 0.012 0.101 0.003 0.007 Auto. BMT vs chemotherapy 0.690 0.516 0.897 0.741


In Figures (6-1) to (6-4), we show estimated survival curves from different parametric models and from KM estimates. The three curves on Figure (6-1)a are survival curves for three treatment groups for male patients based on the mixture model. Figure (6-1)b presents three treatment survival curves for female patients based on the mixture model. Figure (6-2) presents survival curves estimated from the one-piece Gompertz model for male and female patients in three treatment groups. Survival curves estimated from the three-piece Gompertz model are








presented in Figure (6-3). The KM estimates of survival curves are presented in Figure (6-4).


6.3.2 Cure Proportion Estimates

Estimates of cure proportions are presented in Table (6-2). We present

estimates of cure rates and their standard deviations from the mixture model, the one-piece Gompertz model, the three-piece Gompertz model, and the KM estimate for all combinations of patient's gender and three treatment groups. Treatment

1 represents allo. BMT, treatment 2 represents auto. BMT, and Treatment 3 represents intensive chemotherapy. Note that cure proportion exists for each of the combinations. Female patients show higher cure probability than male patients. Standard errors for cure proportions estimated from both the three-piece Gompertz model and the mixture model are smaller compared to the one-piece Gompertz model and the KM estimates.
Table 6-2. Estimates and Standard Deviations of Cure Proportion Generated from the Mixture Model, the One-Piece Gompertz Model, the Three-Piece Gompertz Model, and the Kaplan-Meier Estimates.

Cure Proportions (Standard Deviations) 1-piece 3-piece
Gender Treatment Mixture Gompertz Gompertz KM M 1 0.558 (0.059) 0.504 (0.062) 0.568 (0.056) 0.648 (0.076) M 2 0.386 (0.050) 0.445 (0.060) 0.397 (0.050) 0.387 (0.065) M 3 0.407 (0.033) 0.409 (0.050) 0.405 (0.033) 0.388 (0.036) F 1 0.597 (0.056) 0.533 (0.062) 0.596 (0.054) 0.531 (0.071) F 2 0.424 (0.051) 0.475 (0.061) 0.429 (0.051) 0.421 (0.065) F 3 0.445 (0.036) 0.440 (0.051) 0.437 (0.035) 0.466 (0.041)



6.3.3 Goodness-of-Fit

From the above discussion, we observe that the three-piece Gompertz model fits better than the one-piece Gompertz model. It is natural to compare these two models by applying the goodness-of-fit test we derived in this dissertation. The null








83











a



____ a0lo. BMT ........ auto. BMT
0 --- chemotherapy CO

a. -- ------------------.>0

>0




0 200 400 600
Time in weeks



b



allo. BMT
I %" ........ auto. BMT
0 - - chemotherapy

.0
0 - ..... - - - - - - -
-a "q . .. ... ....... - ._...............
>.0

ci,

0

0 200 400 600
Time in weeks










Figure 6-1. Survival curves estimated from the mixture model for three treatments. a) Male; b) Female.





















\~~ ~ ......l'o. BM I
- - - chemotherapy


"";":'::'"::' Z'22 "- .....-.."..".. .......................... .







0 200 400 600
Time in weeks


b



........ auto. BMT
I- -- chemotherapy


200


400
Time in weeks


Figure 6-2. Survival curves estimated from the one-piece Gompertz model for three treatments. a) Male; b) Female.


=0
cu
.

.>0d


0
0


=0








Ci

.0
q CO >0


C) 05







85









a

I alto, BMVT >'00 ........ auto. BMVT
C5 chemotherapy
=0
2

>

CD,
0
0
0 200 400 600
Time in weeks


b

allo. BMVT >40 ........ auto. BMVT
C5- chemotherapy

C.� >0
C,,
> 6





0 200 400 600
Time in weeks









Figure 6-3. Survival curves estimated from the three-piece Gompertz model for three treatments. a) Male; b) Female.


























C.

20



a.




CO j


0 10O0 200 300 400 500 600 700
Time in weeks




b


0 100 200 300 400 500 600 700
Time in weeks













Figure 6-4. Survival curves from Kaplan-Meier estimates for three treatments. a) Male; b) Female.


0




Ca CO .0
o 0o



C5~h

0
6








hypothesis is expressed as

Ho : g = A2g = A3g,

for any g = 1, 2, and 3. Note that A29 and A3g are parameters for the second and third interval in the three-piece Gompertz model, respectively. We showed that under H0

-2(11 - 12) 2DV6,

where 11 and 12 are the log-likelihood functions for the one-piece and the threepiece Gompertz models, respectively. From the computation, we obtain that 11 = -1977.289 and 12 = -1950.979. Therefore, the test is highly significant. It implies that the three-piece model is significantly better than the one-piece Gompertz model.













CHAPTER 7
SUMMARY


In this dissertation, we presented a piecewise Gompertz model to address long-term survival problem. We constructed a standardized test statistic based on our model which compares cure rates between treatment groups. We derived the strict concavity of the likelihood function based on the model. In addition, we derived the asymptotic properties of the MLEs of parameters in the model. A goodness-of-fit test was also derived to access an appropriate number of intervals for the model.

In oncology study when there are long-term survivors and when there are different hazard patterns on different time intervals, one might consider using our piecewise Gompertz model. Our model is more computational feasible than both the mixture model and the semi-parametric model with bounded cumulative hazard function. In addition, our model is more flexible due to its piecewise structure. Our model also provides a reliable estimate of the cure rate than the KM estimate when the sample size is small at the end of the study period.

In the future, we plan to find an optimal partition of the time span by

deriving a test to compare different partitions. Sample size determination will also be assessed. Since we notice that there are jumps on the survival curve defined by the piecewise Gompertz model, we might consider using a log-spline model to achieve smoothness of the curve. In addition, we plan to perform more detailed simulation studies on the semiparametric model which assumes bounded cumulative hazard function and to study the asymptotic behavior of the semiparametric model.











APPENDIX A
DERIVATION OF MARGINAL LIKELIHOOD

Let
In = ... f (ti)eF t:O)dtndtn-x"'" dr1
f=0 t-1 =
and define

Ik .. "'" f (ti)e -F(ti)�*idtndtn-I ... dtn-k+l, k =1, 2,.. n.
tn-k fn-1 i=n-k+l
As k = 1,
fo f [t~-~nO t e-F(tn)On - e-On (A.1)
11 = jnf(tn)eF(n)endtn 1 (A
n-i1n~n e] and when k = 2,

12 f : 1 f (ti)e-F(ti)eidtndtn-1
t-2 tn-1 i=n-1

f(t )e-F(tn-1)On-1 If f(tn)e-F(tn)Endtn}dtn-l

1 f f(tni1)e-F(tnl1)On-}{e-F(tn-0en - eenldtn-1
1-2 On
1 + [e_(O-1+On)F(tn_2) _ e_(nl+(n)]
On(On-1 + On)
e-On
-e-_ [e-On-F(tn-2) _ e-On-I]
OriOn-1

By similar derivation, we obtain that when k = 3,

13 = fl3f (tn-2)e- F(tn-2)0n-2 2dtn_2
-+ 1 +e(On+On1+ien2)F(tn-3) - e(On+E_.e+e.n2)]
On(On + On-1)(On + EOn-1 + On-2)






90
een

ene _1(e _l + e___)[e(e-1+On-2)F(t.-3) - e-()-1+6n-2)]

+ e- (O -+ _) [e -0-2F(tn-3) - e-1n2]
On1E~n2(0, + O___-e)

Let
k
Ik = E(-1)i-lAi,k[e -Sn-k+l. +lF(tn-k) - e-Sn-k+1"-i+1], (A.2)
i=1
then

Ik+1 =t-- Ikf(tn-k)e- F(tn -k)'9-k dtn-k

0 k

nk -1 i )-Ai,k [e- -k+1 +lF(tn-k) - e-Sn k+lni+l]
f (tn-k)e- F(tn-k)8nk dtn-k
k-1 [eSn k,n i+lF(tn k -l) - e k,n i+l] S _(-1) Ai,ks iS_+e

k
�Z(-1) l AikeSn-k+l,-i+l[F(tn-k-1)On-k-e-n -] (A.3)
+ " ' ,4__,-le - e" A 3 On-k

Also we know that k+1
Ik+1 )(-1) Ai,k+l[e- s-kkn-i+lF(tn--1) - e-Sn-k,n-+l] j=1
k
Z(-1)Ai,k+l[e-Sn- k,n-i+F(tn1-k-) - e -Sn-k,n-i+l]
il1
(1)kAk+l,k+l[eSn-k,n kF(tn-k-1) - e-Sn k,n-k]. (A.4)

Equate (A.3) and (A.4), we obtain that

Aik+l Aik 1 (A.5) Snkni+1
and
k
Ak+l,k+,-- E(-1)i-k Ai,k esn k+ln-_+l (A.6) i=1 Onk








Therefore by Equation (A.5)
1
A ~ = - Sn k+l,n-i+lS-nk,n-i+l

1 = Ai i (A.7) Sn-i,n-i+1Sn-i-1,n-i+1 " " Sn-k+l,n-i+l Pn-k+l,n-i+l

for i < k and by the definition of P,b given in Chapter 1. In addition by Equation (A.6),
k ik Ai,k e _Sk+l,n i+1 Ak+,k+i= n (-k

k- 1 i-k Ajk eSnk+lni+l + Ak, Snk+ln k+ l
= " -e e-s-l,-+
=n-k On-k k-1
( - ) A . , 1 S n- k ,n -i+ 1 S n - k ,n - i+ l
i=1 Sn-i,n-i+1Sn-i-1,n-i+1 . kn-k
e--Sn-k+l,n-k+l (A.8
+Ak,k n-k (A.8) Also we observe that


I , = [e-s'nF(tn-1) le - SnnF(tn-)

where the first equality is true by Equation (A.2) and the second is true from Equation (A.1). Therefore
1 _1
A1,1 - S,n On


Similarly we obtain through derived Equation (A.8)

A 2 ,2 = A l l1 n - e _ _ _ e-Sn'n Sn,nSn-l,n-l

and
A 3,3 -e- Sn- l,n Sn-l,n-Sn-2,n-2Sn-1,n
If we let

Ak,k e n lSnnk+2,n (A.9)








where p,-k+2,, was defined earlier in Chapter 1 as I-j=n-k+2 Sn-k+2,j. In addition, we observe from Equation (A.8) that

Ak,k
k-2 s
S~--~~n ii ( - k+lA z n-i+l,n -i+1e - -k+2, -i+1
- Sn k+l, k+l .= -- pn-k+2,n-i+l
j=1
q-Ak-l,k le-S.-k+2,.-i+l
eSn-k+2,n k-1 1
S-k+l,n-k+l {Z(1)k+1pn-i+2,npn-k+2,n-i+l (A.10)

Hence from equating Equations (A.9) and (A.10), the following equation holds.


-1Y 1 1 (A.11)
(pni+l,npn-k+l,n-i - pnk+l,n i=0

Now we can derive Ak+x,k+l as follows:

Ak+l,k+l
k-1 n--Sn-k+l n-i+l
"(p- n-ki+1),n + + Ak,ke -Sn-k+ln-k+l1
Sn-k,n-k t_ / , pn-k+l,n-i+l"r"Zkk J
i=1
e Sn-k+2,n k-1 1 1
Sn-k,n-k {= ()ikpn-i+2,npn-k +1n-i+1 +Sn-k+l,n-k+lPn-t+2n} i1
ef k l-nk 1 1 1
Sn-k,n-k {pn-k+l,n pn-k+2,nP k+,n-k�1 � Sn-k+l,n-k+lPnk+2,n
e-Sn-k-n-2,n (A.12)
Sn-k,,nk pn-k +l,,n A.2


by Equation (A.11). Therefore, Sn-i+l,n-i+l (A.13) Ai,k = Pn-k+l,n-i+lPni+2,n,

for i < k. Ik, as defined in Equation (A.2), therefore can be written as

k -1e- i+2,n-Sn-k+1"n-j+ F(t -k) - e -S-k+(14)
Ik pn-i+2,nP-k+l,,-i+l




Full Text

PAGE 1

L PIECEWISE GOMPERTZ MODEL ON SOLVING CURE RATE PROBLEM > i ' ;T' ^ * By M s HAOYI CHEN I A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2001

PAGE 2

Copyright 2001 by Haoyi Chen

PAGE 3

I dedicate this work to Hua.

PAGE 4

ACKNOWLEDGMENTS I would like to take this opportunity to express my deepest gratitude to Professor Myron N. Chang, my advisor, for his guidance, tremendous help, and encouragement during my doctoral research. This dissertation would not have been possible without his forceful guidance and excellent advisement. I also would like to thank Professor Randolph R. Carter, my graduate research supervisor, for his guidance and support during the various epidemiological research projects with which I have been involved as a graduate research assistant. My deepest appreciation is extended to the other committee members. Dr. Ronald Randles, Dr. Samuel S. Wu, and Dr. Michael B. Resnick, for their advice on my doctoral research. Lastly, I would like to extend my gratitude to my husband Hua and my parents for their never-ending support, understanding, and love. IV

PAGE 5

TABLE OF CONTENTS page ACKNOWLEDGMENTS iv ABSTRACT vii CHAPTERS 1 INTRODUCTION 1 1.1 Problem and Common Practice 1 1.1.1 Mixture Model 2 1.1.2 Binary Regression Model 6 1.1.3 Nonparametric Method 7 1.1.4 Model with Bounded Cumulative Hazard Rate 10 1.2 Computation Difficulty of the Semiparametric Model 17 1.3 Piecewise Exponential Model 20 2 TOPIC OF RESEARCH 25 2.1 Data Structure and Model 26 2.2 Assumptions 28 2.3 Hypothesis of Interest 29 2.4 Test of Hypothesis 29 3 ESTIMATION OF PARAMETERS 31 3.1 Likelihood Function and Information Matrices 31 3.2 Concavity of Likelihood Function 32 4 PROPERTIES OF THE MAXIMUM LIKELIHOOD ESTIMATES ... 35 4.1 Introduction 35 4.2 Preliminary Proofs 36 4.3 Existence and Consistency of the MLE 37 4.4 Asymptotic Normality of the MLE 40 4.5 Goodness-of-Fit Test 42 5 SIMULATIONS 47 5.1 Introduction 47 5.2 Generation of Random Samples 47 5.2.1 Parameter Settings 48 V

PAGE 6

5.2.2 Generation of Censored Observations 50 5.2.3 Generation of Failure Observations 56 5.3 Type I Error and Power 57 6 EXAMPLE 76 6.1 Introduction 76 6.2 Methodology 77 6.3 Results 80 6.3.1 Pairwise Comparisons on Cure Proportions 80 6.3.2 Cure Proportion Estimates 82 6.3.3 Goodness-of-Fit 82 7 SUMMARY 88 APPENDICES A DERIVATION OF MARGINAL LIKELIHOOD 89 B PROOF OF LEMMA 4.1 94 C PROOF OF LEMMA 4.2 97 D PROOF OF LEMMA 4.3 100 E PROOF OF LEMMA 4.4 101 F PROOF OF LEMMA 4.5 107 G PROOF OF LEMMA 4.6 109 H PROOF OF LEMMA 4.7 112 I PROOF OF LEMMA 4.8 113 J SIMPSONÂ’S FORMULA 114 REFERENCES 115 BIOGRAPHICAL SKETCH 118 VI

PAGE 7

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy PIECEWISE GOMPERTZ MODEL ON SOLVING CURE RATE PROBLEM By Haoyi Chen December 2001 Chair: Myron N. Chang Major Department: Statistics In cancer research, some of the patients are disease-free after certain treatments. It is interesting to compare treatment efficacy with long-term survival rates. One commonly used approach in analyzing this type of data is to compare the Kaplan-Meier estimates of the cure rate. Another approach is to apply the mixture model proposed by Farewell. However, the Kaplan-Meier estimates are unstable toward the end point, while the mixture model is computationally too complex. To overcome these difficulties, we propose a test based on a piecewise Gompertz model to compare drug efficacy on the cure rate. The proposed test also accommodates the situation where patients display different hazard patterns during different treatment stages. In this work, we have derived the strict concavity of the log-likelihood function and the existence, consistency, and asymptotic normality of the maximum likelihood estimates of the parameters. In addition, our Monte Carlo simulation study shows that the proposed test is more computationally feasible and more powerful than the test based on FarewellÂ’s mixture model. Moreover, an vii

PAGE 8

example is given to show the utility of our proposed test. A goodness-of-fit test of our proposed model is also discussed.

PAGE 9

CHAPTER 1 INTRODUCTION 1.1 Problem and Common Practice In survival analysis, one is interested in comparing the efficiency of various drugs or treatments. The survival probability is generally assumed to converge to zero when the observation time becomes large. However, it is not uncommon in cancer research that a proportion of patients are disease-free (relapse-free) after certain treatments. These patients are called “long-term survivors” or “cured.” For instance, in clinical trials dealing with childhood leukemia, some treatments do cure a proportion of patients permanently. This type of long-term survival pattern is different from what is dealt with under the traditional survival analysis. In addition, it is sometimes necessary to assume that hazard functions are different on different intervals. A piecewise exponential model proposed by Friedman (1981) accommodates this assumption. This model was considered semiparametric by Aitkin et al. (1989, PP. 287-292). They stated that the piecewise exponential model has the same flexibility as the proportional hazards model in which the baseline hazard function is unspecified. In this chapter, we first introduce the common methodologies in the literature that deal with cure rate problems. We also discuss the piecewise exponential model proposed by Friedman which assumes different hazard functions on different time intervals. In the literature, there are several methods dealing with the long-term survival problem. They are as follows: • Using a mixture model which assumes the underlying population is a mixture of two populations, patients who are cured and patients who are not cured. 1

PAGE 10

• Applying a modified binary regression model in which the main interest is to compare the cure rates among patient groups. 2 • Using nonparametric methods to compare the survival curves or to compare cure proportions among treatment arms. • Assuming a bounded cumulative hazard rate instead of a cumulative hazard rate with an infinite limit. In the next four sections, we describe each method in more detail. Assume that there are n patients in the study. For the patient, let Ti denote the time to relapse (incidence), Ui the censoring time, T* = min(rj, Ui), Aj = I[Ti
PAGE 11

3 Therefore, the overall likelihood function is n L = Y[{P{Zi = l)r{t* I Z, = 1)}'^P(Z, = 0) + P{Zi = I Z, = ( 1 . 2 ) Two influential articles addressing this method were written by Farewell in 1977 and 1982. In the first article, he expressed the binary variable Z using a logistic regression model where /3 is a vector of coefficients associated with the vector of covariates x. Farewell (1977) also assumed the time to incidence T follows an exponential distribution with mean 1/A, and A is assumed to be known. Thus, when = 1, the contribution of the patient to the likelihood is g)3^Xi f*{Ti = tf,\\Zi = l)P{Zi = 1) = Ae-^*^ti > 0. 1 _|_ gA» Xj When 5i = 0, the contribution of the patient to the likelihood is P{Zi = 0 ) + P{Zi = l)S*{Ti = ti,\\Zi = 1 ) 1 d/3 Xi + 1 + xi 1 + U > 0 . Type I censoring was assumed in Farewell (1977). The author also accessed the efficiency of the logistic-exponential mixture model proposed in his paper relative to a regular logistic model, which assumes Z = 0 if the patient does not have a relapse until T°. He defined as the end point of the observation time. He found that the efficiency of the mixture model increases as the follow-up time T° increases and as P(Z = 1) decreases. When the true underlying distribution is a mixture model, the estimate of the cure rate from the regular logistic model is biased. Farewell (1982) proposed two more general models for the time to incidence T. He assumed that T follows a Weibull distribution in his first model. The probability density function and the survival function of T given that the patient is

PAGE 12

4 noncured are written as f*{T = t] X,( \ Z = l,x) = exp{— (At)^}, t > 0, and (1.4) S'*(T = t I Z = 1, x) = exp{— (At)'’}, t > 0, (1.5) where C is an unknown parameter, A = exp(— 7 ^x) is a function of covariates, and 7 is a vector of unknown parameters associated with covariates x. The likelihood function is obtained by plugging in P{Z = 1) defined in Equation (1.3), the probability density function defined in Equation (1.4), and the survival function defined in Equation (1.5) to the likelihood function in Equation (1.2). Farewell (1982) proposed the second model for grouped data on survival time. Assume there are k failure time groups {ti, tj+i], i = 1,2, ... ,k, the conditional probability for T = ti given that T > U is ^Oi+O'^x P(T = ti \ Z = 1,T > ti,x) = 5 ;-, (1.6) where a.i (z = 1, 2, . . . , A;) are group-specific parameters. The vector 0 is associated with covariates x and is common to all time groups. By combining Equations (1.2), (1.3), and (1.6), the likelihood function is obtained. Farewell (1982) commented that the mixture model is appealing from both biological and statistical points of view. It has been used frequently in cancer studies, especially in the study of breast cancer. Strong evidence of the existence of individuals coming from different populations, however, is required before using the mixture model. Farewell also pointed out that the likelihood function might be fiat for some data sets, which would cause possible nonidentifiability of the parameters. Farewell (1986) discussed some problems related to the mixture model by applying the model to two data sets. The first data set was from a breast cancer trial with 139 patients. He applied a logisticWeibull mixture model to the data. Then he computed the profile likelihood function for (3q, the intercept parameter

PAGE 13

5 in the logistic portion of the mixture model. The profile likelihood function for Pq was obtained by replacing the other parameters in the model by their maximum likelihood estimates (MLE). He found that the profile likelihood function was in a strongly nonquadratic shape, indicating that significant nonidentifiability existed with respect to the estimation of /3 q. Therefore, this logistic-Weibull mixture model presented an unreliable estimate of the cure proportion. Farewell also applied a multinomial-weibull mixture model to a data set from Risch (1983) on the time to the onset of a major affective illness in siblings of affected individuals. RischÂ’s data were grouped into six time intervals. The onset probabilities were modelled by a multinomial distribution. The mixture model fitted data well. Therefore, Farewell concluded that one should be cautious when using mixture models. The cure proportion might be very imprecise in some cases and strong evidence for the existence of two populations is required to make a reliable inference. A significant amount of literature discussed the mixture model. The common parametric distributions for the time to incidence for the noncured population can take many forms. For example, they can be exponential (Berkson & Gage 1952, Bithell & Upton 1977, Farewell 1977, Goldman 1984); lognormal (Boag 1949); Weibull (Farewell 1982); Gompertz (Gordon 1990); Generalized Gamma (Yamaguchi 1992); and Generalized F-distribution (Peng et al. 1998). The distribution for the time to incidence can also take nonparametric or semiparametric forms, such as KaplanMeier (KM) estimates (Laska & Meisner 1992, Taylor 1995) and the proportional hazards model (Kuk & Chen 1992). The probability P{Z = 1) can be treated as a parameter in a Bernoulli trial. It can also be written as a function of a vector of covariates x in a logistic regression model, such as the model defined in Equation (1.3). To conclude the previous discussion, it is generally plausible to use the mixture model when there exists strong scientific evidence of the mixture of

PAGE 14

6 different populations. However, as both Farewell (1977, 1982, & 1986) and Cantor and Shuster (1992) pointed out, the likelihood function is sometimes nonquadratic or flat, which could produce problems for both computation and inference. In addition, parameter estimates could fall out of the parameter space when the data do not support a nonzero surviving fraction. The asymptotic properties of the MLE from a logistic-parametric mixture model were developed systematically by Mailer and Zhou (1996). The asymptotic properties related to the model parameter estimates in both logistic-semiparametric and logistic-nonparametric models have not been explored in the literature. 1.1.2 Binary Regression Model Jung (1996) proposed an interesting method to estimate the cure rate, which is defined to be the probability that the time to incidence T is greater than a fixed time r. In his paper, the standard procedure for inference on parameters in a binary regression model is modified to accommodate censored observations. Let 0(7 t) = /3^x, where 0 is a link function which is monotonic and differentiable on [0, 1]. Also let tt* = the inverse function of (j). Let tt* = 1 — tt* and 7T^(^) = dTr*{4>)/d(f). The MLE of in a standard binary regression model without censoring is obtained by solving n 1/2 ^{/(ti > r) -7T*(/3^Xi)}7r;(/3'^Xi) 0 . 7i*(/3 Xj)7r*(/3 Xj) In the case with censoring, denote T* = min(Ti, Ui). Assume independent censoring, i.e., Ti and Ui are independent given Xj for i = 1, 2, . . . , n. Also assume that Ui,U 2 , . ,Un are independently, identically distributed (iid) with a cumulative distribution function 1 — G. Let G be the KM estimate of G. Then the MLE Pmle be obtained by solving n 1/2 E( 1=1 m > r) 7T*(/3^Xi)} 7r*(/3^Xi)7T*(/3'^Xi) -X,: = 0.

PAGE 15

7 Numerically 0mle can be obtained by applying the Newton-Raphson method. Under mild conditions, it is proved that ^mle is a unique and a consistent estimate of /3 and n^^‘^{0MLE ~ P) follows an asymptotic normal distribution with a zero mean. The problem with this method is that r is very difficult to define. The result is not convincing if r is too small. On the other hand, it is not cost effective for r to be too long. Farewell (1977) compared the efficiency of a logistic-exponential model to a binary logistic regression model that treats any individual with t* > r as a long-term survivor. He also pointed out that the estimate of the cure rate would be biased using a logistic model when the underlying true distribution is a mixture distribution. The logistic model discussed by Farewell (1977) did not consider censoring. However, he emphasized that a similar conclusion can be reached by expanding the model to include censored individuals. Farewell also noted that it is especially unreasonable for the logistic model to ignore the possible censoring times when T* exceeds the largest observing time r. 1.1.3 Nonparametric Method Nonparametric tests are widely applied for both comparing survival curves with the presence of cure and comparing cure proportions between treatment arms. When the main interest is to compare survival curves of several populations with the presence of cure, some of the nonparametric tests are considered standard, such as the log-rank test and the generalized Wilcoxon test. Halpern and Brown (1987) studied the power of these two tests by simulations when the mixture cure rate model by Goldman (1984) is appropriate. They compared the power of these two tests for detecting the difference between two treatment arms for selected values of cure rates, instant failure rates, and follow-up time. Each of the two treatment arms had 50 participating individuals. They used 200 replications for each set of simulation. The simulation results are somewhat inconclusive. The log-rank test is

PAGE 16

8 more powerful than the Wilcoxon test in general. However, the log-rank test does not show any advantage over the Wilcoxon test in the following situations: 1) cure rates are high and are at similar levels in both arms; 2) there is a large difference in the time to the end point for both arms; and 3) the follow-up is long relative to the time to failure. In the comparison of cure proportions, Sposto et al. (1992) studied several tests based on simulated samples. These tests are the likelihood ratio, the Wald, the score tests based on two different mixture models, and a test based on the differences of the KM estimates of the cure rates. The two mixture models are logistic-exponential and logisticWeibull models proposed by Farewell (1977 & 1982). The time point in the KM estimate of the cure rate is defined as the last observed failure time. Simulated samples under assumed mixture models are generated in order to check the results in an ideal case. In addition, samples under some different models are generated in order to study the robustness of the tests to model misspecification. The authors also generated samples based on various levels of cure proportions. The test statistics calculated from the mixture models sometimes can not be obtained because of the nonconvergence during the maximization procedure. The authors pointed out two reasons for the nonconvergence: 1) the maximization is obtained on the boundary; and 2) the algorithm terminates abnormally with a jump to a nonnegative definite region. Test statistics with convergence problem are not presented in the article. In conclusion, the authors noted that the KM estimate of the cure rate is a reasonable alternative to the parametric estimate in general. When the underlying model includes unequal failure rates between treatment arms, the proportion test based on KM estimates is more powerful than the test based on the complicated parametric model. However, the KM estimate introduces more bias than the parametric

PAGE 17

9 estimate of the cure rate when the sample is small, when failure rates are equal, and when not all failures are observed. Gray and Tsiatis (1989) proposed a linear rank test to compare survival curves between two treatment arms over the entire time span, but their test focused power against the later difference on survival curves. This linear rank test is evaluated on a mixture distribution of a proportion representing the cure and a failure distribution with an unspecified form. They assumed that failure distribution functions for both treatment arms are constant over time. They specified a weight function in a two-sample non-parametric censored test proposed by Tarone and Ware (1977) in order to obtain their own test. Note that the logrank test also can be derived from Tarone and WareÂ’s test with a different weight function. However, the difference between Gray and TsiatisÂ’ test and the log-rank test is that the former put more weight towards the later difference and the latter put equal weight throughout the entire observation period. Gary and Tsiatis (1989) calculated relative efficiencies under equal censoring for their test, the log-rank test, and the proportion test based on KM estimates of cure rates. When there is no censoring, their test gains efficiency over the log-rank test for a small cure rate. However, the efficiency gain is less for a cure rate greater than 50%. These results can be explained by observing that the weight functions become closer for their test and for the log-rank test as the cure rate increases. In addition, under the noncensoring assumption, their test is asymptotically equivalent to the proportion test. When there is censoring and when the censoring proportion increases, their test is increasingly more powerful than the proportion test. But note that their test is not a pure test of the equality of the cure rates. Therefore, it has power against alternatives with unequal cure rates and alternatives with equal cure rates and unequal survival curves.

PAGE 18

10 In this section, two types of statistical tests are discussed: 1) comparing survival curves under the assumption of the existence of cure rates; and 2) comparing cure rates. Some of the nonparametric tests such as the log-rank test is quite standard in oncology for comparing survival curves. When the problem narrows down to the comparison of cure rates, the proportion test based on KM estimates are quite popular due to the computation difficulties related to the test based on the mixture model structure. However, there are problems associated with KM estimates for cure rates, as pointed out by Cantor and Shuster (1992). The standard error of the KM estimate increases in the right-hand extremes because of the small sample size at the end of the study. The KM estimate sometimes can produce counterintuitive results. For example, the time-to-relapse curve plateau could be higher than the time-to-dealth curve plateau because of the large downward steps at the end of the study. Therefore, one should be cautious relying on the KM estimate of the cure rate. As an alternative to the KM estimate of the cure rate. Cantor and Shuster proposed a parametric model with a bounded cumulative hazard rate to accommodate the cure proportion. 1.1.4 Model with Bounded Cumulative Hazard Rate Parametric Model Cantor and Shuster (1992) proposed a Gompertz model to incorporate long-term survivors. The survival function is written as: S'(t) = exp[^~^a(l — e^*)], a > 0, (1.7) where a is the initial hazard. When /3 > 0, the survival function S(t) —) 0 as t — > oo; and when ^ < 0, the survival function S(t) exp(a//3) as t > oo. The cure rate is presented as exp(a//d). The authors then maximized the likelihood function based on the Gompertz model using the Newton-Raphson method. Both Var(o;) and Var(/5) are obtained from the observed information matrix

PAGE 19

11 and the variance of the cure rate a/^ is obtained using the Delta method. In the article, they also compared the Gompertz model with the mixture model. When there is a cure proportion, both models produce similar curves and cure rate estimates. However, the Gompertz model is preferred due to the divergence problem associated with the mixture model. The authors also noted the instability of the KM estimate toward the end. Some other nice properties of the proposed Gompertz model are also provided in the article. For example, the cure probability given the patient has survived through time to is exp(|e^*“); the log-likelihood function is strictly concave so one can easily estimate the parameters and the cure rate; and finally, S{t)^ is also a Gompertz survival function when S{t) is a Gompertz survival function. Gieser et al. (1998) incorporated covariates information into the model proposed by Cantor & Shuster (1992). The treatment specific survival function for an individual is written as: 5(*;i,x) = {S.mK" = exp{eT'’'-‘+“:/?-*(l e*')}, (1.8) where a* is defined as logo; for a in Equation (1.7), i is the treatment index, x is a vector of covariates, and 7 is a vector of parameters associated with x. Note that S{t] z,x) in Equation (1.8) is also a Gompertz survival function as noted by Cantor and Shuster (1992). The cure rate in this case is 7 t(z,x) = 5(00; z,x) = exp{/ 3 “^e'’'^’^''‘“‘*}. The obtained log-likelihood function is concave with respect to parameters. The MLEs of the parameters are efficient and asymptotic normal when the assumed model is correct and when the sample size is large. The authors compared the treatment effects on cure rates for any given covariates vector x. The ratio of the

PAGE 20

12 log of the cure rates for two different treatment arms is expressed as ^ ^ log7r(f,x) ^ ” log7r(f',x) The null hypothesis on testing the equality of cure rates is written as Hq : log^../ = 0 and the test statistic is expressed as Z =[di + log(-/?./) log(-A) d^>]/SE, where the standard error in the denominator is obtained by the Delta method. Gieser et al. (1998) applied their model to 763 patients from a multicenter prospective study preformed in 1981 by the Pediatric Oncology Group. This study analyzed the standard risk of acute lymphocytic leukemia in children. They compared two treatment effects on the cure rate. Ghildren’s age and sex are considered as covariates. The Newton-Raphson algorithm is used to yield MLEs of the parameters. The algorithm converges after nine iterations. Based on this study, they compared the estimates and the variances of the cure rates based on the Gompertz model, the logistic-exponential mixture model, and the KM method. They found that the Gompertz model produces similar estimate of the cure rate to the KM estimate but with much smaller variance. They also provided a generalized model in which the treatment indicator is included as one of the covariates. One important advantage of this generalized model is the log-linear structure of the parameters, since the log-linear structure implies the concavity of the log-likelihood function and the existence, the uniqueness, and the asymptotic normality of the parameter estimates.

PAGE 21

13 Semiparametric Model Chen et al. (1999) proposed a semiparametric model for the survival data with a cure fraction. They assumed that there are two variables: a) the number of carcinogenic cells N* in the body of a patient; and b) the incubation time Wi for the carcinogenic cell (i = 1, 2, . . . , A^*). The number of carcinogenic cells = 0 implies that there is no such cell in the patient’s body. Assume that N* follows a Poisson distribution with mean 9 and assume that ITj for z = 1, 2, . . . , A^* are iid with distribution function F{w). Variables Wi and N* are independent. Therefore, the time to incidence is T = min(lT,,0 < i < N*). Note that P{Wo = oo) = 1. Then the survival function for T is presented as S{t) P(No incidence by time t) P{N* = 0) + P{T >t, N* > 0) P{N* = 0) + P{Wi >t,W2>t,...,WN,>t,N*> 0) .-e i=l e-eF{t)_ (1.9) Note that S{t) is an improper survival function because S{t) e~^ > 0 as t — oo. The survival fraction e~^ is indeed the probability that there exists no carcinogenic cell in a patient’s body. The corresponding probability density function and the hazard function are f{t) = 9f{t)exp{-9F{t)), m = 0f{t). Let S*{t) denote the survival function for the noncured proportion of patients, then _ f ,-6 S*{t) = P{T>t\ N* >1) = 1 — ( 1 . 10 )

PAGE 22

14 The corresponding probability density function is d /•(*) = and the hazard function is f*ii) 1 = P(rT <^>' The likelihood function is derived as follows n n i=\ = ni»/('ni''e“"'"”(I'll) i—\ The model is semiparametric in the sense that the form of F{t) is unknown. It has a strong biological meaning by gathering contributions from two characteristics of the tumor growth, the initial numbers of carcinogenic cells and the rate of their growths. Even without any biological meaning, the model still fits any data with a survival fraction. It can always be treated as being generated by an unknown number N* of latent competing risks. In this model, S'(t), /(t), and \{t) are improper, while S*{t), and \*{t) are proper survival function, density function, and hazard function, respectively. One advantage of this model over the mixture model is that the overall hazard function \{t) has a proportional hazard structure if the covariates enter the model through 9. In the mixture model, however, the overall hazard function does not have a proportional hazard structure if the covariates enter the model through the cure rate tt. The hazard function h*{t) for the failure proportion is magnified by p(^xt ) ^ ^ compared to h{t). As t goes to infinity, function h*{t) converges to h{t). In addition, the authors pointed out the close relationship between the model presented in Equation (1.9)

PAGE 23

15 and the mixture model presented in Equation (1.1) by observing that S{t) defined in Equation (1.9) can also be written as If we let P{Zi = 0) in Equation (1.1) take the value of and let S*{t \ Zi = 1) take the form of S*{t) defined in Equation (1.10), then S{t) has the same form as in Equation (1.1). Model in Equation (1.1) also corresponds to some model of the form in Equation (1.9) for some 6 and F{t). Note that F{t) is assumed to have a Weibull form for convenience in the article. The parameter ^ is a function of covariates in the form oi 9 = exp(/3^x). Tsodikov (1998) proposed the same semiparametric model from different angles. As above, let 0 be a function of covariates, denoted by ^(x, /3). Assume that n is the number of failure observations and 6f is the parameter of interest for the failure observation. Let mi be the number of censored observations in the time interval (ti,ti+i), where U is the failure time associated the failure observation. Let , Urm be the censoring time for the observations censored within interval {U, U+i). Let 9^j{i = 1, 2, . . . , n, j = 1,2,..., m^) be the parameters of interest for the censored observations. Also let be the sum of 9s for all individuals on the interval and let for 0 < a < 5 < n. Note that Ylt = 0 Fla = 1 if o > 5. The form of F{t) is completely unspecified. S{t) = e-^ + e-^). b k=a b

PAGE 24

16 In order to draw inference on the cure proportion with the presence of an unknown form of F{t), the author applied the partial likelihood, the marginal likelihood, and the profile likelihood function of 6. The partial likelihood function is written in the form r q q Q The partial likelihood is valid due to the proportional hazard structure. It provides consistent estimates. However, it can only be used to estimate the ratio of 9 and 9b, the baseline value. It is therefore inconclusive in the estimation of 9bWhen there is no covariate, i.e., when all patients are homogeneous, cure rate can not be estimated. Both the marginal and the profile likelihood are derived from the full likelihood function which is written as n rrii L = (1.12) j = l where Q{t) is the survival function for the censoring time. Type I censoring mechanism with a fixed end point tc is assumed throughout the article. Then Q is a single-step function such that Q{U) = 1 for z = 1, 2, . . . , n, AQ{tc) = 1, and tnj = tc for j = 1,2, . . . , ninThe marginal likelihood function is derived as ^n,n _ . r i-0 The profile likelihood function is derived from Equation (1.12) by maximizing it with respect to F. Nonparametric maximum likelihood estimator (NPMLE) is used to estimate F. Both Chen et al. (1999) and Tsodikov (1998) proposed the same semiparametric model, while using different estimation techniques. The former assumed independent and noninformative censoring as well as Weibull distribution for F{t) in the model. Therefore, their model is a parametric model although F{t) can take

PAGE 25

17 a nonparametric form. The latter assumed a Type I censoring mechanism, which is restrictive. Both the parametric and the semiparametric models with a bounded cumulative hazard rate are appealing because of the simplicity and intuitive meaning of these models. However, there are limitations for the more flexible semiparametric model. It is difficult in terms of computation. In addition, the asymptotic properties of this model have not been explored in the literature. In the next section, we will reexamine the semiparametric model proposed by Chen et al. (1999) and Tsodikov (1998). In this examination, we make no parametric assumption for F{t). In addition, we do not limit the censoring to Type I mechanism. By using the simulated data, we show the computational barrier raised by this model. 1.2 Computation Difficulty of the Semiparametric Model We assume random and noninformative censoring. The notations used in Tsodikov (1998)’s paper are used in the section except some changes are made in the definition of Pa,b and We define them as follows: Pa,b = Sa,bPa+l,b ' • Sb,b, II Pa,bPa,b—l ^a,a The likelihood function takes the following form, i=l where f{ti) is the density function corresponding to F{ti). Define (1.13) /• = (1 )' g — t+l,n pn-i+l,np

PAGE 26

18 and Let Lm denote the marginal likelihood function. As derived in Appendix A, Without loss of generality, assume that there is only one covariate x in the study, such as the treatment indicator. Also assume that 0 is a function of the covariate x and is expressed in an exponential term as exp (a + /?x). The first and the second derivatives of with respect to are denoted by S'-j and S”^ They are derived as follows; n (1.14) i=l The score function for a is obtained as n T* and the score function for /? is written as n n T* The Hessian matrix is calculated and written as d‘^\og{Lm)/da'^ d'^\og{Lm)/dad/3 d‘^\og(Lm)/dadp d‘^\og{Lm)/d(3^ where

PAGE 27

19 a^og(L„)/aaa/? = i=0 " 'ra— i+l,n and " i=0 " i=0 where Jf is the first derivative of J* with respect to j3 and is expressed as Suppose that the covariate x is the treatment indicator, i.e., a: = 1 if the patient is in the treatment group and 0 if the patient is in the control group. We are now interested in testing the hypothesis: The score test (Rao, 1973) is used to test the hypothesis in Equation (1.15). The score statistic for (3 under Hq is denoted as U{^)\ho and the asymptotic variance for the score statistic is —5^ \og{L^)ld0^ under the null hypothesis. If a is assumed known, then the standardized test statistic is In order to explore the feasibility for computation, we first generate 10000 samples under the assumption that F is an exponential distribution with a mean q; = 1.5 and /? = 0. Each random sample contains 30 observations in the control group and 30 observations in the treatment group. The censoring proportion is around 50%. The standardized score statistic for (3 is calculated for all random samples generated. The plot of the 10000 standardized score statistics is presented in Eigure (1-1). In this bar plot, we observe that the score statistic behaves well and is close to a normal shape given that a is known. Ho: P = 0. (1.15) umn. (1.16) v^-aMog(Z„)/a/?2|„„' of 1. Then the survival function is exp{—9F) = exp(— (1 — e Assume

PAGE 28

20 The problem becomes difficult to handle, however, when a is unknown. It is common to use a profile likelihood method to handle the nuisance parameter a when our primary parameter of interest is /?. The profile marginal likelihood function for /? is obtained from the marginal likelihood function by replacing a by its MLE under Hq. The test statistic based on the profile marginal likelihood function of j3 is similar to the one defined in Equation (1.16) given that some adjustment on the denominator of Equation (1.16) is required. Therefore, our primary task is to solve for the MLE of a under Hq. By using the same parameter setting for Eigure (1-1), we generate a sample with 30 patients in the treatment group and 30 in the control group. We show in Figure (1-2) the plot of the log of the marginal likelihood function as a function of a on interval [—2, 6] when [3 is valued at 0. The marginal likelihood for a under Ho is poorly behaved. In order to further investigate the behavior of the marginal likelihood function on a and /?, we show in Figure (1-3) a three-dimensional plot of the log marginal likelihood function defined based on the same random sample as a function of a on the interval [—2,2] and (3 on the interval [—3,3]. The poorly behaved log marginal likelihood as a function of a and /? explains the behavior of the log marginal likelihood function in Figure (1-2). 1.3 Piecewise Exponential Model As we mentioned earlier in the chapter, it is sometimes necessary to assume different hazard functions for the time to failure on different time intervals. Friedman (1981) proposed a piecewise exponential model to handle this problem. Let the time scale be divided into I{n) intervals (0,Ti], (Ti,T 2 ], . . ., (T/(„)_i, T/(„)]. The hazard rate of each individual is assumed to be constant over any given interval. Denote the hazard rate of the individual in the interval by . Assume Xij > 0 for all {i,j). Let lij = logA^ and let Uj be the survival time of the

PAGE 29

21 individual in the interval. Define Uj as Uj = max{0,min(Tj Tj_i)}. (1.17) Let 0, otherwise. 1, when the individual fails in [Tj_i,Ti) Then the log-likelihood function is logZ/(l) ^ y ^ ^ exp ( l-ij ) . (1.18) ij ij One of the forms can take is lij = a -t^kXjThe author proved the existence and the uniqueness of the MLE of the vector 1 . The consistency and the asymptotic properties of the MLE were also explored. Karrison (1987) also considered a piecewise exponential model in his study.

PAGE 30

Frequency 22 o o o cvi -4 -2 0 2 4 Standardized Score Figure 1-1. Standardized score under Hq (/5 = 0) for 10000 random samples.

PAGE 31

23 a Figure 1-2. Log marginal likelihood function of a when ^ = 0.

PAGE 32

Marginal Likelihood Function 24 'i? Figure 1-3. Log marginal likelihood function of a and /?.

PAGE 33

CHAPTER 2 TOPIC OF RESEARCH The model we propose in this dissertation is a piecewise Gompertz model. It incorporates the cure proportion by assuming a bounded cumulative hazard rate. Moreover, our model gives the flexibility that the survival times for different time intervals may have different hazard functions. As noted by Aitkin et al. (1989) that a piecewise exponential model has the same flexibility as a proportional hazards model under the assumption that each interval is defined by one failure observation in the piecewise model. The choice of using a Gompertz model over an exponential model is made because of the possible plateau of the survival curve from a Gompertz model. We would speculate that under a piecewise Gompertz model, the more intervals we define, the more flexible the model is. Eventually, when there is only one failure observation on each interval, the flexibility of this model would be the same as the semiparametric model we discussed in Chapter 1. Therefore, our model preserves the flexibility of a semiparametric model to some extent. The degree of the flexibility depends on the number of intervals we define for our model. Meanwhile, the computation for our model is much easier to handle than a semiparametric model due to our modelÂ’s parametric structure. However, when the number of intervals in our model increases, the number of individuals on each interval decreases. We would anticipate large variance for the estimate of the cure rate, which is similar to the situation we discussed earlier for the KM estimate of the cure rate. An appropriate number of intervals needs to be accessed in order to reach maximum flexibility and computational feasibility of the model. In this chapter, we describe our model in detail. In Chapters 3 and 4, asymptotic 25

PAGE 34

26 properties of the MLEs of parameters in the model are proved. In addition, a goodness-of-fit test is derived which can be applied to determine the appropriate number of time intervals in the model. In Chapter 5, a Monte-Carlo simulation study is carried out to show the advantages of our model. Chapter 6 gives a real example and demonstrates the application of our model to the example. 2.1 Data Structure and Model Assume there are n patients in the study. Consider the patient who is in treatment group g and is associated with a vector of covariates Xj, where Xj is a p-dimensional vector. Let Tj and Uj be the survival time and the censoring time of the patient, respectively. The observed data on the patient are (t*,6j,Xj), where t* is the observation on Tj = mm{Tj, Uj), and Sj is the observation on A, = I{T, < Uj). Assume there are G treatment groups and the interval [0, oo) is divided into / intervals [ri_i,ri), i = 1, 2, . . . ,/, where tq = 0 and tj = oo. Let 0^ = {0^ , A^), where (3 is p-dimensional and A = (A^, A^, . . . , A^)^. Note that Ag = (Aig, A23, . . . , \ig). Clearly, 0 is a {p + 7G)-dimensional vector. It is assumed that the hazard function for the patient in treatment group g has the form tG[ri_i,Ti). (2.1) Define an indicator function Kg{j), where { 1 if patient is in group g; 0 otherwise.

PAGE 35

27 Let K{j) be a vector of Kg{j) for ^ = 1, 2, . . . , G, i.e., K{j) = («l(i),«2(j),-..,KG(i))^ Let us also define for i = 1, 2, . . . , 7 and Mt) 1 if t € [ri_i,rj); 0 otherwise, Let Q{t,j) = K{j) 0 J(t) = K2(i)JW \KG{j)J{t) Note that Q(t, j) is a G/-dimensional vector. The hazard function in Equation (2.1) is also written as if «g(i) = 1 and Jj(t) = 1. The cumulative hazard function Hj{t,9), the survival function Sj{t,6), and the density function fj (t, 0) can be derived as follows: Hj{t-,0) = [ h(u; A, Ac(j), J(u))du, Jo Sj{t]0) = exp{— f h{u-,X,K{j),J{u))du}, Jo The likelihood function contributed by (tj,5j,Xj) is expressed as L„, = A, ((•;»).

PAGE 36

28 The corresponding log-likelihood function is written as Inj \ogLj = 6j[\og{h{t*] X, K{j),J{t*))) + /3'^y.j] [' h{u-,X,K{j),J{u))du. Jo ( 22 ) consequently, the overall likelihood function is defined as n Ln — L'nj ) i=i and the corresponding log-likelihood function is ^ ^ ^nj t=l 2.2 Assumptions Our derivation of asymptotic properties is based on the following three assumptions: Assumption 1. There is a 5 > 0 such that ||xj|| < J5 for j = 1, 2, . . . , n. Assumption 2. There exist e > 0, 5 > 0, and an integer nowhen n > no, there exists a subset A of overall individuals with n^/n > 5, where ua is the size of the subset A such that for any unit vector a, I a^Xj |> e for G A. Assumption 3. Let Ug be the number of individuals in treatment group g, where g = 1,2, ... ,G. There exists a (5 > 0 such that mini 6 when n is sufficiently large. Assumption 4. The censoring variables Ui, U 2 , . . . ,Un are iid with a survival function Gu{u). Assume that there exists an e > 0 such that Gv{ti-i) — Gu{oo) > e.

PAGE 37

29 The first assumption assumes that all covariates are uniformly bounded. It is reasonable in practice because the majority of the covariates in a cancer study are patients’ characteristics, which are bounded uniformly. Note that we are primarily interested in time-independent covariates. The second assumption assumes that at least a set A of patients’ covariates are dispersed to some extent. Assumption 3 means that the number of observations in each treatment group must increase proportionally to the sample size. Assumption 4 assumes that not all the censoring observations occur very early on the time span. 2.3 Hypothesis of Interest We are interested in comparing cure rates among treatment groups. Let 7Tg(x) be the cure rate of patients who are under treatment group g and who are associated with covariates vector x. Considering the case with two treatment groups, the null hypothesis is Ho : iTi(x) ?T2 (x) = 0, (2.3) for any covariate x. The test statistic will be derived in the following section. 2.4 Test of Hypothesis The cure rate corresponding to patients in treatment group g who are associated with a specific setting of covariates x is poo 7Tp(x) = S{oo\ 6) = exp{— / h{u\ A, K{j),J{u))du} Jo = exp{e^"[^ + — ]}. i=l ^*9 '^^9 The hypothesis defined in Equation (2.3) is equivalent to (2,4) H, i=l + A,: A 71 -}-{E t=l ^ ^ gAj2Ti-l gA/2T/_i + A, 72 A 0 . 72

PAGE 38

30 /V T '' T Let the MLEs oi 6 = (A^ 0^Y be denoted by 0 = (A ^ )^. The test statistic is derived as i=i i=i \2 a/2 The variance of the above test statistic can be derived by the Delta method. The test statistic given in Equation (2.5) is a function of the estimates of A. We will show in later chapters that 6 follows an asymptotic normal with a mean d and a variance-covariance matrix S. The variance estimate of the test statistic can be derived as follows: V(T) = (f where (|^ Ia=a) represents the derivative of the test statistic with respect to A while evaluated at the MLE of A. E* is the variance-covariance matrix of A and E* is the estimate of E*. The vector is derived as follows: dT dXii dT dXi2 A?/ A?2 for i = 1, 2, — 1, and dT _ (A/ir/_i — dT _ (A/2T/_i 5A/2 A^2 The normalized test statistic is it follows an asymptotic standard normal distribution.

PAGE 39

CHAPTER 3 ESTIMATION OF PARAMETERS 3.1 Likelihood Function and Information Matrices Based on the log-likelihood function defined in Equation (2.2), the score statistic Snj{0) for the individual can be expressed as Sn,W = (3.1) where Snj{0) = ^ ^ ' h{u; A, K{j),J{u))du (3.2) and 5g’(«) ^ P Q{u,j)uh{u-,X,K(j),3{u))du. (3.3) Note that S^j{0), and Snj{6) are p, IG, and {p + 7G)-dimensional vectors, respectively. Furthermore, the matrices formed by second order derivatives of /„j can be derived as _ r = dl3d0 FifhO) = ^ ^ d(3dX^ ^ dH dXdl3^ / h(u; A,K(i), J(u))du, Jo ' / ^ QF{u,j)uh{u;X,K{j),J{u))du, Jo Q(u, j)xjnh(u; A,K(i), J(u))du, and dXdX^ [ Q{u,j)Q^{u,j)u‘^h{u;X,K{j),J Jo [u))du. 31

PAGE 40

32 Then the observed Fisher’s information matrix is written as dt ’(») y f-t”. 3^^^^ [' h{u-,X,K{j),J{u)) Jo i \ ^uQ{u,j) j (x[ uQ^(u,j))du, (3.4) which is a (p + IG) x {p + IG) matrix. The Fisher’s information matrix is the expectation of Fnj{6), which is written as pT* ( \ = E{Fr,j{e)) = / h{u;X,K{j),J{u)) ^Jo \uCl{uG) ^ uQ^(u,j))du . (3.5) 3.2 Concavity of Likelihood Function The log-likelihood function for n individuals is written as ^n(^) = '^{5j[\og{h{t*]X,K{j),3{t*))) +/3'^Xj] / h{u-X,K{j),J{u))du}. j=i (3.6) The score statistic, the observed Fisher’s information matrix, and the Fisher’s information matrix for n individuals in the sample are respectively as follows: n 5„(9) = ^S„,{9), f=l (3,7) n F„(9) = ^F„,(9), f=l (3.8) and Tt Dfi — ^ ^ ^nj • j=l (3.9) Let = (a^ b^), where a is a p-dimensional vector and b = {bn • •• bn • • biG • • bio)^ is a JG-dimensional vectors. We will show that Fn{6) is positive definite and therefore, the log-likelihood function defined in Equation (3.6) is strictly

PAGE 41

33 concave. To show Fn{6) is positive definite, it suffices to show that c^F„(0)c > 0 for any non-null vector c of dimension {p + IG). In order to prove that F„(0) is positive, we assume that there is at least one observation on each interval for each treatment group. Note that ^Fn{e)c / \ Tl ( \ (a’a V / j=l = f /i(w; A,k(j), J(u))[a^Xj j=i > 0 . uh^Q{u, j)Y du This implies that F„(0) is semipositive definite. Since /i(u; A, /c(j), J(u)) > 0, c^Fn{0)c = 0 implies that for any u 6 (0, tj), j = 1, 2, . . . , n, a^Xj 4hFQ{u,j)u = 0. (3.10) For any integer i, 1 < i < I and for any integer g, I < g < G, there exists an individual j (say) in group g and t* G (rj_i,Ti). Then Equation (3.10) implies that a^Xj -Iubig = 0 for u G Consequently a^Xj = 0 and kg = 0. (3.11) Since Equation (3.11) holds for any integer i, 1 < i < I and g, 1 < 5 < G, it is then proved that b = 0.

PAGE 42

34 Therefore, Equation (3.10) implies that for any j — 1, 2, . . . , n, = 0 , i.e., -Xj = 0 for all j = 1,2, ,n. which is a contradiction to Assumption 2. Therefore, Fn{9) is positive definite. The strict concavity of the loglikelihood function as a function of unknown parameters 0 guarantees the uniqueness of the MLEs of the parameters provided that the MLEs exist. Based on the existence of the MLEs and the concavity of the log-likelihood function, the Newton-Raphson algorithm converges.

PAGE 43

CHAPTER 4 PROPERTIES OF THE MAXIMUM LIKELIHOOD ESTIMATES 4.1 Introduction In the previous chapter, we proved the strict concavity of the log-likelihood function. We pointed out that we obtain unique MLEs of parameters if we can prove that these MLEs exist. In this chapter, we will first provide some preliminary results as a preparation for later derivations. Then we will show the existence, the consistency, and the asymptotic normality of the MLEs obtained. The method applied to prove the properties of the MLEs follows a similar method provided by Mailer and Zhou (1996, PP21 1-222). Mailer and Zhou proved the asymptotic properties of MLEs in a survival model which assumes that the failure time follows an exponential distribution with covariates information. The exponential model with covariates in their book is a special case of the Gompertz model. By studying the piecewise Gompertz, we move one step further through the incorporation of different hazard functions on different time intervals. In our model, if we allow A = 0, the hazard function for the failure time T is h{t) = Therefore, the hazard is a constant, which implies that T follows an exponential distribution. In the next section, we state several lemmas which will be applied to prove the existence, the consistency, and the asymptotic normality of the MLEs. The proofs of the lemmas are given in detail in the Appendices. We prove the existence and the consistency of the MLEs in Section 4.3. The proofs of the asymptotic normality is stated in Section 4.4. In Section 4.5, a goodness-of-fit test which compares models with different numbers of pieces is derived. We also show that the corresponding test statistic follows an asymptotic distribution. 35

PAGE 44

36 4.2 Preliminary Proofs Lemma 4.1. Let E{-) and F(-) represent the expected value and the variance of a random variable, respectively. Then the following two equations hold: E{5„(6>)} = 0 and V{Sn{d)} = D,. Proof. See Appendix B. Lemma 4.2. Let 6q = (/3g, Aq) be the true parameter vector, where Aq = , . . . , Xfi , . . . , Ajq, . . . , Xf^). For any positive integer m and for I = 1,2, there exist an e > 0 and a F > 0 such that for any A satisfies ||A — Ao|| < e, pTI Eeo{ ' rt(u; A',k(j), J(u))du} < F, for _; = 1, 2, . . . , n. Note that F is a constant which depends on and e. Proof. See Appendix C. Lemma 4.3. Define 0q as in Lemma 4.2. For any positive integers I and m, there exists a 7 > 0 such that u'^h^{u\Xo,K{j),J{u))du} > (4.1) for j = 1, 2, . . . , n. Note that 7 is a constant depending on Oq. Proof. See Appendix D. Lemma 4.4. Under Assumptions 2, 3, and 4, there exist a 7 > 0 and an integer no > 0 such that when n > no, Amin(F^n) n > 7 , where Amin(D„) is the smallest eigenvalue of

PAGE 45

37 Proof. See Appendix E. 4.3 Existence and Consistency of the MLE For each A > 1, define an elliptical neighborhood of the true value do of unknown parameters by N{do) = {e:{eOoVDnie do) < a}. (4.2) As shown in Lemma 4.4, Amin(-Dn) > when n > noHereafter we assume that the condition n > no is satisfied. We observe that for any d G N{do), A>[d-do)'^Dn[d-do) > \\d-dofXr^n{Dn) > Wd-dofn'y, i.e., |0-0o|| < ^/n^j 7 ' Let us consider an arbitrary vector d which is on the boundary dN{do) of N{do) then 0 is a vector which satisfies the following equation: (4.3) (9 e„fD„(9 e„) = >1. If we can show that ln{d) < Ini^o) for all d G dN{do) with probability approaching 1, then the log-likelihood function has a local maximum in N (do). In addition, the gloabl maximum exists and lies in N{do) since the log-likelihood function is strictly concave. The consistency of is proved by observing that Equation (4.3) holds. Theorem 4.1. There exists a unique and consistent which maximizes ln{d). Proof. For any G dNn{do), the Taylor’s expansion of the log-likelihood function about 00 is written as IniOn) IniOo) (0„ 0q)^5„(0o) " ^(^n ~ 0 q)"’T;(0~ ) (0„ do), (4.4)

PAGE 46

38 where is a random vector on the line segment between and Oq. Hence, the vector OnE N{9o). Recall that 5„(0o) is the vector of the first-order derivatives of ln{9) evaluated at the true parameter do and Fn{0n) is the negative matrix of the second derivatives of ln{0) evaluated at We have ln{Oo) < o| > P{{0n 0ofSn{0o) < \{en 0of Fr,{0n){0n ~ ^o)} > P[{0n 0ofSn{0o) < \a, 0oVFn{0n){0n ~ ^o) > > P[{0n 0oVFn{0r,){0n 0o) > \a} P{(0„ 0ofSn{0o) > \a] . We proved in Lemma 4.1 that E{Sr,{0o)) = 0 and H(5„(0o)) = L»n Since (0„ — 0o)^Sni0o) is a sum of independent random variables with mean zero, we obtain the following inequality by applying the Chebyshev’s inequality: P{{en-0ofSn{0o)>^}< 16(6>„ 0ofDn{0n 0o) _ 16 A’ (4.5) since G dN(0o). The probability in Equation (4.5) can be made arbitrarily small by choosing A sufficiently large. Next we need to show that P{(0n 0ofFn(0n)(0n ~ Oq) > \a] ^ 1, (4.6) as n — ) oo. Let I be a (p -t7G) x {p + 7(7)-dimensional identity matrix. If we can show r\j that, uniformly in N{0o), A I, as n -» oo, (4.7)

PAGE 47

39 then we obtain the following: (0„-0orF„(0i)(0„-0o) = {9n 6o) > {0n 0ofDn{0n 0,)\rrnn{D-^'^Fr,{0n)D-^l^) = AK,,{D-^l'^Fr,{0n)D-^'^) 4 A, where G dN{0o) and 0„G N{0q). Therefore, Equation (4.6) holds as n — > oo. Consequently, if we let both n and A become large, P{ln{0n) < in{0o)} 1 for 0n ^ dN{0o). Unique MLEs of parameters in the model exist by the strict concavity of the log-likelihood function and are consistent by Equation (4.3). We shall now prove Equation (4.7) through the following three lemmas. Note that Equation (4.7) indeed shows that the information matrix is consistently estimated by F„(/9„). Let = (a^,b^) be any {pF /G')-dimensional unit vector, where a and b are pand /G-dimensional, respectively. Lemma 4.5. Eor any (p -I7G')-dimensional unit vector c, ^D-^l^\F„(e,) A 0, (4.8) where $o = (/^o ) FV is the true parameter. Proof. See Appendix E. Lemma 4.6. Eor any (p 47G)-dimensional unit vector c and any 0 G N{0q), c^L>-i/2(F„(0) Fn{0o))D-^^^c A 0. Proof. See Appendix G.

PAGE 48

40 Lemma 4.7. For any 0 G N{Gq) A I(p+/G)x(p+/G) (4.9) Proof. See Appendix H. 4.4 Asymptotic Normality of the MLE Lemma 4.8. Let I be an arbitrary positive integer, then EgQ\\Snj{0)\\‘ is uniformly bounded for all j = 1, 2, . . . , n. Note that Snj{0) is defined in Equation (3.1). Proof. See Appendix I. Lemma 4.9. For any {p + /G')-dimensional unit vector c, c^D-^^^Sn{eo)-^N{0,l). (4.10) Proof. Denote z„ = 1=1 where Snj{0o), j = 1, 2, . . . , n are independent random variables. By Lemma 4.1 E{Zn) = 0 and V{Zr,) = 1 . By Liapounov’s condition, it suffices to show that 1=1 as n — >• oo. Note that (411)

PAGE 49

41 Since £'e(,||S'„j(0o)||^ is uniformly bounded for all individuals by Lemma 4.8, Equation (4.11) holds and the proof is accomplished by the Central Limit Theorem. Theorem 4.2. The MLE are asymptotic normal, i.e., D]^‘^{dn — 0o) ^ N{0, I) as n ^ oo. Proof. By Taylor’s expansion, we obtain that for some 9 on the line segment between On and Oq and for any unit vector u, n^D-^I^Sn{K) n^D-^I^SniOo) 9o). Also because \x^ Dn^^^Sn{9n) = 0 since is the MLE, n^D-y^Sn{9,) = M^D-^'^FnmTn ~ 9,) = n^D-^l\Fn{9) Dn){Tn 0o) + n^D-^I^Dn{Tn 0o) = n^D-^l\Fn{9) Dn)D-^l‘^D]l\Tn 0o) + n^D]!\Tn 9o) = n^[D-^l\Fn{9) Dn)D-^l^]D]l^{rn 0o) + Vi^D]!\Tn 0o). Since G N{9q), Equation (4.9) holds by Lemma 4.9. Therefore, the sum of the absolute values of all elements of [T)n^^^(F„(0) — D„)Dn^^^] goes to 0 in probability. Hence, = o^, where Op(l) is a matrix with all elements converging to 0 as n oo. Then we obtain that u^D-^/^Sn{9o) = u^[0p(l)]F>y2(^ _ ^ _ 0,). Therefore, 0,) u^D-^^^Sn{9o) + Op(l).

PAGE 50

42 Since we have proved the normality of in Lemma 4.7, the normality of Dn^^‘^{6n — 9 q) follows. Hence, Theorem 4 .2 is proved. 4.5 Goodness-of-Fit Test In practice, one might be interested in knowing how many intervals are in fact necessary in a specific study. Therefore, a goodness-of-fit test can be performed to test how many time intervals are adequate. Assume that we have a saturated model with A + 12 intervals and an unsaturated model with I\ intervals. We are interested in testing whether there is significant difference between the two models. We assume that the unsaturated model is obtained by combining intervals A + 1, 7i + 2, . . ., 7i + 12 in the saturated model with interval 7i. The null hypothesis can be written as Ho A(/j+i)p = A(/i+ 2 )g = • • • = A(/i+/ 2 )g = A/jg, V 5 ' = 1, 2, . . . , G. (4.12) Since 6^ = {0^ A^), the null hypothesis defined above can also be written as 77o : 770 = 0, (4.13) where 77 is a design matrix with full row rank I2G and is written as 0 0|;_i 0 0 0 0 0 0 0 0 -1 Of^_i 0 0 0 0 0 0 or,-i 1 -1 0 Of,-i 1 0 -1 0 0 0 0 0 0 Ot.-I 1 0 0

PAGE 51

43 where Os in bold are null vectors. Note that the first p columns of the matrix H put no restriction on (3. Columns (p + 1) through (p + 7i — 1 ) represent the nonrestrictiveness of A for treatment group 1 corresponding to the first /i — 1 intervals. Similar patterns are presented for the rest of the columns. We will now construct a likelihood ratio test for the hypothesis in Equation (4.13). To maximize the log-likelihood function ln{G) under Hq is equivalent to maximize U9) unrestrictively in 0 € i?[p+Ci+^ 2 )G] ^ ^ where 7 is a vector of Lagrange multipliers. If we let 9(1 be the true parameters, 9n be the parameters estimated under ^ • T Hq, and be the estimated parameters under the saturated model, then satisfies the following equation: Snm-H^ = 0, (4.14) where 7 is a vector of the estimated 7 under Hq. Also under the null hypothesis, H9ji = 0 and H9q = 0, H{9n 9q) = 0. (4.15) Now we need to solve for 9n from Equations (4.14) and (4.15). By Taylor’s expansion, So) + 0 ,( 1 ), hence D]l\9n 9q) + D-V25„(0„) = D-^/^Sn{9o) + 0 ,( 1 ). By Equation (4.14), the above equation can be written as D'Ae. «o) + = D-'/^S„(e,) + Op(l). (4.16)

PAGE 52

44 We can rewrite Equation (4.15) as 9 „) = 0 . Now express Equations (4.16) and (4.17) in a matrix formula, (4.17) ( 1-1/2 \ / ^ Dn ' HD, 1/2 0 / V 7 / J-^n \ 5n(6>o) + Op(l) 0 (4.18) / Since H is of full row rank DG and D„ is nonsingular, is of full rank I 2 G. Hence, the inverse of matrix written as HD, -1/2 0 exists and can be / 1 1 ^ I -P Q ^ ° ) \ V where Q = and R = Therefore, we can solve Equations (4.18) as ^ E>y'(0„ 9o) ^ ^ IP V ^ ) ^ R j \ T>n'^n(6>o) + Op(l) 0 \ hence, D\l‘^(en Oo) = (/ P)D-^^^Sn{do) + Op(l). Similarly for 0„, we have Dl^^iOn do) = D-^l‘^Sn{9o) + Op(l),

PAGE 53

45 by Taylor’s expansion and by observing the fact that = 0. Therefore, D]!\9n-en) = D]i\e^-e,)-Dll\9r,-e, = PD-^I^Sn{9,) + o,{l). (4.19) Note that PP = P is idempotent with full rank / 2 G'; therefore, P can be written as A, P = A^ ( I 0^ V 0 0 / where A is an orthogonal matrix. Let then z„4iV(o,/), by Lemma 4.9. From Equation (4.19), we obtain that (0„ 9^^ Dn{9n 9n) = S^{9o)D-^/^PD-^^^Sr^i9o) + Op(l) = S'^{9o)D-^/^A^ ^ I 0^ t/sG U V 0 0 AP-i/2p„(0o) + o^(l) Zi ( 1 0^ P%G U / 0 0 4g 0 0 0 Zn + Gp(l) Z + Op(l) hG(4.20) The likelihood ratio test statistic for the hypothesis in Equation (4.12) can be written as A SUPegeo ^n{Q) ^ Ln{9n) supee©T„(6>) ” L„(0„)’

PAGE 54

46 where © represents the overall parameter space and ©o the parameter space under Hq. By Taylor’s expansion, we can express — 21og(A) as -2l0g(A) = -2[ln{dn) ln{K)] = 2 [( 0 „ OnfFr^mOn ~ K). where 6 lies between and On and therefore 0 G Nn{A). Note that Sn{6n) = 0 and Fn{9) can be replaced by from Lemma 4.7. Hence, — 21og(A) is written as -2 log(A) = [On OnfDniOn ~ On) + Op{l), which converges in distribution to ^7 Equation (4.20).

PAGE 55

CHAPTER 5 SIMULATIONS 5.1 Introduction The results discussed in the previous chapters showed the asymptotic properties of parameter MLEs in the piecewise Gompertz model. These results enable us to carry out the hypothesis test of which the null hypothesis is proposed in Equation (2.3). In this chapter, Monte Carlo simulation studies are performed in order to evaluate the normality approximation of the proposed test statistic in Equation (2.6) and the power of the test under different settings based on the amount of location shift and the sample size. Due to the complexity of the piecewise model, the procedure for generating the random sample is quite complicated and is discussed in section 5.2. Results are given on observed Type I errors and powers for both the piecewise Gompertz model and the exponentiallogistic mixture model. All simulations were completed on the Sun Ultra, Enterprise 450 computer using Ox 2.0 (Doornik 1998). Random number generators in the Ox 2.0 environment were used to generate uniformly distributed random numbers. 5.2 Generation of Random Samples In this section we describe the procedure for generating the random sample when the sample follows a piecewise Gompertz model based on certain parameter settings. In order to generate a random sample of size n with a specified proportion of censoring, we need to generate a sample of failure observations of size n and a sample of censored observations of size n. The sample of failure observations follows the piecewise Gompertz model. The sample of censored observations 47

PAGE 56

48 is assumed to follow a uniform (0, M) distribution, where M is defined by the censoring proportion. The final sample is obtained by comparing the two samples generated. A subject is considered failed if its observation from the first sample is smaller than the one obtained from the second sample. The subject is censored if its observation from the first sample is greater than the one from the second sample. 5.2.1 Parameter Settings We consider five different parameter settings in our Monte Carlo simulations. There is no treatment difference in random samples generated based on Settings 1 and 2. Therefore we can compare Type I errors between tests based on random samples generated from these two settings. In parameter Settings 3, 4, and 5, treatment effects exist between groups. Settings 3 to 5 present the smallest, the intermediate, and the largest difference between treatment groups, respectively. The comparison of powers of tests are based on random samples generated from these three parameter settings. All five parameter settings are listed as follows: Setting 1: • Two treatment groups • Three intervals: [0,3), [3,7), and [7, oo) • One covariate with value 0 or 1 • Cure proportion varied with covariate values • Censoring rates: 30% • Aio = All = —4.2 • A 20 = A 21 == —0.52 • A30 = A31 = —0.2 • /?o = -0.1, A = 0.2 Setting 2: • Two treatment groups

PAGE 57

49 Three intervals: [0,4), [4,10), and [10, oo) One covariate with value 0 or 1 Cure proportion varied with covariate values Censoring rates: 30% -^10 = = —3.5 A20 = A21 = —0.45 A30 = A31 = —0.15 /3o = -0.2, A = 0.15 Setting 3: Two treatment groups Three intervals: [0,3), [3,7), and [7, 00) One covariate with value 0 or 1 Cure proportion varied with covariate values and treatment groups Censoring rates: 30% Aio = —3.5 and An = —4.2 A20 = —0.45 and A21 = —0.52 A30 = — 0.21 and A31 — 0.2 /3o = -0.1, /?i = 0.2 Setting 4: Two treatment groups Three intervals: [0,4), [4,10), and [10, 00) One covariate with value 0 or 1 Cure proportion varied with covariate values and treatment groups Censoring rates: 30% Aio = —3.9 and An = —4.2 A20 = —0.45 and A21 = —0.4 • A30 = —0.165 and A31 = —0.18

PAGE 58

• /?o = -0.2, 13, = 0.15 Setting 5: • Two treatment groups 50 • Three intervals: [0,4), [4,10), and [10, oo) • One covariate with value 0 or 1 • Cure proportion varied with covariate values and treatment groups • Censoring rates: 30% • Aio = —3.5 and An = —4.2 • A20 = —0.45 and A21 = —0.4 • A30 = —0.15 and A31 = —0.18 • po = -0.2, A = 0.15 The survival curves for the above five parameter settings are shown in Figures (5-1) through (5-5). The survival probabilities are calculated by applying S(«) = + p^ig^ m=\ A mg A *5 )}. (5.1) for t e [rj_i,ri). There are two separate survival curves in the first two figures for individuals with covariate values of 0 and 1. For Figures (5-3), (5-4), and (5-5), we draw four survival curves for all four possible combinations of the covariate and the treatment group indicator. 5.2.2 Generation of Censored Observations From the assumption, the sample of censored observations follows a uniform (0, M) distribution. Therefore, we can obtain a sample with a uniform (0, 1) distribution and then transform them into a sample that follows a uniform (0, M) distribution. The primary task is to obtain an appropriate value for M. Let U and T denote random variables for censored and failure observations, respectively. By taking parameter setting 1 as an example, we illustrate the derivation of the

PAGE 59

51 Figure 5-1. Survival curves for patients with parameter setting 1.

PAGE 60

52 Figure 5-2. Survival curves for patients with parameter setting 2.

PAGE 61

53 0 10 20 30 40 50 Time Figure 5-3. Survival curves for patients with parameter setting 3.

PAGE 62

54 Figure 5-4. Survival curves for patients with parameter setting 4.

PAGE 63

55 Figure 5-5. Survival curves for patients with parameter setting 5.

PAGE 64

56 censoring proportion P{U < T) as follows. 1 e dc, for M E [0, 3); P{U < T) 1 _ p-^ljC ^ A 19 1 1 + w exple^o+'^i* — iVl Mg J3 for M 6 [3, 7); and for M G [7, 00 ), pM } J exp{< ,/3o+;3ix ' o^2g^ X 2s -}dc, P{U < T) 1 /'3 1 _ pAigC li I exp{e'>»+'>--j^}dc + M 1 pAit;3 pA2o3 PA23C p{g/?o+/ 3 ix — ^ j. / exp{e^“'''^^^ ^ }dc ^19 Jz 1 1 /3A103 o ^ 2 gZ _ oA 2 o 7 M ‘A -M 29 A 19 A 29 / M exp{( ,/^o-\-(3ix gAsg7 — gAspC A: 39 }dc. Calculation of integrals in the above equations are carried out approximately using the Simpson’s Formula. The detail of the Simpson’s Formula is provided in Appendix J. Therefore, we search for an appropriate value for M such that P{U < T) reaches the required censoring proportion. 5.2.3 Generation of Failure Observations In order to generate a sample of failure observations, we need to generate a sample of uniform (0, 1) observations. Then we can treat the value from the uniform (0, 1) as the survival probability for the failure time T. By inverting the survival function defined in Equation (5.1), we obtain the value for the failure time. Note that special attention is required at the inversion because we are dealing with

PAGE 65

57 piecewise distribution. Examples are given for observations with covariate a; = 0 in the first parameter setting. From Figure (5-1), we observe that each survival curve consists of three pieces based on the three intervals we consider in the study. Therefore, the inversion function for generating the failure time observation depends on the value of the generated uniform random number U*. The survival probability when t = 0 is 1, the survival probability 1 ^—4.2x3 P{T > 3) = exp{e“°-^ — — } = 0.81, and P{T > 7) = exp{e 0.1 1 — e -4.2x3 ^-0.52x3 ^-0.52x7 -4.2 + — e -0.52 -]} = 0.59. Therefore if u* G [0.81, 1], then T = ^ log[e^i»"''> log(i/*)]; Is if u* G [0.59,0.81), then = logj X^g [e log(it*) Mg ^AlgTo p^lgTl X Is and if u* G [0, 0.59), then Aas [e-^“ log(u*) {+ A,; Ag Hg '^2g Note that it is necessary that there is at least one failure observation in each interval for each treatment group in order to make inferences on all parameters. 5.3 Type I Error and Power Type I errors and powers of tests are calculated based on the piecewise Gompertz model and on the exponential-logistic mixture model for all five sets of parameters. The test statistic based on the piecewise Gompertz model

PAGE 66

58 proposed in this dissertation, as we recall, was presented in Equation (2.6). For the exponential-logistic mixture model, we assume that there is one covariate x in the model. The time to failure is assumed to follow an exponential distribution with mean 1/A, where A is modelled as a function of the covariate x and the treatment indicator g. The parameter A is presented as exp( 7 o + 7ix -I72^)The cure rate P{Z = 0) is defined as i_^^ 0 o+\x +$29 • Therefore, for a fixed covariate value, the test of the equality of cure rates between two treatment groups is equivalent to the test of JTo : /^2 = 0. Hence, the standardized test statistic for this null hypothesis is Type I errors based on the first two parameter settings for both models are shown in Table (5-1). Powers for tests based on parameter settings 3 through 5 for both models are presented in Table (5-2). All tests are carried out for 10000 generated random samples and for three different sample sizes: 400, 1000, and 2000. The numbers of unobtainable test statistics (# NaNs) for the exponential-logistic mixture model are shown in the last column of each table. The unobtainable test statistics occur because the likelihood function for the mixture model is close to flat at the point that MLEs are obtained. When the sample size becomes larger, however, unobtainable or unidentifiable test statistics are eliminated. We present bar plots of all standardized test statistics obtained for all three sample sizes based on those five parameter settings. These plots help to visualize the advantages of the piecewise Gompertz model in terms of the normal approximation of the test under the null hypothesis and the power of the test under the alternative hypothesis. Figures (5-6), (5-7), and (5-8) are the distributions of the test statistics under parameter setting 1 for both the mixture model and the piecewise Gompertz model. These three figures use sample sizes of 400, 1000, and TS =

PAGE 67

59 Table 5-1. Type I Errors for the Test of Equality of Cure Proportions for the Mixture and the Three-Piece Gompertz Models. Parameter Setting Sample Size Type I Error # NaNs Mixture Gompertz Mixture Gompertz 1 (200, 200) 0.0305 0.0375 220 (500, 500) 0.0171 0.0412 1 (1000, 1000) 0.0365 0.0412 2 (200, 200) 0.0203 0.0373 69 (500, 500) 0.0356 0.0360 (1000, 1000) 0.0402 0.0408 Table 5-2. Powers for Tests of Equality of Cure Proportions for the Mixture and the Three-Piece Gompertz Models. Parameter Setting Sample Size Power # NaNs Mixture Gompertz Mixture Gompertz 3 (200, 200) 0.0260 0.1016 127 (500, 500) 0.0328 0.2024 (1000, 1000) 0.0679 0.3703 4 (200, 200) 0.0616 0.1896 5 (500, 500) 0.0357 0.4351 (1000, 1000) 0.2721 0.7379 5 (200, 200) 0.2706 0.5572 28 (500, 500) 0.7180 0.9331 (1000, 1000) 0.9601 0.9981

PAGE 68

60 2000, respectively. Similarly, Figures (5-9), (5-10), and (5-11) are for parameter setting 2; Figures (5-12), (5-13), and (5-14) are for parameter setting 3; Figures (5-15), (5-16), and (5-17) are for parameter setting 4; and Figures (5-18), (5-19), and (5-20) are for parameter setting 5. From these tables and plots, we observe that Type I errors obtained from the three-piece Gompertz model are closer to 0.05 than those from the mixture model. Powers presented in Table (5-2) for the three-piece Gompertz model are much larger than ones for the mixture model for all sample sizes. The estimation procedure for obtaining MLEs is also easier to carry out for the three-piece Gompertz model than for the exponential-logistic model. In addition, the algorithm for the three-piece Gompertz model converges much faster.

PAGE 69

61 a o o o CM >. o (D w o LO O -I 1 1 1 I I -4 -2 0 2 4 Standardized Test Statistic b o o o CM O O -4 -2 0 2 4 Standardized Test Statistic Figure 5-6. Standardized test statistics for parameter setting 1 for 10000 random samples with Size 400. a) Mixture model; b) Three-piece Gompertz model.

PAGE 70

62 a -4 -2 0 2 4 Standardized Test Statistic b -4 -2 0 2 4 Standardized Test Statistic Figure 5-7. Standardized test statistics for parameter setting 1 for 10000 random samples with size 1000. a) Mixture model; b) Three-piece Gompertz model.

PAGE 71

Frequency Frequency 500 1000 2000 0 500 1000 2000 63 a -4 -2 0 2 4 Standardized Test Statistic b -4 -2 0 2 4 Standardized Test Statistic Figure 5-8. Standardized test statistics for parameter setting 1 for 10000 random samples with size 2000. a) Mixture Model; b) Three-piece Gompertz model.

PAGE 72

64 a 0) O D o U^r0) o LO I r 1 1 1 -4 -2 0 2 4 Standardized Test Statistic b o T 1 -T -4 -2 0 2 4 Standardized Test Statistic Figure 5-9. Standardized test statistics for parameter setting 2 for 10000 random samples with size 400. a) Mixture model; b) Three-piece Gompertz model.

PAGE 73

65 a -4 -2 0 2 4 Standardized Test Statistic b I I I 1 1 -4 -2 0 2 4 Standardized Test Statistic Figure 5-10. Standardized test statistics for parameter setting 2 for 10000 random samples with size 1000. a) Mixture model; b) Three-piece Gompertz model.

PAGE 74

66 a -4 -2 0 2 4 Standardized Test Statistic b I ^ ^ r -4 -2 0 2 4 Standardized Test Statistic Figure 5-11. Standardized test statistics for parameter setting 2 for 10000 random samples with size 2000. a) Mixture model; b) Three-piece Gompertz model.

PAGE 75

67 a I I 1 r -4 -2 0 2 4 Standardized Test Statistic b I I i I , 1 -4 -2 0 2 4 Standardized Test Statistic Figure 5-12. Standardized test statistics for parameter setting 3 for 10000 random samples with size 400. a) Mixture model; b) Three-Piece Gompertz model.

PAGE 76

68 a -4 -2 0 2 4 Standardized Test Statistic b 1 1 ~T -4 -2 0 2 4 Standardized Test Statistic Figure 5-13. Standardized test statistics for parameter setting 3 for 10000 random samples with size 1000. a) Mixture model; b) Three-piece Gompertz model.

PAGE 77

69 a o -4 -2 0 2 4 Standardized Test Statistic b -4 -2 0 2 4 Standardized Test Statistic Figure 5-14. Standardized test statistics for parameter setting 3 for 10000 random samples with size 2000. a) Mixture model; b) Three-piece Gompertz model.

PAGE 78

70 a I I -6 -4 -2 0 2 4 Standardized Test Statistic b o o o o o -6 —I r \ 1 1 -4 -2 0 2 4 Standardized Test Statistic Figure 5-15. Standardized test statistics for parameter setting 4 for 10000 random samples with size 400. a) Mixture model; b) Three-piece Gompertz model.

PAGE 79

71 a I 1 1 r -6 -4 -2 0 2 4 Standardized Test Statistic b o o O 1c LL. O O lO o -6 4-2 0 2 Standardized Test Statistic 4 Figure 5-16. Standardized test statistics for parameter setting 4 for 10000 random samples with size 1000. a) Mixture model; b) Three-piece Gompertz model.

PAGE 80

72 a T 1 1 — r -6 -4 -2 0 2 4 Standardized Test Statistic b -6 -4 -2 0 2 4 Standardized Test Statistic Figure 5-17. Standardized test statistics for parameter setting 4 for 10000 random samples with size 2000. a) Mixture model; b) Three-piece Gompertz model.

PAGE 81

73 a T 1 I 1 r -8 -6 -4 -2 0 2 4 Standardized Test Statistic b -8 -6 -4 -2 0 2 4 Standardized Test Statistic Figure 5-18. Standardized test statistics for parameter setting 5 for 10000 random samples with size 400. a) Mixture model; b) Three-piece Gompertz model.

PAGE 82

74 a o T r -8 -6 -4 -2 0 2 4 Standardized Test Statistic b -8 -6 -4 -2 0 2 4 Standardized Test Statistic Figure 5-19. Standardized test statistics for parameter setting 5 for 10000 random samples with size 1000. a) Mixture model; b) Three-piece Gompertz model.

PAGE 83

75 a -8 -6 -4 -2 0 2 4 Standardized Test Statistic b -8 -6 -4 -2 0 2 4 Standardized Test Statistic Figure 5-20. Standardized test statistics for parameter setting 5 for 10000 random samples with size 2000. a) Mixture model; b) Three-piece Gompertz model.

PAGE 84

CHAPTER 6 EXAMPLE 6.1 Introduction In Chapter 2 we proposed a test statistic that addresses the hypothesis presented in Equation (2.3). Subsequently in Chapter 3 and Chapter 4 we established the asymptotic normality of parameter estimates in the piecewise Gompertz model. The asymptotic normality validated the test of hypothesis proposed in Chapter 2. Chapter 5 reported Monte Carlo Type I error and power calculations, which established the strength of our test statistic on the comparison of cure rates under various parameter settings. In this chapter, we demonstrate the applicability of our test statistic with a data set. The data set comes from a study by the Pediatric Oncology Group (POG) at the university of Florida (Ravindranath et al. 1996). The motivation of this study is to compare cure proportions among patients with acute myeloid leukemia (AML) under different treatments. Three treatments, allogeneic bone marrow transplantation (alio. BMT), autologous bone marrow transplantation (auto. BMT), and intensive chemotherapy, are performed in the study. In order to register in the study, all patients are required to be in remission after the first two courses of induced chemotherapy. The POG evaluated 666 patients whose ages were under 21 from June 1988 to March 1993. A total of 552 patients entered remission and are therefore eligible for the study. Of the eligible patients, 89 patients received alio. BMT; 117 were assigned to an additional six-course intensive chemotherapy; and 115 were assigned to receive auto. BMT. There were 231 patients who did not participate in the study due to various reasons. 76

PAGE 85

77 6.2 Methodology Our objective based on this data set is to perform pairwise comparisons on cure rates between treatment groups after controlling for patient’s gender. We compare tests for three parametric models and for a proportion test based on KM estimates of cure. The three parametric models are the logistic-exponential mixture model, the one-piece Gompertz model, and the three-piece Gompertz model. In addition, we present cure rates and their standard deviations for every treatment and gender combination based on estimates from parametric models and on KM estimates. In the end, we compare the standard one-piece Gompertz model with the three-piece Gompertz model by applying the goodness-of-fit test we derived in Ghapter 5. In this section, we will briefly describe 1) three parametric models and their corresponding test statistics; and 2) the proportion test based on KM estimates of cure proportions. In the exponential-logistic mixture model, we assume an exponential distribution with parameter A for the failure time and a logistic regression model for the cure proportion. The expression log(A) is defined as a linear function of the covariate x and the treatment indicator. We define two dummy variables to represent the three-level treatment indicator. They are presented as Gi — /[if patient in treatment 1] &nd G 2 — -f[if patient in treatment 2]The SUrvival function for the failure time given that the observation is not a long-term survivor (Z = 1) is S*{t\Z = 1) = (6.1) The cure probability is defined as ~ g/3o+/3iXi+/923i+/33S2 ' (6.2)

PAGE 86

78 Therefore the hypothesis of testing the equality of cure rates in three treatment groups can be written as -ffo ^2 = 03 = 0. (6-3) Pairwise comparisons between two treatments are based on the following standardized test statistics: Ti = P 2 — 03 (6.4) \Iv02-03) T 2 = . , and vm) (6.5) Ts = 03 ( 6 . 6 ) \/V{03) Note that Ti is the test statistic that compares cure proportions between Treatments 1 and 2; T 2 compares cure proportions between Treatments 1 and 3; and T 3 compares cure proportions between Treatments 2 and 3. The variances are obtained by inverting the observed information matrix obtained by taking the second derivatives of the log-likelihood function with respect to parameters. The one-piece Gompertz model is a special case of the piecewise Gompertz model. Let Xg be the parameter associated with the treatment g and /?i the coefficient associated with the covariate x. For Xg < 0, the cure proportion for patients with the treatment g and the covariate x is written as ’Kg{x) = (6.7) If Xg < 0, there is cure proportion for patients under the treatment. If Ag < 0 for any g, then the hypothesis of testing the equality of cure proportions is written as Ho Ai — A2 — A3. (6.8)

PAGE 87

79 Standardized test statistics for pairwise comparisons of cure proportions are presented as Ai — A2 Ti T, = X\ — A3 ^1/(Ai-A3) A2 — A3 , and (6.9) ( 6 . 10 ) ( 6 . 11 ) y/v{X 2 A3) Similarly these three standardized test statistics are for the comparison of cure proportions between Treatments 1 and 2, between Treatments 1 and 3, and between Treatments 2 and 3, respectively. The variances are obtained from the inverse of the observed information matrix. Finally, as we have discussed in Chapter 2, the cure proportion in a piecewise Gompertz model as a function of the covariate x and the treatment g is expressed in Equation (2.4). We consider a three-interval case in this example. Denote the three intervals by [ro,ri), [ri,T2), and [r2,r3), where ro = 0 and T3 = 00. When Xsg < 0 for any g, testing the equality of cure proportions for any given value of X is equivalent to testing Ho : Yo — 1 — 1 ^ — 1 A. il E A,: i2 = E A i3 ( 6 . 12 ) i=l i=l i—1 Standardized test statistics for pairwise comparisons between treatment groups are presented as follows: Ti = To = i=l Xi2 (6.13) ^(ELi y-3 ’’i Ai Ail * ^ Ai2 E 3 e^‘3 i=l Ai3 and ^(^3 g-^tiU-i _e-^ii’"i _ y-^3 * ^ Ail ^i3’"i-l _gAi3T. Ai3 4 (6.14)

PAGE 88

80 Y^3 2-^iz=:\ 2^i— 1 _eAj2^i Ai 2 Z^i=l ^i3'^i-l -e^iSn -^i3 (6.15) e^i2n-i —g>^i2'^i *' vZ^i=i r: z^i=i 3 e^i3’-i-i_eAi3Ti ) •^i2 * ^ A,3 These test statistics Ti,T 2 , and T 3 are for comparisons of cure proportions between Treatments 1 and 2, between Treatments 1 and 3, and between Treatments 2 and 3, respectively. Variance estimates are not quite straightforward as for tests based on the mixture model and on the one-piece Gompertz model. They have already been presented in Chapter 2 and will not be repeated here. Proportion tests of KM estimates of cure for pairwise comparisons between treatments are generated based on KM estimates of cure proportions and standard errors of the estimated cure proportions. P-values for tests derived from parametric models and for the proportion test will be presented in the next section. In addition, we present survival curves estimated based on both parametric models and KM estimates. Furthermore, we present the cure proportion and the corresponding standard deviation for each treatment and gender combination based on parametric estimates and on KM estimates. The result of comparing the one-piece Gompertz model with the three-piece model by the goodness-of-fit test we derived earlier is also presented. 6.3 Results 6.3.1 Pairwise Comparisons on Cure Proportions In Table (6-1), we present p-values for tests based on three parametric models and for the proportion test based on KM estimates. The three intervals for the three-piece Gompertz model are defined as [0, 50), [50, 140), and [140, 00 ). All numbers presented here are in terms of weeks in follow-up. Tests based on the exponential-logistic mixture model, the three-piece Gompertz model, and the KM estimates detect the difference in cure proportions between alio. BMT and auto.

PAGE 89

81 BMT and the difference between alio. BMT and intensive chemotherapy. The alio. BMT increases cure proportion significantly more than the other two treatments. There is no significant difference between auto. BMT and chemotherapy. A test based on the one-piece Gompertz model does not show any significant difference between any two of these three treatments. The maximization procedures obtaining MLEs of parameters for both the one-piece Gompertz model and the threepiece Gompertz model are smooth. The algorithm converges within a few steps. However, there is a convergence problem related to the mixture model. Different starting values for parameters are chosen in order to obtain the convergence. We have discussed the convergence problem related to the mixture model in both Chapter 1 and Chapter 5. The convergence problem we encounter here confirms our earlier assertion. Table 6-1. Comparison of Tests from the Mixture Model, the One-Piece Gompertz Model, and the Three-Piece Gompertz Model, and the Proportion Test on Cure Proportions. Comparisons P-values for Treatment Comparisons Mixture 1-Piece Gompertz 3Piece Gompertz Proportion Test Alio. BMT vs Auto. BMT 0.016 0.384 0.015 0.011 Alio. BMT vs chemotherapy 0.012 0.101 0.003 0.007 Auto. BMT vs chemotherapy 0.690 0.516 0.897 0.741 In Figures (6-1) to (6-4), we show estimated survival curves from different parametric models and from KM estimates. The three curves on Figure (6-l)a are survival curves for three treatment groups for male patients based on the mixture model. Figure (6-l)b presents three treatment survival curves for female patients based on the mixture model. Figure (6-2) presents survival curves estimated from the one-piece Gompertz model for male and female patients in three treatment groups. Survival curves estimated from the three-piece Gompertz model are

PAGE 90

presented in Figure (6-3). The KM estimates of survival curves are presented in Figure (6-4). 82 6.3.2 Cure Proportion Estimates Estimates of cure proportions are presented in Table (6-2). We present estimates of cure rates and their standard deviations from the mixture model, the one-piece Gompertz model, the three-piece Gompertz model, and the KM estimate for all combinations of patientÂ’s gender and three treatment groups. Treatment 1 represents alio. BMT, treatment 2 represents auto. BMT, and Treatment 3 represents intensive chemotherapy. Note that cure proportion exists for each of the combinations. Female patients show higher cure probability than male patients. Standard errors for cure proportions estimated from both the three-piece Gompertz model and the mixture model are smaller compared to the one-piece Gompertz model and the KM estimates. Table 6-2. Estimates and Standard Deviations of Cure Proportion Generated from the Mixture Model, the One-Piece Gompertz Model, the Three-Piece Gompertz Model, and the Kaplan-Meier Estimates. Cure Proportions (Standard Deviations) 1-piece 3-piece Gender Treatment Mixture Gompertz Gompertz KM M 1 0.558 (0.059) 0.504 (0.062) 0.568 (0.056) 0.648 (0.076) M 2 0.386 (0.050) 0.445 (0.060) 0.397 (0.050) 0.387 (0.065) M 3 0.407 (0.033) 0.409 (0.050) 0.405 (0.033) 0.388 (0.036) F 1 0.597 (0.056) 0.533 (0.062) 0.596 (0.054) 0.531 (0.071) F 2 0.424 (0.051) 0.475 (0.061) 0.429 (0.051) 0.421 (0.065) F 3 0.445 (0.036) 0.440 (0.051) 0.437 (0.035) 0.466 (0.041) 6.3.3 Goodness-of-Fit From the above discussion, we observe that the three-piece Gompertz model fits better than the one-piece Gompertz model. It is natural to compare these two models by applying the goodness-of-fit test we derived in this dissertation. The null

PAGE 91

83 a 0 200 400 600 Time in weeks b 0 200 400 600 Time in weeks Figure 6-1. Survival curves estimated from the mixture model for three treatments, a) Male; b) Female.

PAGE 92

84 a 0 200 400 600 Time in weeks b 0 200 400 600 Time in weeks Figure 6-2. Survival curves estimated from the one-piece Gompertz model for three treatments, a) Male; b) Female.

PAGE 93

85 a 0 200 400 600 Time in weeks b 0 200 400 600 Time in weeks Figure 6-3. Survival curves estimated from the three-piece Gompertz model for three treatments, a) Male; b) Female.

PAGE 94

86 a Time in weeks b Time in weeks Figure 6-4. Survival curves from Kaplan-Meier estimates for three treatments, a) Male; b) Female.

PAGE 95

87 hypothesis is expressed as Hq : Alg \2g = Xzg, for any g = 1,2, and 3. Note that \ 2 g and \^g are parameters for the second and third interval in the three-piece Gompertz model, respectively. We showed that under Hq -2{h h) Xi, where and I 2 are the log-likelihood functions for the one-piece and the threepiece Gompertz models, respectively. From the computation, we obtain that li = —1977.289 and I 2 = —1950.979. Therefore, the test is highly significant. It implies that the three-piece model is significantly better than the one-piece Gompertz model.

PAGE 96

CHAPTER 7 SUMMARY In this dissertation, we presented a piecewise Gompertz model to address long-term survival problem. We constructed a standardized test statistic based on our model which compares cure rates between treatment groups. We derived the strict concavity of the likelihood function based on the model. In addition, we derived the asymptotic properties of the MLEs of parameters in the model. A goodness-of-fit test was also derived to access an appropriate number of intervals for the model. In oncology study when there are long-term survivors and when there are different hazard patterns on different time intervals, one might consider using our piecewise Gompertz model. Our model is more computational feasible than both the mixture model and the semi-parametric model with bounded cumulative hazard function. In addition, our model is more flexible due to its piecewise structure. Our model also provides a reliable estimate of the cure rate than the KM estimate when the sample size is small at the end of the study period. In the future, we plan to find an optimal partition of the time span by deriving a test to compare different partitions. Sample size determination will also be assessed. Since we notice that there are jumps on the survival curve defined by the piecewise Gompertz model, we might consider using a log-spline model to achieve smoothness of the curve. In addition, we plan to perform more detailed simulation studies on the semiparametric model which assumes bounded cumulative hazard function and to study the asymptotic behavior of the semiparametric model. 88

PAGE 97

APPENDIX A DERIVATION OF MARGINAL LIKELIHOOD Let and define poo poo JtQ=0 Jtn — 1 j— j poo poo ^‘ = / • 7 n /('ije""'''’®'*, Jtn-k Jtn-l i=n-k+l dtn—1 ' dtfi^k+ii ^ 1, 2, . . . , n. As /c = 1, poo n h= e”®"], (A.l) Jtn-l and when k = 2, aOO n f{U)e~^^*''’^'dtndtn-i 1-1 i=n-l poo poo J tn — 2 "in— I poo 1 Jtn-2 1 On(On-l + On) Jg-(0n-l+©n)i^(tn-2) _ g-(©n-l+0n)j -0n [e' -0„_lF(tn-2) p — ©n e-®-7. OnOn— 1 By similar derivation, we obtain that when k — 3, poo Ju-% j'g-(©n+©n-l+©n-2)F(t„_3) _ g-(©n+©n-l+©n-2)j 1 On(©n + O„_i)(0„ -I©n-l + ©n2 ) 89

PAGE 98

90 -On 0n0n-l(©n-l + 0n-2) g — (0n+0ri-l) 0 n-l©n2 ( 0 n + ©n-l) Jg — (0n-l+0n-2)f’(tn-3) g — (0n-l+0n-2)j g-0n-2^(
PAGE 99

91 Therefore by Equation (A. 5) = A 1 ^n—k+l,n—i+l^n—k,n—i+l 1 Q c >^n (A.7) ^n—i,n—i+l^n—i—l,n—i+l ‘ ' ' Sn—k+l,n—i+l Pn—k+l,n—i+l for i < A: and by the definition of Pa,b given in Chapter 1. In addition by Equation (A.6), t=l Ak+i,k+i = “7 ^n-k k—l = E(-i) -fc + l,n — »+l Z=1 k—l \i-k ^-S, ^n—k ^ *^Tl — fc + l,Tl — t-+-l I kjk — k-^1,71 — fc + 1 On-. 1 g ^n — fc+l,7i — i+l 1=1 +Ak,k 'S'n— i,n— 1 +l‘S'n— 1 — l,n— 1+1 Sn—k+l,n—i+l Sn—k,n—k g *^n— fc+l,n — fe + 1 ^n—k (A.8) Also we observe that ©n where the first equality is true by Equation (A. 2) and the second is true from Equation (A.l). Therefore ^1, 1 1 *S'n,n ©n Similarly we obtain through derived Equation (A.8) C p — Sn,n A —A ^ •^2,2 -^11 c o o ^ P ~'^ n^n ,S ,S i 1 ^ 1,71— 1 and If we let ~Sn— l,n A3, 3 — ‘S'n — 1 ,n— 1 ‘5'n — 2,n — 2 ‘S'u — 1 ,n g *^n — fc+2,n C , , , Dn-fc+2,n’ *^n— fc+l,n— fc+l-* (A.9)

PAGE 100

92 where P” was defined earlier in Chapter 1 as rij=n-fc +2 ^n-k+ 2 ,jIn addition, we observe from Equation (A. 8) that fc -2 1 '1 ^ C . , , , ^P — fc+2,n — i+l / pn-k+2,n—i+l fe+l,n— fe+1 _c _ k—1 p *^71 — fc+2,n fe+i -} Sn-k+l,n-k+l pn-i+2,npn-k+2,n-i+l Hence from equating Equations (A. 9) and (A. 10), the following equation holds. (A.IO) k I ^ ^ V(-D* = = (-1)*“^ / ^ ' pn—k-\-ly i=0 (A.ll) Now we can derive Ajt+i,fc+i as follows: 1A:+1,A:+1 Sn—k,n—k , ^1=1 g-S„_*+2,n Sn—k.n—k f ^ / v '' ' pn—i+2,'i pn—k+l,n—i+l 1 + + Afc,fce 5n-fc+l.n-* + l| 1 i=l f> *^n — fc+2,n I -{ 1 1 + Q , , pn—k+\,n pn— fc+2,n pn— fc+l,7j— fc+1 C , , , , , , pn—k+2, ^n—k,n—k ' •* '^n-fc+l,n— fc+l-* ^“*^71— /c+2,n C , , pn-k+\,n ’ ‘Jn-k,n-k^ by Equation (A.ll). Therefore, } (A.12) ^i,k — Aj^i ^71 — 1+1,71 — 1 + 1 ' p , , . , pn-i+2,n ’ Oi— A:+l,n— 1 + 1 -* for i < k. Ik, as defined in Equation (A. 2), therefore can be written as ^ — 1+2,71 — 1+1 ^(^n—fc ) — — (A.13) 4 = ^(-ir‘i=l (A.14)

PAGE 101

Set k = n, we obtain since ‘n n 1 = V(-iy-i n = y(-iy-i ' pn—i+2,n [e — Sn — i+2,n *S'l,n— t+l^(^o) g i=l n [e “^n — *+2,n g '^n — t+2,n ^ g *^l,n i=l n— 1 pn—i+2,n p, . , * ^ l,n— i+l 1=1 — <7 • I i n— 1 _c, E ^ — On — 1+1,71 *Jl,n (1 )*— ^ y^(-i)* — ^ — ' ' pn—i+l,np Z— /' ' pn—i+l,n i=0 i=0 T — i g ~' 5 n t + l,n = y^(-i)* 1 (1 )’ '' pn-i+l,np . ^ i=0 n pn pl,n P\,n—i p 'S'n— t+l,n E(-i)‘-5^ 1=0 pn-i+l,np^ ’ n— 1 E = (-l) n-r pi,n ’ (A. Pi „_jP"-*+i’" 1=0 by Equation (A. 11). Hence the marginal likelihood function defined in Equation (1.14) can be derived since we observe that poo poo L „=/ ••/ L = n « f /„, Jto—0 Jtn-l j_l where L was defined in Equation (1.13).

PAGE 102

APPENDIX B PROOF OF LEMMA 4.1 The results presented in Lemma 4.1 hold for any general parametric model if it satisfies certain regularity conditions. In our proof, we show that the results also hold for any parametric model with independent censoring under certain regularity conditions. Let T* = min(T', U) be the observed survival/censoring time and A be the censoring indicator. Denote the hazard function, the probability density function, and the survival function for the survival time T by h{T-,0), f{T-,6), and 5(T;0), respectively. Denote the survival function for the censoring time U by Gu{u). Then the likelihood function is written as L{T*,A) = h^{T*;0)S{T*-,d). Therefore the expectation of the score function S{0) has the following expression: = E = E E[S{e)] d d9 logL(T*,A) = E L(T\^) h^(T’-,e)S(T’-,e) /(A = 1) + E ' Mh‘^iT-i0)S{T\0)) h^{T‘]0)S(T’-,0) /(A = 0) (B.l) We can write the first term on the right-hand side of Equation (B.l) as E dO -I h^{T*-,e)S{T*-0) /(A = 1) = E ifiT-,0) A. d0 f{u P{T < U). L /(T;0) u0) d ^f{u]0)Gu{u)du= — f{u;0)Gu{u)d I{T < U) 94

PAGE 103

95 Similarly, the second term of the right-hand side of Equation (B.l) can be written as ^(h^(r-,e)s(r-,e)) h^(T’,e)S{T‘-,0) /(A = 0) = E if de S{U;0) S{U;0) I{T > U) Jo S{u-,0) d0 Jo Hence, E[5(6>)] = ^ [P{T U)] = 0. Now we need to prove F[5(0)] = D. Let and jg{j§L) represent the (p -f IG) X (p qIG) dimensional matrices formed by corresponding second order derivatives. Note that j2 1 _ r j dr id? d^ogL ^ L^L-iiL){^,Ly dO^ dd^ L ’ 1 cPL. r V L2 L ' <55 “>« It is clear that E[5(0)] = D{6) if we can prove E [-— 1 = 0 . ^LdO^ Note that we can derive E[j^^] as L dO^ E{ 1 d^L 1 ,d^L 1 ,(PL^ ] = ^[ 7 ( 3^2 = 1 )] + E[ji^)I{A = 0 )]. 'Ld0‘^' 'L^d0'^'^' ' 'L^d0^ The first term of the right-hand side of Equation (B.2) can be written as (fi f{T-,0) E[yi^)I{A = 1)] = E[ ^\Z.nf' nT < U)] I (PL, f{u-0) d^ roo ^ f{^. Q\ M roo = / = / n^-^^)Gu{u)du Jo fw0) d0 Jo d0'^ P(T < U). (B.2) (B.3)

PAGE 104

96 Similarly, the second term of the right-hand side of Equation (B.2) is written as poo ^ poo = / c:, i rS{u-. 0)d(l -Gv(u)) = ^ S(u-,e)d(l Gv{u)) Jc S{u-,e) d$ Jo = ^P{T > U). (B.4) The combination of Equations (B.3) and (B.4) shows that T fpT fl2

PAGE 105

APPENDIX C PROOF OF LEMMA 4.2 We will prove this lemma for / = 1. The proof for / = 2 is similar and is omitted. Note that '0 0 < u'^h{u]>^ ,K{j),J{u))du^ poo pt = / {/ u"^h{u;)< ,K{j),3{u))du} d[Snj{t-,0o)Gu{t)], (C.l) Jo Jo where and Gu{t) are survival functions of the survival time Tj and the censoring time Uj, respectively. Note that poo pt { u"^h{u-,x' ,K{j),J{u))du}d[Snj{t-,0o)Gu{t)] Jo Jo = 0a)Gu{t) [ f «’"A(a; A', «(j), J(«)M«] } |„” poo + / Snj{t;Oo)Gu{t)t^h{t-x\K{j),3{t))dt Jo poo < / Snj{t;6o)t'^h{t-,x! ,K{j),J{t))dt. (C.2) Jo Let e = mm I e 2,„.x(0) a then under the assumption that the individual belongs to group g, Snj{t; 9o)f^h{t\ a', k(j), J(t))dt / T/_i poo ^ = / 5„,(t;0o)r/i(t;A,/c(i),J(t))dt+ / 6>o)r/i(t; A , K(j), J(t))dt Jo Jtj-i f 97

PAGE 106

98 Jo roo i-t , + / exp{— / h{u-,X(i^K{j)^3{u))du]f^e^'9^dt. (C.3) Jtj-i Jo It is observed that the first term in Equation (C.3) is uniformly bounded for A satisfying ||A' — Ao|| < e. The second term in Equation (C.3) can be written as noo rt , / exp{— / h{u\\o,K,{j)^i{u))du}f^e^’9^dt J Tj-l Jo < f exp {— f e^'9'^du}t'^e^^9^dt, Jri-i Jti-i (C.4) since ||xj|| < B for any j = 1, 2, . . . , n by Assumption 1. We will discuss Equation (C.4) in the following three cases; Case 1. Xfg < 0. ||A' — Ao|| < e implies X'jg < Xfg + e < ^Xfg < 0. Therefore, o-B\\0o\\ f Jti-i r Jti-i dt = Fi < oo. poo / exp{-< Jti-i Case 2. xfg = 0. The following inequality holds. roo ft , foo , I exp{— e^’s'^du}f^e^^9^dt < / exp{— — r/_i)}t”^e^^»*dt. Jti-i Jti-i Jti-i Since ||A' — Ao|| < e implies that jA;^| = — A^°^| < e and A^^ < Equation (C.4) satisfies < poo / exp{-( Jrj-i poo / exp{-( Jti-i -BWM J Tl-l -t — r/_i)}t"*dt = T 2 < oo. Case 3. xfg > 0. Equation (C.4) can be written as poo / exp{Jti-i poo = / exp{J Tl-l Jti-i }re^*dt < Ta < oo, f>^Ia ^ p'^Iq 'I-l x(0) '^ig

PAGE 107

99 since X'jg is uniformly bounded. Hence, there exists a F > 0 such that Eoo{ J A',K(j), J(«))du| < F, for X' satisfying ||A' — Aoll < e.

PAGE 108

APPENDIX D PROOF OF LEMMA 4.3 Note that for 0 < 5 < oo, poo pt [ u^h\u-,Xo,K,{j),3{u))du\d[Snj{t-,Oo)Gu{t)] Jo Jo poo pS > [ u^h^{u;XQ,K{j),3{u))du]d[Snj{t-,9o)Gu{t)] Js Jo pS poo = / u"^h^{u]Xo,K{j),3{u))du / d[Snj{t]9o)Gu{t)] Jo Js [J u'^h\u-,Xo,K{j),3{u))du][Snj{S;9o)Gu{S) Snj{oo-,9o)Gu{oo)]. We observe that Snj{5-,9o)Gu{S) Snj{oo-,9o)Gu{oo) > Snj{S-,9o)[Gu{6) G'u(oo)] pS > exp{-e^ll^“M h{u-,Xo,K{j),3{u))du}[Gu{S)-Gu{oo)] Jo — > 1 — G[/(oo), as (5 0 uniformly for j = 1, 2, . . . , n. Therefore, we can find a > 0 and an e > 0 such that Snj{So-,9o)Gu{do) Snj{oo;9o)Gu{oc) > e and pSo L vJ^h:{u\ Ao, K{j),3{u))du > e. Thus the expectation term in Equation (4.1) > for all j = 1, 2, . . . , n. If we take 7 = e^, the lemma is proved. 100

PAGE 109

APPENDIX E PROOF OF LEMMA 4.4 Let = (a^, b^) be a (p + /G) -dimensional unit vector, where a is p-dimensional and b^ = {bu , . . . , bn , . . . , bic, , bjoY is /G-dimensional. It suffices to show that c^DnC > wy, for any unit vector c. We showed in Chapter 3 that YDjiC = / h{u; A, /c(j), J(u))(a^Xj + u\YQ,{u, j))^du. j=\ •'o By expanding the expectation in the above equation, we obtain c^D„c ^ noo f ft o') = ly h(u; A,/c(j), J(u))(a'^Xj + ub^Q(u, j)) du|dFr;(t) = h{u-,X,K{j),3{u)){a^yij + u\YQ{u,j)Ydu'^d[Sj{t)Gu{t)]. Since l|x,ll < B by Assumption 1, C^DnC ^ fOO ft > / / h{u-,X,K{j),J{u)){a^Xj + u)YCl{u,j)) du d[Sj{t)Gu{t)] Jo Jo G poo ft = Y2 / h{u-,X,K{j),J{u)){a^y:j + u\YQ{u,j)) du 9=1 jegroup g ^ ^ d[SYt)Gu{t)]. (E.l) 101

PAGE 110

102 Assume that the j'" individual belongs to the treatment group g, then / ub Q('U,j) == ^ ^ ^ig^ue[Ti-i,Tj)] i=l and i=l Note that there exist a go where I < go < G and a io where I < io G such that *090 — JQ 1 IG (E. 2 ) We will discuss the behavior of the right-hand side of Equation (E.l) in the following four cases. Case 1 . ||a|p < 2(8 B^g/+t^) ^ treatment group ^0, poo pi / h{u;X,K{j),J{u))[a^yij + uh^Q,{u,j)]^dud[Sj{t)Gu{t)] Jo Jo poo pT\ > / h{u-,X,K{j),3{u))[a^Xj+uh^Q{u,j)Ydud[Sj{t)Gu{t)]. J Tt J hr-i Because u > |ti, ||x|| < B, and ^ , we observe that [a^Xj +ub^Q(u, j)]2 > hub^CliuJ)]'^ (a^Xj) 1 1 > -^(1llain-B^llalp > 8 IG^ " " ’ " " 16 /G Hence, > poo pt / h{u] X,K{j),J{u))[a^Xj + uh^Q{u,j)fdu d[Sj{t)Gu{t)] Jo Jo 2 poo pTi ~T^lJr lj,^^''^’^''du]d[Sj{t)Gu{t)] e 5 »-*.")[S,(rOGt;(Ti) S,(oo)G„(c»)l

PAGE 111

103 > > 16 G/ AigQ 1 ^ ^g'^lSo'’’! _ exp {— f e^^®o“(iu}[G{/(ri) — Gt/(oo)] GI Algo Jq 16GI Algp 7i > 0, by Assumption 4. 2 Case 2. ||a||^ < 2 { 8 b^gi+t^) ^ treatment group go, -IM h{u] A,k(j), J(u))[a^Xj +ub^Q,{u,j)]^du^d[Sj{t)Gu{t)] {/ + uh^Q{fJ'J)?du^d[Sj{t)Gu{t)] Jti-I ''Jtj -1 roo . pTi-i+S ^ > { h{u-,X,K{j),J{u))[a^:x.j + uh^Cl{u,j)]^du\d[Sj{t)Gu{t)]. Jtj-i+S Since u > r/_i > |ti in the above expression, with the additional conditions that i-INI igo — IG and ||xj|| < B for any j = 1, 2, . . . , n, we obtain that [a^Xj + ub^Q(u,j)] > 2 \ '’'i 16IG Therefore, />00 pTj-i+S . {/ h{u-,X,K{j),J{u))[a^yij + uh^Q{u,j)fdu\d[Sj{t)Gu{t)] i_(e^/9oC^-i+<^) e^’^o'^'-^)[Sj{Ti-i + 6 )Gu{ti-i + 5) 5j(oo)G[/(oo)] > > 16 G 7 A/gp + 6)[Gu{t,-, + S) G„(oo)|. iOGi A/go (E.3) Note that Sj(r/_i + S)[Gu{ti-i + ( 5 ) — G£/(oo)] Sj(Tl-l)[Gu{Tl-l) — Gj/(oo)] > exp{-e®ll^ll / /i(u; A, /c(j), J(u))dn}[G[;(r/_i) G[/(oo)] > e > 0,

PAGE 112

104 where Assumption 4 is applied. Therefore, there exists a 5 > 0 such that J ly /i(u; A, K(j), J(u))[a^Xj + uh^ Q{u, j)]‘^du'^d[Sj{t)Gu{t)] > 72 > 0. 2 Case 3. |la|p < 2 {s b^gi+t^) I ^. Therefore, -/ [J h{u-,X,K{j),J{u))[a^y.j + uh^Q{u,j)fdu^d[Sj{t)Gu{t)] poo pTig > — / h{u;X,K{j),J{u))[a^Xj + uh^Q{u,j)]‘^dud[Sj{t)Gu{t)] Jn^ JriQ-i > > > 16GI 1 16G/ e^'o9o^du[Sj{Tio)Gu{Tio) 5j(oo)Gt;(oo)] T-f 1 16G/ Aj(jgo [Gu{Tio) ~ G'f/(oo)] .(e^iosoUo _ eNo9o^io-i)5'^.(riJ[Gt/(riJ G[/(oo)] I (e'^»o9o'^‘o — 6^090 ^•0-1) / h{w,X,K{j),J{u))du} Jo = 73 > 0, where Assumption 4 is used again. By combining the above three cases, we obtain that c^DnC > . where j' = min(7i, 72, 73) and Ug^ is the number of individuals in group ^oUnder Assumption 3 the following holds: • r mm — > di, l
PAGE 113

105 when n is sufficiently large. Therefore, c^DnC > = n'y where 7 = 5iVe Case 4. ||a|r > 2{SB'^GI+Tf) say el Note that |ub^Q(«, j)| < \u\GI. Therefore, (a^x, + !ib''’Q{ti, j))^ > l(a’’x,)^ [tib^Q(a, j)]^ By Assumption 2, there exist an e > 0, a 5 > 0, and a no > 0. When n > no, there exists a subset A of {1,2, ...,n} with size ua > nS such that T I AtXjI > e for j € A. (E-4) HI '' Now we assume n > uq and (E.4) is satisfied. Therefore, if we denote by Si, for u e (0,5i], [a^Xj + nb^Q(u, j)]^ > -(a^Xj)^ — > 1 2,2 -€ e Hence, C^DnC A(«;A.K(i),J(a))^/K(o.,,ll
PAGE 114

Without loss of generality, let us assume tii < Ti. For the individual in treatment group g, 106 / OO -2 2 [ J h{u] K{j), J{u))—^I[ue{0,5i]]du\d[Sj{t)Gu{t)] poo pSi 2,2 > -y [J h{u]X,K{j),3{u))^du]d[Sj{t)Gu{t)] > 5 ,(oo)Gc,(oo)] > 7^ > 0 , where Assumption 4 is used. Note that 7^ depends on g. If we take 7o = min 7 „ > 0 and 1<5 e > e = ’^7-

PAGE 115

APPENDIX F PROOF OF LEMMA 4.5 From Lemma 4.1, we observe that E[c^D-^l^[Fr,{e,) = c^D-^l\Dn Dr;\D-^l\ = 0 . In addition we have ,=1 -^0 1/2 n J=1 DZ^^'^cdu. \ uQ{u,j) xj uQ'^iuJ) 1 /rt We showed in Lemma 4.4 that Amin(D„) > ri'y. This implies that Xmax{Dn ' ) < -F=. Therefore, y/fpy ’ < ( ^ ^uQ{u,j) ^ (xj uQ \ Y • yxj uQ^{u,j) j uQ{u,j) ^ ( \ D-'l^c = / / (F.l) Since T*, j = 1, 2, . . . , n are independent. J=1 / \ pT* 1 / ' /i(u; Ao,K(j), X, 'A (xj uQ^{u,j)) 107

PAGE 116

108 / )-l/2 \ ^uQ{u,j) ^ < [ /i(u; Ao,K(j), J(m))c^D„ j=i Jo D-‘/’cd«}' < f ' /i(u;Ao,KO),J(ti))— (B^ + /Gti^)dti}^ maxE'l / u’^/i(u; Ao, K(j), J(u))(iw > ^m= 0,2 1^0 J (xj uQ^(«,i)) < e2B||j3o|| (B2 + iGf n^'f< e2B||/3o|| = 0(n->). (B^ + IG)'r)2^2 nT The last inequality holds by Lemma 4.2 and F is a constant depending on Aq. Therefore, for any e > 0, B(|c’’£);‘/"(E„(e„) > e) < ^ ^ 0 , as n ^ 00 by the Chebyshev’s inequality. This implies c^D-^^^[Fn{eo) D,,]D-^/^c 4 0.

PAGE 117

APPENDIX G PROOF OF LEMMA 4.6 Let 0^ = {f3^,X^) and 9^ = (/3q, A g), then c^D-^^^[Fr,{9)-Fn{0o)]D-^^^c = f J(^)) Ao,k(j), j(w))] j=i -^0 / By Taylor’s expansion, \ uQU) ! (x^ uQ^(;))D„^/2cdu. (G.l) [e^"^^h{u-, A, k(j), J(^)) e^"^^h{uAg, K{j),3{u))] = [xJ(/3-/3g) +m 1^(AAo)]e^j'’'^'/i(u;Aj,K;(j), J(u)), r\i where 1 is a /G-dimensional vector of 1 and (/3j, AJ) = cxj{0^, Ag ) + (1 — aj){0^,X^) for 0 < aj < 1. By Equation (4.3) and Assumption 1, {/3 /3g) + ul^( A Ag) I < (R + IGu ) ^ which, together with (F.l), implies that the absolute value of the integrand in (G.l) < (R + /Gu)4=\/-e^"^“" — (5" + ^Gu2)/i(u;A„K(i),J(u)). (G.2) y 7 By Lemma 4.2, ' {B + /Gu)^^e^ll^°ll^(R2 + IGu^)h{uXj, k(j), J(«))du} 109

PAGE 118

no < n(B + /G)(52 + /G)e®ll^»ll — max L ri\/n7v^ o
PAGE 119

Ill Hence, the lemma is proved since mean square convergence implies convergence in ?** •
PAGE 120

APPENDIX H PROOF OF LEMMA 4.7 From Lemmas 4.5 and 4.6, D„)D-^'^c A 0, (H.1) where c is a (p + 7G)-dimensional unit vector. As a result of Equation (H.l), A 1, which implies that lim = |im = 1 n— >00 in probability. Therefore, DZ'I^F(B)D-^I^ 4 I(p+,G)x(p+,0). 112

PAGE 121

APPENDIX I PROOF OF LEMMA 4.8 It suffices to show that for each A:, A: = 1, 2, . . . ,p + IG, Eeo\\Snjk{9)\\' is uniformly bounded for j = 1, 2, . . . , n, where Snjk{0) is the component of the vector Snj{9)From Equations (3.2) and (3.3), we observe that \Snjk{0)\ is less than or equal to the maximum of the following two quantities; R + [ h{u-,Xo,K{j),3{u))du ' h{u;Xo,K{j),3{u))du. Jo Note that Tj is the survival time of the individual and T* = min(Tj, Uj), where Uj is the censoring time associated with the individual. To show Eeg\Snjk\ is bounded uniformly for j, it suffices to show that E[Tj]^ is uniformly bounded since we have shown the uniform boundedness of Ee^ Jq ^ h{u-, Aq, K{j),3{u))du in Lemma 4.2. Assume that the individual belongs to the treatment group g, then roo E[Tj]^ = / t^h{u-,Xo,K{j),3{u))e^o^^Snj{t;0o)dt Jo < r'“A(t;A„,/c(j),J(u))(ii Jo poo +e^ll^oll / t^h{u-,Xo,K{j),3{u))Snj{t-,do)dt. (LI) Jri-i Note that the first term in Equation (1.1) is uniformly bounded for any j. The second term is the same as in Equation (C.4) where the parameters are replaced by their true value. Hence, the uniform boundedness is proved. 113

PAGE 122

APPENDIX J SIMPSON’S FORMULA Suppose we want to approximate one integral by dividing the integration interval to Nq points. Assume a is the lower bound of the integral, b is the upper bound of the integral, and the integration form we are interested is [ f{c)dc, (J.l) J a then we will divide the interval (a, b) into Nq intervals with the end points of the intervals represented as where xq = a, xatq = b, Xj = a + jh, and h = The numerical approximation of (J.l) is then written as , No-l No-1 f{c)dc = -{f{xo) + f{xNo) + 2 ^ f{xj) + 4 ^ ° j=i j=o 114

PAGE 123

REFERENCES Aitken, M., Anderson, D., Francis, B. and Hinde, J. (1989). Statistical modelling in GLIM, Clarendon Press [Oxford University Press]. Berkson, J. and Gage, R. P. (1952). Survival curve for cancer patients following treatment. Journal of the American Statistical Association 47: 501-515. Bithell, J. F. and Upton, R. G. (1977). A mixed model for survival applied to British children with neuroblastoma. Recent Developments in Statistics, Proceedings of the 1976 European Meeting of Statisticians, North-Holland/Elsevier (Amsterdam), pp. 635-646. Boag, J. W. (1949). Maximum likelihood estimates of the proportion of patients cured by cancer therapy. Journal of the Royal Statistical Society 11: 15-53. Cantor, A. B. and Shuster, J. J. (1992). Parametric versus non-parametric methods for estimating cure rates based on censored survival data. Statistics in Medicine 11: 931-937. Chen, M., Ibrahim, J. and Sinha, D. (1999). A new bayesian model for survival data with a surviving fraction. Journal of the American Statistical Association 94: 909-949. Doornik, J. A. (1998). Ox An Object-Oriented Matrix Programming Language, Timberlake Consultants (London). Farewell, V. T. (1977). A model for a binary variable with time-censored observations, Biometrika 64: 43-46. Farewell, V. T. (1982). The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38: 1041-1046. Farewell, V. T. (1986). Mixture models in survival analysis: Are they worth the risk?. The Canadian Journal of Statistics 14: 257-262. Friedman, M. (1981). Piecewise exponential models for survival data with covariates, Annal of Statistics 10: 101-113. Gieser, P. W., Chang, M. N., Rao, P. V., Shuster, J. J. and Pullen, J. (1998). Modelling cure rates using the Gompertz model with covariate information. Statistics in Medicine 17: 831-839. 115

PAGE 124

116 Goldman, A. I. (1984). Survivorship analysis when cure is a possibility: A Monte Carlo study, Statistics in Medicine 3 : 153-163. Gordon, N. H. (1990). Application of the theory of finite mixtures for the estimation of ’’cure” rates of treated cancer patients. Statistics in Medicine 9 : 397-407. Gray, R. J. and Tsiatis, A. A. (1989). A linear rank test for use when the main interest is in differences in cure rates. Biometrics 45 : 899-904. Halpern, J. and Brown, B. W. J. (1987). Cure rate models: Power of the logrank and generalized Wilcoxon tests. Statistics in Medicine 6: 483-489. Jung, S.-H. (1996). Regression analysis for long-term survival rate, Biometrika 83 : 227-232. Kalbfleisch, J. D. and Prentice, R. L. (1980). The Statistical Analysis of Failure Time Data, Wiley (New York). Karrison, T. (1987). Restricted mean life with adjustment for covariates. Journal of the American Statistical Association 82 : 1169-1176. Kuk, A. Y. C. and Chen, C.-H. (1992). A mixture model combining logistic regression with proportional hazards regression, Biometrika 79 : 531-541. Laska, E. M. and Meisner, M. J. (1992). Nonparametric estimation and testing in a cure model. Biometrics 48 : 1223-1234. Mailer, R. A. (1988). On the exponential model for survival, Biometrika 75 : 582586. Mailer, R. A. and Zhou, X. (1996). Survival Analysis with Long-Term Survivors, Wiley (New York). Peng, Y., Dear, K. B. G. and Denham, J. W. (1998). A generalized f mixture model for cure rate estimation. Statistics in Medicine 17 : 813-830. Rao, C. (1973). Linear Statistical Inference and It’s Applications (Second Edit ion), Wiley (New York). Ravindranath, Y., Yeager, A. M., Chang, M. N., Steuber, C. P., Krischer, J., Graham-Pole, J., Carroll, A., Inoue, S., Camitta, B. and Weinstein, H. J. (1996). Autologous bone marrow transplantation vesus intensive consolidation chemotherapy for acute myeloid leukemia in childhood. The New England Journal of Medicine 334 : 1428-1434. Risch, N. (1983). Estimating morbidity risks with variable age of onset: Review of methods and a maximum likelihood approach, Biometrics 39 : 929-939.

PAGE 125

117 Sposto, R., Sather, H. N. and Baker, S. A. (1992). A comparison of tests of the difference in the proportion of patients who are cured, Biometrics 48: 87-99. Tarone, R. E. and Ware, J. (1977). On distribution-free tests for equality of survival distribution, Biometrika 64: 156-160. Taylor, J. M. G. (1995). Semi-parametric estimation in failure time mixture models. Biometrics 51: 899-907. Tsodikov, A. (1998). A proportional hazards model taking account of long-term survivors. Biometrics 54: 1508-1516. Yamaguchi, K. (1992). Accelerated failure-time regression models with a regression model of surviving fraction: An application to the analysis of “permanent employment” in Japan, Journal of the American Statistical Association 87: 284-292.

PAGE 126

BIOGRAPHICAL SKETCH Haoyi Chen was born on November 29, 1972, in Jinan, PeopleÂ’s Republic of China. She received her Bachelor of Economics degree from Shanghai University of Finance and Economics in July 1993, majoring in statistics. After two years of industrial experience in business statistics in China, she joined the graduate program in the Department of Statistics at the University of Florida. She received her Master of Statistics degree in January 1998. She is expected to receive her PhD degree in December 2001. 118

PAGE 127

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully ade(4uate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. fA Myron-^'. Chang, Chairm Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. 7 / Randolph L. Carter Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. J II Ronald H. Randles Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy Samuel S. Wu Assistance Professor of Statistics I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. VY \ . ^ Michael B. Resnick Professor of Pediatrics

PAGE 128

This dissertation was submitted to the Graduate Faculty of the Department of Statistics in the College of Liberal Arts and Sciences and to the Graduate School and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. December 2001 Dean, Graduate School