1 Â— a. This approach gives even narrower intervals than obtained by inverting the twosided test with the ordinary P-value. Note that 9is the smallest 9 satisfying P{9) > a. Thus, before 9_, there is no point having P{9) > a. Also note that P*{9) is bounded by P{9) and P*(9) < P{9). For instance, at the ordinary lower limit, if P*{9-) = P{9_), then 9l_ = 0_. Otherwise, 01 > 0_. By a symmetric argument, < 0-I-Hence, the two-sided modified confidence interval is contained within the two-sided ordinary confidence interval. We illustrate these alternative Â“exactÂ” confidence intervals for the common odds ratio using Tables 2.1 and 2.2. For Table 2.1 the 95% confidence interval by inverting a two-sided test is (1.29, 261.49) based on the ordinary exact P-values and (1.38, 40.45) based on modified exact P-values, P*{9) and P*{9). Using Table 2.2 the confidence intervals are (0.88, 15.92) using the ordinary exact P-values, (1.01, 10.30) using P*{9), and (1.01, 11.14) using P;{9). Table 2.3 contains 95% confidence intervals obtained using the two separate onesided ordinary and modified exact P-values, and using the ordinary and modified two-sided exact P-values. For these tables, the confidence interval constructed using the ordinary two-sided P-value is shorter than the ordinary one based on two onesided P-values. In fact, for each data set, the upper endpoint for the two-sided based interval equals the endpoint that would be obtained with the one-sided method for

PAGE 50

42 a 90% confidence interval. For each type of interval, the ones based on the modified P-value are narrower yet. For Table 2.2 the modified confidence interval based on T' = Yik is shorter than the corresponding confidence interval based on the table probability in both one-sided and two-sided cases. One way to compare the methods to construct the confidence interval and to calculate some degree of the conservativeness is using the coverage function (Vollset and Hirji 1991). The coverage function, for a given value of 0, is computed by summation of 9) over t for which the confidence interval contains the given value of 9. The function is then plotted as a function of 9. Hence, it displays how closely the actual coverage probability falls to the nominal coverage probability. For the conditional distribution having the fixed marginal counts of Table 2.1, Figures 2.9 and 2.10 show the actual coverage probability as a function of the true log odds ratio, for 95% confidence intervals based on inverting separate one-sided tests using the ordinary or modified P-value. We use for Figure 2.9 and the table probability for Figure 2.10, for the secondary partitioning in the modified P-value. There is a clear advantage to using the interval based on the modified P-value. For Table 2.2, this calculation requires a huge computing time, and we have not been able to get results using the conditional distribution based on the margins of all 18 partial tables. Thus, we display results using various subsets of the partial tables of Table 2.2. Figure 2.11 gives an analogous display using various numbers of partial tables from Table 2.2. It shows how the conservativeness is reduced by using confidence intervals based on inverting tests with modified P-values. As the number of strata increases, the modified approach yields actual level closer to the nominal level, and this holds over a broader range of odds ratio values. For either approach, for sufficiently large 9, all tables with those margins would have lower bound of the interval below 9] for sufficiently small 9, all tables would have upper bound above 9. In such cases, the actual probability of coverage of a

PAGE 51

43 100(1 a)% confidence interval has lower bound 1 a/2. That bound is achieved at values of 9 that are potential endpoints of the intervals (Neyinan 1935). To show this, let {9_,9^) denote the ordinary interval based on a one-sided test. Suppose that the value of the upper limit, is large enough so that all the lower limits from other possible tables are less than 9^. Since 9j^ is constructed by inverting the one-sided a/2 test, we have P{T < tg', 9j^) = a/2 and P{T > to + 1; ^+) = 1 Â— o;/2 accordingly. The coverage function at 0 = 0+ is C{9^) = Y.nt.0^)P{t-9^) t = P(t; t > to + l;9+) = where l{t,9+) is a indicator function to indicate whether or not 9+ is within the confidence interval at T = t. Note that at 0 = 6>+, we have P{T < to]9+] = a/2, and 9^ is the upper limit. At some value of T = the fact that 9^ is within this interval corresponds to P(T < T; 0^.) > a/2. In order to satisfy this, we need to have t' > to + 1, since P{T < to] 9^) = a/2. Hence, the coverage probability that is the summation of P{t] 9+) over t such that f > fo + 1 is 1 a/2. For 9 > 9+ the coverage function has P{9) > 1 Â— a/2. Figures 2.12 and 2.13 give an analogous display for the confidence intervals based on inverting two-sided tests using the ordinary or modified P-value using Table 2.1. For the secondary statistic T\ Figure 2.12 uses Y^Xl{9) and Figure 2.13 uses the table probability. Again, there is an advantage to the interval based on the modified P-value. Comparing the figures of coverage probability for confidence intervals, we see there is almost always an advantage to using the confidence interval based on inverting two-sided tests. Figure 2.14 gives an analogous display using some fixed sets of margins of Table 2.2. There is a dramatic improvement in the two-sided modified confidence intervals, when the number of strata is large. As the number of

PAGE 52

44 strata increases, we can expect that actual coverage probability is very close to the nominal coverage probability. When log 9 is between -2 and 2, we see there is a large increase in the coverage probability for both the ordinary two-sided and modified twosided confidence intervals. At that point, many new tables for which the confidence intervals contain the given value of 6 are added to the calculation of the coverage probability, and the jump comes from the new included non-null table probabilities. For the coverage probability based on two-sided ordinary tests, the big jump has occurred before the coverage probability based on two-sided modified tests has a big jump, and the amount of increase is greater than that of two-sided modified tests. Also, at that jump point, more new tables are included for the coverage probability based on two-sided ordinary tests than the coverage probability based on two-sided modified tests. We have observed similar results using other sets of fixed margins. In particular, for the two-sided approach, for large |log6Â»|, the true coverage probability has 0.95 as a lower bound rather than 0.975. For the proof, let be the ordinary confidence interval based on the two-sided test. Suppose that the value of the upper limit, 0_)., is large enough so that all of the lower limits from other possible tables are less than Then at 0 = 6*+ we have Yl{t p(t-,e+)p{to-,B+)} -P(^) ^1 -) > 1 Â— a. At 0 the coverage function is C{9+) = ^I{t,9+)P{t-e^) t = T. {t : P{t-,B+)>P{to-,e+)} > 1 a, since at 0 = 9^, we have Yl{t . p(t-,B+)

a. In order to satisfy this, we need to have P{t'\ 9p) > P{to] 9^). Then the two-sided

PAGE 53

45 ordinary P-value is larger than cr at T = t'. Hence, the coverage probability, which is the summation over t such that P{t;9+) > P{to-,9+), is at least 1 a. Also for 9 > 9+ the coverage function has P{9) > 1 Â— cv. For a special case, suppose that P{t] 9+) > P{to\ 9^) for all t > Then at 6> = 9^, P{9^)= Y. F(i;0+) = = Â«Â• {t . p(f,e^)

-,op t>to+l = 1 a, since at some value of T = t', the fact that 9^ is within this interval corresponds to P(T < t ; 0_|_) > a. This requires T > to + 1, since P[T < toj ^+) = Hence the coverage function has C{9^) > 1 Â— a. This relates to the property mentioned previously, by which an interval endpoint for the two-sided approach with error probability a can equal one for the one-sided approach with error probability 2a. So far, we have used the coverage probability to compare the methods of constructing the confidence interval. An alternative way to compare them is to compute the expected length of confidence intervals for 9 or for log 9. A complication results from infinite endpoints that occur at T = or T = Figure 2.15 displays the expected length of confidence intervals for 0, for four methods, using the margins of Table 2.1. The two-sided modified confidence interval has the smallest expected length, uniformly for all 9. For instance, the expected lengths at 9 = 1 are 21.84, 17.22, 13.78, and 11.21 for one-sided ordinary, one-sided modified, two-sided ordinary, and two-sided modified intervals, respectively. For this figure, we arbitrarily set the

PAGE 54

46 upper limit equal to 1000 whenever T = tmaxSince the expected length depends on the upper limit at T = tmax, that value was chosen to be almost two times the maximum finite upper limit among the four methods. Figure 2.16 presents the analogous expected length of confidence intervals for log 6, using the margins of Table 2.1. Again, the two-sided modified confidence interval has uniformly the smallest expected length. We use 1.0 x 10-^ for the lower limit of at T = and 1000 for the upper limit of 0 at T = fniaxFigures 2.17 and 2.18 give analogous displays using the margins of table 2.1, comparing the lengths conditional on T ^ or f,naxThen, the expected length does not depend on the values of the lower limit at and the upper limit at T = fÂ„iax. Again, the two-sided modified confidence interval has uniformly the smallest expected length. 2-4.3 The One-Sided Mid P Confidence Interval For confidence intervals for a common odds ratio based either on inverting two separate one-sided tests or inverting a two-sided test, one can construct even narrower intervals, albeit not Â“exactÂ” ones, by inverting the tests based on the modified mid P value. The ordinary mid P confidence limits based on inverting two separate one-sided tests are found using the functions rÂ„.id(.)(Â») = p,(0)-ip(Â«.;0)), Pniid(2)(^) = ~ 2 P(Ci ^))(2-U) The limits are determined by the same method used for the modified exact confidence interval, using PÂ„iid(i)(^) for the lower limit and PÂ„ud( 2 )(^) for the upper limit. Though

PAGE 55

47 approximate, this type of confidence interval based on the ordinary mid P-value has been observed empirically to behave well (Mehta and Walsh 1992). Following the modified approach based on using a one-sided modified mid Pvalue, let Bi{0) = {Z : Z G P,r = to,T'{6) = The modified mid P confidence interval based on inverting two separate one-sided tests uses p,:ud(2)W = pm-\p[Bx(Â«)\e). (2.18) The limits are chosen by the same method used for the modified exact confidence interval, using for the lower limit and 7W(j(2)(^) upper limit. This approach tends to give narrower intervals than obtained by inverting the one-sided test with the ordinary mid P-value. We illustrate these confidence intervals for the common odds ratio using Tables 2.1 and 2.2. For Table 2.1, the 95% confidence interval by inverting a one-sided test is (1.34, 266.54) based on the ordinary mid P-values and (2.22, 56.00) based on the modified mid P-values using E^l(^) or the table probability for V. Using Table 2.2, the confidence intervals are (0.98, 16.89) using the ordinary mid P-values, (1.01, 13.61) using the modified mid P-values with S (1-04, 14.85) using the modified mid P-values with the table probability for r . 2.4.4 The Two-Sided Mid P Confidence Interval As the two-sided approach tends to give an interval that is usually narrower than the one based on inverting two separate one-sided tests, we can construct a shorter interval using two-sided mid P-values. Though these cannot guarantee achieving at

PAGE 56

48 least the nominal confidence level, one could define mid P versions of the ordinary two-sided and modified two-sided intervals. For testing a particular value of 0, a two-sided mid P-value can be defined as = P(e)-^-P{{Z:Z^T,P{t-e) = Pit,-9)}). (2.19) The limits are determined by the same method used for the two-sided exact confidence interval. Following the modified approach, one can construct a modified confidence interval based on two-sided tests by using a modified mid P-value. We define a modified twosided mid P-value for testing a particular value of 9 as p:.M = PÂ’"{^)-\p{{^-Zer,P{t-,e) = p{p-,0),r{9) = t',{e)}).{2:2O) Also, the limits are determined by the same method used for the two-sided exact confidence interval. We illustrate these confidence intervals for the common odds ratio using Tables 2.1 and 2.2. For Table 2.1, the 95% confidence interval by inverting a two-sided test is (1.38, 131.51) based on the ordinary mid P-values and (1.38, 35.51) based on modified mid P-values using T' = T.Xl{9). Using Table 2.2, the confidence intervals are (1.01, 12.58) and (1.01, 10.29) using the ordinary and modified mid Pvalues with T' = EX^(0), respectively. For these data sets, the confidence interval constructed by using the ordinary two-sided mid P-values is shorter than the ordinary one based on two one-sided mid P-values. For each type of interval, the modified interval is narrower than the ordinary one. Table 2.4 summarizes these 95% confidence intervals using Table 2.1 and Table 2.2. For the conditional distribution having the fixed marginal counts of Table 2.1, Figure 2.19 shows the actual coverage probability as a function of the true log odds ratio, for the 95% confidence intervals based on inverting separate one-sided tests using the ordinary mid P-value or the modified mid P-value with T' = EX^(9). The

PAGE 57

49 exact method yields a coverage exceeding the nominal level, whereas the coverage of the mid P-value fluctuates about the nominal level. For either approach, for sufficiently large |log6Â»|, the actual probability of coverage of a 100(1 a)% confidence interval is centered about 1 otj 2 and that of the modified mid P-value deviates less from 1 Â— a/2. Figure 2.20 gives an analogous display for the confidence intervals based on inverting two-sided tests using the ordinary mid P-value or the modified mid P-value with There is an advantage to the interval based on the modified P-value. For either approach, the actual probability of coverage of a 100(1 Â— a)% confidence interval is centered about the nominal level, and that of the modified mid P-value is even closer to the nominal level. For intervals using mid P-values, we suggest the use of the confidence interval based on inverting two-sided tests using the modified mid P-value. Method Data set 1 Data set 2 Exact Cl Ordinary 1-sided P Modified 1-sided P (P*) Modified 1-sided P (P*) 1.08, 531.51 2.08, 67.35 2.08, 67.35 0.86, 21.37 1.01, 13.63 1.04, 14.87 Ordinary 2-sided P Modified 2-sided P (P*) Modified 2-sided P (P*) 1.29, 261.49 1.38, 40.45 1.38, 40.45 0.88, 15.92 1.01, 10.30 1.01, 11.14 Approximate Cl Mantel-Haenszel ML 1.03, 47.73 1.28, 128.12 0.86, 12.93 0.99, 17.64

PAGE 58

50 Table 2.4. Various 95% confidence intervals for the common odds ratio using mid P-value. Method Data set 1 Data set 2 Approximate Cl Ordinary 1 -sided mid P 1.34, 266.54 0.98, 16.89 Modified 1-sided mid P (P*) 2.22, 56.00 1.01, 13.61 Modified 1-sided mid P {P*) 2.22, 56.00 1.04, 14.85 Ordinary 2-sided mid P 1.38, 131.51 1.01, 12.58 Modified 2-sided mid P {P*) 1.38, 35.51 1.01, 10.29 COVERAGE P LOG THETA Figure 2.9. Coverage probability for confidence intervals based on inverting one-sided tests with T' = for conditional distribution based on margins of Table 2.1.

PAGE 59

51 COVERAGE P "Ti ' ^ ^ -4 -2 0 2 4 LOG THETA Figure 2.10. Coverage probability for confidence intervals based on inverting onesided tests with T' = P{Z), for conditional distribution based on margins of Table

PAGE 60

52 K=3 COVERAGE P LOG THETA K=6 COVERAGE P LOG THETA K=9 COVERAGE P LOG THETA K=12 4-2 0 2 LOG THETA Figure 2.11. Coverage probability for confidence intervals based on inverting onesided tests with T' = ^Xl(0), for conditional distribution based on first K partial tables of Table 2.2.

PAGE 61

53 LOG THETA Figure 2.12. Coverage probability for confidence intervals based on inverting twosided tests with T' = ^ X^(9), for conditional distribution based on margins of Table ^ Â• 1 Â•

PAGE 62

54 COVERAGE P LOG THETA Figure 2.13. Coverage probability for confidence intervals based on inverting twosided tests with T' = P(Z), for conditional distribution based on margins of Table ^ . 1 .

PAGE 63

55 K_ 3 DO^ERAGE P LOG THETA K=6 COVERAGE P LOG THETA K=9 COVERAGE P LOG THETA K=12 -4 -2 0 2 4 LOG THETA Figure 2.14. Coverage probability for confidence intervals based on inverting twosided tests with T' = 'Â£,Xl{0), for conditional distribution based on first K partial tables of Table 2.2.

PAGE 64

100 200 300 400 500 56 LENGTH (THETA) Figure 2.15. Expected length of confidence intervals for 6 , with T' = for conditional distribution based on margins of Table 2.1.

PAGE 65

57 LENGTH(LOG THETA) Figure 2.16. Expected length of confidence intervals for log 0, with T' = for conditional distribution based on margins of Table 2.1.

PAGE 66

100 150 200 58 LENGTH (THETA) Figure 2.17. Expected length of confidence intervals for conditional on T ^ or ^max5 with T' = for conditional distribution based on margins of Table 2.1.

PAGE 67

59 LENGTH(LOG THETA) Figure 2.18. Expected length of confidence intervals for log 6, conditional on T ^ t .Â„iÂ„ ^max? with T Â— S ^ 1(^)5 for conditional distribution based on margins of Table

PAGE 68

0.90 0.92 0.94 0.96 0.98 1.00 60 LOG THETA Figure 2.19. Coverage probability for confidence intervals based on inverting onesided tests using mid P-values with T' = for conditional distribution based on margins of Table 2.1.

PAGE 69

0.90 0.92 0.94 0.96 0.98 1.00 61 COVERAGE P LOG THETA Figure 2.20. Coverage probability for confidence intervals based on inverting twosided tests using mid P-values with T' = Y^X^(6), for conditional distribution based on margins of Table 2.1.

PAGE 70

62 2.5 Connections with Logistic Regression Consider a set of independent binary variables, Vj, Â• Â• Â• , Y^. Corresponding to each variable, Yj, there is a (p x 1) vector = (xij,--,Xpj)' of explanatory variables. Let TTj be the probability that Yj Â— 1. Suppose that the response is related to the explanatory variables by the logistic regression model, log 7T, 1 7Tj= 7 + x'/3. (2.21) The likelihood function is exp[S^^iP,(x;.^ + 7)] nÂ”=i[l + exp(x'/3 + 7)] The p X 1 vector of sufficient statistic for /3 is t = . Suppose p = 2, and we want to conduct inferences about Again, one can eliminate (32 by conditioning on its sufficient statistic, ^2 Â— One can treat the data for the logistic regression model as a three-way 2 x / x K tables where I and K are the number of distinct values of the explanatory variables, Xx and A2, respectively. Exact inference in logistic regression often is highly discrete, even degenerate. One can often alleviate this problem somewhat by treating the data as a contingency table and using the alternative way discussed in Section 2 of constructing P-values. To illustrate, for Table 2.1 we let -k^ denote the probability of cure for the jth individual at the tth penicillin level. The logistic model has form log = 7,+ i = 1, Â• Â• Â• , 3 , where is a dummy variable for delay. The observed value of the sufficient statistic T is 14. For testing Ho : /3 = 0, the exact one-sided P-value \s P = P{T > 14) = 0.0200. The modified exact P-value, using T = E^K^) or the table probability, is 0.0028.

PAGE 71

63 2.6 Discussion We have shown that use of a modified P-value leads to exact tests and confidence intervals that are less conservative than the usual ones. The improvement can be considerable when K is large but n is not, in which case there may be a large number of tables with the different secondary statistic value that have the same primary test statistic value. We prefer modified exact tests and confidence intervals over the ordinary exact ones, because they are less conservative than the ordinary ones but still guarantee at least the nominal level. We prefer confidence intervals based on inverting two-sided tests over those based on inverting two separate one-sided tests, because they tend to be less conservative. Likewise, for confidence intervals using mid P-values, we prefer intervals based on inverting two-sided tests using modified mid P-values. For the secondary statistic, we have used Y.k and the table probability in our examples, and clearly the reduction in conservativeness occurs with test statistics for more general alternatives. A FORTRAN program has been prepared, designed for IBM-compatible PCs or UNIX workstations, for computing modified P-values for tests of conditional independence and modified confidence intervals for an assumed common odds ratio. This program also computes the actual coverage probability and the expected length of confidence intervals using four methods. This program, for 2 X 2 X A tables, is an adaptation of one written by Vollset and Hirji (1991) for ordinary exact inference for such tables. The Appendix A contains the FORTRAN source code.

PAGE 72

CHAPTER 3 APPROXIMATING EXACT INFERENCE ABOUT CONDITIONAL ASSOCIATION 3.1 Introduction For three-way tables, consider the hypothesis of conditional independence of X and y , given Z . This hypothesis is usually tested against the alternative of no threefactor interaction. The general alternative that permits three-factor interaction is the general loglinear model for a three-way table and has the form log = /i + Af -f A[ + Af + AfX + Af/ + aJ/ + ( 3 . 1 ) When X or V are ordinal, narrower alternatives can be constructed for the exact tests. We suggest exact inference regarding conditional associations in three-way contingency tables. For I x J x K tables, we discuss six test statistics for conditional independence that have natural connections with loglinear models for various alternatives. We use a simulation algorithm to obtain precise estimates of exact P-values for cases that are currently computationally infeasible. For three-way contingency tables, current computational algorithms for the exact methods are restricted to certain analysis for 2 x J x K tables. Also when the sample size is small or when the contingency tables are sparse, large-sample approximations can be questionable to apply. The Monte Carlo method is an alternative to either the exact or asymptotic methods. This method is based on estimating the exact conditional sampling distribution of the statistic, by generating random tables having the relevant fixed margins. The advantage of this method is that the number of tables 64

PAGE 73

65 generated is fixed in advance, and the computing time does not depend greatly on the sample size n and the table size, compared to methods for exact analysis. For the random table generation, we use the procedure by Patefield (1981) that simulates hypergeometric distributions. Section 2 discusses exact tests of conditional independence in / x J x tables using three statistics that are popular for asymptotic tests. These are naturally linked to alternatives corresponding to loglinear models that assume a lack of three-factor interaction. Section 3 presents three other statistics that do not require this assumption. All SIX test statistics are score statistics for loglinear models that treat none, one, or both of the classifications as ordinal. Section 4 discusses possible alternative ways of forming modified exact P-values in / x J x K contingency tables, generalizing the modified P-value discussed in Chapter 2. We propose modified exact P-values for six tests for testing conditional independence with I x J x K tables. Computational algorithms have limited availability for tests of conditional independence when / and J exceed two. Section 5 describes a Monte Carlo sampling routine that approximates the ordinary and modified exact P-values. We utilize six test statistics for exact tests of conditional independence. Section 6 illustrates approximate exact tests of conditional independence with examples, and Section 7 explains a FORTRAN program utilizing the simulation algorithm. Â— Tests of Conditional Independence Assuming No Three-factor Interaction This section presents three test statistics for testing conditional independence of and R, given Z,\n I x J x K contingency tables, proposed by Birch (1965). We present loglinear models for which these are score statistics. These models assume a lack of three-factor interaction. We then present three adaptations of these statistics

PAGE 74

66 that do not require that assumption in the next section. In each case, one test treats both X and Y as nominal, one test treats X as nominal and Y as ordinal, and one test treats both as ordinal. The asymptotic chi-squared theory is well developed for the statistics we present. Our focus will be to construct exact tests of conditional independence, using these statistics with the reference set F of tables with the same margins. We use score statistics for loglinear models rather than likelihood-ratio or Wald statistics. This makes the computations for exact analyses simpler, since one does not need to fit the model for each table in T. 3.2.1 Nominal-bv-Nominal Test Birch (1965), Landis et al (1978), and Mantel and Byar (1978) generalized the Cochran-Mantel-Haenszel statistic to handle more than two groups or more than two responses. Suppose X and Y are nominal. Let n^. denote the counts for cells in the first I 1 rows and J 1 columns for stratum k of Z. Conditional on the row and column totals in that stratum, let denote the null expected value of Then d = SA,.(nfc nifc) represents the (/ 1)(J 1) x 1 vector having elements. z = l,---,/-l j = l,...,J_i. (3.2) Let Xk denote the null covariance matrix of n^, where Cov(rijj^, 1^i+k{^n''>l++k '>^i'+k}n^jk[6jjin^^k ~ '^^+j'k) nl^k{n++k 1 ) (3.3)

PAGE 75

67 Then V T,k\k is the null covariance matrix of d. The efficient score statistic for testing conditional independence against the alternative of no three-factor interaction is C'^ = d'V-^d. (3.4) This is also called the generalized CochranMantelHaenszel statistic. Under conditional independence, this statistic has a large sample chi-squared distribution with = For A = 1 stratum with n observations, the statistic reduces to the multiple (n Â— l)/n of the Pearson chi-squared statistic for testing independence. The statistic is sensitive to detecting conditional associations when the association is similar in each stratum. Hence, the generalized Cochran-Mantel-Haenszel statistic has low power for detecting an association in which the patterns of association for some of the strata are in the opposite direction of the patterns displayed by other strata, relative to the case that the association is similar. 3.2.2 Ordinal-bv-Ordinal Test When X and Y are ordinal, it often makes sense to test against a narrow alternative, corresponding to a monotone trend in the conditional association. It then makes sense to form a test statistic using a model that is a special case of the no threefactor interaction model and reflects the ordinality, such as the model of homogeneous linearby-linear association, log rriijk = /^ + + f^UiVj + + Xjif . (3.5) It replaces the general association term by a linearbylinear term ^UiVj, where {u,} and {uj} are monotone scores for levels of X and Y. The parameter /? in that model describes X Â— Y partial association. The model of conditional independence

PAGE 76

68 of X and Y is its special case in which /3 = 0. For this model, the sufficient statistic for is Yik\YiYjUiVj7iijk'\. When I = J = 2^ the usual statistic results from the scores Ui = Ui = 1, u -2 = U 2 = 0. This is the BirchÂ’s exact test statistic for testing conditional independence in 2 x 2 x K contingency tables, and we have utilized this statistic in Chapter 2 for the conditional exact test. Also, Mehta, Patel and Gray (198o) and Vollset, Hirji and Elashoff (1991) used this statistic to implement the exact test. For the asymptotic test of //Â„ : /3 = 0, one can use MantelÂ’s (1963) generalized statistic for detecting association between ordinal variables. This ordinal test focuses the departure from independence on a single degree of freedom. Suppose we expect a monotone conditional relationship between X and V", with the same direction at each level of Z, and suppose that we can assign monotone scores {ui} to levels of X and {vj} to levels of Y . Then there is evidence of positive trend if, within each stratum, the statistic YiYijUiVjriiji. is greater than its expectation under independence. For the model (3.5), given the marginal totals in each stratum and under conditional independence of X and Y, E{YiY,UiV^mjk) = Var(E, ^++k 1 X lE,vWÂ„ + " 'f^++k To summarize the correlation information from the K strata. Mantel (1963) proposed the statistic j^'2 ^ {Yk[T,,T,jUiVjnijk E{YiT,jUiVjnjjk)]y T,kXa.r{'Ei'EjUiVjnijk) ' ' Â’ This is the score statistic for testing conditional independence for model (3.5). It has an asymptotic, chi-squared distribution with df = 1.

PAGE 77

69 3.2.3 Nominal-bv-Qrdinal Test Suppose the row variable X is nominal and the column variable Y is ordinal. A useful loglinear model replaces the ordered row scores in model (3.5) by unordered parameters {/x,}, log = /X + Af + Xj + Af + fiiv, + Aj^.^ + AJ/. (3.7) The sufficient statistics for {fii} are = I,-,7. These can be interpreted as the row sums for a response Y within each level of X, using the scores {uj}, summed over the strata. Assuming the model holds, we can test conditional independence by testing fly = ^2 = Â— fJ-iLet Vi, Â• Â• Â• , be a random sample within the stratum k, which takes scores uj, Â• Â• Â• , vj. Let I denote the (7 Â— 1) x 1 vector having elements h Â— fTt)? ('L8) where and Â— YjTiijkVjjrii^k, h = I,-Note that Wn, is the row mean on Y at level i of X and level k of Z, treating Y as a response with scores {uj}. Similarly, Wk is the Arth stratum mean for Y. Let A denote the null covariance matrix of 1 , which has elements + + ^ -tj (3.9)

PAGE 78

70 Then the efficient score statistic for testing conditional independence against the alternative of (3.7) is 1 A ^1. This statistic is sensitive to location differences among the I conditional distributions of Y that are similar at each level of Z . The asymptotic null distribution is chi-squared with df = 1 Â— 1. The three statistics just discussed were suggested by Birch (1965) for testing conditional independence. The three asymptotic tests are available in SAS (PROC FREQ). 3.2.4 Generalized Tests The previous three statistics are special cases of a general statistic proposed by Landis et al. (1978). Let iik denote a column vector of the cell counts in stratum k, and let ni^. denote their expected values. Also let Ri+it denote the marginal proportion of zth row and let P denote the marginal proportion of jth column. We introduce the following notation to define the generalized test statistic. '^ik id^i\ki 5 ^k Â— i'^lki ' 1 '^Ik) P+k ^^i+k P^++k P+jk ^^+jkllT'++k p' ^ ( p , JD . p / ^1+*: ^2-t-fc n, ^ *+k V-' 1+*: Â» ^2+ki ' ' ' 1 ^I+k ) Â— ( , , ' ' ' j ) f^++k 1^-\--\-k f^++k p P \ / '^^+lk n.^2k nj^Jk ^ ^-\-*k Â— K^+lk, r+2k: Â• Â• Â• , t^+Jk) = ( , , Â• , ) f^++k 'IT'+^k 1^++k

PAGE 79

71 Assume that cell counts from different strata are independent. Landis et al. (1978) showed that under the hypothesis of conditional independence, the expected value and covariance matrix of the frequencies are, respectively. rrik = E[rik\Ho] = n++k{P*u Â® P+*k) (3.10) and Var[nA,|i/o] = _ K+k [{D Pt+k ^*+l^^*+k) (3-11) where Â® denotes Kronecker product multiplication and Da is a matrix with elements of a on the main diagonal. The generalized statistic for testing conditional independence is defined as Qm = G'VqG, ( 3 . 12 ) where and where G = EkBk{nk ruk) Vq = EkBk[\&r{nk\Ho)]BÂ’f,, Bk Â— B-k Â® Ck is a matrix of fixed constants based on row scores Rk and column scores Ck for the kth stratum. When the null hypothesis is true, the statistic Qm is approximately distributed as chi-squared with degree of freedom equal to the rank oi BkSuppose the row variable X is nominal and the column variable Y is ordinal. Then mean score of Y is meaningful. In this case, the mean score is computed for each row of the table, and the alternative hypothesis is that, for at least one stratum, the mean scores of the / rows are unequal. Then the statistic is sensitive to location differences among the / distributions of Y.

PAGE 80

72 For this case we can define the matrix Rk that has dimension (/ Â— 1) x / as (3.13) where Ij_i is an identity matrix of rank / 1, and J/_i is an an (/ 1) x 1 vector of ones. The matrix has the effect of forming 7Â—1 independent contrasts of 7 mean scores. The matrix Cj^ has dimension 1 x T, and the scores are specified as one for each column. Then sums over the K strata information about how 7 row means compare to their null expected values, and it has d/ = 7 1. When both variables are ordinal, R^ and C^. can be defined as R^ = (ui, Â• Â• Â• , u/), and Cfc = (uj,--,uj). If the scores R^ and C}. are the same for all strata, Qm simplifies to M'^ . When both variables are nominal, Rk = J/_i), and Ck = can be used. Then Qm simplifies to d'V~^d with df = {I 1)(J 1). For exact tests of conditional independence in I x J x K tables, we discussed test statistics assuming a lack of three-factor interaction. These are score statistics for loglinear models that treat none, one, or both of the classifications as ordinal. Also they have asymptotic chi-squared distributions. Â— Tests of Conditional Independence Permitting Three-factor Interaction The tests discussed so far assume no three-factor interaction. Suppose, instead, we expect the nature of the association between X and Y to vary considerably across levels of Z . Then one would test against an alternative that permits the association to vary across the strata of Z.

PAGE 81

73 3.3.1 Nominal-bv-Nominal Test, Suppose and Y are nominal. Then one could test conditional independence against the saturated loglinear model, since the only more general model is the saturated model. An efficient score statistic is the Pearson statistic for testing conditional independence against the alternative of the saturated model (Agresti 1992). Letting denote the Pearson statistic for testing independence within the kth level of Z, this statistic is The asymptotic distribution of this statistic is chi-squared with df = K{I 1)(J1), since at each partial table Xj has asymptotic chi-squared distribution with df = (/ Â— 1)(J Â— 1), and we have K independent partial tables. Also, this is the df for testing a loglinear model of conditional independence against the most general alternative. 3.3.2 Ordinal-bv-Ordinal Test The model of homogeneous linear-by-linear association (3.5) allows association between two ordinal variables in each table and this association is homogeneous across levels of Z. When X and Y are ordinal, one sometimes expects a monotone association between X and Y that changes strength across levels of Z. We consider a loglinear model that permits association between X and Y within each level of Z, but heterogeneity among levels of Z, and the degree of heterogeneity is explained by its association parameter. A relevant loglinear model is then the heterogeneous linear-by-linear association model. log mijk = ,, + Af + A]" + Af + fitu.v, + A^^ + A]'/. ( 3 . 14 )

PAGE 82

74 For this model, the null hypothesis of conditional independence is Hq : /?j = Â• Â• = I^K = 0. The loglikelihood is L{m) = ^ ^ log ^ ^ ^ j k i 3 k = E E E "..*(/* + >~f + At + Af + hÂ„,v, + + Aj/) E E E * ^ ^ . j fc Â— + X/ ^i++ + E ^*'+:?+ + X! ^*++fc + H ^A; XllZ UiVjtlijk ^ ] k k t j + EEAf/.w + EEAkV,/.-EEEÂ’Â«Â«Â»(s.is) * A: j A: i j k For this model the sufhcient statistic for is EiEjUiVjTiijk. For A; = 1, Â• Â• Â• , A", the derivative of the loglikelihood is dL{m) d^k EE U^V jTXijf^ EE Vj Tll^j /j , * J i Under the hypothesis of conditional independence, we have mijk = !hÂ±Â±I!Â± 2 jL_ Hence, for A; = 1, Â• Â• , h\ dL{m) EE UiVji^Tl^jk '^ijk^ i J = V Vu n fnt 3 i j P++fc Let s denote the K x 1 vector having elements Sk Ej ^jU{Vj (^Pijk Pi+kP+jk X P++k I n X]j ^jU^Vj {p^ijk ^i+kf^+jk ^ 'kl++k (3.16)

PAGE 83

75 Then s can be defined as s St Sj ^i'^ji^Pijk _St Sj ^i^j{PijK Pi+lP+)l \ P+ + 1 P.+2P-4-72 \ P++2 ^ Pi+^P+tfc \ P+ + * ' Pi+kP+iK Â•< P+ + K ' St Sj Si Sj ^t^j(^tj2 1 St Sj ^i^j{^ijk _St Sj Â”Â»+lÂ”+.il \ Â”++l ^ Â»H+2n+i2_\ "++2 ^ ^t+fc^+ tfc \ Â«++* ' ^i+ifÂ»+t Jr \ "+ + A' h For fixed fc, let G/;(7r) Â— Si Sj ^i^i(7riiA; Â— Let g*; represent the IJ x 1 T T ^ vector having elements Ski'll j) *7^ ^a'^a'l^^a+k){f^jf^++k S6f^6^+6fc )] j ^++k and let gf be the UK x 1 vector with gf' = gj, For example, Si D agi(7T) Ott (Mi7T+ + i Sa^Â‘a7ra+l)(ni7T+ + i S& ) (uiTr+4.1 J2a Ua7ra+l)(n27T++i ^^67T+fci) (Uj7r_|__f_i Â— Sa ^a^a+l )(Wj7r.|__(.i S6^(>^+6l) (U/7T++1 Ea ^^a7ra+l)(uj7T++i J2b ^bT^+bl) 0(K-1)IJ

PAGE 84

76 ( Uin++i u Â„ n Â„+ i )(^; in++i ^bn+bi) (uin++i ~ J2aÂ’^ana+-l){v2Tl^^i ~Z^6^6^+6l) {UiTl^+l Ea ^ a Â« a + l )( Vjn++i E & ^^ 6^+61 ) (u/n++i Ea ^aÂ«a+l)(vjn++i Â“ E& ^6Â«+6l ) 0 ( A '1 )/J = gl 0 ( A '1 )/J and n ++1 Sk dGk(7r) d-TT 1 ^++k 0 ( fc i)/j (ui7T++fc Ea 't^a7ra+fc)(ni7r++fc E & VbTT+bk) ( Ui 7 T++fc Ea ^< a 7 ra + A :)( f 27 r++/t E & Ea ^a^a+/c)(^j^++A: E& ^6^+6/c) (u/7r++A; Ea^^a7Ta+A;)(^^j7r++*: E& 0 ( AT fc)/J 0 ( fc l)/J (niÂ«++^. Ea ?^anÂ„+A:)(t^ira++fc Eft ^^6?^+ftft:) (uin++^. Ea ^a?^a+fc)(w2Â«++A,~ Eft^ftÂ«+6fc) 1 ^++k (^t^++A; ~ Ea ^a^a+fc)(t^j^++fc ~ Eft ^ft^+ftfc) (u/n++A; Ea Uana+k){vjn++k ~ Eft ^Â’ft^+ftfc) 0 ( A 'yt)/J 0 ( fc l)/J gk P{K-k)IJ,

PAGE 85

77 Also let D represent the K x UK matrix such that row k consists of gf that is 3Gi(7r) M D 97T ogjUKl' L 97T J The null asymptotic covariance matrix of s is H = DSD'/n, where n = and S = Diag{-p) Â— pp' with p = { } Â• The score statistic for testing Ho : /3-i = = f^K = 0 is then s'H ^s. From Rao (1973, page 418), the asymptotic distribution of s is A -variate normal. Its mean is zero and dispersion matrix is the information matrix. Hence the asymptotic distribution of is chi-squared with df = K. The number of df is the number of components of parameters for testing, or the rank of the asymptotic covariance matrix. 3.3.3 Nominal-bv-Ordinal Test A loglinear model (3.7) implies there are row effects on the association, and these row effects are the same for each level of Z. In general cases when X is nominal and y is ordinal, we might expect heterogeneity in the row effects on the association. Then a relevant loglinear model to allow heterogeneity across the strata is log rn,jk = /r + Af + Xj + Af + fi.kVj + Xf^^ + AJ/. (3.17) The model is sensitive to alternatives whereby means on Y vary across levels of both X and Z . For identiliability, we use constraints jiik = 0. For this model, the null hypothesis of conditional independence is //q : /ii/t = 0 for i = 1 , Â• Â• Â• , / Â— 1 and

PAGE 86

78 k Â— 1, Â• Â• Â• , . The loglikelihood i IS ^ 3 ^ i j k + jJ'ikVj + 4 =^ + 4 ")-EEE t 3 k f^^ijk i j k k'^j'f^ijk = rifi + E ".++ + E d Â«+;+ + E AfÂ»++^ + EE E Â® 3 k i ] k + EE^f/".+^ + EE4V,t-EEEÂ™Â«*P.is) 'Â‘ k j k i j k For this model the sufficient statistic for is For fixed i and k, the derivative of the loglikelihood is dL{m) dfx tk = '^Vjriijk-^Vjmijk. Under the hypothesis of conditional independence, we have Hence, for fixed i and k. dL{m) ^ / U j ( Tlij f; TTlij k ) n Y.^^ipijk ''^++k Pi+kP-kjk P++k )Â• For i Â— I,-,/ 1, A: Â— 1, Â• Â• , A , let q be the K[1 Â— 1) x 1 vector having elements qik = J2^j{Pvk Pi+kP+jk P++k ), ^++A: ), n n,+k{Wik Wk), (3.19)

PAGE 87

79 where W; Or it can k Â— and Wk Â— Yli Y^j nijkVjfn^^kThen q can be defined as E, Â’>,iPw E, MPu-m be written as E, v,{p Ej ^j(p(t-i)jk P(I-l)+kP+jk P+ + k E, MpÂ„k E, Â”,(P2,K E, ",(?{;-! )tK P+ + K P(7-i)+j