Citation |

- Permanent Link:
- http://ufdc.ufl.edu/AA00031473/00001
## Material Information- Title:
- An empirical comparison of methods for estimating latent variable interactions
- Creator:
- Moulder, Bradley C
- Publication Date:
- 2000
- Language:
- English
- Physical Description:
- xi, 101 leaves : ill. ; 29 cm.
## Subjects- Subjects / Keywords:
- Error rates ( jstor )
Estimate reliability ( jstor ) Mathematical variables ( jstor ) Maximum likelihood estimations ( jstor ) Modeling ( jstor ) Sample size ( jstor ) Standard error ( jstor ) Statistical discrepancies ( jstor ) Statistical models ( jstor ) Statistics ( jstor ) Dissertations, Academic -- Educational Psychology -- UF ( lcsh ) Educational Psychology thesis, Ph. D ( lcsh ) - Genre:
- bibliography ( marcgt )
theses ( marcgt ) non-fiction ( marcgt )
## Notes- Thesis:
- Thesis (Ph. D.)--University of Florida, 2000.
- Bibliography:
- Includes bibliographical references (leaves 98-100).
- General Note:
- Printout.
- General Note:
- Vita.
- Statement of Responsibility:
- by Bradley C. Moulder.
## Record Information- Source Institution:
- University of Florida
- Holding Location:
- University of Florida
- Rights Management:
- The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. Â§107) for non-profit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
- Resource Identifier:
- 025861119 ( ALEPH )
47116866 ( OCLC )
## UFDC Membership |

Downloads |

## This item has the following downloads: |

Full Text |

AN EMPIRICAL COMPARISON OF METHODS FOR ESTIMATING LATENT VARIABLE INTERACTIONS By BRADLEY C. MOULDER A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2000 ACKNOWLEDGMENTS I would like to thank some of the people who have provided support for my education which have made this paper possible. I would first like to thank my wife, Kim, who has made so many sacrifices over the last several years. I have appreciated each and every one. I would also like to thank my parents whose words were always supportive and whose examples attest to a belief in the value of education that was never far from my mind. I would like to thank all the members of my committee. Each of them is exceptional and I have taken something from each of their examples. I would like to thank James Algina. His love of research is infectious and his example as both a researcher and instructor set a truly high mark. I would like to thank Margaret Bradley. The experiences I had working with her have been a valuable resource providing a grounding in both practical analysis and experimental design. I would like to thank Linda Crocker. In the years I worked with her, I very much appreciated that she always acted as though I was a student first. I also appreciate the many opportunities she made available to me. I would also like to thank David Miller. His very approachable manner has made a valuable resource of the breadth of knowledge and experience he possesses. TABLE OF CONTENTS page ACKNOWLEDGMENTS................................................................. ii LIST OF TABLES........................................................................... v LIST OF FIGURES...................................................................... vi INTRODUCTION.......................................................................... 1 M easurem ent Error................................................................. 1 Multiple Regression and Structural Equation Modeling.................... 2 Latent Variable Interaction...................................................... 4 Indicant Product Approaches............................................ 4 Two-Step Approaches................................................... 13 Two-Stage Least Squares Approach.................................. 15 Statement of the Problem......................................................... 18 M E TH O D .................................................................................. 19 Data Simulation.....................................................................19 E ffect Size.................................................................. 19 Squared Multiple Correlation......................................... 19 Correlation Among Variables......................................... 20 Reliability of Observed Variables.......................................20 Sam ple Size................................................................ 20 Simulation Proper ..................................................................21 Parameter Recovery ................................... .................. 26 Comparison of Methods................................................ 28 R E SU L T S .....................................................................................30 B ias................................................................................... 32 Standard Error Ratios...............................................................33 G am m a 4................................................................. 33 P si......................................................................... 39 Gamma 1 and Gamma 2................................................. 41 G am m a 3...................................................................47 iii Mean Squared Error............................................................... 51 G am m a 4................................................................... 51 P si........................................................................... 53 Gamma 1, 2, and 3.........................................................53 Type 1 Error Rate and Power....................................................... 54 G am m a 4................................................................... 54 G am m a 1................................................................... 62 G am m a 2 ................................................................... 63 G am m a 3................................................................... 65 Fit Statistics ....................................................... .................... 65 Chi-squared test of Exact Fit .......................... ................ 65 Comparative Fit Index ............................... ....................68 Non-Normed Fit Index ................................................... 69 Standardized Root Mean Squared Residual...........................70 DISCUSSION............................................................................... 73 F it...................................................................................... 7 5 Hypothesis Testing.................................................................75 Confidence Intervals...............................................................76 C onclusion .......................................................................... 77 L im itations..........................................................................77 APPENDIX SUPPLEMENTARY TABLES...........................................79 R E FE R E N C E S .............................................................. ..................98 BIOGRAPHICAL SKETCH........................................... ................ 101 LIST OF TABLES Table page 1. Kenny and Judd Parameters and Estimates....................................... 6 2. Performance of the Kenny-Judd Model in the Jaccard and Wan Study.........7 3. Yang-Jonsson Parameters and Estimates...........................................10 4. Comparison of Parameters and Estimates for Joreskog-Yang Approaches.....13 5. Ping's Estimates of the Kenny-Judd Parameters................................ 14 6. Bollen's Estimates of the Kenny-Judd Parameters............................... 18 7. Simulation Parameter Values........................................................25 8. Proportions of Mean Square and Variance Component Sum for y4 Standard Error Ratios......................................................... 34 9. Proportions of Mean Square and Variance Component Sum for Y Standard Error Ratios......................................................... 39 10. Proportions of Mean Square and Variance Component Sum for y3 Standard Error Ratios......................................................... 48 11. Proportions of Mean Square and Variance Component Sum for y3 Power..... 55 12. Proportions of Mean Square and Variance Component Sum for 74 Power......56 13. Proportions of Mean Square and Variance Component Sum for yl Type 1 Error Rate................................................................ 62 14. Proportions of Mean Square and Variance Component Sum for y2 Type 1 Error Rate................................................................ 64 15. Proportions of Mean Square and Variance Component Sum for y3 Pow er.......................................................................... . 66 16. Proportions of Mean Square and Variance Component Sum for 2 ............. 67 V 17. Proportions of Mean Square and Variance Component Sum for CFI.......... 68 18. Proportions of Mean Square and Variance Component Sum for NNFI ........ 69 19. Proportions of Mean Square and Variance Component Sum for Standardized Root Mean Squared Residual ................................................. 71 20. O verview of Effects...................................................................74 21. Bias in Estimates of Gamma 4.......................................................79 22. Standard Error Ratios for Gamma 4................................................ 82 23. Standard Error Ratios for Psi ........................................................ 85 24. Root Mean Squared Error for y ......................................................88 25. Type 1 Error Rate for Gamma 4..................................................... 90 26. Type 1 Error Rate for Gamma 1..................................................... 92 27. Type 1 Error Rate for Gamma 2..................................................... 95 LIST OF FIGURES Figure page 1. Bias in Estimates of Gamma 4.......................................................33 2. Standard Error Ratios by Reliability for Gamma 4: Robust Standard Errors... 35 3. Standard Error Ratios by Reliability for Gamma 4: Ordinary Standard Errors ......................................................... ............................ .... 3 5 4. Standard Error Ratios by Multiple Correlation for Gamma 4: Robust Standard E rrors ............................................ ....................................... 36 5. Standard Error Ratios by Multiple Correlation for Gamma 4: Ordinary Standard E rrors ................................................................................... 37 6. Standard Error Ratios by Sample Size for Gamma 4: Robust Standard Errors .............................................. ............ .................... ......... .... 3 8 7. Standard Error Ratios by Sample Size for Gamma 4: Ordinary Standard Errors ........................................... ................... ............................ 3 8 8. Standard Error Ratios by Reliability for Psi: Robust Standard Errors .........40 9. Standard Error Ratios by Reliability for Psi: Ordinary Standard Errors ....... 41 10. Standard Error Ratios by Reliability for Gamma 1-- Expanded View: Robust Standard Errors...................................................................... 42 11. Standard Error Ratios for Gamma 2--Expanded View: Robust Standard Errors ...................................................................... .................. . 43 12. Standard Error Ratios by Reliability for Gamma 1: Robust Standard Errors ......................................... ............................................... . 4 3 13. Standard Error Ratios for Gamma 2: Robust Standard Errors ..................44 14. Standard Error Ratios by Reliability for Gamma 1: Ordinary Standard Errors ....... .............................................. ...................................... 4 5 15. 16. Standard Error Ratios for Gamma 2: Ordinary Standard Errors ................45 17. Standard Error Ratios by P, for Gamma 1...................................46 18. Standard Error Ratios by p',, for Gamma 2.................................... 46 19. Standard Error Ratios by Reliability for Gamma 3: Robust Standard Errors ..48 20. Standard Error Ratios by Reliability for Gamma 3: Robust Standard Errors ..49 21. Standard Error Ratios by Reliability for Gamma 3: Ordinary Standard Errors 49 21. Standard Error Ratios by Reliability for Gamma 3: Ordinary Standard Errors 50 22. Standard Error Ratios by P,,,,, for Gamma 3....................................51 23. Mean Squared Errors for Gamma 4 by Sample Size............................ 52 24. Mean Squared Errors for Gamma 4 by Procedure............................... 52 25. Mean Squared Errors for Gamma 1 and Gamma 2................................53 26. Mean Squared Errors for Gamma 3................................................. 54 27. Type 1 Error Rate for Gamma 4 by MATRIX ................................. 55 28. Power by pr., and N for Gamma 4: Robust Standard Errors ................. 57 29. Power by p,,. and N for Gamma 4: Ordinary Standard Errors ..................59 30. Power by p,,,,, and N for Gamma 4: Robust Standard Errors................. 60 31. Power by p,,.,,, and N for Gamma 4: Ordinary Standard Errors ............ 61 32. Type 1 Error Rate Using Robust and Ordinary Standard Errors for Gamma 1 .................................................................. .......................... 6 3 33. Type 1 Error Rate Using Robust and Ordinary Standard Errors for Gamma 2 .................................................. .......................................... 64 34. Gamma 3 Power for the Joreskog-Yang Model and Robust Standard Errors. 66 35. Chi-Squared by Procedure, N, and p,, ............................................. 68 36. CFI by PROCEDURE and p'..... 69 36. CFI by PROCEDURE and P ,, ..................................................69 37. NNFI by P,2 and PROCEDURE............................................. 70 38. Standardized Root Mean Squared Residual by N and p,, .......................71 39. Standardized Root Mean Squared Residual by PROCEDURE and p,,.........72 40. Standardized Root Mean Squared Residual by MATRIX, PROCEDURE, and Sam ple size............................................................................. 72 Abstract of Dissertation Presented to the Graduate School in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy AN EMPIRICAL COMPARISON OF METHODS FOR ESTIMATING LATENT VARIABLE INTERACTIONS By Bradley C. Moulder December 2000 Chair: James J. Algina Major Department: Educational Psychology Researchers in the social sciences and elsewhere use multiple regression to evaluate relationships among observed variables. In simple linear regression, it is well known that the presence of measurement error tends to decrease power in tests of the 3 coefficient. The influence of measurement error becomes more troublesome with multiple regression as 0 coefficients are unpredictably changed, sometimes overestimating and other times underestimating population Ps. One solution for this problem is to evaluate relationships among latent variables using structural equation modeling. When researchers hypothesize the presence of interaction among variables, this solution has until recently not been available. This is ironic given the particular sensitivity interaction terms have shown to measurement error. A number of approaches to interaction using structural equation modeling have recently been proposed. However, these studies have generally failed to either consider multiple methods or multiple experimental parameters. In this study, the approaches of Bollen, Ping, Kenny and Judd, Joreskog and Yang, as well as two new approaches, were evaluated in a study which considers several of the experimental parameters likely to vary in experimentation. These are (1) the effect size of the interaction, (2) the multiple correlation of the latent variable model, (3) the sample size, (4) the reliability of the observed variables, and (5) the correlation between the latent variables included in the interaction. The study revealed large differences in the ability of procedures to detect interaction and to properly estimate the associated parameter. Bollen's two stage least squares procedure exhibited a particular lack of power. Joreskog and Yang's procedure exhibited a particular sensitivity to a number of parameters likely to vary in experimentation. One of the new approaches, a revision of the Joreskog and Yang approach, proved both accurate and robust leading to the recommendation that it be the procedure of choice. INTRODUCTION Measurement Error Measurement is the assignment of a quantitative value to a sample of behavior from a specified domain (Crocker and Algina, 1986). In practical situations, the value assigned to a particular sample of behavior is a composite of both the trait of interest and other random influences. For example, a student's score on a math test is due to the student's knowledge of math as well as such things as hours of sleep the previous night, illness, and many other factors. Spearman (1907) conceptualized this relationship in saying that an observed variable x is made up of two parts x = t + e, (1) where t is the latent or "true" portion of the score and e is measurement error. If an average is taken over multiple samples from the same domain, the consistency this average reflects is the "true" portion of the score. That is, E(x) = t (2) because t is a constant influence, and E(e) = 0. (3) If it were possible to give the same student the same test repeatedly without exposure influencing the outcome, for example, the resulting average would reflect only the 2 influence of the student's math knowledge. The proportion of observed variance made up of true score is often referred to as reliability. Measurement error is the remaining inconsistent part of the score. Multiple Regression and Structural Equation Modeling Using the multiple regression model, the value of a particular variable y is predicted as a function of several other variables. That is, y= a+ x +fl2 3 Ax3+...+ kk + e (4) A problem with this approach is that in multiple regression one estimates relationships among observed variables contaminated by measurement error and may not clearly reflect relationships among latent traits that underlie the observed variables. In the case of simple linear regression, the familiar correction for correlation coefficient attenuation, pj = p,,.,, x Pr p,,, (5) indicates that the effect of measurement error is to reduce the strength of association between two variables (Crocker & Algina, 1986). In multiple regression, the effect of measurement error is more complex. Like the simple correlation, the multiple correlation is diminished when variables include measurement error. Similarly, a i will be underestimated when xj is the only variable containing measurement error in a model. However, when all variables contain measurement error, P coefficients may be larger than, smaller than, or even a different sign than the corresponding coefficients in a model for variables measured without error (Bollen, 1989). Structural equation modeling affords a solution in providing a framework for evaluating relationships among latent traits or true scores. It does so by joining the 3 methodologies of factor analysis and multiple regression. In factor analysis, a model we will call the measurement model for x variables xi- Pi = i l + 9,i= 1,., I (6) Xi -i =i +8,i= I1+ 1,2 (7) X, - -&2kk +,i Ik_ + l,...,Ik (8) and also for y variables y, -P. = AJ,7 + 6,,j J= 1,...,J (9) is used to decompose the observed scores x, for example, into latent variables (or common factors) 4 and residuals 8 which reflect measurement error. By grouping terms, one can clearly see Spearman's original formulation. For example, 1,,( +4 = (A,,()+S, = t, +e,. (10) Combining the measurement models in equations 6-9 with the multiple regression model in equation 4 yields the structural equation model r= a + j +---+ Ak 4k +, (11) allowing the prediction of values offl as a function of , to (k. One limitation of structural equation modeling approach has been that it does not include a straightforward way to model the familiar idea of interaction among latent variables. Of course, for those using structural equation modeling the ability to evaluate 4 interaction is important as many theories predict such effects. Moreover, it is also of practical concern to researchers using multiple regression because products of observed variables often include more measurement error than do their component variables (Busemeyer & Jones, 1983). In fact, the reliability of the products of two observed variables is PX1 , x P.,X + 'O2 (12) P(IXx2, IX2) = + and reduces to the product of reliabilities of the component terms when xi and x2 are uncorrelated. This means, for example, that if x, and x2 have reliabilities of .7 and are uncorrelated, the reliability of XIX2 will be .49. Correlation between x, and x2 increases the reliability of the product. However, the reliability of xxx2 will always be less than the more reliable of x, and X2. Latent Variable Interaction Indicant Product Approaches Kenny and Judd (1984) devised a latent variable interaction model and used COSAN (McDonald,1978) to compute maximum likelihood estimates of the model parameters. The structural equation in the model specified the relationship of one latent dependent variable to the latent independent variables (~ and 42) and their product: r7= 714 +7242 + 734 2 +4. (13) The measurement models for the latent variables had the usual form: x, -'p, = A,I + '4, i = 1,., (14) 5 x, - 4, = , + 5,,j = I+ 1,...,j (15) k pk 2k+ ek,k = 1,...,K (16) The measurement model for the product of the latent variables was defined by taking products of indicator variables for the latent x variables, (x, -P,)(x, - ,)= A,,2PZ 12 j + j2C29, + 6j, (17) so that the interaction loading is A',1 j2 (18) and the measurement error variance is 2,,,, +,A 2 &o oi' (19) AilA I0 + 12 22Od 6j 0 Kenny and Judd simulated a set of data to provide an example for the use of their technique. In the model they used to simulate the data, Kenny and Judd replaced equation 13 by Y - Py = 7!1 + Y722 + T3142 + (20) and eliminated equation 16. A single replication of 500 cases was generated. Estimates were similar to simulated values and therefore supported the technique (see Table 1). However, the use of a single parameter set with only one replication of data means that nothing can be said about standard errors of structural coefficients, and therefore about hypothesis tests involving them. Subsequently Hayduk (1987) provided a demonstration of how to estimate the Kenny-Judd model by using the releases of LISREL available in 1987. Table 1 Kenny and Judd Parameters and Estimates Parameter Simulated Value Estimated Value 7Y1 -.150 -.169 72 .350 .321 73 .700 .710 W .160 .265 22 .600 .646 24 .700 .685 In 1995, Jaccard and Wan conducted an extensive study estimating the KennyJudd model using the non-linear constraints that can be implemented in LISREL8. In the study, Jaccard and Wan investigated the influence of sample size, reliability of indicator variables, method of estimation, proportion of variance associated with the interaction term, correlation between component variables of the interaction term, and multiple correlation on evaluation of latent variable interaction. A summary of these results is presented in Table 2. In the model they used to simulate data, Y3 = 1.0 in all conditions. They reported many of the expected findings. As expected, estimates for the interaction (Y3) improved as sample size increased. Kenny and Judd also found the expected negative bias in estimation of y, using multiple regression and much more accurate estimates using structural equation modeling procedures that take measurement error into Table 2 Performance of the Kennv-Judd Model in the Jaccard and Wan Study Regression SEM Condition' Mean RMSE Mean RMSE Mean Mean Proportion of Est.SE Est. SE Tests Rejected2 RMSE .00,175,.70 .00 .10 -.01 .14 .135 94.6 .05 .00,175,.90 .00 .10 .00 .10 .103 100.0 .04 .00,400,.70 .01 .06 .00 .09 .086 92.0 .06 .00,400,.90 .01 .07 .00 .07 .067 93.6 .08 .05,175,.70 .51 .56 1.01 .46 .396 85.9 .50 .05,175,.90 .82 .33 1.00 .31 .297 95.8 .83 .05,400,.70 .54 .50 1.02 .30 .252 84.4 .85 .05,400,.90 .84 .25 1.02 .21 .194 90.4 .94 .10,175,.70 .51 .53 1.02 .35 .287 82.4 .74 .10,175,.90 .83 .25 1.00 .22 .212 94.4 .98 .10,400,.70 .54 .49 1.02 .23 .187 80.9 .97 .10,400,.90 .84 .22 1.01 .15 .138 88.7 1.00 1. Entries in the condition column are the proportion of variance accounted for by the interaction, sample size, and the reliability of x and y variables. 2. Probability of rejecting Ho: y3 = 0 8 account. Estimation also improved when reliability increased. Manipulating the effect size of the interaction did not influence accuracy using structural equation modeling and effects for the other variables were unfortunately not reported. As 150 replications of simulated data were used, an estimate of the true standard error of y3 was possible by calculating the standard deviation of the estimates. Because root mean squared error is comprised of both bias and standard deviation, the relatively unbiased estimates of Y3 using structural equation modeling allow a similar comparison between estimated standard errors and root mean squared error. Root mean squared error suggested that average standard errors tended to underestimate their true value overall. Bias in estimated standard errors for the interaction y3 increased as effect size increased, increased as sample size increased, and decreased as reliability increased. Despite these findings, observed Type One errors were at acceptable levels and power was a minimum of .85 for a sample size of 400. Taking expectations of the left and right side of the equation 20, Joreskog and Yang (1996) noted that it was misspecified because the intercept will not necessarily be zero even though the model is formulated for variables in population mean centered form. More specifically, when ( and (2 are bivariate normal the expectation of the right side of equation 13 is 73412' (21) where , is the covariance of 4 and (2 . This is problematic as r1 is specified to have an expectation of zero in the Kenny-Judd model. To correct this problem, Joreskog and 9 Yang reformulated the model to include intercepts and to use variables not deviated from their means: q,=a+ y, 1+ 722 + Y342 +" (22) x, = r; + A,, + 9,, i= 1,...,I (23) x = r, +Aj242 +'j, j= I+1,...,J (24) y, = rk+ k+ ek,k = 1,...,K (25) X, Xj = (r", + I, 1 3,)(rj + /1,12 2 + /) (26) Using a different set of simulated data than that used by Kenny and Judd (1984), Joreskog and Yang (1996) and later Yang-Jonsson (1997) studied the precision of the estimates in this model and an alternative model in which only a single pair of observed variables was used an as indicator of ,. As in Kenny and Judd (1984), equation 22 was replaced by y= a+y1 +Y22 + Y,2 +( (27) and equation 25 was eliminated. Both studies evaluated the accuracy of estimated parameters in the models under various estimation procedures. Maximum likelihood provided more accurate estimates than either weighted least squares or weighted least squares using the augmented moment matrix. Estimates produced by maximum 10 likelihood estimation were most accurate when all pairs of observed variables were used as indicators of (2 as were estimates produced by weighted least squares using the augmented moment matrix. Weighted least squares estimates were less accurate when all indicators of (( were used. Both studies suggest that using the three estimation procedures with the model described by equations 22 and 24, 26 and 27 and the alternative model, the Joreskog-Yang method provides estimates which are consistent. A subset of the results from the Yang-Jonsson (1997) study, for a sample size of 400, all pairs of observed variables, and maximum likelihood estimation are presented in Table 3. Average estimates of the standard error were smaller than the standard deviations of the estimates suggesting that Type I error rate should be a problem when maximum likelihood estimation is used. Table 3 Yang-Jonsson Parameters and Estimates Parameter Simulated Estimated SD of Mean Mean Est.SE Value Value Estimate Est.SE SD of Estimate S1.0 1.000 .049 .039 .80 S0.2 .210 .103 .079 .77 72 0.4 .401 .079 .064 .81 73 0.7 .715 .167 .115 .69 .02 .010 .051 .040 .78 2 0.6 .600 .120 .079 .66 24 0.7 .695 .087 .060 .69 11 Joreskog and Yang (1996) as well as Yang-Jonsson (1997) provided valuable information about how many pairs of observed variables should be used as well as which estimation method should be used. However, these studies were not designed to compare the results of their method with results from the Kenny-Judd model. The lack of a common set of simulated data across the studies means that this information provides little insight into whether the Joreskog and Yang model or the Kenny and Judd model should be used. Joreskog and Yang (1996) and Yang-Jonsson (1997) set r, = r, = 0 in the model they used to simulate the data. Algina and Moulder (in press) allowed r, and r, to be non-zero but otherwise used the same parameter values used by Joreskog and Yang and Yang-Jonsson. Algina and Moulder showed that the maximum likelihood procedure frequently did not converge when the intercepts in equations 23 and 24 were non-zero. That is, the estimation procedure was unable to find an optimal solution for the system of equations. An alternative method was implemented that did not encounter convergence problems under the conditions cited above. This was done by revising the Joreskog-Yang procedure as follows: q = a+y +722 + Y34 2 +4 (28) x, - p, = AiA + ,, i = 1, ..., I (28) (30) 12 yk -Pk 2k 1+k,k = 1,...,K (31) (x, -u,)(x, - j)= A,,A2j2 2 + , + 2 + ,, (32) That is, deviation scores were used while allowing for a non-zero intercept in the structural equation model. Algina and Moulder compared estimates of the parameters in the revised model to estimates of the parameters in the Joreskog and Yang model under conditions in which the Joreskog and Yang model converged. Estimates were comparable and unbiased. Standard errors were similar using both approaches and were much smaller than the standard deviation of the estimates in both cases. Robust standard errors were also reported. These are standard errors calculated in a way that allows for non-normality (Chou, Bentler, & Satorra, 1991). Robust standard errors for the revised model proved nearer the standard deviation of the estimates for this model. However, robust standard errors for the Joreskog-Yang model were much larger than the standard deviation of the estimates using the Joreskog and Yang model. A summary of this comparison is presented for a sample size of 500 in Table 4. Two-Step Approaches Application of the previous methods is difficult because they require knowledge of matrix algebra and careful specification of non-linear constraints. Ping (1996) suggested a two-step approach that avoided the need to impose non-linear constraints on model parameters in equation 17. In the first step, one estimates the measurement models in equations 14 and 15. Estimates calculated in the first step are substituted into equations 18 and 19 to determine the measurement model for the latent product variable. Table 4 Comparison of Parameters and Estimates for Joreskog-Yang Approaches Estimates Mean SE Mean Robust SE SD of Estimates SD of Estimates Paramete Simulated Original Revised Original Revised Original Revised r Value Y -.15 .212 .213 .627 .627 1.582 0.818 Y2 .35 .401 .400 .675 .675 1.588 0.875 73 .70 .706 .706 .787 .800 1.191 0.906 W .16 .195 .194 .806 .803 1.153 0.930 22 .60 .594 .594 .775 .781 1.636 0.992 24 .70 .705 .706 .875 .875 1.400 1.050 . These estimates for factor loadings and measurement error variances are treated as known parameters in a subsequent analysis of the structural equation model using equations 20 and 17. The principal rationale for this is that it avoids much of the complex programming entailed in the previous approaches and can be implemented in programs without the non-linear constraints feature of LISREL8. However, this method misspecifies the model in the second step by treating estimated parameters as known. The impact of this is not known. Ping (1996) used Kenny and Judd's (1984) parameter values to simulate data and estimate parameters in the resulting interaction model. Therefore, a comparison of the two methods can be made. Ping's (1996) study suggests that estimates of y, produced by 14 his method tend to underestimate the interaction effect (see Table 5). A comparison of estimates produced by Ping's procedure with those produced by Kenny and Judd's procedure suggests that the two sets of estimates are comparable with the latter being slightly more accurate. A reduction in accuracy may not be unexpected given the misspecification previously described. However, some minimal decline in accuracy may be acceptable given the gain in ease of use. The extent of this inaccuracy remains Table 5 Ping's Estimates of the Kenny-Judd Parameters Parameter Simulated Value Estimated Value 7, -.150 -.132 72 .350 .318 73 .700 .666 I .160 .237 22 .600 .599 24 .700 .737 difficult to evaluate as only one replication using a sample size of 500 was simulated which did not allow for a comparison of standard errors. It is possible that the two methods have quite different standard errors which would focus interest on a comparison of Type I error rate and power. Because Joreskog and Yang's (1996) inclusion of means is compatible with the Ping approach, a two-step procedure for estimating the Joreskog-Yang model is also possible. Estimates calculated in the first step are substituted into equations 18 and 19 to 15 estimate the measurement model for the latent product variable. These estimates for factor loadings and measurement error variances are treated as known parameters in a subsequent analysis of the structural equation model using equations 27 and 17. Nothing is known about the accuracy of this procedure. However, removing the misspecification identified by Joreskog and Yang has the potential to make this approach superior to Ping's original formulation. Two-Stage Least Squares Approach Bollen (1996) suggested a two stage least squares approach for evaluating latent interaction. In this approach measurement models are defined such that each latent variable and its first observed variable are on the same scale. This is done by setting k equal to 1 for one observed variable (or indicator) for each latent variable as in equations 33, 35, and 37. x,-,= +6 (33) x, - p, = Aj, + 9, i = 2,..., I (34) xI+1 - Pi = + 5., (35) x, - p, = A f2 + 35,j = I + 2,..., J (36) Yl - Puk = 17+ -' (37) 16 Yk -k k+ k,k = 2,...,K (38) The structural equation model is the same as that used in the other deviation score models. ?7 = YI +Y2 2 + Y3 I2 + (39) Using the measurement models in Equations 31-36, one can substitute values for 71 and the 4 variables in equation 34 so that (0i -,Uk) = y((x -u,)-1)+ Y2((x + -,uj)- 61,+1) (40) +Y((x, -4)- ,)((x,+, -pj)- ,t, ) + which can be rearranged to form Yi = Y7(x, -4 )+ 72 (x1.1 - A) + Y x - i,,)+ - A) + u, (41) where the residual u is a composite of the additional terms. Ordinary least squares regression should not be used to estimate Equation 41 because the predictors and the residual u are correlated. Assuming three observed variables per latent variable, one can use the observed variables not included in Equation 40 (i.e., x2, x3, X5, and x6), to predict values for x, ,x4, and x, x4. x l - , =a+ A, (x2 - ,)+J 21(X3 - ,) +31 (x5 -P)+j 41(X6 - ,) (42) +5,(x2 -pX)(x5 - P) + AI(x2 -)(x6 - ,)+ ,, (x3 -,",)(x +A, (x3 - p,)(x6 - p ) 4 - 14 (2 -4 (3 34 (5 ) 44 (6 - (43) +A4 (2 A)(5 6 4 (2 -II)Xx - )+ 174 3X - AXx, -P,) +A (x8 4 3 I6 17 (i1 - ,)(4 -,u) = a + fl,(14)(x2 -, )+ / 14)(x3 -/, ) + 3(14 (x5 - ) (44) +4(14) (x6 - ) + 5(14)(X2 - ,)(x5 -/,) + f6(14)(X2 - P,)(x6 - P,) +fl7(14) (X3 -A,)(x5 - Pj) +,8(14)(x3 -A, )(xx6 - P,) The predicted values can be inserted into a regression equation resulting in: yA = 7,(ii - A,) + 72(4 -4Pi)+ j 3(, - AX 4 - 1U) + u. (45) The predictor variables are no longer correlated with the residual, and estimatators of the ys in Equation 45 are consistent for the ys in Equation 39 (i.e., the mean approaches the parameter value and the standard error approaches zero as the sample size becomes indefinitely large). Bollen (1996) applied his method to data simulated by using Kenny and Judd's (1984) parameter values and reported y estimates similar to those obtained by Kenny and Judd with one replication of the data (see Table 6). However, comparison between procedures was again difficult since Kenny and Judd did not provide standard errors for the ys making any useful comparative statement regarding impact on hypothesis tests or confidence intervals impossible. Furthermore, standard errors provided by Bollen are suspect as there is no way to verify their relation with true values. Table 6 Bollen's Estimates of the Kenny-Judd Parameters Parameter Simulated Estimated Est.SE Value Value 71 -.150 -.160 .052 72 .350 .360 .054 '3 .700 .710 .053 18 Statement of the Problem Comparisons of indicant product methods, two-step methods, and the two-stage least squares approach have been limited. Jaccard and Wan (1995) conducted an extensive simulation of the Kenny-Judd model. But, as their work preceded Joreskog and Yang (1996), Ping (1996), and Bollen (1996) a comparison to these methods was not possible. Ping used simulated data to compare his method to the Kenny-Judd method, implemented in COSAN and in LISREL7 (using Hayduk's method). However, Ping used one set of parameter values and generated only one replication of sample data. Joreskog and Yang (1996) and later Yang-Jonsson (1997) did not compare empirical results of their methods and the other methods. Algina and Moulder (1999) focused on a comparison of the Joreskog-Yang and the Revised Joreskog-Yang models using maximum likelihood estimation. Therefore, the primary objective of the current study is to compare estimates produced by Kenny-Judd, Joreskog-Yang, and Revised JoreskogYang models; two-step estimation of the Kenny-Judd (i.e., Ping's procedure) and the Joreskog-Yang model ( a revised Ping procedure), and Bollen's two stage least squares method for the latent interaction model. Of particular interest is how well the results of the two-step and two stage least squares procedures compare to the remaining procedures, which are more difficult to implement. METHOD Data Simulation In simulating data for model comparisons, ensuring that relevant factors are manipulated and that levels of these factors are similar to those observed in research is of primary importance. To that end, five factors were manipulated using values typical of those observed in the practical literature. Effect Size Champoux and Peters (1987) and later Chaplin (1991) compiled proportions of variance reported in studies using multiple regression. They found that interaction accounts for average of between 3 and 8% of a in multiple regression models. Since these proportions of variance were for observed variables, they are below those values which would be observed for error free variables and the values of '4 were manipulated to account for 0%, 5%, and 10% of cr. Squared Multiple Correlation Based on a survey of all APA journal articles that were published in 1992 and reported multiple regression results, Jaccard and Wan found the median squared multiple correlation in these studies to be .30. The 75h percentile was .50. Based on these results, data for ( and 2 were simulated so that the squared multiple correlation for a model including 4, 2, (2 and a covariate ( 3) fell between .20 and .50. This was accomplished by having a model that includes ,, 2, and 3 accounted for 20 or 40% of 2 o . This squared multiple correlation is denoted by pR where R stands for reduced model. Correlation Among Variables As described previously, correlation among variables has a strong influence on the reliability of the product between them. That is, increases in correlation among component variables are associated with decreased measurement error for the product term. Jaccard and Wan (1996) found that the median correlation among variables for studies using multiple regression in 1992 APA journals was .20 and the 75'h percentile was .40. Values of correlation between (I and (2 were .20 or .40. Reliability of Observed Variables Because structural equation modeling works to distinguish true score from measurement error, the degree of reliability (or the degree to which measurement error is absent) is important in determining the best procedure. As reliabilities of .70 are widely viewed as minimal (cf. Litwin, 1995), low and high reliabilities of observed variables were .70 and .90. Sample Size Accuracy of estimation improves as sample size increases. In a previous study. Yang-Jonsson (1997) used samples sizes from 100 to 3,200 to evaluate the impact of sample size. While this approach provided valuable information in determining the accuracy of statistical estimation procedures, one is rarely in the position of using sample sizes on the order of thousands in the social sciences. Jaccard and Wan (1996) reported median sample size to be 175 and 75h percentile to be 400 in the study of APA journals described previously. Therefore, sample sizes of 200 and 500 were used. Simulation Proper The design of the study was a 2 (sample size; N) x 2 (latent variable correlation; p12) x 2 (squared multiple correlation; p2 ) x 3 (proportion of variance associated with the interaction; Pk,) x 2 (level of reliability of the observed variables; p,.) completely crossed factorial design. PRELIS (Joreskog & Sorbom, 1989) was used to generate 250 replications for each of the 48 conditions. The model for generating the dependent latent variable (ri) was 77q = a + y, + Y22 + Y33 + 42 + . (46) Without loss of generality, the following specifications were made: Yl = Y2 = 0, (47) oi = 022 =033 = O = 1 (48) where Oak is the variance of k, and E(() = E((2) = E(3) = 0. (49) In addition, 43 was assumed to be uncorrelated with ( and 42. As a consequence, 43 was uncorrelated with 442. Because the expected value of 4 was zero and its variance was one, it could be generated as a random normal deviate. The variable 4 was generated by using 2 = 012 4 + I( 1 2) xz. (50) The variable z is a standard normal random deviate. With this calculation and the previous assumptions, terms associated with equation 46 were determined with the exception of y, and y4. With the specifications that y, = Y2 = 0 and = 22 33 = a = 1.0 the variance of q in equation 46 is: 2=l= y + y (1 + ) + (51) and the variance of the residual V/ is unknown. By virtue of the non-zero correlation between 43 and (g4 and the specfication y, = 7Y2 = 0, 2y = P, (52) 2P 2 (53) and 2 2 )12 (54) Pmc, - 4 2 because =2 1 2 2 P ,.cfl (55) (1+ 12) ^2 To facilitate comparison of Y4 in conditions where pa,2 > 0, it was convenient to set Y4 = 1. This was accomplished by dividing the left and right side of equation 46 by 74 resulting in 2 2 (1+,22 ) (56) 73 =PR 2 and (57) 2/2 P1 22 R =[P-Pinc~d 2 The expected value of I was c(r)= a+ 71E()+ y2E(42) + y3E(3) + Y4E(42)+ E(C) (58) which simplifies to pu = a +74012. (59) When constructing structural equation models, one defines the meaning of a latent variable in the selection of indicators. It is common for latent variables to have three indicators, which was the number used in this research. Measurement models describing the twelve variables were defined as: x, = r, + , + (60) x2 +2 + (2 (61) X3 = "3 + A3 I + 3 (62) x, = ', + A42 + (4 x5 = r5 + 25242 + 65 X6 = T6 + A62 2 + 66 x,7 = 7 + 2733 + 7 X8 = rT8 + A 8 + g X9 = r9 + 2933 + 9 y2 =ry2 + Air + , y,2 = Ty, + Az,r9 + e2 (63) (64) (65) (66) (67) (68) (69) (70) Y3 = ,3 +317 + c3 (71) where each X was defined as the square root of the reliability for the associated observed variable. The variance for the measurement model residuals (Ss) for Equations 60 to 68 were 06 Is, = 1-2, Og, = 1 - A92 (72) (73) because each 4 had a variance of one. Residuals for Equations 69 to 71 were (74) 04 = 1- k when pi,,,2 = 0 and (75) 2) = (l- + #2 otherwise. Equation 75 is a result of the decision to make 74 equal to one when 2 0. The parameter values produced in this way are presented in Table 7. A further decision which had to be made was how many pairs of indicators were used to define the interaction. Joreskog and Yang (1996) noted that their model is identified with only a single pair of indicators for the interaction. However, the use of all Table 7 Simulation Parameter Values 2 2 P -inc,73 .40 .20 .40 .20 .40 .20 73 2.03961 1.44222 2.88444 2.03961 0.63246( 0.44721 A2=.20 5.20 7.28 11.44 15.60 0.60 0.80 py,=.70 0.25944 0.25944 0.18345 0.18345 0.83666 0.83666 pyy,=.90 0.29417 0.29417 0.20801 0.20801 0.94868 0.94868 p12 =.40 .10 .40 2.15407 5.80 0.24565 0.27854 .20 1.52315 8.12 0.24565 0.27854 .05 .40 3.04631 12.76 0.17370 0.19696 .20 2.15407 17.40 0.17370 0.19696 .00 .40 0.63246 0.60 0.83666 0.94868 .20 0.44721 0.80 0.83666 0.94868 2 Pncj ,3 .10 .05 .00 indicators for the interaction satisfies the practical desire to use all available information. The primary concern is that each additional product indicator introduces non-normality which biases standard errors and chi-square model fit statistics (Chou, Bentler. & Satorra, 1991). Following Jaccard and Wan (1996), a model including four indicators for the interaction instead of the nine possible was used as a compromise between adequately defining variables and minimizing influences of non-normality. Parameter Recovery Estimation. The most common method of estimation in structural equation modeling is maximum likelihood estimation (ML; Kline, 1998). If the observed variables are multivariate normal, the maximum likelihood procedure has many desireable characteristics. Maximum likelihood estimators are consistent, efficient, and asymptotically normal. The maximum likelihood procedure allows a goodness of fit statistic which is the foundation of many goodness of fit indices. However, even when observed variables for both ( and 42 are normal, observed variables for (2 as well as for rl will not be normal. Boomsma (1983) reported that ML estimators of the parameters in structural equation modeling are consistent when the observed variables are not normally distributed. However, standard errors of parameter estimates tend to be too small and chi-square statistics of model fit are too large (Chou, Bentler, & Satorra, 1991; Yang-Jonsson, 1997). One solution to this problem is to use Browne's Asymptotic Distribution Free estimator (ADF; 1984). Another approach is to use Weighted Least Squares using the augmented moment matrix. However, both of these approaches provide correct estimates, standard errors, and chi-square statistics only with extremely large sample sizes (Chou, Bentler, & Satorra, 1991; Yang-Jonsson, 1997; Jaccard & Wan, 1996). As was previously suggested, sample sizes likely to be used will be relatively small in the social sciences, that is, between 200 and 500. Another approach to the problem of incorrect standard errors and chi-square statistics is to use the asymptotic covariance matrix provided by PRELIS (Joreskog & Sorbom, 1987) in association with maximum likelihood estimation to correct the estimated standard errors and chi-square fit statistic. These are referred to as robust standard errors and the Satorra-Bentler chi-square fit statistic respectively. Chou, Bentler, and Satorra (1991) compared this method to the ADF method using data with skew and/or kurtosis. They reported that the robust standard errors and and SatorraBentler chi-square fit statistics were comparable to or better than those of the ADF method with sample sizes of 200 and 400. In this study, uncorrected and corrected standard errors and chi-square tests are reported. Non-convergence. Pilot study suggested that estimation procedures would fail to converge in some proportion of the replications. Therefore, the methods were compared by using the results for the first 200 replications on which all estimation procedures converged. That is, results were removed for any replication on which any method fails to converge and for any replication beyond the first 200 on which all methods converge. This includes estimation of measurement models in the first step of the two stage procedures. Heywood cases (i.e., negative estimated variances) and other improper estimates did not result in deletion of results for a replication. Despite the simulation of additional replications, higher rates of non-convergence than expected were observed for the Kenny-Judd, Joreskog - Yang, and Revised Joreskog-Yang methods. The criterion for convergence was changed from the default s=.0000001 to E=.0001 to increase rates of convergence. Comparison of Methods A number of approaches were used to evaluate estimates produced for each parameter and each method. Accuracy of estimates. Average estimates were compared to their parameter values in each of the 48 conditions. Accuracy of standard errors. For each method and each parameter, averaged estimated standard errors were computed in each of the 48 conditions. Since estimates will be available for 200 replications of each condition, the standard deviation of the estimates can be used as an estimate of the parameter value for the standard error (Babakus, Ferguson, & Joreskog, 1987). This standard deviation is called an empirical standard error. To simplify interpretation, the average estimated standard error from the 200 replications was divided by the empirical standard error to obtain a measure of the proportional under- or over-estimation of standard errors. Combined comparison of accuracy. To provide a convenient single value describing the degree to which each parameter is accurately estimated, mean squared errors were compared across the models for each parameter in each of the 48 conditions. Mean squared error is the average squared deviation of an estimate from its respective parameter value. This value reflects both parameter bias and standard error of parameter estimates. Type 1 error rates and power. Operating characteristics for the hypothesis test of y coefficients in the latent variable model were estimated by calculating the proportion of replications that were significant. Type 1 error rate was estimated when there is no interaction. Power was estimated when the interaction accounted for 5 or 10% of the variance. Fit statistics. The Non-Normed Fit Index (NNFI; Bentler & Bonnet, 1980), X2 test of exact fit, Comparative Fit Index (CFI; Bentler, 1990) and standardized root mean squared residual were used to assess the degree of correspondence between the models and the data. As the simulation ensured that the models adequately fit the data, a consistent lack of fit for a particular method (e.g., Ping's method) would suggest that the method poorly recovers the simulated effect. Comparisons of fit indices are also reported within each method. These provide a measure of the utility of fit indices for individual methods. Finally, each of the fit indices makes use of the asymptotic covariance matrix information when available. Therefore, the above comparisons of fit indices were repeated including the asymptotic covariance matrix. RESULTS After data were simulated and the procedures under study were applied, the difference between the estimate and parameter value and the ratio of observed to empirical standard error were calculated for each parameter in each replication. Significance was also determined for each parameter as well as for fit statistics. Analysis of Variance (ANOVA) was applied to the 115,200 cases produced by the simulation (200 replications, 2 levels of P!2, 2 levels of p', 3 levels of , , 2 levels of p,, 2 levels of N, 6 procedures, and 2 levels of asymptotic covariance matrix use) for all measures except for the difference between estimates and parameters. In that case, the use of the asymptotic covariance matrix was irrelevant. ANOVA was, therefore, applied to the 57,600 cases from the no asymptotic covariance matrix condition. The six procedures (PROCEDURE) and whether the asymptotic covariance matrix was used (MATRIX) were treated as within subjects effects. This was because all levels of PROCEDURE and MATRIX were applied to each data set. Other factors were treated as a between subjects effects. This resulted in a model with 7 main effects and 119 interactions. The combination of a large number of effects and a large sample size meant that significance was virtually assured for many effects and a large number were significant. Therefore, it was necessary to obtain a measure of influence for each of these effects to select those associated with a meaningful proportion of variance. Mean square 31 components were calculated for each effect with negative components set to zero. These were summed and the ratio of each mean square component to the sum was used to gauge influence. Effects significant at the a=.01 level and that accounted for at least .5% of the mean square and variance component sum were investigated further. The denominator of the proportions described in the previous paragraph is not precisely the total variance because of the error terms in the analyses. These error terms were composites of two confounded variances. The expected values of the error terms followed the form kxoF2 + Ge2 where 0F2 was the variance associated with the interaction of the repeated factor and REPLICATION, ae2 was the residual error variance and k was the product of levels of the repeated measures factors not included in the error term. For example, when the error 2 2 term was PROCEDURExREPLICATIONS nested in P12, PR , P,, ,2, p,,., and N, k was 2 which is the number of levels for MATRIX. When the error term was 2 2 REPLICATIONS nested in P2, pj, p, ,,, P i,,, and N, k was 12 which is the product of the number of levels for MATRIX and PROCEDURE. Including the error terms in the sum would have meant that in some analyses OF2 would contribute up to 12 times making the sum somewhat overstated as a total of the mean squares and variance components. To avoid this, each error term was divided by k before the summation. This solution understates the total of the mean squares and the variances by e2e-(a,2/k) for each error term in the analysis. 32 Bias Analysis was limited to coefficients in the latent variable model 1= a+7 + 722 + Y3C3 +Y4A +1 where y and 72 had parameter values of zero and 74 had a parameter value of one. Parameter values for both y3 and y were varied in order to manipulate p2 and p Therefore, biases associated with y3 and y were evaluated as proportional changes from the parameter value to avoid influences of the parameter value on bias. Of the predictors in the latent variable model, changes in bias associated with manipulated factors were observed for 74. For y4, an interaction of PROCEDURE and p,, was observed for bias that accounted for 1.1% of the mean square and variance component sum. The interaction indicated that estimates of 74 were essentially unbiased for the Joreskog-Yang, KennyJudd, and Revised Joreskog-Yang procedures. However, both the Ping and the Revised Ping procedures were associated with negative bias that was greater as measurement error increased (See Figure 1). Bollen's procedure was associated with positive bias for Y4 when p,, =.70 and negative bias when p,,. =.90. 2 2 A significant interaction of sample size, p2cI P12, pR and p,,. was also observed for bias associated with 74 that accounted for 1.5% of the mean square and variance component sum. However, further examination of this effect did not lead to an interpretation and the effect will not be discussed. Average bias for 74 by sample size, 2 is2 P,,,'4142,, P121 PR, and p,,. is presented in Appendix 1. 0.1 - Pit. 0.075 10.7 S0.9 0.05 --- . 0.025 -0.025 Joreskog Revised Bollen Ping Kenny Revised -0.05 Ping Judd JY. -0.075 Figure 1. Bias in Estimates of Gamma 4 There were no significant effects associated with bias for any other coefficient. Average bias was -.004431 for y', .001151 for Y2, .006923 for Y3, and -3.5015 for y. Standard Error Ratios Gamma 4 A number of factors were associated with changes in standard error ratios for y4. These are presented in Table 8. A significant interaction of MATRIX, PROCEDURE, and p,,. was observed for standard error ratios associated with y4. Mean standard error ratios by these factors are presented in Figure 2. Robust standard errors for y4 for the Joreskog and Yang procedure overestimated true standard errors. Overestimation by the robust standard errors for y4 using the Joreskog and Yang procedure was more severe when observed variables were measured with lower reliability. Standard errors for y4 34 produced by the Bollen procedure were nearest the true standard error. Robust standard errors produced by the Revised Ping procedure most underestimated true standard errors with a value of approximately .85 across reliability. True standard errors were underestimated using the remaining procedures with standard error ratios of approximately .95. Table 8 Proportions of Mean Square and Variance Component Sum for y4 Standard Error Ratios Effect Proportion of Sum F n Nx 2 Nx p, p 2 X Nx , x p2 x pxp. 2 Nx P , , P , 2 Nx .012 X OR X . 2 2 Nx PC2 x p 2 X PR X p. MATRIX MATRIX x p,. PROCEDURE PROCEDURE x PR MATRIX x PROCEDURE MATRIX x PROCEDURE x N MATRIX x PROCEDURE x P MATRIX x PROCEDURE x p. 0.008 0.008 0.007 0.012 0.015 0.006 0.006 0.015 0.006 0.058 0.005 0.006 0.019 0.0 19 72 78 59 157 69 3632 1766 3220 592 12701 598 609 2091 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .000 1 Ordinary standard errors exhibited also changes associated with p,,.. In particular, ordinary standard errors produced by the Joreskog and Yang procedure for y4 underestimated the true standard errors when p,,. =.70 (See Figure 3). As p,,. increased to .90, standard error ratios for the Joreskog and Yang procedure increased indicating the ordinary standard error became a bit more accurate than that for the Revised Ping approach and slightly less accurate than that observed for the other procedures. Ordinary 35 standard errors for the other procedures were more modestly improved by increases in p1,. Estimated standard errors for 74 using Bollen's model were most accurate. .ï¿½ 9 1.5 o . 0 Z 1 0.5 Ri =.90 o 0 ~~C CD ~ a 0 " CL CL - - c' * a" ,'< a Figure 2. Standard Error Ratios by Reliability for Gamma 4: Robust Standard Errors 0 CLI 0 0 - I OI 0 Figure 3. Standard Error Ratios by Reliability for Gamma 4: Ordinary Standard Errors 1.2 o1.1 0 0 1 S0.9 0.8 0.7 36 Increases in overestimation of the true standard error by the robust standard error using the Joreskog and Yang model were associated with increased p (See Figure 4). The Bollen model provided standard errors for y4 nearest the true standard error with a standard error ratio of approximately 1.0. Standard errors produced by the Ping, KennyJudd, and Revised Joreskog - Yang approaches had similar accuracy across levels of pR and very similar to one another with standard error ratios of .95. The Revised Ping approach produced the least accurate standard errors with robust standard error ratios of less than .90. Increases in pR were associated with slight improvement in ordinary standard errors produced by all procedures except Bollen's (See Figure 5). 1.6 o 1.4 12 1.2 00.2 - 1*0.4 00.8.4 0.6 Joreskog Revised Bollen Ping Kenny Revised Ping Judd JY Figure 4. Standard Error Ratios by Multiple Correlation for Gamma 4: Robust Standard Errors An interaction of MATRIX, PROCEDURE, and sample size suggested that standard error ratios were related to sample size and type of standard error differently 1.1 - - E0.4 72 0.8 -o 0.7 . .. 0.6 Joreskog Revised Bollen Ping Kenny- Revised Ping Judd JY Figure 5. Standard Error Ratios by Multiple Correlation for Gamma 4: Ordinary Standard Errors across methods. In particular, the overestimation by the robust standard errors for y4 using the Joreskog and Yang approach was reduced with increased sample size. Robust standard errors also improved with increased sample size for the Ping, Revised Ping, Kenny-Judd, and Revised Joreskog-Yang models. Bollen's approach produced accurate standard errors for y4 with a sample size of 175 and this did not change with increased sample size (See Figure 6). Similarly, increased sample size was associated with slightly more accurate ordinary standard errors (See Figure 7). Average standard error ratios by 2 2 N, Pnc,2' P12 PR, and p,,. are presented in Appendix 2 as this effect was not interpretable. Other effects in the analysis are not discussed as higher level interactions involving the same variables are present. 1.4 1.2 1 0.8 - ~. ~. U, C-- 0 Z$ N O 175 0 400 91 L. .<--. rCD Figure 6. Standard Error Ratios by Sample Size for Gamma 4: Robust Standard Errors 1.05 ---- 1 0.95 0.9 0.85 0.8 0.75 N 0 175 0 400 o S. CD Figure 7. Standard Error Ratios by Sample Size for Gamma 4: Ordinary Standard Errors Psi A number of factors were associated with changes in standard error ratios for y and accounted for at least .5% of the mean square and variance component sum. These are presented in Table 9. An interaction of MATRIX, PROCEDURE, p,;c.,, and p,. was observed for y. This was due to changes in standard error ratios associated with use of the asymptotic covariance matrix. For the Joreskog - Yang procedure, robust standard errors were too large- an effect that was exacerbated by low levels of reliability (See Figure 8). Standard error ratios for the remaining procedures were similar to one another. Table 9 Proportions of Mean Square and Variance Component Sum for ! Standard Error Ratios Effect Proportion of Sum F p 2 Nxpc,,, ,oA2 0.007 174 .0001 Np2 2 N 4x , ,, x p2 x PI 0.008 103 .0001 Pi' 0.007 1041 .0001 2 2 p, c,2 x P2 X P x t' 0.007 87 .0001 MATRIX 0.017 7972 .0001 MATRIX x p,, 0.011 2587 .0001 PROCEDURE 0.054 13851 .0001 PROCEDURE x 2 R x P,,.2 0.005 437 .0001 POEUExNxp2 2 PROCEDURE x N x pc, 2 X P 0.006 62 .0001 PROCEDURE x pi, 0.026 3416 .0001 2 PROCEDURE x p,,,.,, X pit' 0.006 239 .0001 PROCEDURE x Nx p,. x p12 x p 0.006 68 .0001 2 2pi PROCEDURE x p,,,, , X Pl2 X P X' 0.005 60 .0001 2 2 PROCEDURE x N xpinc,g xpl2xPu'xpt 0.011 60 .0001 MATRIX x PROCEDURE 0.103 13403 .0001 MATRIX x PROCEDURE x Pi, 0.056 3637 .0001 MATRIX x PROCEDURE x P . x 2 0.008 171 .0001 MATRIX x PROCEDURE x p, c., x XPill 0.008 171 .0001 0 2.5 8 2 1.5 co = 1 " 0.5 0 li,=70 Of,=90 pr0 o C. CL 0 . , ". (JQ -~L. 2 P inc,42 00.00 W 0.05 00.10 0.. Figure 8. Standard Error Ratios by Reliability for Psi: Robust Standard Errors For the ordinary standard error, standard error ratios for the Joreskog and Yang procedure were quite accurate and comparable to those for the rest of the procedures (See Figure 9). In fact, all procedures were associated with very accurate ordinary standard errors when p,,. =.70. When p,, = .90, standard errors for the Kenny-Judd model were underestimated. Standard error ratios for the other models approached one. A 2 2 significant interaction of PROCEDURE xNx P , x p,12 x p x p,,. was observed that was not interpretable. Means by these factors are presented in Appendix 3. Other effects are not discussed due to higher level interactions involving the same variables. Gamma 1 and Gamma 2 Both yi and y2 were zero and 41 and 42 had the same statistical relationship to 3 and 142. Thus one would expect results on standard errors to be very similar for yr and y2. This expectation was confirmed by the results. Therefore, results for y and Y2 .< 0 1.1 0ï¿½.7 0.9 S0.8 0C CD~ ~C Figure 9. Standard Error Ratios by Reliability for Psi: Ordinary Standard Errors standard error ratios will be described together. A significant effect of MATRIX was observed for yi andy2 standard error ratios that accounted for 9.3% of the mean square and variance component sum. Significant effects of PROCEDURE were also observed for yI and Y2 that accounted for 28 and 29% of the mean square and variance component sum, respectively. The interaction of PROCEDURE and MATRIX was significant and accounted for 56% of the sum for both y, andY2. A significant interaction of MATRIX, PROCEDURE, and p,, was observed for standard error ratios associated with Yi which accounted for 1.1% of the mean square and variance component sum. Mean robust standard error ratios for y' by p,,. and PROCEDURE are presented in Figure 10. As the interaction of MATRIX, PROCEDURE, and p,,, did not account for 1% of the mean square and variance component sum for Y2, mean robust standard error ratios for Y2 by PROCEDURE are presented in Figure 11. For the robust standard error, the standard error ratios for yi and Y2 using the Joreskog and Yang procedure were much too large. As Pii,=.9 2 M] 0.00 ï¿½M 0.05 00.10 0 . 42 indicated by Figures 10 and 11, the size of the ratios for the Joreskog-Yang procedures hides differences among the other procedures. Standard error ratios for the other procedures are re-presented in Figures 12 and 13. Across procedures, increases in reliability were associated with larger standard error ratios indicating that robust standard errors become more accurate. However, increased standard error ratios associated with increased p,,. for the Joreskog Yang model meant more serious overestimation of standard errors for yi when the robust standard error was used. Bollen's procedure more accurately estimated the true standard error than did robust standard errors for the other procedures. It should be noted that Bollen's procedure cannot make use of the asymptotic covariance matrix and, so, results are the same in both MATRIX conditions throughout this study. Standard error ratios for the remaining procedures were comparable and indicated that robust standard errors tended to underestimate the true standard errors by approximately .05 (See Figures 12 and 13). 0 .c30 -- - -_ 30 0i., =.70 Pii, =.90 20 10 0 IQ Figure 10. Standard Error Ratios by Reliability for Gamma 1-- Expanded View: Robust Standard Errors 14 , 0 12 10 6 4 02 0 Joreskog Revised Bollen Ping Ping Kenny- Revised JY Judd Figure 11. Standard Error Ratios for Gamma 2--Expanded View: Robust Standard Errors ol.1- pii,=.70 pii,=90 -o 1 0.9 0.8 < = CDCD 3CD -. I-. Figure 12. Standard Error Ratios by Reliability for Gamma 1: Robust Standard Errors 1 o S0.98 o 0.96 0.94 S0.92 o 0.9 Revised Ping Bollen Ping Kenny-Judd Revised JY Figure 13. Standard Error Ratios for Gamma 2: Robust Standard Errors Results for ordinary standard error ratios were somewhat different than those for robust standard errors. Bollen's procedure continued to provide the most accurate standard errors overall. However, the Joreskog and Yang procedure provided ordinary standard errors for yj that were indistinguishable from those provided by the Revised Ping, Kenny-Judd, or the Revised Joreskog-Yang procedures when p,,. =.70 or p,,. =.90 (See Figure 14). Ordinary standard errors provided by the Joreskog and Yang procedure were slightly too large for Y2 (See Figure 15). Differences between ordinary and robust standard error ratios for yi and Y2 tended to be less than .01 across levels of P,,c,, . A priori expectations that ordinary standard errors produced by maximum likelihood estimation should be underestimated in the presence of non-normality suggested differences in the accuracy of robust and ordinary standard errors as changes as robust standard errors are designed to take standard errors as , 1 ,2 changes as robust standard errors are designed to take 0 1.1 e Pii' =.70 Aip =.90 03 0.9 0.8 0" CD 0 CD 0 CD 0 * C D n < = = . < . = <. 0 -. .. 0 . 0..a ( Q 5-O 0 Figure 14. Standard Error Ratios by Reliability for Gamma 1: Ordinary Standard Errors 1.06 ---- -1.04 o i 1.02 0 1 .u 0.98 8 0.96 0.94 0.9 0.9 Joreskog Revised Bollen Ping Ping Kenny- Revised JY Judd Figure 15. Standard Error Ratios for Gamma 2: Ordinary Standard Errors non-normality into account. Means of standard error ratios for robust and ordinary standard errors by p2 were calculated excluding the Joreskog - Yang method. These are presented in Figure 16. Ordinary standard errors were more accurate than robust ---------- - I 46 standard errors when P,,c,' =.00, the two were equivalent at p, =.05, and robust standard errors tended to be nearer to true standard errors when ,, =.10. The pattern was similar for Y2 (See Figure 17). Ordinary standard errors for the Joreskog-Yang model were similar across Pc.4,g to ordinary standard errors produced by the other procedures. 0.985 0.98 0.975 0.97 0.965 0.96 0.955 0.95 0 0.05 0.1 Figure 16. Standard Error Ratios by P2C for Gamma 1 1~ o 0.98 LL 0.96 S0.94 0.92 MATRIX - O-- Robust --- Ordinary 0.05 Figure 17. Standard Error Ratios by p,,. for Gamma 2 Gamma 3 Effects significant at the ot=.01 level that accounted for at least .5% of the mean square and variance component sum for y3 standard error ratios are presented in Table 10. Significant effects were subsumed by the interaction of MATRIX, PROCEDURE, p,,,. and p,,c, that was observed for standard error ratios associated with Y3. Comparing robust standard errors across methods revealed that the robust standard error again overestimated the true standard error for Y3 using the Joreskog and Yang procedure. The overestimation of robust standard errors using the Joreskog and Yang procedure was more serious as p,,. increased (See Figure 18). Overestimation of standard errors using the Joreskog - Yang method obscured differences among the other methods. Therefore, the other methods are re-presented in Figure 19. Bollen's procedure compared favorably with the other methods providing standard error ratios approaching one. Results for the other procedures were similar to one another and indicated that robust standard errors slightly underestimated the true standard errors. When p,, =.70 and robust standard errors were calculated, all procedures produced larger standard error ratios as p' increased. Bollen's procedure produced more accurate standard errors than did the other methods when ordinary standard errors for Y3 were calculated. Joreskog and Yang's procedure provided ordinary standard errors that substantially underestimated true standard errors for 73 so long as the interaction accounted for some proportion of the variance in the latent variable model (See Figure 20). The Joreskog and Yang procedure 48 also underestimated the true standard error for Y3 when p2,,, = .00 and p,,, =.70. Procedures other than Joreskog - Yang are re-presented on a smaller scale in Figure 21. Table 10 Proportions of Mean Square and Variance Component Sum for 33 Standard Error Ratios Effect Proportion of Sum F p 2 P ,nc,4,42 0.017 31252 0.0001 MATRIX 0.049 146304 0.0001 MATRIX x PI.., 0.031 30542 0.0001 PROCEDURE 0.128 120222 0.0001 PROCEDURE x pc,4 0.106 33248 0.0001 PROCEDUREx P,, 0.013 6087 0.0001 2 PROCEDURE x p,,e,< x P,' 0.010 1617 0.0001 MATRIX x PROCEDURE 0.299 148068 0.0000 2 MATRIX x PROCEDURE x Pc.4,4, 0.186 30660 0.0000 MATRIX x PROCEDURE x p,,. 0.024 6041 0.0000 MATRIX x PROCEDURE x p,4, x p2 0.019 1609 0.0000 MATRIX x PROCEDURE x p,. ,4, X Pill 0.019 1609 0.0000 Reliability = .70 Reliability = .90 0- 0- 2 ptnc,4, , O 0.00 0 0.05 00.10 x W n 0 < = Figure 18. Standard Error Ratios by Reliability for Gamma 3: Robust Standard Errors 00.00 U 0.05 00.10 < UQ CD CD W '< CD (D '~- ~- ~ CL Figure 19. Standard Error Ratios by Reliability for Gamma 3: Robust Standard Errors Reliability = .90 2 0.00 1 0.05 O 0.10 0- - CD 0Ia. -e C - Figure 20. Standard Error Ratios by Reliability for Gamma 3: Ordinary Standard Errors o 0 1.1 Reliability = .70 Reliability = .90 P, 0E 0.00 r'1 0.10 S0.9 0 0 Figure 21. Standard Error Ratios by Reliability for Gamma 3: Ordinary Standard Errors Ordinary standard errors for other procedures were comparable to one another and slightly underestimated true standard errors when p,,. =.70. With p,, =.90, the KennyJudd procedure produced ordinary standard errors that underestimated the true standard ~2 errors more than other procedures when p,, was .00 or. 10. When p,,.=.70, standard 2~ error ratios for ordinary standard errors increased as P, increased for all procedures. A comparison of robust and ordinary standard error ratios for y3 revealed that relative accuracy was not dependent upon Pic,4,14, as it was for y, and y2. Instead, ordinary standard error ratios were uniformly nearer to one than were robust standard error ratios across Pc (See Figure 22). Other effects in the analysis are not discussed as higher level interactions involving the same variables are present. 0.99 0.98 MATRIX 0.97 Robust SOrdinary 0.96 0 0.05 0.1 Figure 22. Standard Error Ratios by P,,,, for Gamma 3 Mean Squared Error Gamma 4 A significant interaction of sample size, P2 , and P was observed for mean P,,c'4,ï¿½, and pn. was observed for mean squared errors for y4 that accounted for 1.2% of the mean square and variance component 2 sum. Average mean squared errors by sample size, Pic and p,,. are presented in Figure 23. Mean squared errors were largest in the P,,,, =.05 condition. Mean squared errors were also larger when N=175 compared to the N=400 condition. Reliability influenced mean squared errors for y4 with larger mean squared errors as reliability diminished. The pattern of means across levels of P2,c,, did not change with changing sample sizes or with changing levels of p,,.. However, differences between levels of 2 Pin'. , 2 were larger with reduced sample sizes and also with lower reliability. A 52 significant interaction of PROCEDURE and p,,. was observed for mean squared errors for Y4 that accounted for .6% of the mean square and variance component sum. Bollen's procedure had the largest means squared error when p,,. =.70. Ping's procedure had the smallest mean squared error overall. Mean squared errors were similar for the other procedures (See Figure 24). 0 .6 . ..... .. .......... .. .. .. N=175 N=400 0.5 o 0.4 0.3 O 0.00 S0.2 00.05 4.) 00.10 0.1 0 0.7 0.9 0.7 0.9 Reliability Reliability Figure 23. Mean Squared Errors for Gamma 4 by Sample Size 0.14 0.12 0.1 Pll 0.08 Ol 0.7 0.06 0.9 0.04 0.02 0 0 0 (D ( Figure 24. Mean Squared Errors for Gamma 4 by Procedure Psi An interaction of PROCEDURE, sample size, p, 2, p12, and p2 was observed for mean squared error for y that accounted for 1% of the mean square and variance component sum. This effect was not interpretable and average mean squared error by sample size, Pcg1, P12, and pR for y are presented in Appendix 4. Gamma 1, 2, and 3 The effect of the factors on mean squared errors for y1, Y2, and 73 were very similar and will be reported together A significant interaction of sample size and p2, as observed for mean squared errors accounting for 4, 5, and 4% of the mean square and variance component sums for yt, Y2. and y3 respectively. Average mean squared errors by sample size and P2c, for ya and Y2 are presented in Figure 25. Mean squared errors were largest in the Pc. =.05 condition. Mean squared errors were also larger when N=175 compared to the N=400 condition. These effects are mirrored for y3 in Figure 26. 0.5 Gamma I Gamma 2 p2 00.00 0.4 "'OS 0.2 00.00 0.3 0 = N=175 N=400 N=175 N=400 Figure 25. Mean Squared Errors for Gamma 1 and Gamma 2 0.4 o 0.3 ~2 P'nc.4,'z 0.2 O 0.00 S0.05 0.1 00.10 0 0 N=175 N=400 Figure 26. Mean Squared Errors for Gamma 3 Type 1 Error Rate and Power When evaluating rates of significance, Type 1 error rate is relevant 71 and Y2. When p, = .00, Type 1 error rate is also relevant for Y4. Power is always relevant for y3 and it is relevant for y4 when P,2 .00. Gamma 4 Type 1 error rate. Effects for the Type 1 error rate when testing Ho: 74 = 0 that were significant and associated with at least .5% of the mean square and variance component sum are presented in Table 11. An interaction of MATRIX and PROCEDURE was observed for 74 Type 1 errors. Use of the asymptotic covariance matrix influenced Type 1 error rate for 74 differently depending upon which method was used. Using the Revised Ping, Ping, Kenny-Judd, or Revised Joreskog-Yang approaches, a greater Type 1 error rate was associated with robust standard errors (See Figure 27). 55 Using the Joreskog and Yang approach, Type 1 error rate was greater when using ordinary standard errors. The MATRIXx PROCEDURExN x p1,2 x p x p,,. term was not interpretable. Type 1 error rate by these variables are presented in Appendix 5. Table 11 Proportions of Mean Square and Variance Component Sum for y4 Type 1 Error Rate Proportion of Effect Sum F p PROCEDURE 0.010 28 .0001 MATRIXx PROCEDURE 0.030 111 .0001 2 MATRIXx PROCEDURExN x p2 X X .0111 4 .0030 t 0.1 0.075 MATRIX o 0.05 Robust 0 o U Ordinary . 0.025 00 o 0 t -. n 0 CD C 0 CD Figure 27. Type 1 Error Rate for Gamma 4 by MATRIX Power. Several effects were significantly associated with significance for Y4 and associated with at least .5% of the mean square and variance component sum. These are presented in Table 12. An interaction of PROCEDURE, MATRIX, sample size, and p,,. was observed for Y4 . Using robust standard errors, power for the test Ho: 74 =0 was 56 nearly 100% for all methods when p,, =.90 (See Figure 28). Power for hypothesis tests of y4 for both Bollen and Kenny -Judd's methods were reduced by approximately 5% when p,,. =.70. Power was reduced for all procedures when N=175. However, the reduction in power associated with small sample size was more serious for the JoreskogYang and Bollen procedures. When N=175, power for all procedures was somewhat reduced when p,, =.70. However, the Joreskog-Yang and Bollen procedures were much more strongly effected than were the other procedures. Table 12 Proportions of Mean Square and Variance Component Sum for y4 Power Proportion of Effect Sum F p N 0.153 677 .0001 2 Pnc.,2 0.099 438 .0001 Nx2 Nx p ,c,,2 0.138 307 .0001 P, 0.039 173 .0001 Nx p,, 0.045 100 .0001 2 Pi., x p,1 0.021 47 .0001 N ,x p2c,g x Pif 0.014 16 .0001 PROCEDURE 0.019 348 .0001 PROCEDURE x N 0.019 172 .0001 2 PROCEDURE x p, 0.007 61 .0001 PROCEDURE x p,, 0.017 152 .0001 PROCEDURE x N x p,,, 0.011 52 .0001 MATRIX x PROCEDURE 0.024 701 .0001 MATRIX x PROCEDURE x N 0.037 534 .0001 MATRIX x PROCEDURE x Pc, 0.006 84 .0001 ,,.4, 0.006 84 .0001 MATRIX x PROCEDURE x N x p 0.005 40 .0001 ,41462 0.005 40 .0001! MATRIX x PROCEDURE x p,,A 0.020 296 .0001 MATRIX x PROCEDURE x N x p,, 0.026 189 .0001 o Proportion of Models Rejected -e 0 0 0 0 O a Uld pOSIAa'd " UallOg coooo 0 ps-Auul CT o ~O)lSaJOf UOIalOf ppnf-Auua)N Af PQS!A3I En 00~ II 58 Using ordinary standard errors, nearly 100% power was observed for all procedures when N=400 (See Figure 29). The single exception was a reduction of 5% for the Bollen procedure when p,,. =.70. When N=175, a reduction in power was observed in general. Reduced reliability was more problematic for '4 power using the Bollen method which exhibited a large reduction in power when p,,, =.70. An interaction of MATRIX, PROCEDURE, P,2 , and N was observed for 4 t,,4,,, and N was observed for Y4 Type 1 error rate. For robust standard errors and N=400, power was very high for all procedures. Lower power was observed for Y4 with the Bollen and Joreskog - Yang approaches when ,,,=.05 (See Figure 30). When N=175, power was reduced for all methods. However, decreases in power for y4 using the Bollen and Joreskog - Yang procedures were larger than with the other procedures. Using ordinary standard errors, power for Y4 was similar to what was observed using robust standard errors. However, power for Y4 using the Joreskog - Yang procedure no longer differed from the other procedures and only Bollen's method was associated with reduced power for tests of Y4 (See Figure 31). Excluding the Joreskog - Yang procedure, power for y4 was very similar using robust and ordinary standard errors. Only the N=175, p,, =.70 condition were differences observed. In this condition, using robust standard errors resulted in 1% more rejected models compared to using ordinary standard errors. Other effects in the analysis are not discussed as higher level interactions involving the same variables were present. -TI t Proportion of Models Rejected o IL O 00 SUf pOS!AQ-d z U;)1o z 0 oo II ppnf-XuuoNl ........... kf pOSIAQ>J AUidPSTA3-J Cl>Z PpII o Proportion of Models Rejected 2 U.Id pOSIAO-d Sul. " ppnf.uuoN 0 C') kf pOS!AQd CL 0, SOlSolOf Uallï¿½Hgf '*41 SUid C>TA> ppnf-,(uuaN Af p;SIA;>J p 5 I' Proportion of Models Rejected 0 0 o0 0 o) o 5m1saof ,..ï¿½Sutd pQSIAO-d U)IaIOI z. ..B ppnf-XuuaNI kf pOSIAON SUt~d pOSTAQ' t UallogR ppnf-XuuaN Af pOSIA;D-j Gamma 1 Five effects were significantly associated with the proportion of replications for which Ho: yi = 0 was rejected and associated with at least .5% of the mean square and variance component sum. These are presented in Table 13. As the parameter value for y, was zero, the proportion of replications for which the Ho: y, = 0 was rejected estimates the Type 1 error rate for yl. A significant interaction of MATRIX and PROCEDURE was observed for the proportion of hypothesis tests rejected for Yi. Table 13 Proportions of Mean Square and Variance Component Sum for yi Type 1 Error Rate Proportion of Effect Sum F p 2 2 P,4,14, x P2 x PR 0.071 8 .0040 PROCEDURE 0.012 101 .0001 MATRIX x PROCEDURE 0.025 296 .0001 22 MATRIX x PROCEDURE x N x p,,,,, x Pl2 X p R 0.010 6 .0001 Using robust standard errors the Joreskog and Yang procedure had a near zero Type one error rate. Proportions of models in which Ho: yi=0 was rejected for other procedures were very similar at approximately .06 when robust standard errors were used. Bollen's procedure was associated with slightly fewer Type 1 errors (See Figure 32). Using ordinary standard errors, Type 1 error rates for y, were similar across procedures at about 6%. Bollen's procedure was associated with slightly fewer Type 1 errors for y, than the other procedures. The interaction MATRIX x PROCEDURE x N x 2 2 Pin,,,1,2 X P12 X pR was also significant but not interpretable. Type I error rate by these 63 variables is presented in Appendix 6. The interaction N x p,,,, x p2 x p2 is not discussed as it is a subset of this higher order interaction. -o u 0.1 0.075 -o o 0.05 o 0.025 0 1t 0 01 MATRIX I Robust M Ordinary Figure 32. Type 1 Error Rate Using Robust and Ordinary Standard Errors for Gamma 1 Gamma 2 Five factors were associated with whether the hypothesis test Ho: Y2=0 was significant and were associated with at least .5% of the mean square and variance component sum. These are presented in Table 14. The significant interaction of MATRIX, PROCEDURE, and p,,, showed the sensitivity of the Joreskog and Yang procedure to robust standard errors and changes in p,,.. Due to overestimated standard errors, Type 1 error rate was very low for the Joreskog and Yang procedure using robust standard errors. Rejections for other procedures using robust standard errors were about .06. Bollen's procedure had slightly fewer Type 1 errors (See Figure 33). Increased p,,. 64 was associated with increased Type 1 error rate for all but the Joreskog and Yang procedure. Table 14 Proportions of Mean Square and Variance Component Sum for y) Type 1 Error Rate Proportion of Effect Sum F p PROCEDURE 0.019 138 .0001 PROCEDURE x N P, x Pl2 X PR 0.009 4 .0022 x2 2 PROCEDURE x Nx P,,c,4, X p12 x PR X P,' 0.015 3 .0052 MATRIX x PROCEDURE 0.020 241 .0001 MATRIX x PROCEDURE x Pir 0.007 41 .0001 0.08 .0 0.07 0.06 rj 0 0.05 0.04 o 0.03 .2 0.02 t 0 0.01 0 S 0 PH,. 00.7 *0.9 0 0 0 ~ prJ 0 .. 00 -v ,.- 00 . CL CL Figure 33. Type 1 Error Rate by MATRIX and p,, for Gamma 2 Using ordinary standard errors, the Joreskog and Yang procedure had the fewest Type One errors for y2 at about 3% when p, =.70. However, this increased to 7% when p,, =.90. A Type 1 error rate of about 6% was observed for the Ping, Kenny - Judd, Revised Ping, and Revised Joreskog-Yang procedures for Y2. Bollen's procedure was associated with slightly fewer Type 1 errors for test of Ho: y2 = 0 at about 5%. Type 1 error rate for these procedures increased by about 1% with increased p,,. The 2 2 PROCEDURE x N x x p12 x p x p,, interaction was not interpretable and is presented in Appendix 7. Other effects are not discussed as higher level interactions are present. Gamma 3 Several factors were significant in predicting power for testing Ho: Y3 = 0 and accounted for at least .5% of the mean square and variance component sum. These are 22 presented in Table 15. An interaction of MATRIX, PROCEDURE, N, pc,,, and p, suggested low power to detect covariate effects using robust standard errors and the Joreskog - Yang model. For the Joreskog - Yang model, power improved for hypothesis tests of y3 when reliability for the observed variables was low or when sample size was large (See Figure 34). Power was also greater in the P,,, =.05 condition. However, power was always poor using robust standard errors and the Joreskog-Yang method. Power was greater than 98% for other procedures irrespective of MATRIX. Other effects are not discussed as higher level interactions are present. Fit Statistics Chi-squared Test of Exact Fit Seven effects were significantly associated with the X2 Exact Fit statistic and were associated with at least .5% of the mean square and variance component sum. These are Table 15 Proportions of Mean Square and Variance Component Sum for y3 Power Effect MATRIX PROCEDURE PROCEDURE x p,,,2 2 PROCEDURE x Nx p,,4,1 2 2 PROCEDURE x p,4,, 2x PR MATRIX x PROCEDURE MATRIX x PROCEDURE x N MATRIX x PROCEDURE x P MATRIX x PROCEDURE x Np ,., 2 MATRIX x PROCEDURE x N x Pc.,42 MATRIX x PROCEDUREx P, MATRIX x PROCEDUREx Pc., x 2 MATRIX x PROCEDURE x Nx p 2, MATRIX x PROCEDURE x N x p ,,C, ,4 x Pj Proportion of Sum 0.083 0.251 0.006 0.005 0.006 0.501 0.005 0.013 0.010 0.006 0.011 0.007 N=175 N=400 0.4 0.2 Multiple Correlation Figure 34. Gamma 3 Power for the Joreskog-Yang Model and Robust Standard Errors F 170218 167138 1455 565 617 170218 927 1484 574 1054 627 200 p .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 .0001 2 M 0.05 *0.1 0.2 presented in Table 16. An interaction of p,,,, sample size, and PROCEDURE was observed for the X2 test of Exact Fit statistic which accounted for 5.2% of the mean square and variance components sum. Chi-squared statistics evaluating Ping's model were much larger than those evaluating other procedures with a sample size of 175 (See Figure 35). When N=400 chi-squared statistics increased compared to when N=175 for the Ping procedure. In both cases, increased p,,. was associated with increased X2. However the effect of increased p,,. was much more evident in the N=400 condition using the Ping procedure. The Kenny-Judd, Joreskog - Yang and Revised Joreskog Yang procedures were similar to one another with chi-squared statistics slightly decreasing with increased sample size. The Revised Ping approach had a slightly increased X2 with increased sample size. None were clearly influenced by changes in p,,.. Other effects are not discussed as they are lower level interactions subsumed by the PROCEDURE x N x p,,. interaction. Table 16 Proportions of Mean Square and Variance Component Sum for X2 Effect Proportion of F p Sum N 0.028 23095 .0001 p,, 0.030 24675 .0001 p;, x N 0.008 3599 .0001 PROCEDURE 0.545 676980 .0001 PROCEDURE xN 0.162 100784 .0001 PROCEDURE x p,,. 0.170 105674 .0001 PROCEDURE x N x p,, 0.052 16240 .0001 68 1000 1000 N=175 N=400 800 . ., S600 0 0.7 S400 0.9 200 0 Figure 35. Chi-Squared by Procedure, N, and p,,. Comparative Fit Index Three effects were significantly associated with changes in the Comparative Fit Index and were associated with at least .5% of the mean square and variance component sum. These are presented in Table 17. A significant interaction of PROCEDURE and 2 P,, ,4 was observed for the Comparative Fit Index indicating that a low value for the Comparative Fit Index was observed using the Ping approach which was reduced further when P2, ,,.00 (See Figure 36). High values for the Comparative Fit Index were observed using the other procedures that did not change with increased p Table 17 Proportions of Mean Square and Variance Component Sum for CFI Effect Proportion of Sum F p P2., 0.007 2716 .0001 PROCEDURE 0.947 516707 .0001 PROCEDURE x p2 0.034 6227 .0001 1.02 1 2 Ptn., , , 0.98 0.96 ---- - 0.00 S0.94 -M.. 0.05 U 0.92 - 00.10 0.9 0.88 0.86 Joreskog Revised Ping Kenny- Revised Ping Judd JY Figure 36. CFI by PROCEDURE and Pc,, Non-Normed Fit Index Three significant effects were observed for the Non-Normed Fit Index (NNFI) that were associated with at least .5% of the mean square and variance component sum. These are presented in Table 18. An interaction of PROCEDURE and p,,, was observed indicating that the Ping approach, which had the lowest values for NNFI 2 overall, was associated with further diminished NNFI when P,2., =0 (See Figure 37). Table 18 Proportions of Mean Square and Variance Component Sum for NNFI Effect Proportion of Sum F p 2 0.006 1754 .0001 PROCEDURE 0.948 552150 .0001 PROCEDURE x p2 0.034 6632 .0001 1.02 0.98 0.96 0.94 O 0.00 0.92 E..0.05 0.9 ...- 1 00.10 0.88 0.86 0.84 0.82 Joreskog Revised Ping Kenny- Revised Ping Judd JY Figure 37. NNFI by P,,,, and PROCEDURE Standardized Root Mean Squared Residual Six factors were identified that were significantly associated with Standardized Root Mean Squared Residual and associated with at least .5% of the mean square and variance component sum. These are presented in Table 19. A significant interaction of sample size and p,, was observed for Standardized Root Mean Squared Residual indicating that while reduced sample size and reduced reliability were associated with larger Standardized Root Mean Square Residual, the effect of reduced reliability was larger when N=175. (See Figure 38). An interaction of PROCEDURE and p,,. was observed for Standardized Root Mean Squared Residual. Standardized Root Mean Squared Residual was larger for the Ping approach and the Kenny-Judd approach. The Revised Ping approach had the smallest Root Mean Squared Residual. Standardized Root Mean Squared Residual 71 diminished with increased p,, (See Figure 39). . This effect was larger for the Ping and the Revised Ping approaches. Table 19 Proportions of Mean Square and Variance Component Sum for Standardized Root Mean Squared Residual Effect Proportion of Sum F p N 0.471 7138 .0001 pit, 0.111 1685 .0001 N xp,, 0.007 53 .0001 PROCEDURE 0.364 16750 .0001 PROCEDURE xN 0.023 526 .0001 PROCEDURE x p,. 0.015 336 .0001 0.06 0.05 0.04 S0.03 S0.02 0.01 0 P' 00.7 *0.9 N=175 N=400 Figure 38. Standardized Root Mean Squared Residual by N and p,,. Similarly, an interaction of PROCEDURE, and Sample Size was observed for Standardized Root Mean Squared Residual. Decreases in Standardized Root Mean Squared Residual associated with increased sample size were larger for the Ping and Kenny - Judd approaches compared to the other approaches (See Figure 40). 0.06 0.05 0.04 0.03 0.02 0.01 0 Pl, 00.7 00.9 0 ~1Q~ o cJQ Figure 39. Standardized Root Mean Squared Residual by PROCEDURE and p,,. 0.06 0.05 0.04 0.03 0.02 0.01 0 N O 175 M400 Joreskog Revised Ping Ping Kenny- Revised Judd JY Figure 40. Standardized Root Mean Squared Residual by PROCEDURE and Sample Size DISCUSSION A wide array of studies have been conducted on methods for estimating and testing latent variable interaction. However, many of these have provided results obtained for a single method and, therefore, did not allow for a general comparison of the available methods. Therefore, the primary objective of the current study was to compare estimates produced by Kenny-Judd, Joreskog-Yang, and Revised Joreskog-Yang models; two-step estimation of the Kenny-Judd (i.e., Ping's procedure) and the Joreskog-Yang model (a revised Ping procedure), and Bollen's two stage least squares method for the latent interaction model. Of particular interest was how well the results of the two-step and two stage least squares procedures compare to the remaining procedures, which are more difficult to implement. In comparing the six methods examined here, one is struck by the comparability of the various procedures. Even in situations where factors in the study had significant results, this was generally due to one procedure and differences among the remaining procedures tended to be small and the procedures performed quite well. However, limitations of some procedures were observed that suggest caution in the use of these procedures. An overview of the results of this study suggesting these limitations is presented in Table 20. Bias, robust and ordinary standard error ratios, Type I error rate, power, and mean squared error are presented for 74 Standard error ratios are also Table 20 Overview of Effects 74 7'4 SE Ratio Y4 Type 1 Error y4 Power Y3 SE Ratio CFI MSE Bias Method Bias Ordinary Robust Ordinary Robust Ordinary Robust Ordinary Robust Joreskog-Yang .004 .865 1.313 .074 .029 .956 .856 .482 14.292 .999 .061 Revised Ping .010 .912 .871 .059 .083 .953 .958 .989 .979 .998 .062 Bollen .046 1.000 1.000 .038 .038 .886 .886 1.007 1.007 - .083 Ping .021 .954 .934 .051 .067 .948 .952 .989 .979 .911 .043 Kenny-Judd .005 .948 .933 .051 .068 .950 .953 .981 .971 1.001 .062 Revised Revs.004 .947 .933 .050 .065 .950 .953 .989 .979 .999 .061 Joreskog-Yang presented for y3 and CFI is presented as a measure of fit. In each case, values are averages calculated over all the conditions in the study. Fit Structural equation modeling differs from other types of modeling such as regression in that fit statistics are used in evaluating whether a particular model is appropriate. Obtaining a model that fits the data is by many considered a prerequisite for conducting hypothesis tests which are often not considered for models with evidence of misfit. For that reason, the fact that the Ping procedure is associated with lower values of CFI and NNFI is problematic. This means that using Ping's method, researchers concerned with having adequate evidence of fit often conclude that the model does not fit the data. The average Standardized Root Mean Squared Residual was higher for the Kenny-Judd and Ping procedures than for the other procedures. However, typically this fit statistic met the commonly used criteria for adequate fit. With the exception of Bollen's procedure which does not calculate the types of fit statistics evaluated here, each of the other procedures provided evidence that tests of fit that were reasonably good from a Type 1 error rate perspective. Hypothesis Testing For most researchers, the most important information provided by a structural equation model will come from the hypothesis tests. Clearly, testing for significance is a central activity for many researchers. In this study, it was clear that the choice of method strongly influenced whether or not significance tests were appropriate. The Joreskog Yang method did not provide accurate hypothesis tests for the interaction regardless of whether ordinary or robust standard errors were used. When robust standard errors were used, the hypothesis test for the interaction was too conservative. When ordinary standard errors were used, the hypothesis test for the interaction was too liberal. Hypothesis tests for y3 using the Joreskog-Yang method were similarly influenced leading to a situation in which a researcher using this method would be unable to add or remove variables from the latent variable model. That is, if robust standard errors are used, meaningful predictors will be non-significant due to low power and meaningful predictors could be removed. If ordinary standard errors are used, meaningless predictors will retained too often due to a high Type 1 error rate. The Revised Ping approach provided hypothesis tests for the interaction with high Type 1 error rates no matter which kind of standard errors were used indicating that the approach will too often identify an interaction when one is not present. Bollen's two stage least squares approach exhibited low Type 1 error rate and was underpowered in tests of the interaction and well as for other tests in the model. The Ping, Kenny-Judd, and Revised Joreskog-Yang procedures all provided hypothesis tests with acceptable Type 1 error rates when ordinary standard errors were used and slightly too large when robust standard errors were used. Each of these procedures also had high levels of power in tests of the interaction so that each would be appropriate for hypothesis testing. Confidence Intervals Confidence intervals provide more information than do hypothesis tests affording researchers additional information to use in evaluating theories. In order to reap the benefits of this information, it is necessary that both parameter estimates and standard errors of these estimates are accurate. This study revealed that estimates of the interaction were inaccurate for the Bollen, Ping, and Revised Ping approaches. For the 77 Ping and Revised Ping procedures, the degree of inaccuracy was a function of reliability such that increased measurement error resulted in less accurate estimates. Estimates of the interaction using the Bollen procedure were inaccurate on the whole. Results for standard errors resemble the results for hypothesis tests for obvious reasons. Standard errors for the Joreskog-Yang procedure were too large when robust standard errors were used and too small when ordinary standard errors were used. Standard errors for the Bollen procedure were too large overall. Finally, both types of standard errors were too small for the interaction using the Revised Ping approach. The Kenny-Judd and the Revised Joreskog-Yang approach provided accurate estimates and standard errors making them the appropriate procedures to use on data like that simulated here when confidence intervals are of interest. Conclusion There are many purposes for conducting an analysis using latent variable interaction. In this study, the Kenny-Judd and the Revised Joreskog-Yang procedures recovered the correct parameter values with enough accuracy to provide answers to many of the questions that researchers are likely to ask under the experimental situation described here. Unfortunately, none of the simpler approaches are can provide appropriate fit statistics, hypothesis tests, and confidence intervals. However, if one is interested only in hypothesis tests, the simpler Ping procedure can be used. Limitations Like all studies, this study could not manipulate all variables that could possibly influence the identification of the best method for evaluating latent variable interaction. Some factors that could have been varied are observed variable means, number of 78 observed variables defining each latent independent variable, number of combinations of variables defining the interaction, and method of estimation. The study was also limited in using a laboratory situation in which no data were missing and the model fit the data perfectly. In addition, all variables except the product variables were normally distributed. Future research should examine the effects of these variables on performance of the procedures studies in this dissertation. APPENDIX SUPPLEMENTARY TABLES Table 21 Bias in Estimates of y4 N 2 P12 p2 A Bias 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.20 0.20 0.20 0.20 0.40 0.40 0.40 0.40 0.20 0.20 0.20 0.20 0.40 0.40 0.40 0.20 0.20 0.40 0.40 0.20 0.20 0.40 0.40 0.20 0.20 0.40 0.40 0.20 0.20 0.40 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.009 0.009 -0.006 -0.012 -0.014 -0.006 -0.003 -0.002 -0.009 0.002 -0.003 -0.017 0.016 -0.038 -0.070 Table 21--Continued N Pin, pl2 p2 P,1 Bias 175 0.05 0.40 0.40 0.90 -0.033 175 0.10 0.20 0.20 0.70 0.062 175 0.10 0.20 0.20 0.90 -0.001 175 0.10 0.20 0.40 0.70 -0.007 175 0.10 0.20 0.40 0.90 0.012 175 0.10 0.40 0.20 0.70 0.006 175 0.10 0.40 0.20 0.90 -0.031 175 0.10 0.40 0.40 0.70 -0.023 175 0.10 0.40 0.40 0.90 0.002 400 0.00 0.20 0.20 0.70 0.003 400 0.00 0.20 0.20 0.90 0.004 400 0.00 0.20 0.40 0.70 0.001 400 0.00 0.20 0.40 0.90 0.000 400 0.00 0.40 0.20 0.70 -0.003 400 0.00 0.40 0.20 0.90 0.000 400 0.00 0.40 0.40 0.70 -0.001 400 0.00 0.40 0.40 0.90 -0.002 400 0.05 0.20 0.20 0.70 0.019 400 0.05 0.20 0.20 0.90 -0.018 400 0.05 0.20 0.40 0.70 -0.039 Table 21--Continued N P p2 p2 p , Bias 400 0.05 0.20 0.40 0.90 -0.002 400 0.05 0.40 0.20 0.70 -0.034 400 0.05 0.40 0.20 0.90 -0.021 400 0.05 0.40 0.40 0.70 -0.010 400 0.05 0.40 0.40 0.90 -0.006 400 0.10 0.20 0.20 0.70 -0.030 400 0.10 0.20 0.20 0.90 0.000 400 0.10 0.20 0.40 0.70 -0.018 400 0.10 0.20 0.40 0.90 -0.004 400 0.10 0.40 0.20 0.70 0.007 400 0.10 0.40 0.20 0.90 -0.004 400 0.10 0.40 0.40 0.70 0.004 400 0.10 0.40 0.40 0.90 0.026 175 0.00 0.20 0.20 0.70 0.903 175 0.00 0.20 0.20 0.90 0.950 175 0.00 0.20 0.40 0.70 1.064 175 0.00 0.20 0.40 0.90 1.075 175 0.00 0.40 0.20 0.70 0.930 175 0.00 0.40 0.20 0.90 1.052 0.40 0.70 1.038 0.00 0.40 Table 22 Standard Error Ratios for 4 N c, P2 P Pi P1 2" 0.00 0.20 0.20 0.90 1.005 400 175 175 175 175 175 175 175 175 175 175 175 175 175 175 175 175 175 400 0.00 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.00 0.40 0.20 0.20 0.20 0.20 0.40 0.40 0.40 0.40 0.20 0.20 0.20 0.20 0.40 0.40 0.40 0.40 0.20 0.40 0.20 0.20 0.40 0.40 0.20 0.20 0.40 0.40 0.20 0.20 0.40 0.40 0.20 0.20 0.40 0.40 0.20 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 SE Ratio 1.061 0.903 0.969 0.945 0.989 0.899 0.887 0.923 0.979 0.938 0.882 0.981 0.877 0.957 0.896 0.970 1.036 0.938 Table 22--Continued N 2 2 SE Pinc, P2 PR Pl, SEatio ____ ____ ____ ____ ____ ___Ratio 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 400 0.00 0.00 0.00 0.00 0.00 0.00 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.10 0.10 0.10 0.10 0.10 0.20 0.20 0.40 0.40 0.40 0.40 0.20 0.20 0.20 0.20 0.40 0.40 0.40 0.40 0.20 0.20 0.20 0.20 0.40 0.40 0.40 0.20 0.20 0.40 0.40 0.20 0.20 0.40 0.40 0.20 0.20 0.40 0.40 0.20 0.20 0.40 0.40 0.20 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.90 0.70 0.968 1.064 0.975 0.924 1.018 0.979 0.990 0.878 0.986 1.058 0.953 1.021 0.909 0.983 0.992 0.941 0.938 0.950 0.952 Table 22-Continued N 2 2 ii SE P,n,, ï¿½ PR P,; SE AM P12 PRRatio 400 0.10 0.40 0.20 0.90 1.104 400 0.10 0.40 0.40 0.70 0.963 400 0.10 0.40 0.40 0.90 0.844 Table 23 Standard Error Ratios for Psi PROCEDURE N 2 PR 12 , Joreskog Revised Ping Kenny Revised Ping Judd JYP 2 Ping Judd jY 175 175 175 175 175 175 175 175 175 175 175 175 175 175 175 175 175 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.10 1.463 1.317 1.594 1.199 1.470 1.256 1.644 1.310 1.316 1.023 1.302 1.057 1.404 1.059 1.433 1.058 1.652 0.965 1.041 1.018 0.973 0.916 0.967 1.022 1.003 0.961 0.946 0.920 0.962 1.024 0.967 0.991 0.965 1.002 0.966 1.042 1.017 0.974 0.916 0.966 1.020 1.003 0.956 0.947 0.921 0.967 1.014 0.971 1.002 0.962 0.992 0.965 1.042 1.018 0.974 0.917 0.002 1.018 1.004 0.960 0.947 0.915 0.961 1.022 0.970 1.000 0.966 0.992 0.964 1.042 1.017 0.974 0.917 0.967 1.018 1.004 0.958 0.948 0.914 0.961 1.018 0.970 1.000 0.965 0.989 Table 23-Continued PROCEDURE N c,, PR P12 ii, Joreskog Revised Ping Kenny Revised pPing Judd JY P Ping Judd JY 175 175 175 175 175 175 175 400 400 400 400 400 400 400 400 400 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.05 0.2 0.2 0.2 0.4 0.4 0.4 0.4 0.2 0.2 0.2 0.2 0.4 0.4 0.4 0.4 0.2 1.251 1.749 1.137 1.575 1.211 1.661 1.254 1.304 1.081 1.227 1.084 1.527 1.088 1.365 1.196 1.551 1.079 1.017 0.944 0.926 1.035 0.929 1.033 1.036 0.963 0.972 0.958 1.088 0.932 0.994 1.021 1.126 1.074 1.015 0.953 0.922 1.032 0.933 1.034 1.035 0.963 0.971 0.959 1.089 0.932 0.994 1.021 1.127 1.007 1.071 1.018 0.952 0.926 1.034 0.943 0.854 1.035 0.963 0.973 0.959 1.087 0.932 0.994 1.021 1.123 1.072 1.020 0.952 0.923 1.034 0.944 1.027 1.035 0.963 0.973 0.959 1.087 0.932 0.994 1.021 1.122 400 0.05 0.2 0.2 0.9 1.079 1.006 1.007 1.007 Table 23--Continued PROCEDURE N P,, P R P12 p,,' Joreskog Revised Ping Kenny Revised Ping Judd JY 400 400 400 400 400 400 400 400 400 400 400 400 400 400 0.05 0.05 0.05 0.05 0.05 0.05 0.10 0.10 0.10 0.10 0.10 0.10 0.10 0.10 1.323 1.067 1.338 1.030 1.359 1.063 1.562 1.065 1.654 1.227 1.715 1.297 1.648 1.192 0.948 0.981 0.982 0.965 0.967 0.975 0.974 0.921 0.975 1.030 1.016 1.107 0.952 0.992 0.950 0.981 0.987 0.963 0.991 0.973 0.982 0.925 0.985 1.036 1.034 1.106 0.937 0.995 0.948 0.983 0.984 0.964 0.964 0.975 0.973 0.925 0.969 1.035 1.020 1.107 0.957 0.995 0.948 0.983 0.986 0.964 0.964 0.976 0.973 0.925 0.971 1.036 1.019 1.108 0.957 0.995 Table 24 Root Mean Squared Error for Psi N P~,~< p2 p Joreskog Revised Ping Kenny Revised Ping Judd JY 175 0 0.2 0.2 175 0.05 0.2 0.2 175 0.1 0.2 0.2 175 0 0.40.2 175 0.05 0.40.2 175 0.1 0.4 0.2 175 0 0.2 0.4 175 0.05 0.20.4 175 0.1 0.2 0.4 175 0 0.4 0.4 175 0.05 0.4 0.4 175 0.1 0.4 0.4 400 0 0.2 0.2 400 0.05 0.2 0.2 400 0.1 0.2 0.2 400 0 0.4 0.2 400 0.05 0.4 0.2 400 0.1 0.4 0.2 0.204 78.174 16.833 0.211 94.839 20.605 0.119 40.905 8.664 0.114 50.996 10.771 0.194 73.780 15.638 0.192 89.809 0.204 78.022 16.773 0.211 94.503 20.427 0.119 40.674 8.628 0.114 0.203 75.844 15.777 0.210 92.239 19.441 0.118 39.254 7.977 0.113 50.777 49.339 10.710 10.002 0.194 0.194 73.675 71.560 15.602 14.721 0.192 0.192 89.593 87.479 20.011 19.940 18.901 0.200 78.190 16.860 1516.950 94.950 20.620 0.120 40.920 8.670 0.110 51.010 10.830 0.190 73.770 15.650 0.190 89 840 0.204 78.184 16.835 0.211 94 880 20.615 0.119 40.914 8.664 0.114 51.008 10.781 0.194 73.780 15.639 0.192 89.824 20.030 20.016 Table 24-Continued N P2,. Pl2 p Joreskog Revised Ping Kenny Revised Ping Judd JY 400 0 0.20.4 0.110 0.110 0.109 0.110 0.110 400 0.05 0.2 0.4 39.753 39.686 38.296 39.760 39.760 400 0.1 0.2 0.4 8.300 8.276 7.616 8.300 8.300 400 0 0.40.4 0.110 0.110 0.110 0.110 0.110 400 0.05 0.40.4 48.025 47.820 46.258 48.020 48.030 400 0.1 0.4 0.4 9.854 9.785 9.081 9.860 9.856 |

Full Text |

PAGE 1 AN EMPIRICAL COMPARISON OF METHODS FOR ESTIMATING LATENT VARIABLE INTERACTIONS By BRADLEY C. MOULDER A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2000 PAGE 2 ACKNOWLEDGMENTS I would like to thank some of the people who have provided support for my education which have made this paper possible. I would first like to thank my wife, Kim, who has made so many sacrifices over the last several years. 1 have appreciated each and every one. I would also like to thank my parents whose words were always supportive and whose examples attest to a belief in the value of education that was never far from my mind. I would like to thank all the members of my committee. Each of them is exceptional and I have taken something from each of their examples. I would like to thank James Algina. His love of research is infectious and his example as both a researcher and instructor set a truly high mark. 1 would like to thank Margaret Bradley. The experiences I had working with her have been a valuable resource providing a grounding in both practical analysis and experimental design. I would like to thank Linda Crocker. In the years I worked with her, I very much appreciated that she always acted as though I was a student first. I also appreciate the many opportunities she made available to me. I would also like to thank David Miller. His very approachable manner has made a valuable resource of the breadth of knowledge and experience he possesses. ii PAGE 3 TABLE OF CONTENTS page ACKNOWLEDGMENTS ii LIST OF TABLES v LIST OF FIGURES vi INTRODUCTION 1 Measurement Error ^ Multiple Regression and Structural Equation Modeling 2 Latent Variable Interaction 4 Indicant Product Approaches 4 Two-Step Approaches 13 Two-Stage Least Squares Approach 15 Statement of the Problem 18 METHOD 19 Data Simulation 19 Effect Size 19 Squared Multiple Correlation 19 Correlation Among Variables 20 Reliability of Observed Variables 20 Sample Size 20 Simulation Proper 21 Parameter Recovery 26 Comparison of Methods 28 RESULTS '. '. 30 Bias 32 Standard Error Ratios 33 Gamma 4 33 Psi 39 Gamma 1 and Gamma 2 41 Gamma 3 47 iii PAGE 4 Mean Squared Error ^ Gamma 4 51 Psi 53 Gamma 1, 2, and 3 53 Type 1 Error Rate and Power 54 Gamma 4 54 Gamma 1 62 Gamma 2 63 Gamma 3 65 Fit Statistics 65 Chi-squared test of Exact Fit 65 Comparative Fit Index 68 Non-Normed Fit Index 69 Standardized Root Mean Squared Residual 70 DISCUSSION 73 Fit 75 Hypothesis Testing 75 Confidence Intervals 76 Conclusion 77 Limitations 77 APPENDIX SUPPLEMENTARY TABLES 79 REFERENCES : 98 BIOGRAPHICAL SKETCH 101 iv PAGE 5 LIST OF TABLES Table page 1 . Kenny and Judd Parameters and Estimates 6 2. Performance of the Kenny-Judd Model in the Jaccard and Wan Study 7 3 . YangJonsson Parameters and Estimates 10 4. Comparison of Parameters and Estimates for Joreskog-Yang Approaches 13 5. Ping's Estimates of the Kenny-Judd Parameters 14 6. BoUen's Estimates of the Kenny-Judd Parameters 18 7. Simulation Parameter Values 25 8. Proportions of Mean Square and Variance Component Sum for 74 Standard Error Ratios 34 9. Proportions of Mean Square and Variance Component Sum for \j/ Standard Error Ratios 39 10. Proportions of Mean Square and Variance Component Sum for 73 Standard Error Ratios 48 1 1 . Proportions of Mean Square and Variance Component Sum for 73 Power 55 1 2. Proportions of Mean Square and Variance Component Sum for 74 Power 56 13. Proportions of Mean Square and Variance Component Sum for 71 Type 1 Error Rate 62 14. Proportions of Mean Square and Variance Component Sum for 72 Type 1 Error Rate 64 15. Proportions of Mean Square and Variance Component Sum for 73 Power 66 16. Proportions of Mean Square and Variance Component Sum for 67 PAGE 6 1 7. Proportions of Mean Square and Variance Component Sum for CFI 68 1 8. Proportions of Mean Square and Variance Component Sum for NNFI 69 19. Proportions of Mean Square and Variance Component Sum for Standardized Root Mean Squared Residual 71 20. Overview of Effects 74 21. Bias in Estimates of Gamma 4 79 22. Standard Error Ratios for Gamma 4 82 23. Standard Error Ratios for Psi 85 ?4.,^ , Root Mean Squared Error for vj/ 88 25. Type 1 Error Rate for Gamma 4 90 Type 1 Error Rate for Gamma 1 92 27. Type 1 Error Rate for Gamma 2 95 vi PAGE 7 LIST OF FIGURES Figure ^ ms. 1. Bias in Estimates of Gamma 4 33 2. Standard Error Ratios by Reliability for Gamma 4: Robust Standard Errors... 35 3. Standard Error Ratios by Reliability for Gamma 4: Ordinary Standard Errors 35 4. Standard Error Ratios by Multiple Correlation for Gamma 4: Robust Standard Errors 36 5. Standard Error Ratios by Multiple Correlation for Gamma 4: Ordinary Standard Errors 37 6. Standard Error Ratios by Sample Size for Gamma 4: Robust Standard Errors 38 7. Standard Error Ratios by Sample Size for Gamma 4: Ordinary Standard Errors 38 8. Standard Error Ratios by Reliability for Psi: Robust Standard Errors 40 9. Standard Error Ratios by Reliability for Psi: Ordinary Standard Errors 41 10. Standard Error Ratios by Reliability for Gamma 1~ Expanded View: Robust Standard Errors 42 11. Standard Error Ratios for Gamma 2~Expanded View: Robust Standard Errors 43 12. Standard Error Ratios by Reliability for Gamma 1 : Robust Standard Errors 43 13. Standard Error Ratios for Gamma 2: Robust Standard Errors 44 vii PAGE 8 14. Standard Error Ratios by Reliability for Gamma 1 : Ordinary Standard Errors 45 15. 16. Standard Error Ratios for Gamma 2: Ordinary Standard Errors 45 17. Standard Error Ratios by p]Â„,. , . for Gamma 1 46 18. Standard Error Ratios by yO,i . for Gamma 2 46 19. Standard Error Ratios by Reliability for Gamma 3: Robust Standard Errors .. 48 20. Standard Error Ratios by Reliability for Gamma 3: Robust Standard Errors .. 49 21. Standard Error Ratios by Reliability for Gamma 3: Ordinary Standard Errors 49 21. Standard Error Ratios by Reliability for Gamma 3: Ordinary Standard Errors 50 22. Standard Error Ratios by p]Â„^ for Gamma 3 51 23. Mean Squared Errors for Gamma 4 by Sample Size 52 24. Mean Squared Errors for Gamma 4 by Procedure 52 25. Mean Squared Errors for Gamma 1 and Gamma 2 53 26. Mean Squared Errors for Gamma 3 54 27. Type 1 Error Rate for Gamma 4 by MATRIX 55 28. Power by , and N for Gamma 4: Robust Standard Errors 57 29. Power by p,,, and N for Gamma 4: Ordinary Standard Errors 59 30. Power by aL.^.^j and N for Gamma 4: Robust Standard Errors 60 3 1 . Power by aL.^.cj ^"'^ ^ fo'" Gamma 4: Ordinary Standard Errors 61 32. Type 1 Error Rate Using Robust and Ordinary Standard Errors for Gamma 1 63 33. Type 1 Error Rate Using Robust and Ordinary Standard Errors for Gamma 2 64 viii PAGE 9 34. Gamma 3 Power for the Joreskog-Yang Model and Robust Standard Errors . 66 35. Chi-Squared by Procedure, N, and /?,, 68 36. CFI by PROCEDURE and p,'Â„, 69 37. NNFI by p,^,^,^^ and PROCEDURE 70 38. Standardized Root Mean Squared Residual by N and pÂ„ 71 39. Standardized Root Mean Squared Residual by PROCEDURE and pÂ„ 72 40. Standardized Root Mean Squared Residual by MATRIX, PROCEDURE, and Sample size 72 ix PAGE 10 Abstract of Dissertation Presented to the Graduate School in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy AN EMPIRICAL COMPARISON OF METHODS FOR ESTIMATING LATENT VARIABLE INTERACTIONS By Bradley C. Moulder December 2000 Chair: James J. Algina Major Department: Educational Psychology Researchers in the social sciences and elsewhere use multiple regression to evaluate relationships among observed variables. In simple linear regression, it is well known that the presence of measurement error tends to decrease power in tests of the P coefficient. The influence of measurement error becomes more troublesome with multiple regression as p coefficients are unpredictably changed, sometimes overestimating and other times underestimating population Ps. One solution for this problem is to evaluate relationships among latent variables using structural equation modeling. When researchers hypothesize the presence of interaction among variables, this solution has until recently not been available. This is ironic given the particular sensitivity interaction terms have shown to measurement error. A number of approaches to interaction using structural equation modeling have recently been proposed. However, X PAGE 11 these studies have generally failed to either consider multiple methods or multiple experimental parameters. In this study, the approaches of Bollen, Ping, Kenny and Judd, Joreskog and Yang, as well as two new approaches, were evaluated in a study which considers several of the experimental parameters likely to vary in experimentation. These are (1) the effect size of the interaction, (2) the multiple correlation of the latent variable model, (3) the sample size, (4) the reliability of the observed variables, and (5) the correlation between the latent variables included in the interaction. The study revealed large differences in the ability of procedures to detect interaction and to properly estimate the associated parameter. Bollen's two stage least squares procedure exhibited a particular lack of power. Joreskog and Yang's procedure exhibited a particular sensitivity to a number of parameters likely to vary in experimentation. One of the new approaches, a revision of the Joreskog and Yang approach, proved both accurate and robust leading to the recommendation that it be the procedure of choice. xi PAGE 12 INTRODUCTION Measurement Error Measurement is the assignment of a quantitative value to a sample of behavior from a specified domain (Crocker and Algina, 1986). In practical situations, the value assigned to a particular sample of behavior is a composite of both the trait of interest and other random influences. For example, a student's score on a math test is due to the student's knowledge of math as well as such things as hours of sleep the previous night, illness, and many other factors. Spearman (1907) conceptualized this relationship in saying that an observed variable x is made up of two parts x=t + e, where t is the latent or "true" portion of the score and e is measurement error. If an average is taken over multiple samples from the same domain, the consistency this average reflects is the "true" portion of the score. That is, E{x) = t (2) because t is a constant influence, and E{e) = 0. (3) If it were possible to give the same student the same test repeatedly without exposure influencing the outcome, for example, the resulting average would reflect only the 1 PAGE 13 2 influence of the student's math knowledge. The proportion of observed variance made up of true score is often referred to as reliability. Measurement error is the remaining inconsistent part of the score. , Using the muhiple regression model, the value of a particular variable y is predicted as a function of several other variables. That is, A problem with this approach is that in multiple regression one estimates relationships among observed variables contaminated by measurement error and may not clearly reflect relationships among latent traits that underlie the observed variables. In the case of simple linear regression, the familiar correction for correlation coefficient attenuation, indicates that the effect of measurement error is to reduce the strength of association between two variables (Crocker & Algina, 1 986). In multiple regression, the effect of measurement error is more complex. Like the simple correlation, the multiple correlation is diminished when variables include measurement error. Similarly, a P, will be underestimated when xj is the only variable containing measurement error in a model. However, when all variables contain measurement error, (3 coefficients may be larger than, smaller than, or even a different sign than the corresponding coefficients in a model for variables measured without error (Bollen, 1989). Staiclural equation modeling affords a solution in providing a framework for evaluating relationships among latent traits or true scores. It does so by joining the Multiple Regression and Structural Equation Modeling y = a + P\X^+ p^Xi + PiX^ + ... + Pi^Xi^ + Â€ (4) (5) PAGE 14 3 methodologies of factor analysis and multiple regression. In factor analysis, a model we will call the measurement model for x variables x^H,^ Pin^\+S,,i=\,...J^ (6) Â• . _Â„ Â• . I--. . x,-M,=^A+^ni= h-i + i-Jk , (8) and also for y variables yj-Mi=^iri\+Â£jJ = (9) is used to decompose the observed scores x, for example, into latent variables (or common factors) ^ and residuals 5 which reflect measurement error. By grouping terms, one can clearly see Spearman's original formulation. For example, X,^,+d, = {X,^,) + d=t,+e,. (10) Combining the measurement models in equations 6-9 with the multiple regression model in equation 4 yields the structural equation model ;7 = ar + ;i,^, +... + 4^^ (11) allowing the prediction of values of t] as a function of 4i to . One limitation of structural equation modeling approach has been that it does not include a straightforward way to model the familiar idea of interaction among latent variables. Of course, for those using structural equation modeling the ability to evaluate PAGE 15 4 interaction is important as many theories predict such effects. Moreover, it is also of practical concern to researchers using multiple regression because products of observed variables often include more measurement error than do their component variables (Busemeyer & Jones, 1983). In fact, the reliability of the products of two observed variables is and reduces to the product of reliabilities of the component terms when xi and X2 are uncorrelated. This means, for example, that if xi and X2 have reliabilities of .7 and are uncorrelated, the reliability of xiX2 will be .49. Correlation between xi and X2 increases the reliability of the product. However, the reliability of X1X2 will always be less than the more reliable of xi and X2. Latent Variable Interaction Indicant Product Approaches Kenny and Judd (1984) devised a latent variable interaction model and used COS AN (McDonald, 1978) to compute maximum likelihood estimates of the model parameters. The structural equation in the model specified the relationship of one latent dependent variable to the latent independent variables ( ^| and ) and their product: The measurement models for the latent variables had the usual form: x,-iu,=X,^,+5,J=\....J (14) PAGE 16 5 Xj-^=^pA + S^,j = l+\...,J (15) yk-l^k=Kv^Â£,^k = \,...,K (16) The measurement model for the product of the latent variables was defined by taking products of indicator variables for the latent x variables, {x,-^i.){x,-^^) = ^,Xa^,^, +^,^A-^^P-^2S.^5^Â„ (17) so that the interaction loading is and the measurement error variance is Kenny and Judd simulated a set of data to provide an example for the use of their technique. In the model they used to simulate the data, Kenny and Judd replaced equation 1 3 by and eliminated equation 16. A single replication of 500 cases was generated. Estimates were similar to simulated values and therefore supported the technique (see Table 1). " However, the use of a single parameter set with only one replication of data means that nothing can be said about standard errors of structural coefficients, and therefore about hypothesis tests involving them. Subsequently Hayduk (1987) provided a demonstration of how to estimate the Kenny-Judd model by using the releases of LISREL available in ' 1987. PAGE 17 6 Table 1 Kenny and Judd Parameters and Estimates Parameter Simulated Value Estimated Value -.150 -.169 72 .350 .321 7i .700 .710 .160 .265 .600 .646 .700 .685 In 1995, Jaccard and Wan conducted an extensive study estimating the KennyJudd model using the non-linear constraints that can be implemented in LISREL8. In the study, Jaccard and Wan investigated the influence of sample size, reliability of indicator variables, method of estimation, proportion of variance associated with the interaction term, correlation between component variables of the interaction term, and multiple correlation on evaluation of latent variable interaction. A summary of these results is presented in Table 2. In the model they used to simulate data, = 1.0 in all conditions. They reported many of the expected findings. As expected, estimates for the interaction (X,) improved as sample size increased. Kenny and Judd also found the expected negative bias in estimation of using multiple regression and much more accurate estimates using structural equation modeling procedures that take measurement error into PAGE 18 Table 2 Performance of the KennyJudd Model in the Jaccard and Wan Study Regression SEM Condition' Mean RMSE Mean RMSE Mean Mean Proportion of Est.SE Est. SE Tests Rejected^ RMSE .00,175,.70 .00 .10 -.01 .14 .135 94.6 .05 .00,175,.90 .00 .10 .00 .10 .103 100.0 .04 .00,400,.70 .01 .06 .00 .09 .086 92.0 .06 nn Ann on .Ul .U / no .UU .U / .UO / yj.O AO .08 .05,175,.70 .51 .56 1.01 .46 .396 85.9 .50 .05,175,.90 .82 .33 1.00 .31 .297 95.8 .83 .05,400,.70 .54 .50 1.02 .30 .252 84.4 .85 .05,400,.90 .84 .25 1.02 .21 .194 90.4.94 .10.175,.70 .51 .53 1.02 .35 .287 82.4 .74 .10,175,.90 .83 .25 1.00 .22 .212 94.4 .98 .10,400,.70 .54 .49 1.02 .23 .187 80.9 .97 .10,400,.90 .84 .22 1.01 .15 .138 88.7 1.00 1 . Entries in the condition column are the proportion of variance accounted for by the interaction, sample size, and the reliability of x and y variables. 2. Probability of rejecting Ho: y^ = 0 PAGE 19 8 account. Estimation also improved when reliability increased. Manipulating the effect size of the interaction did not influence accuracy using structural equation modeling and effects for the other variables were unfortunately not reported. As 150 replications of simulated data were used, an estimate of the true standard error of was possible by calculating the standard deviation of the estimates. Because root mean squared error is comprised of both bias and standard deviation, the relatively unbiased estimates of using structural equation modeling allow a similar comparison between estimated standard errors and root mean squared error. Root mean squared error suggested that average standard errors tended to underestimate their true value overall. Bias in estimated standard errors for the interaction increased as effect size increased, increased as sample size increased, and decreased as reliability increased. Despite these . findings, observed Type One errors were at acceptable levels and power was a minimum of .85 for a sample size of 400. ' ; j Taking expectations of the left and right side of the equation 20, Joreskog and Yang (1996) noted that it was misspecified because the intercept will not necessarily be zero even though the model is formulated for variables in population mean centered form. More specifically, when and are bivariate normal the expectation of the right side of equation 13 is rAi' (21) where ^z),, is the covariance of 4, and . This is problematic as ti is specified to have an expectation of zero in the KennyJudd model. To correct this problem, Joreskog and PAGE 20 9 Yang reformulated the model to include intercepts and to use variables not deviated from their means: Tj= a + rA + 72^2+ + ^ (22) x, = T,+AJ^+SÂ„i=\,...,I (23) Xj=Tj+A^,^, + Sj,j^I + l,...,J (24) yk = '^k + \^ + Â£k'k = \,...,K (25) X,X,=(T,+A,^,+S,)iT,+^,2^2+^:) (26) Using a different set of simulated data than that used by Kenny and Judd ( 1 984), Joreskog and Yang (1996) and later Yang-Jonsson (1997) studied the precision of the estimates in this model and an alternative model in which only a single pair of observed variables was used an as indicator of (^.(^^ Â• Kermy and Judd (1984), equation 22 was replaced by y^a + r,<^,+r,j2 + rA'^2+C (27) and equation 25 was eliminated. Both studies evaluated the accuracy of estimated parameters in the models under various estimation procedures. Maximum likelihood provided more accurate estimates than either weighted least squares or weighted least squares using the augmented moment matrix. Estimates produced by maximum PAGE 21 10 likelihood estimation were most accurate when all pairs of observed variables were used as indicators of 4^, ^2 ^^^^ estimates produced by weighted least squares using the augmented moment matrix. Weighted least squares estimates were less accurate when all indicators of ^^(^2 ^^""^ used. Both studies suggest that using the three estimation procedures with the model described by equations 22 and 24, 26 and 27 and the alternative model, the Joreskog-Yang method provides estimates which are consistent. A subset of the results from the Yang-Jonsson (1997) study, for a sample size of 400, all pairs of observed variables, and maximum likelihood estimation are presented in Table 3. Average estimates of the standard error were smaller than the standard deviations of the estimates suggesting that Type I error rate should be a problem when maximum likelihood estimation is used. Table 3 Yang-Jonsson Parameters and Estimates Parameter Simulated Value Estimated Value SD of Estimate Mean Est.SE Mean Est.SE SD of Estimate a 1.0 1.000 .049 .039 .80 ri 0.2 .210 .103 .079 .77 0.4 .401 .079 .064 .81 Yi 0.7 .715 .167 .115 .69 .02 .010 .051 .040 .78 0.6 .600 .120 .079 .66 0.7 .695 .087 .060 .69 PAGE 22 11 Joreskog and Yang (1996) as well as Yang-Jonsson (1997) provided valuable information about how many pairs of observed variables should be used as well as which estimation method should be used. However, these studies were not designed to compare the resuhs of their method with results from the Kemiy-Judd model. The lack of a common set of simulated data across the studies means that this information provides little insight into whether the Joreskog and Yang model or the Kemiy and Judd model should be used. Joreskog and Yang (1996) and Yang-Jonsson (1997) set r, = = 0 in the model they used to simulate the data. Algina and Moulder (in press) allowed r, and r, to be non-zero but otherwise used the same parameter values used by Joreskog and Yang and Yang-Jonsson. Algina and Moulder showed that the maximum likelihood procedure frequently did not converge when the intercepts in equations 23 and 24 were non-zero. That is, the estimation procedure was unable to find an optimal solution for the system of equations. An alternative method was implemented that did not encounter convergence problems under the conditions cited above. This was done by revising the JoreskogYang procedure as follows: r^=a + Y,^,+YA^yM2+^ ^^^^ x,-//,=AÂ„^,+c^Â„/=l,...,/ (28) (30) PAGE 23 12 (31) (32) That is, deviation scores were used while allowing for a non-zero intercept in the structural equation model. Algina and Moulder compared estimates of the parameters in the revised model to estimates of the parameters in the Joreskog and Yang model under conditions in which the Joreskog and Yang model converged. Estimates were comparable and unbiased. Standard errors were similar using both approaches and were much smaller than the standard deviation of the estimates in both cases. Robust standard errors were also reported. These are standard errors calculated in a way that allows for non-normality (Chou, Bentler, & Satorra, 1991). Robust standard errors for the revised model proved nearer the standard deviation of the estimates for this model. However, robust standard errors for the JoreskogYang model were much larger than the standard deviation of the estimates using the Joreskog and Yang model. A summary of this comparison is presented for a sample size of 500 in Table 4. Two-Step Approaches Application of the previous methods is difficult because they require knowledge of matrix algebra and careful specification of non-linear constraints. Ping (1996) suggested a two-step approach that avoided the need to impose non-linear constraints on model parameters in equation 17. In the first step, one estimates the measurement models in equations 14 and 15. Estimates calculated in the first step are substituted into equations 18 and 19 to determine the measurement model for the latent product variable. PAGE 24 13 Table 4 Comparison of Parameters and Estimates for JoreskogYang Approaches Estimates Mean SE Mean Robust SE SD of Estimates SD of Estimates Paramete r Simulated Value Original Revised Original Revised Original Revised h -.15 .212 .213 .627 .627 1.582 0.818 n .35 .401 .400 .675 .675 1.588 0.875 .70 .706 .706 .787 .800 1.191 0.906 .16 .195 .194 .806 .803 1.153 0.930 .60 .594 .594 .775 .781 1.636 0.992 A4 .70 .705 .706 .875 .875 1.400 1.050 . These estimates for factor loadings and measurement error variances are treated as known parameters in a subsequent analysis of the structural equation model using equations 20 and 1 7. The principal rationale for this is that it avoids much of the complex programming entailed in the previous approaches and can be implemented in programs without the non-linear constraints feature of LISREL8. However, this method misspecifies the model in the second step by treating estimated parameters as known. The impact of this is not known. Ping (1996) used Kenny and Judd's (1984) parameter values to simulate data and estimate parameters in the resulting interaction model. Therefore, a comparison of the two methods can be made. Ping's (1996) study suggests that estimates of produced by PAGE 25 14 his method tend to underestimate the interaction effect (see Table 5). A comparison of estimates produced by Ping's procedure with those produced by Kenny and Judd's procedure suggests that the two sets of estimates are comparable with the latter being slightly more accurate. A reduction in accuracy may not be unexpected given the misspecification previously described. However, some minimal decline in accuracy may be acceptable given the gain in ease of use. The extent of this inaccuracy remains Tables Ping's Estimates of the KennyJudd Parameters Parameter Simulated Value Estimated Value h -.150 -.132 .350 .318 Yi .700 .666 .160 .237 .600 .599 .700 .737 difficult to evaluate as only one replication using a sample size of 500 was simulated which did not allow for a comparison of standard errors. It is possible that the two methods have quite different standard errors which would focus interest on a comparison of Type I error rate and power. Because Joreskog and Yang's ( 1 996) inclusion of means is compatible with the Ping approach, a two-step procedure for estimating the JoreskogYang model is also possible. Estunates calculated in the first step are substituted into equations 18 and 19 to PAGE 26 15 estimate the measurement model for the latent product variable. These estimates for factor loadings and measurement error variances are treated as known parameters in a subsequent analysis of the structural equation model using equations 27 and 17. Nothing is known about the accuracy of this procedure. However, removing the misspecification identified by Joreskog and Yang has the potential to make this approach superior to Ping's original formulation. Two-Stage Least Squares Approach Bollen (1996) suggested a two stage least squares approach for evaluating latent interaction. In this approach measurement models are defined such that each latent variable and its first observed variable are on the same scale. This is done by setting X equal to 1 for one observed variable (or indicator) for each latent variable as in equations 33, 35, and 37. (33) (34) ^/H -Mj =^2 + ^1 (35) (36) (37) PAGE 27 16 y,Hk^Kn + Â£k,k = 2,...,K (38) The structural equation model is the same as that used in the other deviation score models. Using the measurement models in Equations 3 1 -36, one can substitute values for r\ and the ^ variables in equation 34 so that (^'i -Mk)-S\= r , ((x, -M,)-S{) + y, ((X/Â„ -fij)-5,,,) (40) n,{{x, -M)-<^,)((X;Â„ -n^)-5,^,) + C which can be rearranged to form y\ =y\{xx -M,)+yM,x r 3(^1 -/",)+", (4i) where the residual u is a composite of the additional terms. Ordinary least squares regression should not be used to estimate Equation 41 because the predictors and the residual u are correlated. Assuming three observed variables per latent variable, one can use the observed variables not included in Equation 40 (i.e., X2, X3, X5, and xe), to predict values for x, , and x^x^ . x^M.=a + MxiMXi-jU,)+/3v(xs-Mj) + (42) + Ai(^2 M, Mj) + Ai(^2 M )(^5 -/",) + Ai (^3 >",)(^5 y",) x,-^, = a+ p,,{x, -M,) + fi2, (x, -//,) + fi,,{x, -ju^) + p,,{x, ^) (43) + ^4(^2 M,)ix, ) + P,,{x, ^Xh l^j) + ^4(^3 M, )(^5 ) + A4(^3->",)K-A) PAGE 28 17 (x, ju,){i4 -Mj)^oc + /?,( ,4,(^2 -M,)+/^o^) ('^3 + A(i4 )(^5 ) (44) + A(14)U6 +A(14)(^2 /^JUs -y";) + /^6( 14,(^2 -/",)( ^6 " ) + A( 14) (^3 M, )ixs Mj) + /?8( i4)(^3 M, )ixe Mj) The predicted values can be inserted into a regression equation resulting in: yi = Yx M ) + ^2 (^4 ->">) + Yt.{X\ /",)(J?4 /^v ) + " Â• (45) The predictor variables are no longer correlated with the residual, and estimatators of the ys in Equation 45 are consistent for the ys in Equation 39 (i.e., the mean approaches the parameter value and the standard error approaches zero as the sample size becomes indefinitely large). BoUen (1996) applied his method to data simulated by using Kenny and Judd's (1984) parameter values and reported y estimates similar to those obtained by Kenny and Judd with one replication of the data (see Table 6). However, comparison between procedures was again difficult since Kermy and Judd did not provide standard errors for the ys making any usefiil comparative statement regarding impact on hypothesis tests or confidence intervals impossible. Furthermore, standard errors provided by BoUen are suspect as there is no way to verify their relation with true values. Table 6 Bollen's Estimates of the KennyJudd Parameters Parameter Simulated Estimated Est.SE Value Value Yx -.150 -.160 .052 Y2 .350 .360 .054 Ys .700 .710 .053 PAGE 29 18 Statement of the Problem Comparisons of indicant product methods, two-step methods, and the two-stage least squares approach have been limited. Jaccard and Wan (1995) conducted an extensive simulation of the KennyJudd model. But, as their work preceded Joreskog and Yang (1996), Ping (1996), and Bollen (1996) a comparison to these methods was not possible. Ping used simulated data to compare his method to the Kenny-Judd method, implemented in COSAN and in LISREL7 (using Hayduk's method). However, Ping used one set of parameter values and generated only one replication of sample data. Joreskog and Yang (1996) and later Yang-Jonsson (1997) did not compare empirical results of their methods and the other methods. Algina and Moulder (1999) focused on a comparison of the JoreskogYang and the Revised JoreskogYang models using . maximum likelihood estimation. Therefore, the primary objective of the current study is to compare estimates produced by Kenny-Judd, JoreskogYang, and Revised JoreskogYang models; two-step estimation of the Kenny-Judd (i.e., Ping's procedure) and the JoreskogYang model ( a revised Ping procedure), and Bollen' s two stage least squares method for the latent interaction model. Of particular interest is how well the results of the two-step and two stage least squares procedures compare to the remaining procedures, which are more difficult to implement. PAGE 30 METHOD Data Simulation In simulating data for model comparisons, ensuring that relevant factors are manipulated and that levels of these factors are similar to those observed in research is of primary importance. To that end, five factors were manipulated using values typical of those observed in the practical literature. Effect Size Champoux and Peters (1987) and later Chaplin (1991) compiled proportions of variance reported in studies using multiple regression. They found that interaction accounts for average of between 3 and 8% of or', in multiple regression models. Since these proportions of variance were for observed variables, they are below those values which would be observed for error free variables and the values of were manipulated to account for 0%, 5%, and 10% of crj. Squared Multiple Correlation Based on a survey of all APA journal articles that were published in 1992 and reported multiple regression results, Jaccard and Wan found the median squared multiple correlation in these studies to be .30. The 75"" percentile was .50. Based on these results, data for 4} and were simulated so that the squared multiple correlation for a model 19 PAGE 31 20 including , '^1^2 ^ covariate (1^3) fell between .20 and .50. This was accomplished by having a model that includes and 1^3 accounted for 20 or 40% of (J^ . This squared multiple correlation is denoted by p\ where R stands for reduced model. Correlation Among Variables As described previously, correlation among variables has a strong influence on the reliability of the product between them. That is, increases in correlation among component variables are associated with decreased measurement error for the product term. Jaccard and Wan (1996) found that the median correlation among variables for studies using multiple regression in 1992 AP A journals was .20 and the 75"^ percentile was .40. Values of correlation between ^1 and ^2 were .20 or .40. Reliability of Observed Variables Because structural equation modeling works to distinguish true score from measurement error, the degree of reliability (or the degree to which measurement error is absent) is important in determining the best procedure. As reliabilities of .70 are widely viewed as minimal (cf. Litwin, 1995), low and high reliabilities of observed variables were .70 and .90. Sample Size Accuracy of estimation improves as sample size increases. In a previous study, Yang-Jonsson (1997) used samples sizes from 100 to 3,200 to evaluate the impact of sample size. While this approach provided valuable information in determining the accuracy of statistical estimation procedures, one is rarely in the position of using sample PAGE 32 21 sizes on the order of thousands in the social sciences. Jaccard and Wan (1996) reported median sample size to be 175 and 75"^ percentile to be 400 in the study of APA journals described previously. Therefore, sample sizes of 200 and 500 were used. Simulation Proper The design of the study was a 2 (sample size; N) x 2 (latent variable correlation; p,2 ) X 2 (squared multiple correlation; /?Â«) x 3 (proportion of variance associated with the interaction; aL,^,|j ) ^ 2 (level of reliability of the observed variables; /9Â„ .) completely crossed factorial design. PRELIS (Joreskog & Sorbom, 1989) was used to generate 250 replications for each of the 48 conditions. The model for generating the dependent latent variable (r\) was rj = a + rA + ^2^2 + rA +rA<^2 +C(46) Without loss of generality, the following specifications were made: r, =^2=0, (47) ^, =^22 = PAGE 33 22 In addition, <^3 was assumed to be uncorrelated with and . As a consequence, ^3 was uncorrelated with . Because the expected value of 4i was zero and its variance was one, it could be generated as a random normal deviate. The variable ^, was generated by using The variable z is a standard normal random deviate. With this calculation and the previous assumptions, terms associated with equation 46 were determined with the exception of and . With the specifications that = = ^11 ~ ^22 ~ ^33 ~ '^n~^ the variance of r\ in equation 46 is: . al^l = y';+yii\ + ;,) + ,^ l-'' (51) and the variance of the residual y/ is unknown. By virtue of the non-zero correlation between 1^3 and and the specfication y^ = y2-0, y\ = pI ' ' (52) W = P\Plc,iyi^ ' (53) and 2 _ n(i+^r,) (54) because g\ = 1 PAGE 34 23 To facilitate comparison of f 4 in conditions where f}^^^ ^^^^ > 0, it was convenient to set = \. This was accompHshed by dividing the left and right side of equation 46 by resulting in /3 = Pr P and v^A^-pI p p, The expected value of r\ was which simplifies to (56) (57) (58) When constructing structural equation models, one defines the meaning of a latent variable in the selection of indicators. It is common for latent variables to have three indicators, which was the number used in this research. Measurement models describing the twelve variables were defined as: (60) (61) ^3 =^3+^1^1 + ^3 (62) PAGE 35 24 ^4 =^4+^42^2 + '^4 (63) X,=T,+ ^52^2 + ^5 (64) ^6=7-6+ /l62<^2 + ^6 (65) ^7=77+^73^3+^^7 (66) ^8=78+^83^+^^8 (67) = r, +/l93^3 +^ (68) y\ =^y}+^U^ +^, (69) >'2 =7,, +>^2,/7 +^2 (70) >'3 = 7^3+'^3l'7 +^3 (71) where each X was defined as the square root of the reliability for the associated observed variable. The variance for the measurement model residuals (5s) for Equations 60 to 68 were ^.,=1-^? (72) =1-^7 (73) because each ^ had a variance of one. Residuals for Equations 69 to 71 were PAGE 36 25 0,. =1-/1: (74) when pI^^^^^^ = 0 and 0^ =\-Xl (75) (1 + ^,2) otherwise. Equation 75 is a result of the decision to make y, equal to one when > were 0. The parameter values produced in this way are presented in Table 7. A further decision which had to be made was how many pairs of indicators used to define the interaction. Joreskog and Yang (1996) noted that their model is identified with only a single pair of indicators for the interaction. However, the use of all Table 7 Simulation Parameter Values Pmc,y y I. lie,)'. X3 P,,.=.70 .10 .05 .00 .10 .05 .00 .40 .20 .40 .20 .40 .20 .40 .20 .40 .20 .40 .20 Pi2=.20 2.03961 1.44222 2.88444 2.03961 0.63246 0.44721 P,2-40 2.15407 1.52315 3.04631 2.15407 0.63246 0.44721 5.20 7.28 11.44 15.60 0.60 0.80 5.80 8.12 12.76 17.40 0.60 0.80 0.25944 0.25944 0.18345 0.18345 0.83666 0.83666 0.24565 0.24565 0.17370 0.17370 0.83666 0.83666 0.29417 0.29417 0.20801 0.20801 0.94868 0.94868 0.27854 0.27854 0.19696 0.19696 0.94868 0.94868 PAGE 37 26 indicators for the interaction satisfies the practical desire to use all available information. The primary concern is that each additional product indicator introduces non-normality which biases standard errors and chi-square model fit statistics (Chou. Bentler, & Satorra, 1991 ). Following Jaccard and Wan (1996), a model including four indicators for the interaction instead of the nine possible was used as a compromise between adequately defining variables and minimizing influences of non-normality. Parameter Recovery Estimation . The most common method of estimation in structural equation modeling is maximum likelihood estimation (ML; Kline, 1998). If the observed variables are multivariate normal, the maximum likelihood procedure has many desireable characteristics. Maximum likelihood estimators are consistent, efficient, and asymptotically normal. The maximum likelihood procedure allows a goodness of tit statistic which is the foundation of many goodness of fit indices. However, even when observed variables for both ^, and are normal, observed variables for as well as for Ti will not be normal. Boomsma (1983) reported that ML estimators of the parameters in structural equation modeling are consistent when the observed variables are not normally distributed. However, standard errors of parameter estimates tend to be too small and chi-square statistics of model fit are too large (Chou, Bentler, & Satorra, 1991; Yang-Jonsson, 1997). One solution to this problem is to use Browne's Asymptotic Distribution Free estimator (ADF; 1 984). Another approach is to use Weighted Least Squares using the augmented moment matrix. However, both of these approaches provide correct PAGE 38 27 estimates, standard errors, and chi-square statistics only with extremely large sample sizes (Chou, Bentler, & Satorra, 1991; Yang-Jonsson, 1997; Jaccard & Wan, 1996). As was previously suggested, sample sizes likely to be used will be relatively small in the social sciences, that is, between 200 and 500. Another approach to the problem of incorrect standard errors and chi-square statistics is to use the asymptotic covariance matrix provided by PRELIS (Joreskog & Sorbom, 1987) in association with maximum likelihood estimation to correct the estimated standard errors and chi-square fit statistic. These are referred to as robust standard errors and the Satorra-Bentler chi-square fit statistic respectively. Chou, Bentler, and Satorra (1991) compared this method to the ADF method using data with skew and/or kurtosis. They reported that the robust standard errors and and SatorraBentler chi-square fit statistics were comparable to or better than those of the ADF method with sample sizes of 200 and 400. In this study, uncorrected and corrected standard errors and chi-square tests are reported. Non-convereence . Pilot study suggested that estimation procedures would fail to converge in some proportion of the replications. Therefore, the methods were compared by using the results for the first 200 replications on which all estimation procedures converged. That is, results were removed for any replication on which any method fails to converge and for any replication beyond the first 200 on which all methods converge. This includes estimation of measurement models in the first step of the two stage procedures. Heywood cases (i.e., negative esfimated variances) and other improper estimates did not result in deletion of results for a replication. Despite the simulation of additional replications, higher rates of non-convergence than expected were observed for PAGE 39 28 the KennyJudd, Joreskog Yang, and Revised Joreskog-Yang methods. The criterion for convergence was changed from the default Â£=.0000001 to s=.0001 to increase rates of convergence. Comparison of Methods A number of approaches were used to evaluate estimates produced for each parameter and each method. Accuracy of estimates . Average estimates were compared to their parameter values in each of the 48 conditions. Accuracy of standard errors . For each method and each parameter, averaged estimated standard errors were computed in each of the 48 conditions. Since estimates will be available for 200 replications of each condition, the standard deviation of the estimates can be used as an estimate of the parameter value for the standard error (Babakus, Ferguson, &. Joreskog, 1987). This standard deviation is called an empirical standard error. To simplify interpretation, the average estimated standard error from the 200 replications was divided by the empirical standard error to obtain a measure of the proportional underor over-estimation of standard errors. Combined comparison of accuracy . To provide a convenient single value describing the degree to which each parameter is accurately estimated, mean squared errors were compared across the models for each parameter in each of the 48 conditions. Mean squared error is the average squared deviation of an estimate from its respective parameter value. This value reflects both parameter bias and standard error of parameter estimates. PAGE 40 29 Type 1 error rates and power . Operating characteristics for the hypothesis test of Y coefficients in the latent variable model were estimated by calculating the proportion of replications that were significant. Type 1 error rate was estimated when there is no interaction. Power was estimated when the interaction accounted for 5 or 1 0% of the variance. Fit statistics . The Non-Normed Fit Index (NNFI; Bentler & Bonnet, 1980), x' test of exact fit, Comparative Fit Index (CFI; Bentler, 1990) and standardized root mean squared residual were used to assess the degree of correspondence between the models and the data. As the simulation ensured that the models adequately fit the data, a consistent lack of fit for a particular method (e.g.. Ping's method) would suggest that the method poorly recovers the simulated effect. Comparisons of fit indices are also reported within each method. These provide a measure of the utility of fit indices for individual methods. Finally, each of the fit indices makes use of the asymptotic co variance matrix information when available. Therefore, the above comparisons of fit indices were repeated including the asymptotic covariance matrix. PAGE 41 RESULTS After data were simulated and the procedures under study were applied, the difference between the estimate and parameter value and the ratio of observed to empirical standard error were calculated for each parameter in each replication. Significance was also determined for each parameter as well as for fit statistics. Analysis of Variance (ANOVA) was applied to the 1 15,200 cases produced by the simulation (200 replications, 2 levels of p,^ , 2 levels of yo^ 3 levels of pl^^^^^ , 2 levels of p,, , 2 levels of N, 6 procedures, and 2 levels of asymptotic covariance matrix use) for all measures except for the difference between estimates and parameters. In that case, the use of the asymptotic covariance matrix was irrelevant. ANOVA was, therefore, applied to the 57,600 cases from the no asymptotic covariance matrix condition. The six procedures (PROCEDURE) and whether the asymptotic covariance matrix was used (MATRIX) were treated as within subjects effects. This was because all levels of PROCEDURE and MATRIX were applied to each data set. Other factors were treated as a between subjects effects. This resulted in a model with 7 main effects and 1 19 interactions. The combination of a large number of effects and a large sample size meant that significance was virtually assured for many effects and a large number were significant. Therefore, it was necessary to obtain a measure of influence for each of these effects to select those associated with a meaningful proportion of variance. Mean square 30 PAGE 42 31 components were calculated for each effect with negative components set to zero. These were summed and the ratio of each mean square component to the sum was used to gauge influence. Effects significant at the a=.01 level and that accounted for at least .5% of the mean square and variance component sum were investigated further. The denominator of the proportions described in the previous paragraph is not precisely the total variance because of the error terms in the analyses. These error terms were composites of two confounded variances. The expected values of the error terms followed the form where ap^ was the variance associated with the interaction of the repeated factor and REPLICATION, a/ was the residual error variance and k was the product of levels of the repeated measures factors not included in the error term. For example, when the error term was PROCEDURExREPLICATIONS nested in p,, , p\, p^^^^^^ , , and N, k was 2 which is the number of levels for MATRIX. When the error term was REPLICATIONS nested in p^^ , pi , pl^^^^^^ , , and N, k was 12 which is the product of the number of levels for MATRIX and PROCEDURE. Including the error terms in the sum would have meant that in some analyses ap^ would contribute up to 12 times making the sum somewhat overstated as a total of the mean squares and variance components. To avoid this, each error term was divided by k before the summation. This solution * understates the total of the mean squares and the variances by for each error term in the analysis. PAGE 43 32 Bias Analysis was limited to coefficients in the latent variable model 77 = Â« + r,^, + ^2^2 + +rA<^2 + ^ where yi and 72 had parameter values of zero and 74 had a parameter value of one. Parameter values for both 73 and v[/ were varied in order to manipulate f}^ and pi, Â• Therefore, biases associated with 73 and v|/ were evaluated as proportional changes from the parameter value to avoid influences of the parameter value on bias. Of the predictors in the latent variable model, changes in bias associated with manipulated factors were observed for 74. For 74, an interaction of PROCEDURE and pÂ„. was observed for bias that accounted for 1.1% of the mean square and variance component sum. The interaction indicated that estimates of 74 were essentially unbiased for the Joreskog-Yang, KennyJudd, and Revised Joreskog-Yang procedures. However, both the Ping and the Revised Ping procedures were associated with negative bias that was greater as measurement error increased (See Figure 1). Bollen's procedure was associated with positive bias for 74 when yOÂ„. =.70 and negative bias when =.90. A significant interaction of sample size, aL,;,^j > Pn ' Pa ' and /?Â„. was also observed for bias associated with 74 that accounted for 1 .5% of the mean square and variance component sum. However, further examination of this effect did not lead to an interpretation and the effect will not be discussed. Average bias for 74 by sample size, P^mcUi ' ^12 ' A ' A/' is presented in Appendix 1 . PAGE 44 33 0.1 0.075 0.05 0.025 0 -0.025 -0.05 -0.075 1 [T Joreskog Revised BoUen Ping Kenny Revised Ping Judd JY Figure 1 . Bias in Estimates of Gamma 4 There were no significant effects associated with bias for any other coefficient. Average bias was -.004431 foryi, .001151 for 72, .006923 for 73, and -3.5015 for\\i. Standard Error Ratios Gamma 4 A number of factors were associated with changes in standard error ratios for 74. These are presented in Table 8. A significant interaction of MATRIX, PROCEDURE, and p^^, was observed for standard error ratios associated with 74. Mean standard error ratios by these factors are presented in Figure 2. Robust standard errors for 74 for the Joreskog and Yang procedure overestimated true standard errors. Overestimation by the robust standard errors for 74 using the Joreskog and Yang procedure was more severe when observed variables were measured with lower reliability. Standard errors for 74 PAGE 45 34 produced by the Bollen procedure were nearest the true standard error. Robust standard errors produced by the Revised Ping procedure most underestimated true standard errors with a value of approximately .85 across reliability. True standard errors were underestimated using the remaining procedures with standard error ratios of approximately .95. Table 8 Proportions of Mean Square and Variance Component Sum for Standard Error Ratios Effect Proportion of Sum F P 0.008 72 .0001 0.008 78 .0001 0.007 59 .0001 Nx/7,2Xp^X/7Â„. 0.012 157 .0001 NxAU^,Xp,2X/7^XpÂ„, 0.015 69 .0001 MATRIX 0.006 3632 .0001 MATRIX X 0.006 1766 .0001 PROCEDURE 0.015 3220 .0001 PROCEDURE X 0.006 592 .0001 MATRIX X PROCEDURE 0.058 12701 .0001 MATRIX X PROCEDURE xN 0.005 598 .0001 MATRIX X PROCEDURE 0.006 609 .0001 MATRIX X PROCEDURE xAÂ„0.019 2091 .0001 Ordinary standard errors exhibited also changes associated with pÂ„. . In particular, ordinary standard errors produced by the Joreskog and Yang procedure for 74 underestimated the true standard errors when pÂ„' =-70 (See Figure 3). As p,. increased to .90, standard error ratios for the Joreskog and Yang procedure increased indicating the ordinary standard error became a bit more accurate than that for the Revised Ping approach and slightly less accurate than that observed for the other procedures. Ordinary PAGE 46 35 standard errors for the other procedures were more modestly improved by increases in /?Â„, . Estimated standard errors for 74 using Bollen's model were most accurate. Â•2 1.5 I CO o 0.5 a> v> o < m D. 3 OQ DO o_ 3 3 OQ c c/j' ft) o OQ D3 y. 7^ o_ 5' < ?r OQ nn 3 CL 1 c D. 5' OQ < CL Figure 2. Standard Error Ratios by Reliability for Gamma 4: Robust Standard Errors 1.2 Â•I Â•Â• w T3 I 0-8 0.9 0.7 O OQ Ri=-70 CO D <_ m 3 0. 5' OQ 3 OQ GO o OQ 7^ < So' m a. 3 OQ CO 3 3 OQ 7^ ft) c Cl D. fD < to' ft) Figure 3. Standard Error Ratios by Reliability for Gamma 4: Ordinary Standard Errors PAGE 47 36 Increases in overestimation of the true standard error by the robust standard error using the Joreskog and Yang model were associated with increased (See Figure 4). The Bollen model provided standard errors for 74 nearest the true standard error with a standard error ratio of approximately 1 .0. Standard errors produced by the Ping, KennyJudd, and Revised Joreskog Yang approaches had similar accuracy across levels of p]^ and very similar to one another with standard error ratios of .95. The Revised Ping approach produced the least accurate standard errors with robust standard error ratios of less than .90. Increases in pÂ« were associated with slight improvement in ordinary standard errors produced by all procedures except Bollen's (See Figure 5). 1.6 Joreskog Revised Bollen Ping Kenny Revised Ping Judd JY Figure 4. Standard Error Ratios by Multiple Correlation for Gamma 4: Robust Standard Errors An interaction of MATRIX, PROCEDURE, and sample size suggested that standard error ratios were related to sample size and type of standard error differently PAGE 48 37 1.1 Joreskog Revised Bollen Ping KennyRevised Ping Judd JY Figure 5. Standard Error Ratios by Multiple Correlation for Gamma 4: Ordinary Standard Errors across methods. In particular, the overestimation by the robust standard errors for 74 using the Joreskog and Yang approach was reduced with increased sample size. Robust standard errors also improved with increased sample size for the Ping, Revised Ping, Kenny-Judd, and Revised JoreskogYang models. Bollen's approach produced accurate standard errors for 74 with a sample size of 1 75 and this did not change with increased sample size (See Figure 6). Similarly, increased sample size was associated with slightly more accurate ordinary standard errors (See Figure 7). Average standard error ratios by 2 2 PincUi ' P\i ' A' and are presented in Appendix 2 as this effect was not interpretable. Other effects in the analysis are not discussed as higher level interactions involving the same variables are present. PAGE 49 38 Figure 7. Standard Error Ratios by Sample Size for Gamma 4: Ordinary Standard Errors PAGE 50 39 PÂ§i A number of factors were associated with changes in standard error ratios for vj; and accounted for at least .5% of the mean square and variance component sum. These are presented in Table 9. An interaction of MATRIX, PROCEDURE, AL.^,f, ' ^^'^ P,r was observed for This was due to changes in standard error ratios associated with use of the asymptotic covariance matrix. For the Joreskog Yang procedure, robust standard errors were too largean effect that was exacerbated by low levels of reliability (See Figure 8). Standard error ratios for the remaining procedures were similar to one another. Table 9 Proportions of Mean Square and Variance Component Sum for \\i Standard Error Ratios Effect Proportion of Sum F P 0.007 174 .0001 0.008 103 .0001 Ar 0.007 1041 .0001 0.007 87 .0001 MATRIX 0.017 7972 .0001 MATRIX X 0.011 2587 .0001 PROCEDURE 0.054 13851 .0001 PROCEDURE xp,^^ ^^^^ 0.005 437 .0001 PROCEDURExNxp^, ^P\ 0.006 62 .0001 PROCEDURE X p. . 0.026 3416 .0001 PROCEDURE xp,^^^xpÂ„ 0.006 239 .0001 PROCEDURE X N X p,^ . , ""Pn xp;^xpÂ„, 0.006 68 .0001 PROCEDURE xp,^^^xp. i^'pI XA/ 0.005 60 .0001 PROCEDURE X N X ^Pm ^p\^PÂ„' 0.011 60 .0001 MATRIX X PROCEDURE 0.103 13403 .0001 MATRIX X PROCEDURE x p,,, 0.056 3637 .0001 MATRIX X PROCEDURE x 0.008 171 .0001 PAGE 51 40 For the ordinary standard error, standard error ratios for the Joreskog and Yang procedure were quite accurate and comparable to those for the rest of the procedures (See Figure 9). In fact, all procedures were associated with very accurate ordinary standard errors when p,,, =.70. When = .90, standard errors for the KennyJudd model were underestimated. Standard error ratios for the other models approached one. A significant interaction of PROCEDURE xNx pl_^^^^ x p,^ x p\x p,, was observed that was not interpretable. Means by these factors are presented in Appendix 3. Other effects are not discussed due to higher level interactions involving the same variables. Gamma 1 and Gamma 2 Both yi and y2 were zero and and ^2 had the same statistical relationship to ^3 and 4i^2. Thus one would expect results on standard errors to be very similar for y, and y2. This expectation was confirmed by the results. Therefore, results for yi and y2 PAGE 52 41 n o OQ 73 T3 75 7S 5 5' n> m S. OQ (A vise v; 0. tÂ— 1 Pin C <Â— 1 dd OQ o 1 n o OQ ?3 < 3 OQ 3 OQ 7^ (T) CL CL ft) <_ ft CL < Figure 9. Standard Error Ratios by Reliability for Psi: Ordinary Standard Errors standard error ratios will be described together. A significant effect of MATRIX was observed for yi and 72 standard error ratios that accounted for 9.3% of the mean square and variance component sum. Significant effects of PROCEDURE were also observed for Yi and 72 that accounted for 28 and 29% of the mean square and variance component sum, respectively. The interaction of PROCEDURE and MATRIX was significant and accounted for 56% of the sum for both yi andy2. A significant interaction of MATRIX, PROCEDURE, and yo^,, was observed for standard error ratios associated with yi which accounted for 1.1% of the mean square and variance component sum. Mean robust standard error ratios for y, by p... and PROCEDURE are presented in Figure 10. As the interaction of MATRIX, PROCEDURE, and did not account for 1% of the mean square and variance component sum for y2, mean robust standard error ratios for ya by PROCEDURE are presented in Figure 11. For the robust standard error, the standard error ratios for y, and Y2 using the Joreskog and Yang procedure were much too large. As PAGE 53 42 indicated by Figures 10 and 1 1, the size of the ratios for the Joreskog-Yang procedures hides differences among the other procedures. Standard error ratios for the other procedures are re-presented in Figures 12 and 13. Across procedures, increases in reliability were associated with larger standard error ratios indicating that robust standard errors become more accurate. However, increased standard error ratios associated with increased p^,. for the Joreskog Yang model meant more serious overestimation of standard errors for yi when the robust standard error was used. Bollen's procedure more accurately estimated the true standard error than did robust standard errors for the other procedures. It should be noted that Bollen's procedure cannot make use of the asymptotic covariance matrix and, so, results are the same in both MATRIX conditions throughout this study. Standard error ratios for the remaining procedures were comparable and indicated that robust standard errors tended to underestimate the true standard errors by approximately .05 (See Figures 12 and 13). Â•2 30 CO o w 20 I 10 -Â«-Â» 3 p.. =-70 II rv-^ mm rm^ ?"3 CjO 15 ^ ;?o o n n a. 0 OQ P, =.90 /O tj2 y_ ^ "po ft) o OQ <_ 1/5' ft 2. 3 fT ^ ft ft g: ^ Figure 10. Standard Error Ratios by Reliability for Gamma 1-Expanded View: Robust Standard Errors PAGE 54 43 14 I / (Vl 1 r\ 10 i w 8 S3 6 CO 4 2 I 1 Joreskog Revised Ping Bollen Ping KennyRevised JY Judd Figure 1 1 . Standard Error Ratios for Gamma 2--Expanded View: Robust Standard Errors o a; o t w 1.1 B 0.9 00 o Pi 0.8 3 OQ D3 3 3 I ?0 55" n < ft 3 OQ CO ?r 3 3 era 7^ fD 3 3 I 'Â—I C a. D. 70 n ft Cu 'Â—1 Figure 12. Standard Error Ratios by Reliability for Gamma 1 : Robust Standard Errors PAGE 55 44 1 J I 0.98 I 1 w 0.96 I I 0.94 I 1 p 1 I I 00 I 0.92 o 0.9 1 1 -J 1 1 J \ 1 1 1 1 Revised Ping Bollen Ping KennyJudd Revised JY Figure 13. Standard Error Ratios for Gamma 2: Robust Standard Errors Results for ordinary standard error ratios were somewhat different than those for robust standard errors. Bollen's procedure continued to provide the most accurate standard errors overall. However, the Joreskog and Yang procedure provided ordinary standard errors for yi that were indistinguishable from those provided by the Revised Ping, KennyJudd, or the Revised JoreskogYang procedures when =.70 or =.90 (See Figure 14). Ordinary standard errors provided by the Joreskog and Yang procedure were slightly too large for 72 (See Figure 15). Differences between ordinary and robust standard error ratios for yi and y2 tended to be less than .01 across levels of pl^.^^^^^ . A priori expectations that ordinary standard errors produced by maximum likelihood estimation should be underestimated in the presence of non-normality suggested differences in the accuracy of robust and ordinary standard errors as p^^.^^^^ changes as robust standard errors are designed to take PAGE 56 45 1.1 Pi, =.70 Pii' = 90 Â•s o e at -*-Â» 0.9 0.8 O 70 ft < 55' Q. 3 OQ CD ~0 9B' n 7^ n> 3 3 v; I (Â— . c a. n < ft Q. O OQ 73 n < Q. 2? 5' OQ CD -0 2. 5' ST W 7: ft 3 3 I <Â— C D. Cl 73 ft < So" ft D. Figure 14. Standard Error Ratios by Reliability for Gamma 1: Ordinary Standard Errors l-c o C3 u 1.06 1.04 1.02 1 0.98 0.96 0.94 0.92 0.9 Joreskog Revised Bollen Ping KennyRevised J Y Ping Judd Figure 15. Standard Error Ratios for Gamma 2: Ordinary Standard Errors non-normality into account. Means of standard error ratios for robust and ordinary standard errors by /?,^Â„^. ^^^^ were calculated excluding the Joreskog Yang method. These are presented in Figure 16. Ordinary standard errors were more accurate than robust PAGE 57 46 standard errors when aL,^,,^^ =-00' the two were equivalent at p]Â„^ ^^^^ =.05, and robust standard errors tended to be nearer to true standard errors when p]^^ ^^^^ =. 1 0. The pattern was similar for 72 (See Figure 17). Ordinary standard errors for the Joreskog-Yang model were similar across >o,L,^,fj to ordinary standard errors produced by the other procedures. 0.985 7Â— 1 0.95 'J 0 0.05 0.1 Figure 16. Standard Error Ratios by pi,.,, for Gamma 1 1 0.92 0 0.05 0.1 Figure 17. Standard Error Ratios by p^.^^.^ for Gamma 2 PAGE 58 47 Gamma 3 Effects significant at the a=.01 level that accounted for at least .5% of the mean square and variance component sum for 73 standard error ratios are presented in Table 10. Significant effects were subsumed by the interaction of MATRIX, PROCEDURE, pÂ„. , and that was observed for standard error ratios associated with 73. Comparing robust standard errors across methods revealed that the robust standard error again overestimated the true standard error for 73 using the Joreskog and Yang procedure. The overestimation of robust standard errors using the Joreskog and Yang procedure was more serious as p,,. increased (See Figure 18). Overestimation of standard errors using the Joreskog Yang method obscured differences among the other methods. Therefore, the other methods are re-presented in Figure 19. Bollen's procedure compared favorably with the other methods providing standard error ratios approaching one. Results for the other procedures were similar to one another and indicated that robust standard errors slightly underestimated the true standard errors. When =.70 and robust standard errors were calculated, all procedures produced larger standard error ratios as increased. Bollen's procedure produced more accurate standard errors than did the other methods when ordinary standard errors for 73 were calculated. Joreskog and Yang's procedure provided ordinary standard errors that substantially underestimated true standard errors for 73 so long as the interaction accounted for some proportion of the variance in the latent variable model (See Figure 20). The Joreskog and Yang procedure PAGE 59 also underestimated the true standard error for 73 when ^^^^ .00 and pÂ„. -.70. Procedures other than Joreskog Yang are re-presented on a smaller scale in Figure 21 . Table 10 Proportions of Mean Square and Variance Component Sum for Standard Error Ratios Effect Proportion of Sum F P 2 0.017 31252 0.0001 MATRIX 0.049 146304 0.0001 MATRIX xp,^^^ 0.031 30542 0.0001 PROCEDURE 0.128 120222 0.0001 PROCEDURE xp,^^^ 0.106 33248 0.0001 PROCEDUREx p,.. 0.013 6087 0.0001 PROCEDURE xpi^^^^x pÂ„ 0.010 1617 0.0001 MATRIX X PROCEDURE 0.299 148068 0.0000 MATRIX X PROCEDURE x pl^.^^^^ 0.186 30660 0.0000 MATRIX X PROCEDURE x pÂ„, 0.024 6041 0.0000 MATRIX X PROCEDURE x pl^^^^^^ x /?Â„, 0.019 1609 0.0000 I 35 S XUJ T3 Xo C 1/5 3 o 30 25 20 15 10 5 0 Reliability = .70 1 Reliability = .90 1 o 73 n <_ n a. 3 OQ 03 "V ST ''^ o 3 7^ n s 3 v; I c D. a. 73 n < n a. <Â— < 11 0.00 0.05 0.10 O 05 70 n < on" o. 3 00 CD -a 2. 5' ST ^ n 7: ft 3 c a. a. ?3 n < Q. Figure 1 8. Standard Error Ratios by Reliability for Gamma 3: Robust Standard Errors PAGE 60 49 1.1 T3 C as ^ 0.9 C/J X) o 0.8 LJ-i Reliability = .70 Reliability = .90 Jl n n 72 13 7^ ?3 73 n o_ 5' n m n vis fT cro nil vis vis 3 n> a. a. c Pin dd Â•< TO 0.00 0.05 0.10 Figure 19. Standard Error Ratios by Reliability for Gamma 3: Robust Standard Errors 1.5 1 i u M 0-5 00 Reliability = .70 Id I Reliability = .90 ^2 rmi 0.00 0.05 0.10 tn o TO ?3 S OQ 03 2. 3' ft ^ 75 po no c fD t/3 ST O TO CO 7n 3 TO ii. 3 fH 3 < 3 t/3 Cl C CL CL Figure 20. Standard Error Ratios by Reliability for Gamma 3: Ordinary Standard Errors PAGE 61 50 1.1 1 C/3 0.9 n CL 3 00 Reliability = .70 03 -0 ft 7^ n c a. a. D. Reliability = .90 CL I to ?r 3 3 (TQ 7^ c 73 < CL Figure 21 . Standard Error Ratios by Reliability for Gamma 3: Ordinary Standard Errors Ordinary standard errors for other procedures were comparable to one another and slightly underestimated true standard errors when pÂ„, =.70. With /?Â„. =.90, the KennyJudd procedure produced ordinary standard errors that underestimated the true standard errors more than other procedures when pi,_^^^^ was .00 or . 1 0. When pÂ„. =.70, standard error ratios for ordinary standard errors increased as p^Â„^ ^^^^ increased for all procedures. A comparison of robust and ordinary standard error ratios for 73 revealed that relative accuracy was not dependent upon aL,^,#j as it was for yi and 72. Instead, ordinary standard error ratios were uniformly nearer to one than were robust standard error ratios across pl,^^^^^ (See Figure 22). Other effects in the analysis are not discussed as higher level interactions involving the same variables are present. PAGE 62 51 0.05 0.1 Figure 22. Standard Error Ratios by ^^^^ for Gamma 3 Mean Squared Error Gamma 4 A significant interaction of sample size, pl,^^^^ , and was observed for mean squared errors for 74 that accounted for 1 .2% of the mean square and variance component sum. Average mean squared errors by sample size, , and p,,, are presented in Figure 23. Mean squared errors were largest in the aL,^,^, = 05 condition. Mean squared errors were also larger when N=l 75 compared to the N=400 condition. Reliability influenced mean squared errors for 74 with larger mean squared errors as reliability diminished. The pattern of means across levels of pl, .^.^ did not change with changing sample sizes or with changing levels of p,,, . However, differences between levels of pIcu, were larger with reduced sample sizes and also with lower reliability. A PAGE 63 52 significant interaction of PROCEDURE and /?Â„. was observed for mean squared errors for Y4 that accounted for .6% of the mean square and variance component sum. Bollen's procedure had the largest means squared error when =.70. Ping's procedure had the smallest mean squared error overall. Mean squared errors were similar for the other procedures (See Figure 24). 0.7 0.9 0.7 0.9 Reliability Reliability Figure 23. Mean Squared Errors for Gamma 4 by Sample Size 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 IlilbliTliI ii i Pa10.7 10.9 00 3 < Co' CO o_ FT 3 3 era = 3 Cl. Figure 24. Mean Squared Errors for Gamma 4 by Procedure PAGE 64 53 An interaction of PROCEDURE, sample size, p,;,,^,^^ , Pn , and p\ was observed for mean squared error for y that accounted for 1% of tlie mean square and variance component sum. This effect was not interpretable and average mean squared error by sample size, pi^,^^ , A2 ' and p\ for v|/ are presented in Appendix 4. Gamma 1 , 2, and 3 The effect of the factors on mean squared errors for y 1 , 72, and 73 were very similar and will be reported together A significant interaction of sample size and pl,^^^.^ was observed for mean squared errors accounting for 4, 5, and 4% of the mean square and variance component sums for yi, yi, and ys respectively. Average mean squared errors by sample size and pl^ ^^^^ for yi and y2 are presented in Figure 25. Mean squared errors were largest in the pl^ ^^^^ =.05 condition. Mean squared errors were also larger when N=175 compared to the N=400 condition. These effects are mirrored for ys in Figure 26. 0.5 Gamma 1 Gamma 2 0.4 0.1 0.2 0.3 0 0.00 0.05 0.10 N=175 N=400 N=175 N=400 Figure 25. Mean Squared Errors for Gamma 1 and Gamma 2 PAGE 65 54 0.4 N=175 N-400 Figure 26. Mean Squared Errors for Gamma 3 Type 1 Error Rate and Power When evaluating rates of significance. Type 1 error rate is relevant yi and 72. Â• When pl^-^^^r^ = .00, Type 1 error rate is also relevant for 74. Power is always relevant for Y3 and it is relevant for 74 when pl^ ^^^^ > .00. Gamma 4 Type 1 error rate . Effects for the Type 1 error rate when testing Ho: 74 = 0 that were significant and associated with at least .5% of the mean square and variance component sum are presented in Table 1 1 . An interaction of MATRIX and PROCEDURE was observed for 74 Type 1 errors. Use of the asymptotic covariance matrix influenced Type 1 error rate for 74 differently depending upon which method was used. Using the Revised Ping, Ping, Kenny-Judd, or Revised Joreskog-Yang approaches, a greater Type 1 error rate was associated with robust standard errors (See Figure 27). PAGE 66 55 Using the Joreskog and Yang approach. Type 1 error rate was greater when using ordinary standard errors. The MATRIXx PROCEDURExN x p^^ x p]^y. p^^. term was not interpretable. Type 1 error rate by these variables are presented in Appendix 5. Table 11 Proportions of Mean Square and Variance Component Sum for 7 4 Type 1 Error Rate Effect Proportion of Sum F P PROCEDURE 0.010 28 .0001 MATRIXx PROCEDURE 0.030 111 .0001 MATRIXx PROCEDURExN x p,, x p', x .0111 4 .0030 MATRIX Robust Ordinary Figure 27. Type 1 Error Rate for Gamma 4 by MATRIX Power . Several effects were significantly associated with significance for and associated with at least .5% of the mean square and variance component sum. These are presented in Table 12. An interaction of PROCEDURE, MATRIX, sample size, and pÂ„. was observed for . Using robust standard errors, power for the test Ho: y^ =0 was PAGE 67 56 nearly 100% for all methods when =.90 (See Figure 28). Power for hypothesis tests of ^4 for both BoUen and Kenny -Judd's methods were reduced by approximately 5% when pÂ„. =.70. Power was reduced for all procedures when N= 175. However, the reduction in power associated with small sample size was more serious for the JoreskogYang and BoUen procedures. When N=l 75, power for all procedures was somewhat reduced when =.70. However, the Joreskog-Yang and Bollen procedures were much more strongly effected than were the other procedures. Table 12 Proportions of Mean Square and Variance Component Sum for 7 4 Power Effect Proportion of ^1 im 0 Ulil r P N 0.153 677 .0001 2 0.099 438 .0001 0.138 307 .0001 Pn0.039 173 .0001 N X 0.045 100 .0001 0.021 47 .0001 0.014 16 .0001 PROCEDURE 0.019 348 .0001 PROCEDURE X N 0.019 172 .0001 PROCEDURE xp^ r mc 2 0.007 61 .0001 PROCEDURE X 0.017 152 .0001 PROCEDURE X N X pÂ„. 0.011 52 .0001 MATRIX X PROCEDURE 0.024 701 .0001 MATRIX X PROCEDURE x N 0.037 534 .0001 MATRIX X PROCEDURE X 0.006 84 .0001 MATRIX X PROCEDURE x N x 0.005 40 .0001 MATRIX X PROCEDURE x pÂ„, 0.020 296 .0001 MATRIX X PROCEDURE x N x /j,. 0.026 189 .0001 PAGE 68 57 Revised Ping Joreskog O 00 o o o o o I I c c3 -Â» I o O Â•a I o PL, 00 PAGE 69 58 Using ordinary standard errors, nearly 1 00% power was observed for all procedures when N-400 (See Figure 29). The single exception was a reduction of 5% for the Bollen procedure when p, =.70. When N=175, a reduction in power was observed in general. Reduced reliability was more problematic for POwer using the Bollen method which exhibited a large reduction in power when =.70. An interaction of MATRIX, PROCEDURE, p^,.^,^^ , and N was observed for y, Type 1 error rate. For robust standard errors and N=400, power was very high for all procedures. Lower power was observed for y, with the Bollen and Joreskog Yang approaches when pl,.^.^_ =.05 (See Figure 30). When N=175, power was reduced for all methods. However, decreases in power for y^ using the Bollen and Joreskog Yang procedures were larger than with the other procedures. Using ordinary standard errors, power for y^ was similar to what was observed using robust standard errors. However, power for y^ using the Joreskog Yang procedure no longer differed from the other procedures and only Bollen' s method was associated with reduced power for tests of y^ (See Figure 31). ... Excluding the Joreskog Yang procedure, power for y^ was very similar using robust and ordinary standard errors. Only the N=l 75, /?Â„.=.70 condition were differences observed. In this condition, using robust standard errors resulted in 1% more rejected models compared to using ordinary standard errors. Other effects in the analysis are not discussed as higher level interactions involving the same variables were present. PAGE 70 59 o o Revised JY Kenny-Judd Ping Bollen Revised Ping Joreskog Revised JY Kenny-Judd Ping Bollen Revised Ping Joreskog Â•9 on T3 o ON 00 o o o o o papafo-y sppopM JO uoiyodojj Â«2 Z On PAGE 71 60 Â«; o " , fe' o d 00 papafa^ sjapoj/^ jo uoijjodojj Revised JY Kenny-Judd Ping BoUen Revised Ping Joreskog Revised JY Kenny-Judd Ping Bollen Revised Ping Joreskog X) o d m 3 00 PAGE 72 61 I o ^ 1 1 : o o Revised JY Kenny-Judd Ping Bollen Revised Ping Joreskog Revised JY Kenny-Judd Ping Bollen Revised Ping Joreskog 00 o o o o papafo^ sjapojAI JO uoiyodoaj o W C/3 O 1 Â£ 3 PAGE 73 62 Gamma 1 Five effects were significantly associated with the proportion of replications for which Ho: yi = 0 was rejected and associated with at least .5% of the mean square and variance component sum. These are presented in Table 1 3. As the parameter value for yi was zero, the proportion of replications for which the Ho: yi = 0 was rejected estimates the Type 1 error rate for y i . A significant interaction of MATRIX and PROCEDURE was observed for the proportion of hypothesis tests rejected for y i . Table 13 Proportions of Mean Square and Variance Component Sum for y i Tvpe 1 Error Rate Proportion of Effect ] Sum F p _ N X aI,^,^, X Pu X Pr 0.071 8 .0040 PROCEDURE 0.012 101 .0001 MATRIX X PROCEDURE 0.025 296 .0001 MATRIX X PROCEDURE x N x pl^^^^ x p, xpl q.OI 0 6 .0001 Using robust standard errors the Joreskog and Yang procedure had a near zero Type one error rate. Proportions of models in which Ho: yi=0 was rejected for other procedures were very similar at approximately .06 when robust standard errors were used. Bollen's procedure was associated with slightly fewer Type 1 errors (See Figure 32). Using ordinary standard errors, Type 1 error rates for yi were similar across procedures at about 6%. Bollen's procedure was associated with slightly fewer Type 1 errors for yi than the other procedures. The interaction MATRIX x PROCEDURE x N x aL^,^2 X A2 X yo^ was also significant but not interpretable. Type 1 error rate by these PAGE 74 63 variables is presented in Appendix 6. The interaction N x p^.^^^.^ x p,, x pÂ„ is not discussed as it is a subset of this higher order interaction. i . * 73 1) 0 0.1 u Â•57 % 0.075 1 0.05 Â° 0.025 0 E Robust ' Ordinary ni CD 35 2S' 70 n vis n> a. e PP Figure 32. Type 1 Error Rate Using Robust and Ordinary Standard Errors for Gamma 1 Gamma 2 ' Five factors were associated with whether the hypothesis test Ho: 72=0 was significant and were associated with at least .5% of the mean square and variance component sum. These are presented in Table 14. The significant interaction of MATRIX, PROCEDURE, and p^,. showed the sensitivity of the Joreskog and Yang procedure to robust standard errors and changes in /?Â„. . Due to overestimated standard errors. Type 1 error rate was very low for the Joreskog and Yang procedure using robust Standard errors. Rejections for other procedures using robust standard errors were about .06. Bollen's procedure had slightly fewer Type 1 errors (See Figure 33). Increased pÂ„. PAGE 75 was associated with increased Type 1 error rate for all but the Joreskog and Yang procedure. Table 14 Proportions of Mean Square and Variance Component Sum for y? Type 1 Error Rate Effect Proportion of Sum F P PROCEDURE 0.019 138 .0001 PROCEDURE X N X pl^^^^^ x p,, x pi 0.009 4 .0022 PROCEDURE X N X p^^^^^^ x p,^ x x pÂ„, 0.015 3 .0052 MATRIX X PROCEDURE 0.020 241 .0001 MATRIX X PROCEDURE x p,,, 0.007 41 .0001 H u u '57 u T3 O 0.08 0.07 0.06 0.05 I Robust Ordinary 2 0.04 o 0.03 4 .1 0.02 o 0.01 0 n O OQ Pi.0.7 0.9 73 n < n Cl 3 OQ 03 rT 3 3 (JO 7^ n 3 3 c D. O. 73 n < (7)' n a. n O era 73 n < n o. CD -a o 5n 3 7Z n 3 3 c D. a. 73 n < a. Figure 33. Type 1 Error Rate by MATRIX and p^^. for Gamma 2 Using ordinary standard errors, the Joreskog and Yang procedure had the fewest Type One errors for 72 at about 3% when /?Â„, =.70. However, this increased to 7% when PAGE 76 65 p^,, =.90. A Type 1 error rate of about 6% was observed for the Ping, Kenny Judd, Revised Ping, and Revised Joreskog-Yang procedures for 72. BoUen's procedure was associated with slightly fewer Type 1 errors for test of Ho: 72 = 0 at about 5%. Type 1 error rate for these procedures increased by about 1% with increased pÂ„. . The PROCEDURE X N X p]Â„^^^^^ x yO^ x PÂ«x P,,' interaction was not interpretable and is presented in Appendix 7. Other effects are not discussed as higher level interactions are present. Gamma 3 Several factors were significant in predicting power for testing Ho: 73 = 0 and accounted for at least .5% of the mean square and variance component sum. These are presented in Table 15. An interaction of MATRIX, PROCEDURE, N, p]Â„^^^^^^ , and p] suggested low power to detect covariate effects using robust standard errors and the Joreskog Yang model. For the Joreskog Yang model, power improved for hypothesis tests of Y3 when reliability for the observed variables was low or when sample size was large (See Figure 34). Power was also greater in the pl^ ^^^^ =.05 condition. However, power was always poor using robust standard errors and the Joreskog-Yang method. Power was greater than 98% for other procedures irrespective of MATRIX. Other effects are not discussed as higher level interactions are present. Fit Statistics Chi-squared Test of Exact Fit Seven effects were significantly associated with the Exact Fit statistic and were associated with at least .5% of the mean square and variance component sum. These are PAGE 77 66 Table 15 Proportions of Mean Square and Variance Component Sum for yi Power Effect Proportion of Sum F P MATRIX PROCEDURE ' .: , 0.083 0.251 170218 167138 .0001 .0001 PROCEDURE X pI^^^^ 0.006 1455 .0001 PROCEDURE X N X p,Â„c 0.005 565 .0001 PROCEDURE X x 0.006 617 .0001 MATRIX X PROCEDURE 0.501 170218 .0001 MATRIX X PROCEDURE x N 0.005 927 .0001 MATRIX X PROCEDURE x p^^^^^ 0.013 1484 .0001 MATRIX X PROCEDURE x N x pi 0.010 574 .0001 MATRIX X PROCEDURExp^ 0.006 1054 .0001 MATRIX X PROCEDUREx p^^^^^ x A 0.011 627 .0001 MATRIX X PROCEDURE x N x pi X 0.007 200 .0001 Â•a O 4> Â•57 CO 13 o o c o E o & o l-l 0H 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 \ 0 N=175 N=400 0.2 0.4 0.2 Multiple Correlation 0.4 0.05 0.1 Figure 34. Gamma 3 Power for the Joreskog-Yang Model and Robust Standard Errors PAGE 78 67 presented in Table 16. An interaction of , sample size, and PROCEDURE was observed for the % test of Exact Fit statistic which accounted for 5.2% of the mean square and variance components sum. Chi-squared statistics evaluating Ping's model were much larger than those evaluating other procedures with a sample size of 1 75 (See Figure 35). When N=400 chi-squared statistics increased compared to when N=175 for the Ping procedure. In both cases, increased pÂ„. was associated with increased y^. However the effect of increased /?Â„. was much more evident in the N=400 condition using the Ping procedure. The Kenny-Judd, Joreskog Yang and Revised Joreskog Yang procedures were similar to one another with chi-squared statistics slightly decreasing with increased sample size. The Revised Ping approach had a slightly increased with increased sample size. None were clearly influenced by changes in yo,,. . Other effects are not discussed as they are lower level interactions subsumed by the PROCEDURE X N X yoÂ„. interaction. Table 16 Proportions of Mean Square and Variance Component Sum for y Effect Proportion of Sum F P N 0.028 23095 .0001 0.030 24675 .0001 0.008 3599 .0001 PROCEDURE 0.545 676980 .0001 PROCEDURE xN 0.162 100784 .0001 PROCEDURE X p,,, 0.170 105674 .0001 PROCEDURE X N X p.,. 0.052 16240 .0001 PAGE 79 68 I u 1000 800 600 400 200 0 N=175 N= 400 n Fi ll n fi n i ^ o n en TT O CTQ Re Ping Ke Re vis nn; vis n '< n (Â—1 o. c y B' dd ^ 13 7^ >0 !^ O OQ OQ n 3 CTQ c C/j" n A,' 0.7^ 0.9 Figure 35. Chi-Squared by Procedure, N, and pÂ„. Comparative Fit Index Three effects were significantly associated with changes in the Comparative Fit Index and were associated with at least .5% of the mean square and variance component sum. These are presented in Table 17. A significant interaction of PROCEDURE and was observed for the Comparative Fit Index indicating that a low value for the Comparative Fit Index was observed using the Ping approach which was reduced further when aL.^.^j ^ (See Figure 36). High values for the Comparative Fit Index were observed using the other procedures that did not change with increased . Table 17 Proportions of Mean Square and Variance Component Sum for CFI Effect Proportion of Sum F P 2 0.007 2716 .0001 PROCEDURE 0.947 516707 .0001 PROCEDURE X 0.034 6227 .0001 PAGE 80 69 Joreskog Revised Ping KennyRevised Ping Judd JY Figure 36. CFI by PROCEDURE and \ ^ ; ' , Non-Normed Fit Index Three significant effects were observed for the Non-Normed Fit Index (NNFI) that were associated with at least .5% of the mean square and variance component sum. These are presented in Table 18. An interaction of PROCEDURE and pic.{,4j observed indicating that the Ping approach, which had the lowest values for NNFI overall, was associated with further diminished NNFI when pl^^ . =0 (See Figure 37). Table 18 Proportions of Mean Square and Variance Component Sum for NNFI Effect Proportion of Sum F P 2 0.006 1754 .0001 PROCEDURE 0.948 552150 .0001 PROCEDURE X o^, . 0.034 6632 .0001 PAGE 81 70 1.02 Joreskog Revised Ping KennyRevised Ping Judd JY Figure 37. NNFI by p,^, ^^^^ and PROCEDURE Standardized Root Mean Squared Residual Six factors were identified that were significantly associated with Standardized Root Mean Squared Residual and associated with at least .5% of the mean square and variance component sum. These are presented in Table 19. A significant interaction of sample size and /?Â„. was observed for Standardized Root Mean Squared Residual indicating that while reduced sample size and reduced reliability were associated with larger Standardized Root Mean Square Residual, the effect of reduced reliability was larger when N=l 75. (See Figure 38). An interaction of PROCEDURE and /?Â„, was observed for Standardized Root Mean Squared Residual. Standardized Root Mean Squared Residual was larger for the Ping approach and the Kermy-Judd approach. The Revised Ping approach had the smallest Root Mean Squared Residual. Standardized Root Mean Squared Residual PAGE 82 71 diminished with increased pÂ„, (See Figure 39). . This effect was larger for the Ping and the Revised Ping approaches. Table 19 Proportions of Mean Square and Variance Component Sum for Standardized Root Mean Squared Residual Effect Proportion of Sum F P N 0.471 7138 .0001 A,' 0.111 1685 .0001 N X pÂ„. 0.007 53 .0001 PROCEDURE 0.364 16750 .0001 PROCEDURE xN 0.023 526 .0001 PROCEDURE X 0.015 336 .0001 0.06 N=175 N=400 Figure 38. Standardized Root Mean Squared Residual by N and p^^ Similarly, an interaction of PROCEDURE, and Sample Size was observed for Standardized Root Mean Squared Residual. Decreases in Standardized Root Mean Squared Residual associated with increased sample size were larger for the Ping and Kenny Judd approaches compared to the other approaches (See Figure 40). PAGE 83 72 0.06 0.05 Figure 39. Standardized Root Mean Squared Residual by PROCEDURE and /?Â„. 0.06 0.05 Joreskog Revised Ping KennyRevised Ping Judd JY Figure 40. Standardized Root Mean Squared Residual by PROCEDURE and Sample Size PAGE 84 DISCUSSION A wide array of studies have been conducted on methods for estimating and testing latent variable interaction. However, many of these have provided results obtained for a single method and, therefore, did not allow for a general comparison of the available methods. Therefore, the primary objective of the current study was to compare estimates produced by Kenny-Judd, Joreskog-Yang, and Revised Joreskog-Yang models; two-step estimation of the Kenny-Judd (i.e.. Ping's procedure) and the Joreskog-Yang model (a revised Ping procedure), and Bollen's two stage least squares method for the latent interaction model. Of particular interest was how well the results of the two-step and two stage least squares procedures compare to the remaining procedures, which are more difficult to implement. In comparing the six methods examined here, one is struck by the comparability of the various procedures. Even in situations where factors in the study had significant results, this was generally due to one procedure and differences among the remaining procedures tended to be small and the procedures performed quite well. However, limitations of some procedures were observed that suggest caution in the use of these procedures. An overview of the results of this study suggesting these limitations is presented in Table 20. Bias, robust and ordinary standard error ratios. Type 1 error rate, power, and mean squared error are presented for 74 Standard error ratios are also 73 PAGE 85 74 CO C/2 o Oh o D. c3 (/3 o e O O CO l-H O X) O c O on O c l-H O OQ o CO H CI w o (L) 'E > o -a o c CO DO o t/) u c OS O ON ON ON ON ON ON C3N ON o IT) ON o o in o ON ON o o G CO ^ o PAGE 86 75 presented for 73 and CFI is presented as a measure of fit. In each case, values are averages calculated over all the conditions in the study. Fit Structural equation modeling differs from other types of modeling such as regression in that fit statistics are used in evaluating whether a particular model is appropriate. Obtaining a model that fits the data is by many considered a prerequisite for conducting hypothesis tests which are often not considered for models with evidence of misfit. For that reason, the fact that the Ping procedure is associated with lower values of CFI and NNFI is problematic. This means that using Ping's method, researchers concerned with having adequate evidence of fit often conclude that the model does not fit the data. The average Standardized Root Mean Squared Residual was higher for the KennyJudd and Ping procedures than for the other procedures. However, typically this fit statistic met the commonly used criteria for adequate fit. With the exception of Bollen's procedure which does not calculate the types of fit statistics evaluated here, each of the other procedures provided evidence that tests of fit that were reasonably good from a Type 1 error rate perspecfive. Hypothesis Testing For most researchers, the most important information provided by a structural equafion model will come fi-om the hypothesis tests. Clearly, testing for significance is a central activity for many researchers. In this study, it was clear that the choice of method strongly influenced whether or not significance tests were appropriate. The Joreskog Yang method did not provide accurate hypothesis tests for the interaction regardless of whether ordinary or robust standard errors were used. When robust standard errors were PAGE 87 76 used, the hypothesis test for the interaction was too conservative. When ordinary standard errors were used, the hypothesis test for the interaction was too liberal. Hypothesis tests for 73 using the Joreskog-Yang method were similarly influenced 7 leading to a situation in which a researcher using this method would be unable to add or remove variables from the latent variable model. That is, if robust standard errors are used, meaningful predictors will be non-significant due to low power and meaningful predictors could be removed. If ordinary standard errors are used, meaningless predictors will retained too often due to a high Type 1 error rate. The Revised Ping approach provided hypothesis tests for the interaction with high Type 1 error rates no matter which kind of standard errors were used indicating that the approach will too often identify an interaction when one is not present. Bollen's two stage least squares approach exhibited low Type 1 error rate and was underpowered in tests of the interaction and well as for other tests in the model. The Ping, Kenny-Judd, and Revised Joreskog-Yang procedures all provided hypothesis tests with acceptable Type 1 error rates when ordinary standard errors were used and slightly too large when robust standard errors were used. Each of these procedures also had high levels of power in tests of the interaction so that each would be appropriate for hypothesis testing. Confidence Intervals Confidence intervals provide more information than do hypothesis tests affording researchers additional information to use in evaluating theories. In order to reap the benefits of this information, it is necessary that both parameter estimates and standard errors of these estimates are accurate. This study revealed that estimates of the interaction were inaccurate for the Bollen, Ping, and Revised Ping approaches. For the PAGE 88 77 Ping and Revised Ping procedures, the degree of inaccuracy was a function of reliability such that increased measurement error resulted in less accurate estimates. Estimates of the interaction using the BoUen procedure were inaccurate on the whole. Results for standard errors resemble the results for hypothesis tests for obvious reasons. Standard errors for the Joreskog-Yang procedure were too large when robust standard errors were used and too small when ordinary standard errors were used. Standard errors for the Bollen procedure were too large overall. Finally, both types of standard errors were too small for the interaction using the Revised Ping approach. The KennyJudd and the Revised Joreskog-Yang approach provided accurate estimates and standard errors making them the appropriate procedures to use on data like that simulated here when confidence intervals are of interest. Conclusion There are many purposes for conducting an analysis using latent variable interaction. In this study, the KennyJudd and the Revised Joreskog-Yang procedures recovered the correct parameter values with enough accuracy to provide answers to many of the questions that researchers are likely to ask under the experimental situation described here. Unfortunately, none of the simpler approaches are can provide appropriate fit statistics, hypothesis tests, and confidence intervals. However, if one is interested only in hypothesis tests, the simpler Ping procedure can be used. Limitations Like all studies, this study could not manipulate all variables that could possibly influence the identification of the best method for evaluating latent variable interaction. Some factors that could have been varied are observed variable means, number of PAGE 89 78 observed variables defining each latent independent variable, number of combinations of variables defining the interaction, and method of estimation. The study was also limited in using a laboratory situation in which no data were missing and the model fit the data perfectly. In addition, all variables except the product variables were normally distributed. Future research should examine the effects of these variables on performance of the procedures studies in this dissertation. PAGE 90 APPENDIX SUPPLEMENTARY TABLES Table 21 Bias in Estimates of y i N Pn P' Bias 175 0.00 0.20 0.20 0.70 0.009 175 0.00 0.20 0.20 0.90 0.009 175 0.00 0.20 0.40 0.70 -0.006 175 0.00 0.20 0.40 0.90 -0.012 175 0.00 0.40 0.20 0.70 -0.014 175 0.00 0.40 0.20 0.90 -0.006 175 0.00 0.40 0.40 0.70 -0.003 175 0.00 0.40 0.40 0.90 -0.002 175 0.05 0.20 0.20 0.70 -0.009 175 0.05 0.20 0.20 0.90 0.002 175 0.05 0.20 0.40 0.70 -0.003 175 0.05 0.20 0.40 0.90 -0.017 175 0.05 0.40 0.20 0.70 0.016 175 0.05 0.40 0.20 0.90 -0.038 175 0.05 0.40 0.40 0.70 -0.070 79 PAGE 91 80 Table 21 Â— Continued N At P Bias 175 0.05 0.40 0.40 0.90 -0.033 175 0.10 0.20 0.20 0.70 0.062 175 0.10 0.20 0.20 0.90 -0.001 175 0.10 0.20 0.40 0.70 -0.007 175 0.10 0.20 0.40 0.90 0.012 175 0.10 0.40 0.20 0.70 0.006 175 0.10 0.40 0.20 0.90 -0.031 175 0.10 0.40 0.40 0.70 -0.023 175 0.10 0.40 0.40 0.90 0.002 400 0.00 0.20 0.20 0.70 0.003 400 0.00 0.20 0.20 0.90 0.004 400 0.00 0.20 0.40 0.70 0.001 400 0.00 0.20 0.40 0.90 0.000 400 0.00 0.40 0.20 0.70 -0.003 400 0.00 0.40 0.20 0.90 0.000 400 0.00 0.40 0.40 0.70 -0.001 400 0.00 0.40 0.40 0.90 -0.002 400 0.05 0.20 0.20 0.70 0.019 400 0.05 0.20 0.20 0.90 -0.018 400 0.05 0.20 0.40 0.70 -0.039 PAGE 92 81 Table 21 Â— Continued N 2 Pu P' Pn' Bias 400 0.05 0.20 0.40 0.90 -0.002 400 0.05 0.40 0.20 0.70 -0.034 400 0.05 0.40 0.20 0.90 -0.021 400 0.05 0.40 0.40 0.70 -0.010 400 0.05 0.40 0.40 0.90 -0.006 400 0.10 0.20 0.20 0.70 -0.030 400 0.10 0.20 0.20 0.90 0.000 400 0.10 0.20 0.40 0.70 -0.018 400 0.10 0.20 0.40 0.90 -0.004 400 0.10 0.40 0.20 0.70 0.007 400 0.10 0.40 0.20 0.90 -0.004 400 0.10 0.40 0.40 0.70 0.004 400 0.10 0.40 0.40 0.90 0.026 175 0.00 0.20 0.20 0.70 0.903 175 0.00 0.20 0.20 0.90 0.950 175 0.00 0.20 0.40 0.70 1.064 1 7'> u.uu u.zu c\ /in A c\r\ U.VU 1.U75 175 0.00 0.40 0.20 0.70 0.930 175 0.00 0.40 0.20 0.90 1.052 175 0.00 0.40 0.40 0.70 1.038 PAGE 93 Table 22 Standard Error Ratios for yi 82 T75 OOO 040 175 0.05 0.20 175 0.05 0.20 175 0.05 0.20 175 0.05 0.20 175 0.05 0.40 175 0.05 0.40 175 0.05 0.40 175 0.05 0.40 175 0.10 0.20 175 0.10 0.20 175 0.10 0.20 175 0.10 0.20 175 0.10 0.40 175 0.10 0.40 175 0.10 0.40 175 0.10 0.40 400 0.00 0.20 400 0.00 0.20 PÂ« Pu Ratio 0.40 0.90 1.061 0.20 0.70 0.903 0.20 0.90 0.969 0.40 0.70 0.945 0.40 0.90 0.989 0.20 0.70 0.899 0.20 0.90 0.887 0.40 0.70 0.923 0.40 0.90 0.979 0.20 0.70 0.938 0.20 0.90 0.882 0.40 0.70 0.981 0.40 0.90 0.877 0.20 0.70 0.957 0.20 0.90 0.896 0.40 0.70 0.970 0.40 0.90 1.036 0.20 0.70 0.938 0.20 0.90 1.005 PAGE 94 83 Table 22 Â— Continued N pL.. Pn pR P,r SE Ratio 400 0.00 0.20 0.40 0.70 0.968 400 0.00 0.20 0.40 0.90 1.064 400 0.00 0.40 0.20 0.70 0.975 400 0.00 0.40 0.20 0.90 0.924 400 0.00 0.40 0.40 0.70 1.018 400 0.00 0.40 0.40 0.90 0.979 400 0.05 0.20 0.20 0.70 0.990 400 0.05 0.20 0.20 0.90 0.878 400 0.05 0.20 0.40 0.70 0.986 400 0.05 0.20 0.40 0.90 1.058 400 0.05 0.40 0.20 0.70 0.953 400 0.05 0.40 0.20 0.90 1.021 400 0.05 0.40 0.40 0.70 0.909 400 0.05 0.40 0.40 0.90 0.983 400 0.10 0.20 0.20 0.70 0.992 400 0.10 0.20 0.20 0.90 0.941 400 0.10 0.20 0.40 0.70 0.938 400 0.10 0.20 0.40 0.90 0.950 400 0.10 0.40 0.20 0.70 0.952 PAGE 95 84 Table 22 Â— Continued N pi,,,,, Pn P\ P. SE Ratio 400 0.10 0.40 " 0.20 0.90 1.104 400 0.10 0.40 0.40 0.70 0.963 400 0.10 0.40 0.40 0.90 0.844 PAGE 96 85 Table 23 Standard Error Ratios for Psi PROCEDURE N P\ r R 1 1 1 PÂ„1 II Joreskog Revised Ping Ping Kenny Judd Revised JY 175 0.00 0.2 0.2 0.7 1.463 0.965 0.966 0.965 0.964 175 0.00 0.2 0.2 0.9 1.317 1.041 1.042 1.042 1.042 175 0.00 0.2 0.4 0.7 1.594 1.018 1.017 1.018 1.017 175 0.00 0.2 0.4 0.9 1.199 0.973 0.974 0.974 0.974 175 0.00 0.4 0.2 0.7 1.470 0.916 0.916 0.917 0.917 175 0.00 0.4 0.2 0.9 1.256 0.967 0.966 0.002 0.967 175 0.00 0.4 0.4 0.7 1.644 1.022 1.020 1.018 1.018 175 0.00 0.4 0.4 0.9 1.310 1.003 1.003 1.004 1.004 175 0.05 0.2 0.2 0.7 1.316 0.961 0.956 0.960 0.958 175 0.05 0.2 0.2 0.9 1.023 0.946 0.947 0.947 0.948 175 0.05 0.2 0.4 0.7 1.302 0.920 0.921 0.915 0.914 175 0.05 0.2 0.4 0.9 1.057 0.962 0.967 0.961 0.961 175 0.05 0.4 0.2 0.7 1.404 1.024 1.014 1.022 1.018 175 0.05 0.4 0.2 0.9 1.059 0.967 0.971 0.970 0.970 175 0.05 0.4 0.4 0.7 1.433 0.991 1.002 1.000 1.000 175 0.05 0.4 0.4 0.9 1.058 0.965 0.962 0.966 0.965 175 0.10 0.2 0.2 0.7 1.652 1.002 0.992 0.992 0.989 PAGE 97 86 Table 23 Â— Continued PROCEDURE N 2 pI Pl2 PnJoreskog Revised Ping Ping Kenny Judd Revised JY 175 0.10 0.2 0.2 0.9 1.251 1.079 1.074 1.071 1.072 175 0.10 0.2 0.4 0.7 1.749 1.017 1.015 1.018 1.020 175 0.10 0.2 0.4 0.9 1.137 0.944 0.953 0.952 0.952 175 0.10 0.4 0.2 0.7 1.575 0.926 0.922 0.926 0.923 175 0.10 0.4 0.2 0.9 1.211 1.035 1.032 1.034 1.034 175 0.10 0.4 0.4 0.7 1.661 0.929 0.933 0.943 0.944 175 0.10 0.4 0.4 0.9 1.254 1.033 1.034 0.854 1.027 400 0.00 0.2 0.2 0.7 1.304 1.036 1.035 1.035 1.035 400 0.00 0.2 0.2 0.9 1.081 0.963 0.963 0.963 0.963 400 0.00 0.2 0.4 0.7 1.227 0.972 0.971 0.973 0.973 400 0.00 0.2 0.4 0.9 1.084 0.958 0.959 0.959 0.959 400 0.00 0.4 0.2 0.7 1.527 1.088 1.089 1.087 1.087 400 0.00 0.4 0.2 0.9 1.088 0.932 0.932 0.932 0.932 400 0.00 0.4 0.4 0.7 1.365 0.994 0.994 0.994 0.994 400 0.00 0.4 0.4 0.9 1.196 1.021 1.021 1.021 1.021 400 0.05 0.2 0.2 0.7 1.551 1.126 1.127 1.123 1.122 400 0.05 0.2 0.2 0.9 1.079 1.006 1.007 1.007 1.007 Â•I PAGE 98 87 Table 23 Â— Continued PROCEDURE N aL,^Â„=, p\ Pn A,' Joreskog Revised Ping Ping Kenny Revised Judd JY 400 0.05 0.2 0.4 0.7 1.323 400 0.05 0.2 0.4 0.9 1.067 400 0.05 0.4 0.2 0.7 1.338 400 0.05 0.4 0.2 0.9 1.030 400 0.05 0.4 0.4 0.7 1.359 400 0.05 0.4 0.4 0.9 1.063 400 0.10 0.2 0.2 0.7 1.562 400 0.10 0.2 0.2 0.9 1.065 400 0.10 0.2 0.4 0.7 1.654 400 0.10 0.2 0.4 0.9 1.227 400 0.10 0.4 0.2 0.7 1.715 400 0.10 0.4 0.2 0.9 1.297 400 0.10 0.4 0.4 0.7 1.648 0.948 0.981 0.965 0.975 0.974 0.921 0.975 1.030 1.016 1.107 0.950 0.981 0.982 0.987 0.963 0.967 0.991 0.973 0.982 0.925 1.036 1.034 1.106 0.948 0.983 0.964 0.975 0.925 0.985 0.969 1.035 1.020 1.107 0.948 0.983 0.984 0.986 0.964 0.964 0.964 0.976 0.973 0.973 0.925 0.971 1.036 1.019 1.108 0.952 0.937 0.957 0.957 400 0.10 0.4 0.4 0.9 1.192 0.992 0.995 0.995 0.995 PAGE 99 Table 24 Root Mean Squared Error for Psi N pI^.^. /7|2 pi Joreskog Revised Ping Kenny Revised * Ping Judd JY T75 0 0.2 0.2 57204 0.204 0203 0.200 0.204 175 0.05 0.2 0.2 78.174 78.022 75.844 78.190 78.184 175 0.1 0.2 0.2 16.833 16.773 15.777 16.860 16.835 175 0 0.4 0.2 0.211 0.211 0.210 1516.950 0.211 175 0.05 0.4 0.2 94.839 94.503 92.239 94.950 94 880 175 0.1 0.4 0.2 20.605 20.427 19.441 20.620 20.615 175 0 0.20.4 0.119 0.119 0.118 0.120 0.119 175 0.05 0.2 0.4 40.905 40.674 39.254 40.920 40.914 175 0.1 0.2 0.4 8.664 8.628 7.977 8:670 8.664 175 0 0.40.4 0.114 0.114 0.113 0.110 0.114 175 0.05 0.4 0.4 50.996 50.777 49.339 51.010 51.008 175 0.1 0.4 0.4 10.771 10.710 10.002 10.830 10.781 400 0 0.2 0.2 0.194 0J94 0.194 0.190 0.194 400 0.05 0.2 0.2 73.780 73.675 71.560 73.770 73.780 400 0 1 0.2 0.2 15.638 15.602 14.721 15.650 15.639 400 0 0.4 0.2 0.192 0.192 0.192 0.190 0.192 400 0.05 0.4 0.2 89.809 89.593 87.479 89 840 89.824 400 0.1 0.4 0.2 20.011 19.940 18.901 20.030 20.016 PAGE 100 89 Table 24 Â— Continued N > 1x1 r K Joreskog Revised Ping Kenny Revised Ping Judd .lY 400 0 0.2 0.4 0.110 0.110 0.109 0.110 0.110 400 0.05 0.2 0.4 ^Q 75"^ J 7.UOU 38.296 39.760 39.760 400 0.1 0.2 0.4 8.300 8.276 7.616 8.300 8.300 400 0 0.4 0.4 0.110 0.110 0.110 0.110 0.110 400 0.05 0.4 0.4 48.025 47.820 46.258 48.020 48.030 400 0.1 0.4 0.4 9.854 9.785 9.081 9.860 9.856 PAGE 101 90 Table 25 Type 1 Error Rate for Gamma 4 PROCEDURE MATRIX N Pn p\ P^ncA^h PÂ«' Joreskog Revised Ping Kenny Revised Joreskog Ping Judd JY 175 0.2 0.2 0 0.7 0.050 0.095 0.055 0.090 0.095 0.080 175 0.2 0.2 0 0.9 0.055 0.105 0.055 0.105 0.105 0.105 175 0.2 0.4 0 0.7 0.010 0.050 0.005 0.040 0.055 0.050 175 0.2 0.4 0 0.9 0.015 0.060 0.050 0.055 0.055 0.055 175 0.4 0.2 0 0.7 0.025 0.125 0.030 0.090 0.095 0.090 175 0.4 0.2 0 0.9 0.020 0.085 0.015 0.055 0.070 0.055 175 0.4 0.4 0 0.7 0.010 0.110 0.040 0.095 0.075 0.075 175 0.4 0.4 0 0.9 0.020 0.095 0.040 0.075 0.065 0.065 400 0.2 0.2 0 0.7 0.030 0.080 0.030 0.060 0.070 0.070 400 0.2 0.2 0 0.9 0.045 0.075 0.050 0.070 0.055 0.055 400 0.2 0.4 0 0.7 0.035 0.060 0.060 0.045 0.040 0.050 400 0.2 0.4 0 0.9 0.005 0.040 0.040 0.035 0.050 0.040 400 0.4 0.2 0 0.7 0.040 0.060 0.030 0.055 0.060 0.060 400 0.4 0.2 0 0.9 0.035 0.120 0.035 0.075 0.085 0.085 400 0.4 0.4 0 0.7 0.015 0.070 0.045 0.040 0.040 0.040 400 0.4 0.4 0 0.9 0.050 0.100 0.030 0.075 0.070 0.070 175 0.2 0.2 0 0.7 0.130 0.065 0.055 0.065 0.070 0.075 PAGE 102 <Â•-"'' , " 91 Table 25 Â— Continued .' PROCEDURE Joreskog Revised Ping Kenny Revised Joreskog 2 Ping Judd JY MATRIX N Pn p\ P<"^U, P.i' 2 175 0.2 0.2 0 0.9 0.095 0.075 0.055 0.070 0.075 0.070 2 175 0.2 0.4 0 0.7 0.090 0.035 0.005 0.030 0.035 0.030 2 175 0.2 0.4 0 0.9 0.030 0.030 0.050 0.035 0.025 0.030 2 175 0.4 0.2 0 0.7 0.125 0.065 0.030 0.055 0.060 0.060 2 175 0.4 0.2 0 0.9 0.020 0.045 0.015 0.030 0.040 0.025 2 175 0.4 0.4 0 0.7 0.075 0.065 0.040 0.045 0.055 0.055 2 175 0.4 0.4 0 0.9 0.045 0.060 0.040 0.055 0.050 0.045 2 400 0.2 0.2 0 0.7 0.115 0.055 0.030 0.065 0.060 0.065 2 400 0.2 0.2 0 0.9 0.065 0.060 0.050 0.060 0.055 0.055 2 400 0.2 0.4 0 0.7 0.105 0.060 0.060 0.055 0.050 0.050 2 400 0.2 0.4 0 0.9 0.030 0.040 0.040 0.035 0.030 0.030 2 400 0.4 0.2 0 0.7 0.080 0.060 0.030 0.055 0.050 0.055 2 400 0.4 0.2 0 0.9 0.060 0.085 0.035 0.055 0.055 0.055 2 400 0.4 0.4 0 0.7 0.065 0.055 0.045 0.035 0.030 0.035 2 400 0.4 0.4 0 0.9 0.060 0.085 0.030 0.075 0.070 0.070 PAGE 103 92 Table 26 Type 1 Error Rate for yi PROCEDURE MATRIX N aL^^ P"' p\ Joreskog Revised Bollen Ping Kenny Revised Ping Judd JY 175 0.00 0.7 0.2 0.000 0.053 0.035 0.050 0.050 0.050 175 0.00 0.7 0.4 0.000 0.065 0.045 0.065 0.065 0.070 175 0.00 0.9 0.2 0.000 0.070 0.063 0.070 0.070 0.063 175 0.00 0.9 0.4 0.000 0.060 0.048 0.058 0.060 0.048 175 0.05 0.7 0.2 0.000 0.070 0.050 0.070 0.058 0.063 175 0.05 0.7 0.4 0.000 0.060 0.050 0.068 0.060 0.060 175 0.05 0.9 0.2 0.000 0.055 0.065 0.058 0.055 0.060 175 0.05 0.9 0.4 0.000 0.068 0.048 0.070 0.065 0.055 175 0.10 0.7 0.2 0.000 0.065 0.045 0.060 0.070 0.060 175 0.10 0.7 0.4 0.000 0.050 0.040 0.050 0.045 0.043 175 0.10 0.9 0.2 0.000 0.105 0.075 0.118 0.113 0.108 175 0.10 0.9 0.4 0.000 0.105 0.075 0.103 0.100 0.090 400 0.00 0.7 0.2 0.000 0.063 0.065 0.060 0.063 0.065 400 0.00 0.7 0.4 0.000 0.063 0.055 0.063 0.060 0.060 400 0.00 0.9 0.2 0.000 0.053 0.058 0.053 0.058 0.055 400 0.00 0.9 0.4 0.000 0.040 0.035 0.040 0.038 0.040 400 0.05 0.7 0.2 0.000 0.068 0.040 0.063 0.060 0.060 PAGE 104 93 Table 26 Â— Continued PROCEDURE MATRIX N pfnc^^ P,r pI Joreskog Revised Bollen Ping Kenny Revised Ping Judd JY 400 0.05 0.7 0.4 0.000 0.055 0.050 0.058 0.060 0.058 400 0.05 0.9 0.2 0.000 0.053 0.045 0.055 0.055 0.058 400 0.05 0.9 0.4 0.000 0.048 0.038 0.048 0.048 0.048 400 0.10 0.7 0.2 0.000 0.058 0.053 0.060 0.058 0.060 400 0.10 0.7 0.4 0.000 0.083 0.093 0.078 0.075 0.083 400 0.10 0.9 0.2 0.000 0.070 0.065 0.068 0.063 0.070 400 0.10 0.9 0.4 0.000 0.080 0.073 0.075 0.075 0.078 2 175 0.00 0.7 0.2 0.058 0.053 0.035 0.050 0.048 0.053 2 175 0.00 0.7 0.4 0.058 0.055 0.045 0.058 0.060 0.063 2 175 0.00 0.9 0.2 0.063 0.065 0.063 0.065 0.073 0.063 2 175 0.00 0.9 0.4 0.050 0.048 0.048 0.050 0.048 0.050 2 175 0.05 0.7 0.2 0.060 0.070 0.050 0.063 0.065 0.065 2 175 0.05 0.7 0.4 0.060 0.050 0.050 0.063 0.055 0.063 2 175 0.05 0.9 0.2 0.053 0.053 0.065 0.058 0.055 0.050 2 175 0.05 0.9 0.4 0.063 0.058 0.048 0.060 0.055 0.050 2 175 0.10 0.7 0.2 0.063 0.053 0.045 0.055 0.063 0.063 2 175 0.10 0.7 0.4 0.048 0.053 0.040 0.053 0.055 0.053 2 175 0.10 0.9 0.2 0.103 0.093 0.075 0.098 0.095 0.093 PAGE 105 94 Table 26 Â— Continued PROCEDURE MATRIX N P\ Joreskog Revised Bollen Ping Kenny Revised Ping Judd JY 2 175 0.10 0.9 0.4 0.088 0.093 0.075 0,090 0.088 0.083 2 400 0.00 0.7 0.2 0.058 0.058 0.065 0.058 0.060 0.060 2 400 0.00 0.7 0.4 0.063 0.053 0.055 0.053 0.050 0.050 2 400 0.00 0.9 0.2 0.060 0.053 0.058 0.050 0.055 0.053 2 400 0.00 0.9 0.4 0.043 0.040 0.035 0.040 0.040 0.040 2 400 0.05 0.7 0.2 0.060 0.068 0.040 0.068 0.068 0.070 2 400 0.05 0.7 0.4 0.060 0.055 0.050 0.060 0.060 0.060 2 400 0.05 0.9 0.2 0.053 0.050 0.045 0.050 0.050 0.050 2 400 0.05 0.9 0.4 0.043 0.048 0.038 0;045 0.043 0.043 2 400 0.10 0.7 0.2 0.073 0.063 0.053 0.063 0.060 0 060 2 400 0.10 0.7 0.4 0.075 0.088 0.093 0.088 0.088 0.083 2 400 0.10 0.9 0.2 0.068 0.060 0.065 0.065 0.063 0.065 2 400 0.10 0.9 0.4 0.078 0.083 0.073 0.075 0.075 0.073 Â• Â• ;1 PAGE 106 95 Table 27 Type 1 Error Rate for Y7 PROCEDURE Joreskog Revised Bollen Ping Kenny Revised N aL,^,^, P,2 A/' Ping Judd JY 175 0.00 0.2 0.2 0.7 0.005 0.063 0.035 0.060 0.055 0.053 175 0.00 0.2 0.2 0.9 0.013 0.045 0.030 0.043 0.043 0.048 175 0.00 0.2 0.4 0.7 0.003 0.075 0.040 0.068 0.045 0.050 175 0.00 0.2 0.4 0.9 0.028 0.063 0.045 0.063 0.060 0.063 175 0.00 0.4 0.2 0.7 0.013 0.053 0.045 0.053 0.043 0.048 175 0.00 0.4 0.2 0.9 0.038 0.033 0.045 0.033 0.043 0.040 175 0.00 0.4 0.4 0.7 0.008 0.033 0.035 0.033 0.030 0.030 175 0.00 0.4 0.4 0.9 0.035 0.063 0.045 0.063 0.058 0.058 175 0.05 0.2 0.2 0.7 0.005 0.068 0.055 0.068 0.068 0.070 175 0.05 0.2 0.2 0.9 0.013 0.035 0.015 0.035 0.033 0.035 175 0.05 0.2 0.4 0.7 0.005 0.065 0.045 0.068 0.065 0.063 175 0.05 0.2 0.4 0.9 0.030 0.088 0.085 0.095 0.085 0.088 175 0.05 0.4 0.2 0.7 0.033 0.088 0.055 0.088 0.083 0.083 175 0.05 0.4 0.2 0.9 0.040 0.065 0.060 0.058 0.058 0.058 175 0.05 0.4 0.4 0.7 0.028 0.065 0.030 0.065 0.060 0.060 175 0.05 0.4 0.4 0.9 0.048 0.073 0.060 0.075 0.070 0.075 175 0.10 0.2 0.2 0.7 0.020 0.090 0.065 0.093 0.090 0.095 PAGE 107 Table 27 Â— Continued 96 PROCEDURE Joreskog Revised Bollen Ping Kenny Revised N PmcM^ Pn P\ Pii' Ping Judd JY 175 0.10 0.2 0.2 0.9 0.033 0.075 0.065 0.068 0.065 0.065 175 0.10 0.2 0.4 0.7 0.008 0.063 0.100 0.055 0.058 0.055 175 0.10 0.2 0.4 0.9 0.033 0.083 0.095 0.085 0.083 0.085 175 0.10 0.4 0.2 0.7 0.025 0.065 0.045 0.070 0.068 0.070 175 0.10 0.4 0.2 0.9 0.045 0.058 0.050 0.058 0.058 0.060 175 0.10 0.4 0.4 0.7 0.023 0.063 0.055 0.073 0.070 0.075 175 0.10 0.4 0.4 0.9 0.068 0.113 0.110 0.118 0.115 0.113 400 0.00 0.2 0.2 0.7 0.005 0.040 0.025 0.040 0.040 0.045 400 0.00 0.2 0.2 0.9 0.018 0.060 0.065 0.060 0.060 0.058 400 0.00 0.2 0.4 0.7 0.010 0.070 0.060 0.065 0.065 0.063 400 0.00 0.2 0.4 0.9 0.025 0.070 0.045 0.070 0.068 0.060 400 0.00 0.4 0.2 0.7 0.020 0.065 0.060 0.065 0.070 0.065 400 0.00 0.4 0.2 0.9 0.025 0.040 0.030 0.040 0.040 0.038 400 0.00 0.4 0.4 0.7 0.028 0.075 0.050 0.073 0.075 0.078 400 0.00 0.4 0.4 0.9 0.040 0.068 0.055 0.068 0.065 0.063 400 0.05 0.2 0.2 0.7 0.005 0.060 0.045 0.055 0.055 0.055 400 0.05 0.2 0.2 0.9 0.033 0.080 0.060 0.075 0.075 0.075 400 0.05 0.2 0.4 0.7 0.013 0.063 0.065 0.060 0.060 0.065 PAGE 108 97 Table 27Â— Continued PROCEDURE Joreskog Revised Bollen Ping Kenny Revised N P'nc4,i, Pn p\ P,r Ping Judd JY 400 0.05 0.2 0.4 0.9 0.040 0.080 0.075 0.078 0.080 0.083 400 0.05 0.4 0.2 0.7 0.013 0.040 0.040 0.038 0.035 0.035 400 0.05 0.4 0.2 0.9 0.055 0.085 0.075 0.085 0.085 0.088 400 0.05 0.4 0.4 0.7 0.023 0.078 0.055 0.078 0.073 0.075 400 0.05 0.4 0.4 0.9 0.055 0.098 0.075 0.098 0.098 0.095 400 0.10 0.2 0.2 0.7 0.013 0.050 0.030 0.053 0.053 0.043 400 0.10 0.2 0.2 0.9 0.023 0.055 0.050 0.055 0.055 0.055 400 0.10 0.2 0.4 0.7 0.008 0.093 0.045 0.080 0.078 0.073 400 0.10 0.2 0.4 0.9 0.040 0.063 0.075 0.068 0.065 0.055 400 0.10 0.4 0.2 0.7 0.018 0.063 0.030 0.063 0.063 0.065 400 0.10 0.4 0.2 0.9 0.043 0.063 0.060 0.065 0.065 0.060 400 0.10 0.4 0.4 0.7 0.020 0.050 0.050 0.040 0.040 0.043 400 0.10 0.4 0.4 0.9 0.048 0.063 0.035 0.060 0.060 0.058 PAGE 109 REFERENCES Algina, J., & Moulder, B. (in press). A note on estimating the Joreskog-Yang model using LISREL. Structural Equation Modeling . Babakus, E., Ferguson, C.E., & Joreskog, K.G. (1987). The sensitivity of confirmatory maximum likelihood factor analysis to violations of measurement scale and distributional assumptions. Journal of Marketing Research, 24 , 222-228. Bentler, P.M. (1990). Comparative fit indices in structural models. Psychological Bulletin, 107 . 238-246. Bentler, P.M., & Bonett, D.G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88 , 588-606. Bollen, K. A. (1989). Structural equations with latent variables . New York: Wiley. Bollen, K. A. (1996). An alternative two-stage least squares (2SLS) esfimate for latent variable equafion. Psychometrika, 61(1 ), 109-121. Bollen, K.A., & Paxton, P. (1998). In G.A. Marcoulides & R.E. Schumaker (Eds.), Interaction and nonlinear effects in structural equation modeling . Mahwah, NJ: Lawrence Erlbaum. Boomsma, A. (1983). On the robustness of LISREL against small sample size and non-normality. Doctoral dissertation, University of Groningen. Browne, M.W. (1984). Asymptotic distribution-free methods for the analysis of covariance structures. BriUsh Journal of Mathematical and Statistical Psychology. 41 . 193-208. Busemeyer, J.R., & Jones, L.E. (1983). Analysis of multiple combinafion rules when the causal variables are measured with error. Psychological Bulletin. 93(3 ). 549562. Champoux, J.E., & Peters, W.S. (1987). Form, effect size, and power in moderated regression analysis. Journal of Occupational Research. 60 . 243-255. 98 PAGE 110 99 Chaplin, W.F. (1991). The next generation of moderation research in personality psychology. Journal of Personality, 59 , 143-178. Chou, C, Bentler, P.M., & Satorra, A. ( 1 99 1 ). Scaled test statistics and robust standard errors for non-normal data in covariance structure analysis: A monte carlo study. British Journal of Mathematical and Statistical Psychology, 44. 347-357. Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory . Orlando, Fl: Holt, Rinehart, and Winston. Hayduk, L. (1987). Structural equation modeling with LISREL . Baltimore: Johns Hopkins Press. Jaccard, J., & Wan, C.K. (1995). Measurement error in the analysis of interaction effects between continuous predictors using multiple regression: Multiple indicator and structural equation approaches. Psychological Bulletin, 1 1 6 , 348-357. Joreskog, K.G., & Sorbom, D. (1989). LISREL 7: A guide to the program and applications . Chicago: SPSS Inc. Joreskog, K.G., & Sorbom, D. (1989). SPSS LISREL 7: User's guide and reference . Chicago: SPSS Inc. Joreskog, K.G., & Yang, F. (1996). Nonlinear structural equation models: The KennyJudd model with interaction effects. In G.A. Marcoulides & R.E. Schumaker (Eds.), Advanced structural equation modeling: Issues and techniques . Mahwah, NJ: Lawrence Erlbaum. Kenny, D.A., & Judd, CM. (1984). Estimating the nonlinear and interactive effects of latent variables. Psychological Bulletin. 96 . 201-210. Litwin, M.S. (1995). How to measure survey reliability and validity . Thousand Oaks, California: Sage Publications. McDonald, R.P. (1978). A simple comprehensive model for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology. 31 5972. Ping, R.A. (1996). Latent variable interaction and quadratic effect estimation: A two step technique using structural equation analysis. Psychological Bulletin. 1 19 166175. PAGE 111 100 Spearman, C. (1907). Demonstration of formulae for true measurement of correlation. American Journal of Psychology, 18(2 ), 1 6 1 1 69. Yang-Jonsson, F. (1997). Non-linear structural equation models: Simulation studies of the KennyJudd Model. Unpublished doctoral dissertation, Uppsala University, . Sweden. PAGE 112 BIOGRAPHICAL SKETCH Bradley C. Moulder was bom on March 2, 1971. He received bachelor's and master's degrees in psychology, both at the University of Florida. In the fall of 1997, he began studying for the Ph.D. in the Educational Psychology Department at the University of Florida, majoring in research and evaluation methodology. He will graduate with the Ph.D. degree in December 2000. 101 PAGE 113 I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. s J. Algina, Chai ssor of Educatioijgll Psychology I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Lmda Crocker Professor of Educational Psychology I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a " dissertation for the degree of Doctor of Philosophy. M. David Miller Professor of Educational Psychology I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Margaret M. Bradley Associate Scientist of Psychc^logy PAGE 114 This dissertation was submitted to the Graduate Facuhy of the College of Education and to the Graduate School and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. December 2000 Chairmanrfeducational Psychology JhamnanTEducational Psycholog Dean, College of Education Dean, Graduate School |