Citation |

- Permanent Link:
- https://ufdc.ufl.edu/UF00097734/00001
## Material Information- Title:
- A Comparative analysis of some measures of change
- Creator:
- Neel, John Howard, 1944-
- Publication Date:
- 1970
- Copyright Date:
- 1970
- Language:
- English
- Physical Description:
- viii, 54 leaves. : ; 28 cm.
## Subjects- Subjects / Keywords:
- Covariance ( jstor )
Educational research ( jstor ) Error rates ( jstor ) Group size ( jstor ) Learning ( jstor ) Mathematical procedures ( jstor ) Proportions ( jstor ) Statistical discrepancies ( jstor ) Statistics ( jstor ) T tests ( jstor ) Dissertations, Academic -- Foundations of Education -- UF ( lcsh ) Foundations of Education thesis Ph. D ( lcsh ) Mathematical statistics ( lcsh ) Probabilities ( lcsh ) - Genre:
- bibliography ( marcgt )
non-fiction ( marcgt )
## Notes- Thesis:
- Thesis--University of Florida, 1970.
- Bibliography:
- Bibliography: leaves 51-52.
- Additional Physical Form:
- Also available on World Wide Web
- General Note:
- Manuscript copy.
- General Note:
- Vita.
## Record Information- Source Institution:
- University of Florida
- Holding Location:
- University of Florida
- Rights Management:
- Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
- Resource Identifier:
- 029482338 ( AlephBibNum )
AEG9071 ( NOTIS ) 014346100 ( OCLC )
## UFDC Membership |

Downloads |

## This item has the following downloads:
PDF ( 2 MBs ) ( .pdf )
comparativeanaly00neelrich_Page_39.txt comparativeanaly00neelrich_Page_21.txt comparativeanaly00neelrich_Page_26.txt comparativeanaly00neelrich_Page_34.txt comparativeanaly00neelrich_Page_32.txt comparativeanaly00neelrich_Page_29.txt comparativeanaly00neelrich_Page_23.txt comparativeanaly00neelrich_Page_30.txt comparativeanaly00neelrich_Page_42.txt comparativeanaly00neelrich_Page_25.txt comparativeanaly00neelrich_Page_60.txt comparativeanaly00neelrich_Page_20.txt comparativeanaly00neelrich_Page_45.txt comparativeanaly00neelrich_Page_44.txt comparativeanaly00neelrich_Page_43.txt comparativeanaly00neelrich_Page_54.txt comparativeanaly00neelrich_Page_62.txt comparativeanaly00neelrich_Page_63.txt comparativeanaly00neelrich_Page_51.txt comparativeanaly00neelrich_Page_52.txt comparativeanaly00neelrich_Page_15.txt comparativeanaly00neelrich_Page_57.txt comparativeanaly00neelrich_Page_03.txt comparativeanaly00neelrich_Page_41.txt comparativeanaly00neelrich_Page_12.txt comparativeanaly00neelrich_Page_55.txt comparativeanaly00neelrich_Page_31.txt comparativeanaly00neelrich_Page_09.txt comparativeanaly00neelrich_Page_35.txt comparativeanaly00neelrich_Page_06.txt comparativeanaly00neelrich_Page_11.txt comparativeanaly00neelrich_Page_28.txt comparativeanaly00neelrich_Page_07.txt comparativeanaly00neelrich_Page_50.txt comparativeanaly00neelrich_Page_16.txt comparativeanaly00neelrich_Page_56.txt comparativeanaly00neelrich_Page_19.txt comparativeanaly00neelrich_Page_58.txt comparativeanaly00neelrich_Page_48.txt comparativeanaly00neelrich_Page_47.txt comparativeanaly00neelrich_Page_53.txt comparativeanaly00neelrich_Page_36.txt comparativeanaly00neelrich_Page_08.txt comparativeanaly00neelrich_Page_04.txt comparativeanaly00neelrich_Page_02.txt comparativeanaly00neelrich_Page_18.txt comparativeanaly00neelrich_Page_38.txt comparativeanaly00neelrich_Page_61.txt comparativeanaly00neelrich_Page_37.txt comparativeanaly00neelrich_Page_14.txt comparativeanaly00neelrich_Page_40.txt comparativeanaly00neelrich_Page_49.txt EKH6J6O6G_JSBLEW_xml.txt comparativeanaly00neelrich_Page_33.txt comparativeanaly00neelrich_Page_17.txt comparativeanaly00neelrich_Page_13.txt comparativeanaly00neelrich_Page_01.txt comparativeanaly00neelrich_Page_24.txt comparativeanaly00neelrich_Page_10.txt comparativeanaly00neelrich_pdf.txt comparativeanaly00neelrich_Page_59.txt comparativeanaly00neelrich_Page_27.txt comparativeanaly00neelrich_Page_05.txt comparativeanaly00neelrich_Page_46.txt comparativeanaly00neelrich_Page_22.txt |

Full Text |

A COMPARATIVE ANALYSIS OF SOME MEASURES OF CHANGE By JOHN HOWARD NEEL A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1970 COPYRIGHT BY JOHN HOWARD NEEL 1970 ACKNOWLEDGEMENTS The writer wishes to thank his committee members for their guidance in this study, especially Dr. Charles H. Bridges, Jr., who suggested the topic and was a constant source of assistance throughout the study. When this study was begun there was a questioning of change scores in the department which was helpful and encouraging. Those responsible for this atmosphere were Dr. Charles M. Bridges, Jr., Dr. Robert S. Soar, Dr. William B. Ware and Mr. Keith Brown. Dr. Vynce A. Hines and Dr. P. V. Rao assisted by each detecting an error in the model presented. Dr. William B. Ware was especially helpful editorially, as was Dr. Wilson H. Guertin. The writer's wife, Carol, was encouraging, and under- standing of the time which was necessarily spent away from home. ii TABLE OF CONTENTS Page ACKNOWLEDGMENTS . . . . . . . LIST OF TABLES . . . . . . . ABSTRACT . . . . . . . . CHAPTER I. INTRODUCTION . . . . . The Problem of Measuring Change Methods of Analyzing Change . The Problem . . . . . Some Limitations . . . . Procedures . . . . . Significance of the Study . . Organization of the Study . . II. RELATED LITERATURE . . . . . . iii . . vi S . . vii . . . 10 Derivation of Lord's True Gain Scores Comparison of Lord's True Gain Scores with Other Scores . . . . . Comparison of Regressed Gain Scores with Other Scores . . . . . . Another Study and Summary . . . . III. METHODS AND PROCEDURES . . 10 13 . 14 . 15 . . 16 Procedures: An Overview . . . . 16 Sampling from a Normal Population with Specified Mean and Variance . . .. 19 Selecting Reliability . .. . . 20 Selecting Gain . . . . . ... .21 Analysis of the t Values for the Four Methods . . . . . . ... 23 IV. RESULTS, CONCLUSIONS,AND SUMMARY . . 25 Resul ts . . . . . . . . Conclusions . . . . . . . Disc ssio . . . . . . . S 25 231 .31 * . * * Page CHAPTER A Direction for Future Research . . . Summary . . . . . . . . . APPENDIX A. FORTRAN PROGRAM . . . . . B. LIST OF t's FOR THE FOUR METHODS . BIBLIOGRAPHY . . . . . . . . BIOGRAPHICAL SKETCH . . . . . . . . . LIST OF TABLES Table NUMBER OF SIGNIFICANT t's WHEN THE TRUE MEAN GAIN WAS 0.0 FOR BOTH GROUPS . . NUMBER OF SIGNIFICANT t's WHEN THE POWER OF THE t TEST ON THE RAW DIFFERENCE SCORES WAS 0.50 . . . . . . Page 26 27 Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy A COMPARATIVE ANALYSIS OF SOME MEASURES OF CHANGE by John Howard Neel August, 1970 Chairman: Wilson H. Guertin Co-Chairman: Charles M. Bridges, Jr. Major Department: Foundations of Education The purpose of this study was to determine which of four selected methods was most appropriate for measuring change such as gain in achievement. The four selected methods were raw difference, Lord's true gain, regressed gain, and analysis of covariance procedures. In order to compare the four methods Monte Carlo techniques were employed to generate samples of pre and post scores for two groups. The reliability, variances, and means of the sampled populations were controlled. One hundred sa-:ples were generated at each of 20 conbinations of five levels of reliability, two levels of group size, and L.-.-o lecv3.s of gain. Using each of the four methods, a t statistic ;was calculated lor each sample to test the null] hypothesis of no difference in amount of gain between the tA:o groups. The number of t's significant at the 0.05 level of significance was recorded for each of the four methods. Vii At each of the 20 combinations a chi square test was used to test the null hypothesis of equal proportions of significant t's among the four groups. This hypothesis was rejected in each case. It was noted that use of Lord's true gain procedure tended to create a greater significance level than the user would intend. The proportion of significant t's for each of the other three methods of analysis fell reasonably close to the expected values. On this basis use of Lord's true gain procedure was not recommended and since there was no apparent difference among the remaining methods of analysis, none was recommended above the other. vi i CHAPTER I INTRODUCTION Learning has long been a primary focus of investi- gation for educators. A definition of learning has been offered by Hilgard (1956): Learning is the process by which an activity originates or is changed through reacting to an encountered situation, provided that the change in activity cannot be explained on the basis of native response tendencies, maturation, or temporary states of the organism (e. g. fatigue, drugs, etc.). (p. 3) Although Hilgard went on to say that the definition is not perfect, it does illustrate a commonly accepted aspect of learning: learning involves change of the behavior of the organism which learns (Bigge, 1964, p. 1; Skinr.nr, 1968, p. 10; Combs, 1959, p. 88). Educators have been concerned with this change and often have sought to measure the change occurring in some situation. Several methods of analyzing change have been presented in the literature. A comparison of these methods of analysis of thef measurement of change and the acco;lpIxj.ny- ing difficulties .which arise was the focus of this study. The purpose of the study was to determine which of the several selected measures of change is most appropriate under various conditions. The Problem of Measuring Change In all sciences measurement is an approximation. In conducting a first order survey, a surveyor makes three measurements of a distance and takes the average of the three measurements. Physicists and engineers customarily report the relative size of the error of their measurements. Physical scientists have been fortunate in that the size of the relative error involved in their measurements often has been small, frequently less than 0.01 and some- times less than 10 Educators are unfortunate in this respect in that if a student's true I. Q. were 100 and an I. Q. score of 88 was observed, the relative error would be 0.14. The size of this error is not uncommon and larger relative errors do occur. The Stanford-Binet intelligence test has standard error of measurement equal to five I. Q. points (Anastasi, 1961, p. 200). Thus, relative errors of 0.14 or larger will occur 1.64 per cent of the time, assuming a normal distribution of errors. In measuring change the problem is compounded since there is an error in both the pre score and the post score. iloreover, when the magnitude of the change is small or zero, the magnitude of the error may be larger than that of the change. This possibility makes the change difficult to detect or to separate from the error. The effect of this error of measurement on change scores was noted as early as 1924 by Thorndike: When the individuals in a varying group are measured twice in respect to any ability by an imperfect measure (that is one whose self- correlation is below 1.00), the average difference between the two obtained scores will equal the average difference between the true scores that would have been obtained by perfect measures, but for any individual the difference between the two obtained scores will be affected by the error. Individuals who are below the mean of the group will tend by the error to be less far below it in the second, and individuals who are above the mean of the group in the first measurement will tend by the error to be less far above it in the second. The lower the self-corre- lation, the greater the error and its effect. Thorndike (1924) went on to show that there was a spurious negative correlation between initial true score and true gain. He then stated that "the equation connecting the relation of obtained initial ability with obtained gain, the unreliability of the measures and the true facts" had not been discovered. Lord (1.956) developed the equation to which Thorndike alluded. Lord made the following assumptions concerning the error of measurement: the errors i) have zero mean for the groups tested. ii) have the same variances for both tests. iii) are uncorrelated with each other and with true score on either test. (Lord, 1956) HcNemar (1958) has extended Lord's work to the case of un-qual error variances. It should be noted here that :cl;emanir's follow-up of Lor-:'s work: is only an extension. When the variances are equal, :;c'ieniar's foLmiul.s re identical. to Lord's (Lord, 1958). Methods of Analyzing Change The estimated true gain scores derived from Lord's and NcNemar's equations have been used in either t tests or analysis of variance procedures (Soar, 1968; Tillman, 1969). There are, in addition to Lord's method, three other commonly used methods for analyzing change. One method has been the use of a straight analysis of variance or t test on the raw difference scores as the situation warrants. It is important to note that the raw scores are used in the analysis with no correction for the unreliability of the measures. A second method is to complete an analysis of covari- ance on the raw difference scores using the pretest scores as covariates. These procedures are standard statistical techniques and mnsy bo found in many texts (Hays, 1963; Snedecor, 1956; Winer, 1962). The third method of measuring gain has been advocated by Manning and DuBois (1962):- the method of residual gain. In this method the final scores are regressed on the initial scores and the difference between the final score and the score predicted by the regression equation is taken as a measure of gain. This measure is then used in t tests or analysis of variance procedures. Thus in the case of equal variances four common methods of analyzing change have been identified: 1. Use of raw gain scores in appropriate procedures. 2. Use of Lord's true gain scores in appro- priate procedures. 3. Use of iaanning and DuBois' regressed gain scores in appropriate procedures. 4. U.se of analysis of covariance on raw gain scores with pre scores as covariates. The Problem A researcher faced with these different methods and a problem in measuring change is confronted with a second and fundamental methodological problem: which of the methods for measuring change is most appropriate? It is this question that this study sought to answer. The problem of which method to use is further complicated since different writers have claimed that different techniques were appropriate. "iManning and DuBois and Rankin and Tracy feel that the method of residual gain is more appropriate for correlational procedures since it is metric free" (as quoted in Tillman, 1969, p. 2). Ohnmacht (1968) also sug- gested that this procedure was the best. Lord(in Harris, 1963, chapter 2) mentioned regressed gain but seemed to advocate his own method as being superior. This position is further supported by Cronbach and Furby (1970). To determine which of the method of analysis was most appropriate, an empirical study was conducted to con-ipre the results of each method under known situations. Some Limitations This study was limited to the two group situation. To examine more than two groups would have involved such a large number of possibilities as to make the study impractical in terms of time and money. Thus, this study excluded the more general multigroup comparisons possible with analysis of variance and covariance procedures and was limited to examining t tests and the analysis of covariance using the pre score as covariate. Additionally the case of unequal variances between the two groups was not considered. A third limitation was that the variance of the true gains was selected a priori to be 3.6. True gain scores with this variance will be such that over 99 per cent will be within five units of the true mean gain. It is the ratio of the variances of the gain and error which is important. Since the reliabilities were varied, as indicated later, this study was conducted utilizing several such ratios. One of the factors of interest was the reliability of the test used. A second factor of interest was the sample size or the relative power of the procedures under study. There is an infinite number of reliabilities and a decision was made as to the levels of reliability to be investigated: 0.50 to 0.90 in increments of 0.10. Tests with reliabilities lower than 0.50 are rarely used in practice and at best the resulting data would be highly questionable. The lower limit of 0.50 was chosen for this reason. Sample sizes of 25 and 100 per group were chosen as being somewhat representative of sample sizes used in educational research. Procedures Two groups were compared under 20 conditions using t tests as follows. There were five levels of reliability (0.50, 0.60, 0.70, 0.80, 0.90) and two levels of sample size (25 and 100) used in this study. Thus there were ten differ- ent combinations of sample sizes and reliabilities. For each of these combinations two cases were investigated, one where there was no pre to post test gain in either group and the second where there was a known gain from the pre to the post test for one group. For each of these 20 instances, 100 samples were generated and analyzed using each of the four methods of analysis indicated previously. Consequently, two questions were to be answered: 1. Does any one of the selected methods yield a disproportionate number of significant t values when there is no difference between the mean gain of the two groups? 2. Is any one of the selected methods more power- ful, i. e. more successful in detecting a difference when a difference does exist? Samples from a normal distribution were generated using techniques described by Rosenthal (1966). The method for generating random numbers was the multiplication by a constant method. With the procedures used, this method will produce 8.5 million numbers before the series repeats. This number was more than sufficient for this study. All generation of the samples and calculation of t values using the various methods of analysis were done on the IBM 360/65 computer at the University of Florida. The significance level used for the t tests was 0.05. The two research questions generated two null hypo- theses: 1. The proportion of t's significant at the 0.05 level is the same for each method of analysis when there is no gain in either group. 2. The proportion of t's significant at the 0.05 level is the same for each method of analysis when there is gain by one group but not by the other. These hypotheses were tested at each of two combinations of reliability and sample size with chi square tests using the 0.05 level of significance. Significance of the Study The results of this study should either indicate empirically that one or more methods were superior to the others or that there were no great differences among the methods. If the former were true, then educational resear- chers may select one of the better methods. If the latter were true, then educational researchers may select any of the methods. In either case the study provides some answer as to how change scores should be analyzed. Organization of the Study Chapter I has been the introduction, statement of the problem, limitations, hypotheses, and procedural overview. Chapter II reviews related literature, essentially the development of the equation and methodology of the various techniques studied. Chapter III describes the procedures and Chapter IV presents the data, conclusions, and summary. CHAPTER II BELATED LITERATURE Much has been said in the literature about measuring change. However, most of this discussion is centered around the four methods investigated in this study: raw gain, Lord's true gain, regressed gain, and analysis of covariance procedures. As pointed out in Chapter I raw gain and regressed gain procedures are discussed in many texts and therefore not discussed here. Lord's true gain and regressed gain procedures are discussed in this chapter. Derivation of Lord's True Gain Scores The following derivation parallels Lord's (1956) development of true gain scores with one exception as noted. Lord gave the following equations as a model for the observed pre and post scores: (1) X = T + E1 (2) Y = T + G + E2 where X = observed pre score; Y = observed post score; T = true pre score; G = true gain; El = error of measurement in pre observed score; E2 = error of measurement in post observed score. Lord then made the following assumption concerning E1 and E2, the errors of measurement. The Errors i) have zero mean in the group tested. ii) have the same variance (a ) for both tests. e iii) are uncorrelated with each other and with true score on either test (Lord, 1956). The derivation can be considerably shortened at this point by examining a standard regression equation which predicts one variable, X1, from two other variables, X2, X3. The equation is (Tate, 1965, p. 171) a1 a (3) X = B2.3 2 (X2 2 + B32 (X X) + 1 1 12.3 2 2 13.2 G 3 3 where r r r 23 and r r r B r 13 12 23 B13.2 1 2 23 If we let X1 = G = gain; X2 = X = observed pre score; X3 = Y = observed post score; the following elements in the regression equation can be identified as X1 = estimated gain; X2 = mean of the observed pre scores; X = mean of the observed post scores; S1 = standard deviation of the gain scores; S2 = standard deviation of observed pre scores; S = standard deviation of observed post scores; r23 = correlation of observed pre and post scores. Lord has pointed out that r12 and r13, the correlations between observed score and true score, are the reliabilities. From (1) and his stated assumptions, Lord writes (4) o 2 +2 2 (4) = t + e t e e 2 2 2 2 (5) a = o + o + e y t g e (6) ax = o2 + a (6) xy t te Lord solves these equations to find 2 2 2 2 (7) a = o + o + 20 2o g x y xy the variance of the true gain scores. At this point the only element in the regression equation which is undefined is X1. This element is found by considering the mean of the observed pre scores which from (1) can be seen to be equal to the mean of the gain scores plus the mean of the errors of measurement, that is (8) X= -E . But by Lord's assumption (i) E = O, therefore (9) X = T . Similarly from (2) Lord shows (10) Y = T + G . Then (9) is subtracted from (10) and rewritten to yield (11) G = Y X . Thus all elements are defined and (3) may be rewritten as in terms of T, G, X, and Y as (12) G = B123 ^ (X B) + B132 Y (Y Y) + Y X x y which Lord has asserted to be an estimate of true gain. It may be noted that no notational scheme or other method has been presented to distinguish between statistics and parameters in the preceding derivation. This lack is in keeping with Lord's derivation. It is assumed here that Lord was referring to parameters until the point at which he obtained the final equation and that he then intended to use sample values to estimate the appropriate parameters in the regression equation. Comparison of Lord's True Gain Scores with Other Scores In his original article Lord (1956) made no comparison of his method with any other method. In a subsequent article (1959) Lord again made no mention of other methods. In a chapter written in Problems Jn iesurins Chan'ye (Harris, 1963, Chapte ir 2) he m.de reference to regrosscd vain scores, but no discussion or comparison was presented. In Statis- tical Theories of Mental Test Scores (Lord and Novick, 1968) no comparison of Lord's true gain scores with other procedures is presented. Comparison of Regressed Gain Scores with Other Scores Manning and DuBois (1962) have compared per cent, raw, and residual gain scores. A per cent gain score is raw gain score divided by the pre score (Manning and DuBois, 1962). The comparisons were made on the bases of metric requirements, reliability, and appropriateness of use in correlation procedures. On each of these bases residual gain scores were recommended over per cent and raw gain scores. Manning and DuBois pointed out that per cent and raw gain scores require at least equal interval scales on both pre and post scores and that the scales be the same on both pre and post scores, i. e. the same equal interval scale must be used on both tests. According to Manning and DuBois these qualities are not possessed by educational and psychological test scores. In contrast, residual gain scores do not require the same equal interval scales and therefore are appropriate for use with test scores (Manning and DuBois, 1962). Manning and DuBois summarily list formulas showing that residual gain scores are more reliable and more appropriate measures for correlational procedures than are raw or per cent gain scores. These formulas were only listed, not derived, and no reference was made to their derivation. Another Study and Summary Madansky (1959) reported or derived several methods for fitting straight lines to two variables when both were measured with error. One of these procedures is applicable in the case when the variance of the error of measurement is unknown. However, there has apparently been no attempt to apply the method to the analysis of change. A search of the literature has revealed no compara- tive empirical examination of the four methods examined in this study. Further, the advocates and authors of two of the reported procedures, each of whom has been shown to know of the existence of the other procedure, continue to advocate their own method even though they offer no reason or data for this advocacy. This study should provide some knowledge as to any difference in the four methods. CHAPTER III METHODS AND PROCEDURES Procedures: An Overview As stated in Chapter I, Monte Carlo techniques were employed to generate pre and post test scores for two groups. One group is referred to as the gain group, the other as the no gain group. The model for the observed pre scores is (1) X = T + El where X is the observed pre score; T is the true pre score; E1 is a normally distributed random error 2 with mean 0.0 and variance C . e The model for the post scores is (2) Y = T + G + E2 where Y is the observed post score; T is the true pre score; G is the true gain from pre to post score; E2 is a normally distributed random error wit mo!ican 0.0 anid variance o e2 The generated scores were subsequently analyzed for the difference in the amount of gain or change between the two groups. The scores were analyzed by the four selected methods; 1. a t test on the raw difference scores 2. a t test on Lord's true gain scores 3. a t test on regressed gain scores 4. a t test from an analysis of covariance on the raw difference scores using pre score as covariate. The results of these analyses were then compared. For the gain group an appropriate mean gain, V from pre to post scores was obviously selected to be 0.0 in the case of no gain for either group and selected to be of such size as to male the po,:w-r 0.50 when there was a gain in the gain group. The post scores were generated by adding a random normal gain,, and a random noriiral error, E2, to the generated true pre scores. The variables G and E2 had means 1 and 0 respectively and variances as discussed later. For the no gain group there was no gain from pre to post scores. The pre scores for both the gain and the no gainI group were taken from a normal population with mean 50.0 and variance 100.0. The mean and variance of the popula- tion of post scores for both the gain and the no gain groups w;ere function", of the mean true gain and of the reliability. After the samples were generated, the hypothesis of no difference in average gain between the two groups was tested using each of the four methods. The t values for each of these tests were recorded. This procedure was repeated over 100 samples for each of the selected reliabilities 0.50, 0.60, 0.70, 0.80 and 0.90. The method for introducing the effect of the selected reliability into each generated score is presented in a following section. Thus 100 t values were calculated and recorded for each method of analysis and at each level of reliability. This entire procedure was repeated for each of the following conditions: 1. group size = 25, Pg = 0.0 for both groups; 2. group size = 25, Vg 0.0 for the gain group S0.0 for the no gain group; 3. group size = 100, V = 0.0 for both groups; g 4. group size = 100, P 9 0.0 for the gain group = 0.0 for the no gain group. Where the gain was not equal to 0.0 it was such that the power of the t tests on the raw difference scores was 0.50, i. e. the expected proportion of rejected null hypotheses was 0.50. The following sections describe in more detail some of the previously mentioned procedures.* * The reader is also referred to the FORTRAN listing in Appendix A for the exact computer routines by which these procedures were carried out. Sampling from a Normal Population with Specified Mean and Variance If F is the cumulative density function of a random variable R1, thaa the random variable R2 defined by (3) R2 = F(R1) is uniformly distributed over the interval [o,li (Meyer, 1965, p. 256, Theorem 13.6). Here F is the cumulative density function of the random variable R1. It follows then that Rl, where (4) R = F-(R2) is normally distributed if F-1 is the inverse cumulative density function of a normal distribution and if R2 is a uniform random number on the interval [0,1] (Meyer, 1965, pp. 256-257). Thus random samples from a normal distribution may be obtained using uniform random numbers and by (4) where F is the cumulative density function of a normal distribution. -1 For a normal distribution, F1 (R ) must be calculated using numeric:al approximation methods. This calculation as well as the generation of the uniform random numbers werc'e done using a routine dcscribcd by Rosenthal (1966, pp. 270, 267). Roscnthal's techniques were adapted to the IBM' 360/65 computer installed at the University of Florida (see Appendix A FUIICTIOII RAUD). The normal population sampled had mean 0 and variance 1.0. If a different mean or variance 20 was required, it was obtained by addition or multiplication by an appropriate constant. Selecting Reliability Reliabilities of 0.50, 0.60, 0.70, 0.80 and 0.90 were selected as representative of reliabilities found in test scores. The reliability, rel, of a test may be defined as c2 e (5) rel = 1- (Nunnally, 1967, p. 221), 2 x 2 2 where a is the error of measurement variance and o is e x the observed score variance. Since c2 had been selected x a priori to be 100.0, we have from (5) 2 (6) a = 100 (1 rel) e1 Moreover, since (7) X = T + E1 and since the error, El, is assumed to be independent of the true score, T, we have (8) o2 = 02 + a2 x t e or, combining (6) and (8) and solving for o2, (9) a2 = 100 o2 t e1 For the post scores the desired variances are also easily found from the model for a post score, (10) Y = T + G + E and for which (11) 2 = 2 + 2 + C2 y t g e2 As stated in Chapter I, a2 was selected to be 3.6. If (5) g y e2 x el tively, then (5) and (11) may be used to find c2 + 2 2 (12) C2 t C e rel t g 2 The effect of the selected reliability may be obtained by selecting the error of measurement variances and the variance of the true scores in accordance with (6), (9) and (12). Thus it is seen that if true scores are selected from 2 a distribution with variance a2 and if the errors of t measurement are selected independently from a distribution 2 2 with variance 0 then by (8) X has variance if (7) holds. e x Selecting Gain When there was no gain in either group the value of g . would then be 0.0. When vt was nonzero for the gain, its value was selected so as to make the power of the t test on the raw difference scores equal to 0.50. The power of 0.50 was selected in order to permit maximum difference between the four methods of analysis. The value of G was determined by examining the difference scores (D). (13) D = Y X, and from (14) D = (T + G + E2) (T + E1) or (15) D = G + E2 El The elements in the right side of (11) are mutually inde- pendent normally distributed random variables whose vari- ances have been found and thus 2 2 2 2 (16) d = oa + a + o d el g e2 Furthermore since the only difference in (15) for the gain and no gain groups is the mean of G, the variance of the 2 difference for the gain group, ad and the variance of g 2 the difference for the no gain group, d are equal, i. e. ng 2 2 2 (17) dg= dng = d g ng 2 This common value a may then be used to determine the appropriate value of pg to produce the desired power of 0.50 for the t test on the raw difference scores. The t test on the raw difference scores is found from the following formula: (18) t = g n Dng S(n 1) S2 + (n 1) S g ng 1 1 n + 2 r n "2 If the group size is 25 and reliability 0.50, the value of is found as follows: 2 (note: rel = 0.50 impliescl = 106.72) (19) t = 2 2 24(2 c ) + 24(2L + 1 g 1! 25 + 25 2 25 25 D t = 2 5 This value of t is greater than the critical value of t (2.01) only if (20) 2.01 -gT that is, only if (21) 5.86 <5 g Thus if a value of 5.86 is chosen for the mean gain, the power is .50. Appropriate values for other group sizes and reliabilities were similarly determined. Analysis~ of the t Values for the Four feth-od'J The number of t's; significant at the 0.05 level w.ss rccorde.d for each of' the four me'.1o.-is of analysis. The.o 24 data were recorded for each of the 20 combinations of sample size, reliability and gain. A chi square statistic was calculated for each of these 20 sets to test the null hypothesis of no difference in the proportion of signifi- cant t values for the four methods of analysis. These data may be seen in Tables 1 and 2 of Chapter IV. CHAPTER IV RESULTS, CONCLUSIONS, AND SUMMARY Results The number of significant t's for each method of analysis under the no gain condition is presented in Table 1, and for the gain condition in Table 2. Addi- tionally, the computed chi square statistics for each reliability level are given. In each case the null hypo- thesis tested was that the proportion of significant t's was the same for each of the four methods of analysis. The chi square values were computed from the 2 x 4 contingency tables implied by the corresponding line of the table. For example, for group size of 25 and a reliability of 0.50, the 2 x 4 contingency table implied by the first line of Table 1 is: Raw Lord's Regressed Analysis of gain gain gain covariance Significant 5 58 2 2 Non significant 95 42 98 98 As may be seen by inspection of Tables 1 and 2, all the chi square values wcre significant at the 0.05 level and in each case the hypothesis of equal proportion of signi- ficant t values for the four methods of analysis w:as rejected. TABLE 1 NUMBER OF SIGNIFICANT t's WHEN THE TRUE MEAN GAIN WAS 0.0 FOR BOTH GROUPS GROUP SIZE = 25 RELIABILITY RAW LORD'S REGRESSED ANALYSIS CHI GAIN TRUE GAIN OF SQUARE GAIN COVARIANCE 0.50 5 58 2 2 163.13 0.60 5 53 4 4 128.98 0.70 7 51 6 6 103.69 0.80 5 24 5 5 30.76 0.90 6 27 4 4 40.95 GROUP SIZE = 100 RELIABILITY RAW LORD'S REGRESSED ANALYSIS CHI GAIN TRUE GAIN OF SQUARE GAIN COVARIANCE 242.95 178.28 115.05 111.60 59.61 0.50 0.60 0.70 0.80 0.90 CHI SQUARE (3,.95) = 7.82 TABLE 2 NUMBER OF SIGNIFICANT t's WHEN THE POWER OF THE t TEST ON THE RAW DIFFERENCE SCORES WAS 0.50 GROUP SIZE = 25 RELIABILITY RAW LORD'S REGRESSED ANALYSIS CHI GAIN TRUE GAIN OF SQUARE GAIN COVARIANCE 0.50 44 89 54 54 48.80 0.60 52 89 64 64 33.00 0.70 52 86 65 66 28.23 0.80 47 78 50 51 25.43 0.90 50 75 53 52 16.90 GROUP SIZE = 100 RELIABILITY RAW LORD's REGRESSED ANALYSIS CHI GAIN TRUE GAIN OF SQUARE GAIN COVARIANCE 0.50 0.60 0.70 0.80 0.90 38.80 39.48 24.45 38.37 21.35 CHI SQUARE (3,.95) = 7.82 Further inspection of Tables 1 and 2 reveals a higher number of significant t's for Lord's true gain procedure than for any other methods. Moreover, examination of Table 1 shows that this particular technique gives a considerably greater frequency of significant t values than one would expect by chance. The expected frequency is 5 for the a priori established condition of no actual difference in the two populations sampled. These results indicate that use of Lord's true gain procedure tends to create a higher significance level than the user would intend. If the sample proportion of significant t's found in the analysis is used as an estimate of the significance level, that estimate is 0.58 for the case when the group size was 25. For the same group size the lowest estimate of the signi- ficance level is 0.39. Conclusions Since the hypothesis of equal proportion of signifi- cant t's for the four methods of analysis was rejected in each of the 10 cases where the mean gain was 0.0 and since the use of Lord's true gain scores provided estimated levels of significance which were considerably higher than those intended, the use of Lord's true gain scores is strongly suspect and therefore is not recommended. No apparent differences were found among the remaining three I-ethods of analysis. However, there is a similarity bet;sen the regressed gain scores procedure and the analysis of covariance procedure that should be examined. The data in Table 1 indicate that the same number of significant t's was found by both of these methods in the case where there was no gain for either group. The 100 t values for each of the four methods of analysis when the group size was 25 and there was no gain in either group are presented in Appendix B. Inspection of the t values for the regressed gain procedure and the analysis of covariance procedure reveals a striking simi- larity between the t values; for each sample the t values are identical to at least the first decimal place. As a descriptive statistic it is noted that the correlation between the t values found by these two methods is 1.00 (rounded to 3 digits). Thus, the two methods are providing very similar results. In contrast the correlation between the t's for the raw difference and regressed gain procedures is 0.898. The two methods, regressed gain and analysis of covariance, are not entirely similar to the raw difference procedure. It may also be seen from Appendi-: B that the signs of the t's from bath the r'egresscd gain and analysis of covariance procedures co.:,etimes are opposite from the sign of the t for the raw difference procedure. Snodecor (1956, pp. 397,398) has indicated that the regressed gain procedure and the analysis of covariance procedure on the post scores using pre score as covariate are identical procedures. Nlo cource i.as found indicating a similarity between the regressed gain procedure and the analysis of covariance procedure on the difference scores using pre score as covariate. However, the two methods may be shown to be equivalent by writing the linear model for the regressed gain, or, equivalently, for the analysis of covariance on the post scores using pre score as covariate, and the model for the analysis of covariance on the differ- ence scores using pre score as covariate. The model for covariance analysis on the post scores is (1) Y = B0 + BlX + B2Z + E (Mendenhall, 1968, p. 170), where X and Y are defined as previously and Z = 1, if Y is from the gain group, = 0, if Y is not from the gain group. E = a normally distributed random error with mean 0. The model for covariance analysis on the difference scores is (2) D = B0 + BlX + B2Z + E where (3) D =Y -X and all other elements are defined as in (1). Now if the right side of (3) is substituted into (2) and the resultant equation rearranged to yield Y = BO + (B + 1.O)X + B2Z + E (4) it is seen that (1) and (4) are identical except for the addition of 1.0 to BI of equation (1) and thus the two methods will yield the same t values for testing the hypothesis that B2 is equal to 0.0. Since no clear difference was found among the raw gain procedure, the regressed gain procedure and the analysis of covariance procedure, none of these is recommended as more appropriate for the analysis of change than the other. All of these three procedures are recommended above Lord's true gain procedure. Discussion It is reasonable to ask if there is some questionable logic in Lord's derivation of true gain scores. Two things become apparent upon examination of the derivation. First the formula which Lord uses to begin his derivation, (3) of Chapter II, requires that the independent variable be known exactly (Madansky, 1959; Scheffe', 1959, p. 4), i. e. without error of measurement. The problem of esti- mating true gain arises in that the pre and post test scores are not knoin exactly, but instead thec observed ..cores, or the t'ru.e scores plus measurement errors, are known as m.ay be seen from Lord's models of the observed scores, (1) and (2) in Chapter II. If true pre and true post score were known these could be put into the regression equation. However, if true pre score and true post score were known there would be no nee:1 for the regression equation to estimate gain. The gain could be obtained simply by sub- tracting true pre score from true post score. In short, Lord seems to have assumed his conclusion in his derivation. Second, a look at the basic method for estimating true gain is enlightening. In order to estimate true gain from observed score it is necessary to somehow remove the error of measurement since, assuming the test to be valid, this is the factor that obscures the true score. Mendenhall (1968, p. 1) says that "...statistics is a theory of infor- mation...." What information is known concerning the errors of measurement? By Lord's assumption (iii) of Chapter II the errors are uncorrelated with true score and with each other. Thus neither the observed pre score nor the observed post score should provide any information concerning the size of the error. Since this error is random it would seem that it could not be removed from the observed scores. Consider two equal observed scores, one obtained from a higher true score by the addition of a negative error, the other obtained from a lower true score by the addition of an error of the same size but opposite sign of the previous error. How does one decide which score is to have a posi- tive correction added and which is to have a negative correction added? It would appear that one cannot make this decision without having some information besides the observed scores. A Direction for Future Research Since this study shows no difference among the propor- tion of significant t's for the raw gain, regressed gain, and the analysis of covariance procedures, it would be interesting to investigate the use of these procedures under assumptions other than the models listed in the first section of Chapter III. Differences may occur, for example, when the gain is a linear function of the pre score. Summary This study compared four selected measures of change. The four measures were: raw difference, Lord's true gain, regressed gain, and analysis of covariance procedures. An empirical comparison was made among these four methods. Samples were generated using Monte Carlo techniques and the data in each sample were analyzed by each of the four methods. It was found that Lord's true gain procedure produced a number of spurious significant t values, greater than would be expected by chance, when there was no real differ- ence in amount of gain between the two populations sampled. No apparent differences were noted among the remaining three methods and these three methods did not appear to have inflated significance levels. 'With such data use of Lord's true gain procedure is not recomriiended annc none of the retrmainingr three methods was recommrnended over the others. APPENDIX A FORTRAN PROGRAM WHICH PERFORMED THE CALCULATIONS DIMENSION X(100,2,2),LORD(100,2),DIFF(100,2) 1,AMAT(3,3),DIFFX(3),DUM(3) DOUBLE PRECISION SEED REAL KR21,LORC,MPRG,MPRNG,MPOG,MPONG,MLG,MLNG,MRGG,MRGNG READ (5,1) N SEEDNSAMP,KSAMP,NOPT 1 FORMAT(13,F11.0,314) IF(NOPT.EC.O) GO TO 3 READ(5,2) IREL, ISAMP C IREL RESTART RELIABILITY AT IREL FOR ABORTED RUN C ISAMP RESTART SAMPLE NUMBER AT ISAMP FOR ABORTED RUN 2 FORMAT(214) GO TO 4 3 IREL=1 ISAMP= 4 CONTINUE C C N SIZE OF SAMPLE C GAIN AVERAGE GAIN IN GAIN GROUP C SLED SEED FOR RAN'DOl NUMBER GENERATOR C KSAM:P NUI'ER OF SAMPLES TO BE TI.KEN AT EACH LEVEL C NSAMP N'-UM.BLR SfF IPE LAST SAMPLE FROM THE PREVIOUS RUN C I r'CR.'ENTt D t."r PRI(.IED OUJT AS THE SAMPLE NUMBER C OF RELIABILITY C TT=O.0 DO 1000 IR=IREL,5 READ(5,14) GAIN 14 FORMAT(F6.4) DO 902 JI=ISAMP,KSAMP NSAMP=NSAMP+1 C C S SUM C SS SUM OF SQUARES C D DIFFERENCE SCORE C G GAIN GROUP C NG NO GAIN GROUP C PR PRE SCORE C PO POST SCORE C PP PRE X POST C PRDIFF SUM PR X DIFF C SDG =0.0 SDNG =0.0 SSDG =0.0 SSDNG =0.0 SPRG =0.0 SPRING =0.0 SSPRG =0.0 SSPRNG=0.0 SPOG =0.0 SPONG =0.0 SSPOG =0.0 SSPCNG=O.0 SSPPG =0.0 SSPPNG=0.0 PRDIFF=0.0 REL=0.40+ 0.10*IR SEI= SQRT(100.0-SE1*SE1) SX=SQRT(100.0-SEI*SE1) SE2= SCR T((SE1lSEl+3.36)/REL-3.36-SElcSEl) GAI;= I1T SCRT( 2.0 (SEl 1 SE1+SE2*SE2+ 3. ?6)/N) C X(I,J,K) I-STUDE"' T C J=1, PRE SCORE C =2, POST SCORE C K= GA IN C =2 NO GAINr C C 1IFF(I,J) 1= STUDENT C J=1, GAIN C =2, Nr GAIN C DC 10 I=I,N l)l=SX R.':1C(SEEC ) D2=SX*R [C (SE E0) X(I,1, )=50+D1+SE 1RAND(SEED) X(I,2,1)=50+01+GAIN+1.83*RAND(SEED)+SE2*RAND(SEED) X(I1,,2)=50+D2+SE1*RAND(SEED) X(I,2,2)=50+D2+1.83*RAND(SEED)+SE2*RAND(SEED) DIFF 1,1)=Xl1,2,1 )-X( I, 1,1) DIFF(1,2)=X(I,2,2)-X(I,1,2) SDG=SDG +CIFF(I,1) SDNG=SDNG +DIFF(1,2) SSDG=SSDG+CIFF(1,1)*DIFF( I,1) SSDNG=SSCNG+DIFF(1,2)*DIFF(1,2) SPRG=SPRG +X(I,1,l) SPRNG=SPRNG +X(,1,2) SSPRG=SSPRG +X(I,1, 1)X(1,1,1) SSPRNG=SSPRNG+X( I, 2)r X( 1, 1,2) SPOG=SPOG+X(I,2, 1) SPONG=SPONG+X(I,2,2) SSPOG=SSPCG+X(1,2, 1)X(1,2,1) SSPOiNG=SSPOi 9G+X(I,2,2)*X(1,2,2) SSPPG=SSPPG+ X(1,1,1)*X(I,2,l) SSPPNG=SSPPNG +X(I,1,2)*X(I,2,2) 10 PRDIFF=PRCIFF+DIFF( I )*X( , l)+DIFF( ,2) VAPRG= (SSPRG-SPRG*SPRG/N)/(N-1) VAPRNG= (SSPRNG-SPRNG-SPRNG/N)/(N-1) VAPOG= (SSPOG-SPCG*SPOG/N)/(N-1) VAPONC= (SSPONG-SPONG*SPC]ONG/N)/(N-1) CPPG= ( SSPPG- S PR GSPG / N ) / SQRT ( S SPRG-SPQ.RG SPRG/;)*(SSPOG- 1SPOG*SPOG/N)) CPPNG= (SSPPNG-SPRNG*SPONG/N)/SQRT(ISSPRNG-SPRNGSPRNG/N)*( 1SSPONG-SPCNG*SPONG/N)) DBARG= SDG/N DBARNG = SDNG/N VADG= (SSDG-SDG*SDG/N)/(N-1) VAUNG =(SSDNG-SDNG*SDNG/N)/(N-1) Tl= (CBARG-DBARNG)/SQRT(((N-1)*(VADG+VADNG)/(2*N-2))*(2.0/N 1)) B1G= (((1.0-REL)*CPPG*SQRT(VAPOG))/SORT(VAPRG)-REL+CPPG*CPP 1G)/ 1(1.0-CPPG*CPPG) B2G= (RIL-CPPG*CPPG-((1.0-REL) SQRT(VAPRG)*CPPG)/SQRT(VAPOG 1))/(1.0-CPCPPGCPPG) B1NG=(((1.0-REL)*CPPNGFSQRT(VAPONG))/SQRT(VAPRNG)-REL+CPPNG I1CPPNG))/ (1.0-CPPNG=CPPNG) B2NG= (REL-CPPNG*CPPNG-((1.0-REL)*SQRT(VAPRNG)*CPPNG)/SQRTI IVAPONG))/(1.0-CPPNG*CPPNG) SLG =0.0 SSLG =0.0 SLNG =0.0 SSLrjG =0.0 V PRG=SPRG;/ ~: f-' PU C=SP]C//N MPR ~(;= SPR :G /f, rI:'PONG = S P CI, G/ .' DO 110 I=1,N LORD(I,1)=DBARG+B1G*(X(I,1,1)-MPRG)+B2G*(X(I,2,1)-MPOG) LORD(I,2)= DBARNG+B1NG*(X(I,1,2)-MPRNG)+B2NG*(X(I,2,2)-MPON 1G) SLG=SLG+LCRD(I,1) SLNG=SLNG+LORD(1,2) SSLG=SSLG+LORD(1, 1)*LORD(I, ) 110 SSLNG=SSLNG+LORD(I,2)*LORD(I,2) MLG=SLG/N MLNG=SLNG/N VALG=(SSLG-SLG*SLG/N)/(N-1.0) VALNG=(SSLNG-SLNG*SLNG/N)/(N-1.0) T2=(fLG-MLNG)/SQRT(((SSLG-SLG*SLG/N)+SSLNG-SLNG 1(N-1))) A=(SSPPG+SSPPNG-(SPRG4SPRNG)*(SPOG+SPONG)/[ 2N))/ 1(SSPRG+SSPRNG-(SPRG+SPRNG)*(SPRG+SPRNG)/(2*N)) B=(SPCG+SPONG)/(2*N)-A*(SPRG+SPRNG)/(2*N) SRGG=0.0 SSRGG=0.0 SRGNG=0.0 SSRGNG=O.0 .00 210 I=1,N RGSG=X(I,2,1)-A*X(I,1,1)-B RGSNG=X(I,2,2)-AlX(I,1,2)-B SRGG=SRGG+RGSG SSRGG=SSRGG+RGSG RGSG SRGNG=SRGNG+RGSNG 210 SSRGNG=SSRGNG+RGSNG*RGSNG MRGG=SRGG/N NRGNG=SRGNG/N VARGG=(SSRGG-SRGG*SRGG/N)/(N-1) VARGNG=(SSRGNG-SRGNG*SRGNG/N)/(N-1) T3= (MRGG-MRGNG)/SQRT((SSRGG-SGG*SSRGG/N+SSRGNG-SRGNG* 1SRGNG/N)/(N*(N-1))) AMAT(1,1)=2*N AMAT(1,2)=N AMAT(1,3)=SPRG+SPRNG AMAT(2,1)=AMAT(1,2) AMAT(2,2)=N AMAT(2,3) =SPRG AMAT(3, 1)=AMAT (1, 3) AMAT(3,2)=AMAT(2,3) AMAT(3,3)=SSPRG+-SSPRNG 900 COrNT I UE CALL Ir.(AMAT DIFF (1)= SUG SDriG DIFF>(2)=SCG DIFFX(3)=PROIFF YXXXXY=0.0 DO 410 1=1,3 DUFM( I )=0.0 DO 405 J=1,3 405 DUM(I)=CUI(I)+DIFFX(J)*AMAT(I,J) 410 YXXXXY=YXXXXY+CUM(I)*CIFFX(1) SSE=SSCG+SSCNG-YXXXXY VAACCV=SSE/(2.0*N-3) T4=DU (2)/SCRT(VAACOV*AMAT(2,2)) AA=(PRCIFF-(SPRG+PRNRG)*(SCG+SDNG)/(2*N))/ 1(SSPRG+SSPRNG-(S+SG+SPRNG)*(SPRG+SPRNG)/(2*N)) AVG=CEARG-AA*((PPRC-(WPRG+PRNG)/2) AVNG=CBIARNG-AA*(MPRNG-(MPRG+VPRNG)/2) KR21=1.0-(U PRG*(1CO-VPRG))/(1CO*VAPRG) VRITE(6,501) NSAMP,REL,KR21,T1,T2,T3 T4TiMPRG,IVPRNG,MPCG, 1VPCNG,MLG,VLNGMRGG,tRGNG,AMC,AMNG,SEED,VAPRG,VAPRNG,VAPCG, 2VAPCNG,VALG,VALNG,VARCG,VARGNG,VAACOV 501 FORI AT(16,lX,2(F3.2, IX),1X,4(F7.3),5(4X,2F6.2)/5X,F23.11, 116X,4(4X,2F6.1 ),8X,F6.2) VIRITE(7,502)NSAMP,REL,KR21,T1,T2,T3,T4,MPRG,MPRNGMPOG, 1 PCGLG,LGLN ,NS AMP, RGG RGNG,AMG,AMNG,VAPRG,VAPRNG, 2VAPCG,VAPCNG,VALG,VALNG,VARGC,VARGNG,VAACOV 502 FGRVAT(16,2F3.2,4F7.3,6F6.2, 3X,' 1'/I6, F6.2,8F5.1,F6.2,3X,' 12') 902 CCNTINUE ISAVP=1 1000 CONTINUE STOP E NC FUNCT I C R A;.D (RGC) DOUBLE PRECISION RO RG=CVCC(RC*30517578125.,34359738368.) X=RC/34359738368. Y=SIGN(1.O,X-0.5) V=SCRT(-2.0*ALOG(O.5*(1.0-ABS(I.C-2.0*X)))) RANC=Y*(V-(2.515517+0.802853*V+.C10328* 1V.**2)/( 1.0+1.432788*V+0.189269*V:*-2+O.CO1308-V**3)) RETURN ENC SUBRCLTINE INV(A) C PROGRAM FCR FINDING TFE INVERSE OF A 3X3 MATRIX CIVENSICA A(3,3),L(3),M(3) CATA N/3/ C SEARCH FCR LARGEST ELEMENT C080 K=1,N L(K)=K '(K)=K BIGA=P(K,K) CC2C I=K,N CC20 J=K,N IF(ABS (DIG,)-AL'S (A(I,.))) ]C,2C,20 10 BIGA=A(I,J) L(K)=I (K ) J 20 CC TINrUE C INTERCHA.rGE PCWS J=L(K) IF(L(K)-K) 35,35,25 25 0C30 I=1,N HOLC=-A(K,I) A(K, I)=A(J,I) 30 A(J,I)=FCLC C INTERCHANGE CCLUM\S 35 I=' (K) IF(V(K)-K) 45,45,3 37 DC40 J=i,N HCLC=-A(J,K) A(J,K)=A(J,I) 40 A(J, I)=-CLC DIVICE CCLUMN BY VINUS PIVOT 45 0C55 I=l,N 46 IF(I-K)50,55,50 50 A(I,K)=A(I,K)/(-A(K,K)) 55 CONTINUE S RECUCE VATRIX D065 I=1, GC.( 5 J= l r 56 IFll-l:) 57,65,57 57 IF(J-,) OC; 5,60 60 A I J)=t. ( I ,k) A ( J) [ I J ) 65 CC'.TI IUE C DIVIE E F:C., CY PIVCT DC75 J=1, 68 IF(J-K)70,75,70 70 A(K,J)=A(K,J)/A(K,K) 75 CCNTINUE C CCrTI\UEC FRUCUCT CF PIVOTS C REFLACE PIVCT EY RECIPROCAL A(K,K)=1 .0/A( K,K) 80 CC I.T1,UE C FI1 AL RCW AND COL rt.' T ;TERCHANr GE K = N 100 K=(K-1) IF(K) 153,150,103 103 I=L(K) IF(I-K) 12C,120,105 105 CCI10 J=1,r\ HCLF ,( J.K ) S( J K ) =- !, ( J, I) 110 A(J,I)=hCLC 120 J=V( K ) IF(J-K) 1 IC, CO, 125 125 CC130 1= 1,n HCLC=A(K, I) A(Krl)=-A(J, I 130 t.(J, I )= CLC GC TC 103 150 RETURN END APPENDIX B LISTING OF t's FOR EACH METHOD OF ANALYSIS WITH THE GROUP SIZE 25 AND GAIN 0.0 REGRESSED GAIN 0.334 -0.052 1.428 0.888 0.318 0.005 2.297 0.288 -0.904 -1.008 -0.186 -0.051 -0.232 -0.585 2.239 -0.575 -1.297 -0.538 0.674 -0.954 0.235 -0.036 0.451 -1.324 -0.856 1.240 0.221 0.006 0.165 0.076 -0.610 1.242 -0.696 0.410 -0.076 -0.670 2.034 1.216 -0.029 -1.205 0.361 ANALYSIS OF COVARIANCE 0.935 -0.175 3.180 3.837 0.925 0.013 25.767 1.746 -2.464 -1.691 -0.664 -0.040 -1.325 -2.182 10.504 -2.038 -2.736 -2.794 4.600 -5.083 0.937 -0.075 2.305 -2.824 -2.220 4.498 0.910 0.019 0.609 0.251 -2.692 2.416 -2.768 2.427 -0.379 -2.169 6.352 7.274 -0.143 -3.964 0.995 SAMPLE NUMBER RAW GAIN LORD'S TRUE GAIN 0.597 -0.257 1.340 1.441 -0.153 0.903 0.899 0.252 -1.383 -1.013 -0.243 -0.255 -0.161 -0.959 1.702 0.143 -0.717 -0.269 0.638 -1.758 0.379 -0.867 0.564 -1.661 -0.518 0.735 -0.112 -0.163 0.250 0.164 -0.920 1.441 -0.608 0.225 -0.254 -0.894 1.791 0.966 0.813 -1.347 0.729 0.592 -0.254 1.330 1.453 -0.152 0.906 0.986 0.249 -1.370 -1.003 -0.241 -0.253 -0.159 -0.958 1.720 0.152 -0.725 -0.267 0.632 -1.767 0.375 -0.864 0.558 -1.648 -0.516 0.739 -0.112 -0.161 0.247 0.163 -0.912 1.427 -0.603 0.224 -0.252 -0.884 1.790 0.961 0.829 -1.334 0.722 LISTING OF t's (CONTINUED) REGRESSED GAIN 42 43 44 45 46 47 48. 49 50 51 52 53 54 55 56 57 53 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 ANALYSIS OF COVARIANCE 1.042 1.556 0.511 -1.295 0.669 0.382 0.228 0.922 -1.263 -0.276 0.496 1.385 -0.087 -1.594 0.489 -1.450 1.146 1.754 -0.53-3 -0.484 1.155 1.321 0.130 0.721 -0.556 0.178 -0.146 1.290 -1.904 0.572 0.923 -0.034 0.123 -0.586 -2.622 -0.034 -0.259 -0.597 0.065 0.611' 0.729 -0.967 -1.275 1.259 SAMPLE NUMBER RAW GAIN 1.762 16.696 3.038 -7.504 4.977 1.210 0.216 5.810 -3.021 -0.763 1.975 6.511 -0.187 -2.591 3.276 -3.217 4 .044 7.198 -8.090 -2.971 3.037 20.233 1.233 2.349 -1.149 0.332 -0.465 4.543 -4.395 5.132 2.998 -0.106 0.546 -2.404 - .344 -0.078 -0.303 -2.141 0.1.97 1.834 0.579 -2.813 -5.166 4.430 LORD'S TRUE GAIN 0.975 1.728 0.184 -1.783 1.026 0.640 -0.110 0.943 -1.114 -0.941 0.757 1.583 -0.161 -1.646 -0.233 -1.298 0.657 1.535 -0.255 -0.083 1.766 1.620 -0.008 0.514 -0.084 0.434 -0.496 1.526 -1.858 0.470 1.140 -0.004 0.026 -0.920 -2.917 -1.368 -0.257 0.577 -0.404 0.162 1.078 -1.069 -1.077 0.797 0.966 1.710 0.183 -1.780 1.026 0.634 -0.110 0.933 -1.106 -0.938 0.749 1.572 -0.160 -1.630 -0.239 -1.290 0.657 1.530 -0.253 -0.083 1.749 1.604 -0.003 0.510 -0.084 0.432 -0.493 1.514 -1.844 0.466 1.128 -0.004 0.026 -0.911 -2.895 -1.382 -0.254 0.605 -0.412 0.162 1.073 -1.058 -1.071 0.302 LISTING OF t's (CONTINUED) REGRESSED GAIN -0.996 -0.406 -0.633 0.924 0.579 0.167 1.212 2.149 0.815 0.442 1.030 -1.004 -0.162 -0.194 -0.218 ANALYSIS OF COVARIANCE -1.010 -0.408 -0.630 0.915 0.573 0.165 1.204 2.145 0.809 0.438 1.020 -1.019 -0.160 -0.204 -0.216 SAMPLE NUMBER RAW GAIN 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 LORD'S TRUE GAIN -16.635 -4.328 -4.608 3.109 1.689 0.420 9.886 12.778 2.005 3.848 1.445 0.659 -0.168 -57.894 0.638 -1.736 -0.835 -0.956 0.930 0.551 0.121 1.352 2.337 0.921 0.599 0.756 0.155 -0.091 -1.449 0.125 BIBLIOGRAPHY Anastasi, Anne Psychological Testing. New York: Macmillan, 1961. Bigge, Morris L. Learning Theories for Teachers. New York: Harper and Row, 1964. Combs, A. W., Snygg, Donald Individual Behavior. New York: Harper and Row, 1959. Cronbach, Lee J., Furby, Lita "How Should We leisure Change--or Should Wle?" Prsychol1ogical Bulletin, 1970, 74: 63-80. Hays, William L. Statistics for PsycholoLists. New York: Holt, Rinehart, and Winston, 1963. Hilgard, Ernest H. Theories of Learning. few York: Appleton, 1956. Lord, F. I.. "The measurement of Growth," Educational and Psychological Heasurement, 1956, 16: -,i21-437. Lord, F. II. Statistical Inferences about True Scores," Psychoriiletrika, 1959, 24: 1-17. Lord, F. l. "Elementary models for Mleasuring Change," in C. W. Harris Problems in Ileasuring Charige. ;adison, Wisconsin: University of Wisconsin Press, 1963, pp. 21-3S. 1c.Nemar, Q. "On Growth Ileasurermenit," Educational and Psychological Mleasurement, 1958, 18: 77-55. Madansky, Albert "The Fitting of Straight Lines When Both Variables Are Subject to Error," Journal of the American Statistical Association, 1959, 54: 173-205. planning, Winton H., DuiBois, Philip H. "Correlational Methods in Research on Human Learning," Perceptral and Motor Skills, 1962, 15: 287--321. Mendonhall, .William Introduction to Linear iioodels and the Dcin and Analysis _of Ep:-ri mnts. Eelmont, California: Wadsworth, 1968. Meyer, Paul Introductory Probability and Statistical Application. Reading, Massachusetts: Addison-Wesley, 1965. Nunnally, Jum C. Psychometric Theory. New York: McGraw- Hill, 1967. Ohnmacht, Fred W. "Correlates of Change in Academic Achievement," Journal of Educational Measurement, 1968, 5: 41-44. Rosenthal, Myron R. Numerical Methods in Computer Programming. Homeward, Illinois: Irwin, 1966. Scheffe', Henry The Analysis of Variance. New York: Wiley, 1959. Schick, George B. (ed), May, Merril M. (ed) The Psychology of Reading. Milwaukee: The National Reading Conference, Inc., 15th Yearbook 1969, Ranking, Earl F., Jr., Dale, Lothar H. pp. 17-2L. Skinner, B. F. The Technology of Teaching. New York: Appleton, 1968. Snedecor, George W. Statistical Methods. Ames, Iowa: Iowa State College Press, 1956. Soar, Robert S. "Optimum Teacher-Pupil Interaction for Summer Growth," Educational Leadership Research Supplement, 1968, 26(3): 275-280. Tate, Merle W. Statistics in Education and Psychology. New York: Macmillan, 1965. Thorndike, E. L. "The Influence of Chance Imperfections of Measures Upon the Relation of Initial Score to Gain or Loss," Journal of Experimental Psychology, 1924, 7: 225-232. Tillman, Chester E. Crude Gain VS. True Gain: Correlates of Gain in Reading after Remedial Tutoring. Doctoral dissertation, University of Florida, 1969. Winer, B. J. Statistical Principles in Experimental Design. New York: McGraw-Hill, 1962. BIOGRAPHICAL SKETCH John Howard Neel was born July 27, 1944, at Waynesburg, Pennsylvania. He graduated from William R. Boone High school, Orlando, Florida, in June, 1962. In August, 1965, he received the degree Bachelor of Arts with a major in mathematics from the University of Florida. He taught algebra and general mathematics at John F. Kennedy Junior High School from September, 1965, until June, 1966. In September, 1966, he enrolled in the College of Education at the University of Florida under a United States Office of Education fellowship program directed by Dr. Wilson H. Guertin. In September, 1968, he accepted a research assistantship under the same program. In June, 1968, he received the degree Master of Arts in Education. He was an instructor in the College of Education at the University of South Florida from September, 1968, until August, 1969, and is currently on leave from that position. In September, 1969, he was appointed Interim Instructor in the College of Education at the University of Florida and he holds that position currently. John Howard Heel is married to the forncr Carol Lynn Ramft. They have two daughters, Sarah Elizabeth and Lia Suzanne. 54 John Howard Neel is a member of the Florida Educational Research Association, The American Educational Research Association, Phi Delta Kappa, and the American Statistical Association. This dissertation was prepared under the direction of the chairmen of the candidate's supervisory committee and has been approved by all members of that committee. It was submitted to the Dean of the College of Education and to the Graduate Council, and was approved as partial fulfill- ment of the requirements for the degree of Doctor of Philosophy. August, 1970 Dean, Co ~eg of Education Dean, Graduate School Supervisory Committee: C L. /n / __ Co.-.i i 'mla_ _2 -, r",, |

Full Text |

xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd INGEST IEID EKH6J6O6G_JSBLEW INGEST_TIME 2017-07-14T21:19:33Z PACKAGE UF00097734_00001 AGREEMENT_INFO ACCOUNT UF PROJECT UFDC FILES PAGE 1 A COMPARATIVE ANALYSIS OF SOME MEASURES OF CHANGE By JOHN HOWARD NEEL A DISSERTATION PRESENTED TO THE GRADUATE COUNOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1970 PAGE 2 COPYRIGHT BY JOHN HOVJARD NEEL 1970 PAGE 3 ACKN0WLEDGEI4ENTS The writer v:ishes to thank his committee members for their guidance in this study, especially Dr. Charles H. Bridges, Jr., v/ho suggested the topic and was a constant source of assistance throughout the study. When this study was begun there v.'as a questioning of change scores in the department which vjas helpful and encouraging. Those responsible for this atmosphere were Dr. Charles M. Bridges, Jr., Dr. Robert S. Soar, Dr. William B. Ware and lir. Keith Brown . Dr. Vynce A. Hines and Dr. P. V. Rao assisted by each detecting an error in the model presented. Dr. William B. Ware was especially helpful editorially, as vjas Dr. Wilson H. Guertin. The v/riter's wife, Carol, was encouraging, and understanding of the time V7hich v;as necessarily spent av/ay from home . iii PAGE 4 TABLE OF CONTENTS ACKNOWLEDGMENTS LIST OP TABLES ABSTRACT CHAPTER I. INTRODUCTION , The Problem of Measuring Change . . . . , Methods of Analyzing Change , The Problem , Some Limitations Procedures , Significance of the Study Organization of the Study II . RELATED LITER^^TURE Derivation of Lord's True Gain Scores . Comparison of Lord's True Gain Scores with Other Scores , Comparison of Regressed Gain Scores with Other Scores , Another Study and Summary , III . rlETHODS AND PROCEDURES Procedures: An Overview Sampling from a Normal Population v;ith Specified Mean and Variance Selecting Reliability Selecting Gain Analysis of the t Values foi' the Pour Methods Page iii vi vii 2 5 6 7 8 9 10 10 13 1^ 15 16 16 19 20 21 23 IV. RESULTS, CONCLUSIONS, AND SUMMARY Results Conclusions Discussion 25 25 28 31 IV PAGE 5 Page CHAPTER A Direction for Future Research 33 Summary 33 APPENDIX A. FORTRAN PROGR/\M 34B. LIST OF t's FOR THE FOUR METHODS .... 4? BIBLIOGRAPHY 51 BIOGRAPHICAL SKETCH 53 PAGE 6 LIST OF TABLES Table iage NUMBER OP SIGNIFICANT t's V/HEN THE TRUE I4EAN GAIN WAS 0.0 FOR BOTH GROUPS ... 26 NUMBER OF SIGNIFICANT t's V/HEN THE POVffiR OF TliE t TEST ON THE RAV; DIFFERENCE SCORES WAS 0.50 27 VI PAGE 7 Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy A COMPARATIVE ANALYSIS OF SOME MEASURES OF CHANGE by John Howard Neel August, 1970 Chairman: Wilson H. Guertin Co-Chairman: Charles M. Bridges, Jr. Major Department: Foundations of Education The purpose of this study was to determine which of four selected methods v;as most appropriate for measuring change such as gain in achievement. The four selected methods v.'ere ravi difference. Lord's true gain, regressed gain, and analysis of covariance procedures. In order to compare the four methods Monte Carlo techniques were employed to generate samples of pre and post scores for two groups. The reliability, variances, and means of the sampled populations were controlled. One hundred samples were generated at each of 20 combinations of five levels of reliability, Uio levels of group size, and two levels of gain. Using each of the four methods, a t statistic was calculated for each sample to test the null hypothesis of no difference in amount of gain betv/een the two groups. The number of t's significant at the 0.05 level of significance was recorded for each of the four methods . VI 1 PAGE 8 At each of the 20 combinations a chi square test was used to test the null hypothesis of equal proportions of significant t's among the four groups. This hypothesis was rejected in each case. It V7as noted that use of Lord's true gain procedure tended to create a greater significance level than the user vrould intend. The proportion of significant t's for each of the other three methods of analysis fell reasonably close to the expected values. On this basis use of Lord's true gain procedure vras not recommended and since there v;as no apparent difference among the remaining methods of analysis, none v;as recommended above the other. VI 11 PAGE 9 CHAPTER I INTRODUCTION Learning has long been a primary focus of investigation for educators. A definition of learning has been offered by Hilgard (I956): Learning is the process by which an activity originates or is changed through reacting to an encountered situation, provided that the change in activity cannot be explained on the basis of native response tendencies, maturation, or temporary states of the organism (e. g. fatigue, drugs, etc .) . (p. 3) Although Hilgard v/ent on to say that the definitio2i is not perfect, it does illustrate a commonly accepted aspect of learning: learning involves change of the behavior of the organism v/hich learns (Bigge, 196^, p. 1; Skinrsr, 1968, p. 10; Combs, 1959, p. 88). Educators have been concerned with this change and often have sought to measure the change occurring in some situation. Several methods of analyzing change have been presented in the literature. A comparison of these methods of analysis of the measurement of change and the accompanying difficulties v:hich arise was the focus of this study. The purpose of the study vms to determine v:hich of the several selected measures of change is most appropriate under various conditions. PAGE 10 The Problem of Aeasurln/g; Ghang;e In all sciences measurement is an approximation. In conducting a first order survey, a surveyor makes three measurements of a distance and takes the average of the three measurements . Physicists and engineers customarily report the relative size of the error of their measurements. Physical scientists have been fortunate in that the size of the relative error involved in their measurements often has been small, frequently less than 0.01 and sometimes less than 10" . Educators are unfortunate in this respect in that if a student's true I. Q. v;ere 100 and an I. Q. score of 38 was observed, the relative error would be 0.14. The size of this error is not uncommon and larger relative errors do occur. The Stanford-Binet intelligence test has standard error of measurement equal to five I. Q. points (Anastasi, I96I, p. 200). Thus, relative errors of 0.14 or larger v;ill occur 1.64 per cent of the time, assuming a normal distribution of errors. In measuring change the proble'ii is compounded since there is an error in both the pre score and the post score. Moreover, V7hen the miagnitude of the change is small or zero, the magnitude of the error may be larger than that of the change. This possibility makes the change difficult to detect or to separate from the error. The effect of this error of measurement on change scores was noted as early as 1924 by Thorndike: PAGE 11 When the individuals in a varying group are measured twice in respect to any ability by an imperfect measure (that is one whose selfcorrelation is below 1.00), the average difference betv/een the two obtained scores will equal the average difference betv;een the true scores that v;ould have been obtained by perfect measures, but for any individual the difference betvjeen the tv/o obtained scores Vfill be affected by the error. Individuals who a.re belov; the mean of the group v;ill tend by the error to be less far below it in the second, and individuals v/ho are above the mean of the group in trie first measurement v;ill tend by the error to be less far above it in the second. The lower the self -correlation, the greater the error and its effect. Thorndike (192^) v/ent on to shov: that there vjas a spurious negative correlation betv/een initial true score and ti'-ue gain. He then stated that "the equation connecting the relation of obtained initial ability v;ith obtained gain, the unreliability of the measures and the true facts" had not been discovered. Lord (195^) developed the equation to vrhicli Thorndike alluded. Lord made the following assumptions concerning the error of measurement: the errors i) have zero mean for the groups tested. ii) have the same variances for both tests, iii) are uncorrelatod vxith each other and vjith true score on either test. (Lord, 1956) McNemar (1955) has extended Lord's woi-'k to the case of unequal ei-ror variances. It should be noted here that McNemar's follovj-up of Lord's v;ork is only an extension. When the variances are equal, McIIemar's formulas are identical to Lord's (Lord, 195S) . PAGE 12 Methods of Analyzing; Change The estimated true gain scores derived from Lord's and McNeraar's equations have been used in either t tests or analysis of variance procedures (Soar, I968; Tillman, 1969). There are, in addition to Lord's method, three other commonly used methods for analyzing change. One method has been the use of a straight analysis of variance or t test on the raw difference scores as the situation warrants. It is important to note that the rav; scores are used in the analysis with no correction for the mirel lability of the measures. A second method is to complete an analysis of covariance on the rav; difference scores using the pretest scores as covariates , These procedures are standard statistical techniques and may bo fomid in many texts (Hays, 1963; Snedecor, 1956; v/iner, I962) . The third method of measuring gain has been advocated by Manning and DuBois (I962): the method of residual gain. In this method the final scores are regressed on the initial scores and the difference betv;een the final score and the score predicted by the regression equation is taken as a measure of gain. This measure is than used in t tests or analysis of variance procedures. Thus in the case of equal variances four common methods of analyzing change have been identified: 1. Use of rav: gain scores in appropriate procedures , PAGE 13 2. Use of Lord's true gain scores in appropriate procedures . 3. Use of Manning and DuBois' regressed gain scores in appropriate procedures. ^. Use of analysis of covariance on raw gain scores with pre scores as covariates . The Problem A researcher faced with these different methods and a problem in measuring change is confronted with a second and fimda.Tiental methodological problem: which of the methods for measuring change is most appropriate? It is this question that this study sought to ansv;er. The problem of v/hich method to use is further complicated since different writers have claimed that different techniques were appropriate. "Manning and DuJBois and Rankin and Tracy feel that the method of residual gain is more appropriate for correlational procedures since it is metric free" (as quoted in Tillman, I969, p. 2). Ohnmacht (1963) also suggested that this procedure vra.s the best. Lord] in Harris, 1963, chapter 2) mentioned regressed gain but seemed to advocate his ovin method as being superior. This position is further supported by Cronbach and Furby (I97O). To determine which of the methods of analysis v;as most appropriate, an empirical study vjas conducted to compare the results of each method under knovm situations. PAGE 14 Some Limitations This study v/as limited to the two group situation. To examine more than two groups would have involved such a large number of possibilities as to make the study impractical in terms of time and money. Thus, this study excluded the more general multigroup comparisons possible with analysis of variance and covariance procedures and was limited to examining t tests and the analysis of covariance using the pre score as covariate . Additionally the case of unequal variances between the two groups v.'as not considered. A third limitation vjas that the variance of the true gains was selected a priori to be 3-6. True gain scores with this variance will be such that over 99 psr cent will be within five imits of the true mean gain. It is the ratio of the variances of the gain and error which is important. Since the reliabilities were varied, as indicated later, this study was conducted utilizing several such ratios. One of the factors of interest was the reliability of the test used. A second factor of interest was the sample size or the relative pov;er of tlie procedures under study. There is an infinite number of i^eliabilities and a decision was made as to the levels of reliability to be investigated: 0.50 to 0.90 in increments of 0.10. Tests vjith reliabilities lower than O.5O are rarely used in practice and at best the resulting data would be highly questionable. The lovjer limit of 0.50 was chosen for this reason. PAGE 15 Sample sizes of 25 and 100 per group were chosen as being somewhat representative of sample sizes used in educational research. Procedures Two groups v^ere compared under 20 conditions using t tests as f ollov;s . There were five levels of reliability (0.50, 0.60, 0.70, 0.80, 0.90) and two levels of sample size (25 and 100) used in this study. Thus there v;ere ten different combinations of sample sizes and reliabilities. For each of these combinations two cases vjere investigated, one v;here there was no pre to post test gain in either group and the second where there was a knovm. gain from the pre to the post test for one group. For each of these 20 instances, 100 samples were generated and analyzed using each of the four methods of analysis indicated previously. Â« Consequently, tv;o questions vjere to be answered: 1. Does any one of the selected methods yield a disproportionate number of significant t values v;hen there is no difference between the mean gain of the tv70 groups? 2. Is any one of the selected methods more pov;erful, i. e. more successful in detecting a difference when a difference does exist? Samples from a normal distribution v/ere generated using techniques described by Rosenthal (I966) . The method for generating random numbers was the multiplication by a constant method. With the procedures used, this nethod PAGE 16 8 will produce 8.5 million numbers before the series repeats. This number vjas more than sufficient for this study. All generation of the samples and calculation of t values using the various methods of analysis were done on the IBM 36O/65 computer at the University of Florida. The significance level used for the t tests was 0.05 . The two research questions generated two null hypotheses: 1. The proportion of t's significant at the 0.05 level is the same for each method of analysis when there is no gain in either group . 2, The proportion of t's significant at the 0.05 level is the same for each method of analysis when there is gain by one group but not by the other. These hypotheses were tested at each of tv7o combinations of reliability and sample size with chi square tests using the 0.05 level of significance. Signif ican ce of the Study The results of this study should either indicate empirically that one or more methods vrere superior to the others or that there were no great differences among the methods. If the former were true, then educational researchers may select one of tlie better methods. If the latter were true, then educational researchers may select any of the PAGE 17 methods. In either case the study provides some answer as to how change scores should be analyzed. Orp:anization of the Study Chapter I has been the introduction, statement of the problem, limitations, hypotheses, and procedural overview. Chapter II reviews related literature, essentially the development of the equation and methodology of the various techniques studied. Chapter III describes the procedures and Chapter IV presents the data, conclusions, and summary. PAGE 18 CHAPTER II RELATED LITERATURE Much has been said in the literature about measuringchange. However, most of this discussion is centered around the four methods investigated in this study: raw gain, Lord's true gain, regressed gain, and analysis of covariance procedures. As pointed out in Chapter I raw gain and regressed gain procedures are discussed in many texts and therefore not discussed here. Lord's true gain and regressed gain procedures are discussed in this chapter. Derivation o f Lord's True Gain Scores The following derivation parallels Lord's (1956) development of true gain scores with one exception as noted. Lord gave the following equations as a model for the observed pre and post scores: (1) X = T + E-^ (2) I = T + G + Eg where X = observed pre score; Y = observed post score; T = true pre score; G = true gain; E-j = error of measurement in pre observed score; 10 PAGE 19 11 Ep = error of measurement in post observed score. Lord then made the following assumption concerning E, and Ep, the errors of measurement. The Errors i) have zero mean in the group tested. 2 ii) have the same variance (o ) for both tests. iii) are uncorrelated with each other and with true score on either test (Lord, 1956) . The derivation can be considerably shortened at this point by examining a standard regression equation v/hich predicts one variable, X^ , from two other variables, XÂ„, XÂ„. The equation is (Tate, 1965, p. 171) <3' ^1 = B12.3 C^ (X2 X^) + B^3 2 i (X3 X3) + x^ where r Â— V T B = ^12 ^1 3 "^23 12.3 1 p2 23 and r r r B = -12 12 ^23 13.2 1 _ ^2 If we let X-j^ = G ^ gain; X = X = observed pre score; Xo = 'if = observed post score; the following elements in the regression equation can be identified as X-, = estimated gain; PAGE 20 12 Xp = mean of the observed pre scores; Xo = mean of the observed post scores; S-, = standard deviation of the gain scores; Sp = standa.rd deviation of observed pre scores; SÂ„ = standard deviation of observed post scores; Tp^ = correlation of observed pre and post scores. Lord has pointed out that r-,^ and r, ^, the correlations between observed score and true score, are the reliabilities Prom (1) and his stated assumptions, Lord v;rites , r'\ ^2 2 ^ 2 ^ 2 (5) ^y = ^t + Â°g + % 5 (6) ^xy^^t'-ne Â• Lord solves these equations to find 2 2 2 2 (7) o = a -^^ o + 2a 2a ^ , w / g X y xy e ' the variance of the true gain scores . At this point the only element in the regression equation vjhich is undefined is X . This eleraeiit is found by considering the mean of the observed pre scores vjhich from (1) can be seen to be equal to the mean of the gain scores plus t?ie mean of the errors of measurement, that is (8) X = T -iE . But by Lord's assumption (i) E = 0, therefore PAGE 21 13 . (9) X = T . . Similarly from (2) Lord shows (10) Y = T + G . Then (9) is subtracted from (10) and rewritten to yield (11) G = Y X . Thus all elements are defined and (3) may be rewritten as in terras of T, G, X, and Y as (12) G B^2 3 ^^ (^ ^) + B;L3 2 ^ (Y Y) + I X X y which Lord has asserted to be an estimate of true gain. It may be noted that no notational scheme or other method has been presented to distinguish betvjeen statistics and parameters in the preceding derivation. This lack is in keeping v^ith Lord's derivation. It is assumed here that Lord v.'as referring to parameters until the point at which he obtained the final equation and that he then intended to use sample values to estimate the appropriate parameters in the regression equation. Comparison of Lord's True Gain Scores with Other Scor es In his original article Lord (1956) made no comparison of his method v;ith any other method. In a subsequent artj.cle (1959) Lord again made no mention of other methods. In a chapter written in Problems in Mea suring Chanp:e (Harris, 1963, Chapter 2) he made reference to regressed gain scores, PAGE 22 14 but no discussion or comparison was presented. In Statis tical Theories of Mental Tes t Scores (Lord and Novick, 1968) no comparison of Lord's true gain scores with other procedures is presented. Comparison of Regressed Gain Scores with Other Scores Manning and DuBois (I962) have compared per cent, rav;, and residual gain scores. A per cent gain score is raw gain scoredivided by the pre score (Manning and DuBois, 1962) . The comparisons vjere made on the bases of metric requirements, reliability, and appropriateness of use in correlation procedures. On each of these bases residual gain scores were recommended over per cent and raw gain scores. Manning and DuBois pointed out that per cent and raw gain scores require at least equal interval scales on both pre and post scores and that the scales be the same on both pre and post scores, i. e. the same equal interval scale must be used on both tests . According to Maniiing and DuBois these qualities are not possessed by educational and psychological test scores. In contrast, residual gain scores do not require the same equal interval scales and therefore are appropriate for use with test scores (Manning and DuBois, 1962) . Manning and DuBois summarily list form.ulas showing that residual gain scores are more reliable and more appropriate measures for correlational procedures than are raw or per cent gain scores. These formulas were only IJsted, not derived, and no reference v;as made to their derivation. PAGE 23 15 Another Study and Summary Madansky (1959) reported or derived several methods for fitting straight lines to two variables when both were measured with error. One of these procedures is applicable in the case when the variance of the error of measurement is unknown. Hov;ever, there has apparently been no attempt to apply the method to the analysis of change. A search of the literature has revealed no comparative empirical examination of the four methods examined in this study. Further, the advocates and authors of two of the reported procedures, each of whom has been shovjn to know of the existence of the other procedure, continue to advocate their own method even though they offer no reason or data for this advocacy. This study should provide some knowledge as to any difference in the four methods. PAGE 24 CHAPTER III METHODS AND PROCEDURES Procedures: An Overview As stated in Chapter I, Monte Carlo techniques v^Jere employed to generate pre and post test scores for tvjo groups. One group is referred to as the gain group, the other as the no gain group. The model for the observed pre scores is (1) X = T + E-j^ where X is the observed pre score; T is the true pre score; E-, is a nornally distributed random error 2 with mean 0.0 and variance o . Â®1 The model for the post scores is (2) Y = T + G + Eg where y is the observed post score; T is the true pre score; G is the true gain from pre to post score; Ep is a norraally distributed random error 2 vjith mean 0.0 and variance o 16 PAGE 25 17 The generated scores vrere subsequently analyzed for the difference in the amount of gain or change between the two groups. The scores were analyzed by the four selected methods; 1. at test on the raw difference scores 2. at test on Lord's true gain scores 3. at test on regressed gain scores 4. a t test from an analysis of covariance on the raw difference scores using pre score as covariate. The results of these analyses v/ere then compared. For tlie gain group an appropriate mean gain, y , from pre to post scores was obviously selected to be 0.0 in the case of no gain for either group and selected to be of such size as to make the pov;er 0.50 when there was a gain in the gain group. The post scores were generated by adding a random normal gain, G, and a random normal error, Ep, to the generated true pre scores . The variables G and Ep had means V and respectively and variances as discussed D later. For the no gain group there v/as no gain from pre to post scores. The pre scores for both the gain and the no gain group were taken from a normal population with mean 50.0 and variance 100.0. The mean and variance of the population of post scores for both the gain and the no gain groups were functions of the mean true gain and of the re] iability . PAGE 26 18 After the samples were generated, the hypothesis of no difference in average gain between the two groups was tested using each of the four methods. The t values for each of these tests were recorded. This procedure was repeated over 100 samples for each of the selected reliabilities O.5O, O.6O, O.7O, 0.80 and 0.90. The method for introducing the effect of the selected reliability into each generated score is presented in a following section. Thus 100 t values were calculated and recorded for each method of analysis and at each level of reliability. This entire procedure was repeated for each of the follovn.ng conditions: 1.group size = 25, l^ = 0.0 for both groups; 2. group size = 25, v 5^ 0.0 for the gain group O = 0.0 for the no gain group; 3. group size 100, V =0.0 for both groups; 4. group size = 100, V7^ 0.0 for the gain group O =0.0 for the no gain group. V/here the gain vjas not equal to 0.0 it vjas such that the power of the t tests on the raw difference scores was O.5O, i. e. the expected proportion of rejected null hypotheses was 0.50. The follov;ing sections describe in more detail some of the previously mentioned procedures .* * The reader is also referred to the FOSTRAH listing in Appendix A for the exact computer routines by which these procedures v:ere carried out. PAGE 27 19 Samplinp; from a Normal Population with Specified Mean and Variance If P is the cumulative density function of a random variable R, , thai the random variable R_ defined by (3) B2 " ^^^1^ is imiforiTily distributed over the interval [o,l| (Heyer, 1965, P256, Theorem 13-6) . Here P is the cumulative density function of the random variable R-, . It follows then that R-, , where (^) Ri = F~^(I^2^ ' is normally distributed if F~ is the inverse cumulative density fujiction of a normal distribution and if R is a uniform random number on the interval [o,l] (Meyer, I965, pp. 256-257) . Thus random samples from a normal distribution may be obtained using uniform random numbers and by (4) vihere P is the cumulative density function of a normal distribution. For a normal distribution, F~ (R ) must be calculated using numerical approximation methods. This calculation as well as the generation of the uniform random numbers were done using a routine described by Rosenthal (I966, pp. 270, 287). Rosenthal's techniques v^ere adapted to the IBM 36O/65 computer installed at the University of Florida (see Appendix A FUIICTIOII R/^KD) . The normal population sampled had mean and variance 1.0. If a different mean or variance PAGE 28 20 was required, it was obtained by addition or multiplication by an appropriate constant. Selecting; Reliability Reliabilities of O.5O, O.6O, O.7O, 0.80 and O.9O were selected as representative of reliabilities found in test scores. The reliability, rel, of a test may be defined as (5) , rel = 1 Â— i (Nunnally, I967, p. 221), a X 2 2 where is the error of measurement variance and o is e X 2 the observed score variance. Since had been selected X a priori to be 100.0, we have from (5) (6) o^ = 100 (1 rel) . Â®1 Moreover, since (7) X = T + E^ and since the error, E-, , is assumed to be independent of the true score, T, we have (8) a^ z. o^ -ia^ , X t e^ 2 or, combining (6) and (8) and solving for o (9) o? = 100 cj^ t e^ For the post scores the desired variances are also eavSily found from the model for a post score, PAGE 29 21 (10) Y = T + G + E^ , and for which (11) c"^ ^ c\^ o"^ ^ o"^ y t g eg 2 As stated in Chapter I, oj was selected to be 3.6. If (5) D is rewritten with a and a^ instead of a^ and g , respecy eg X e^; tively, then (5) and (11) may be used to find (12) /= -L. _S _ o2 o^ . e^ i^el t g Â• The effect of the selected reliability may be obtained by selectiiig the error of measurenient variances and the variance of the true scores in accordance with (6), (9) and (12) . Thus it is seen that if true scores are selected from a distribution with variance o and if the errors of 1/ measurement are selected independently from a distribution with variance o^, then by (8) X has variance a^, if (7) holds. S electJ-n/^ Gain v;hen there was no gain in either group the value of v would then be 0.0. When vi was nonzero for the gain its value was selected so as to make the power of the t test on the va.\i difference scores equal to O.50. The power of 0.50 vjas selected in order to permit maximum difference between the four methods of analysis. PAGE 30 22 The value of G was determined by examining the difference scores (D) . (13) D = Y X, and from (1^) D = (T + G + E^) (T + E^) , or (15) D = G + E^ E-[_ . The elements in the right side of (11) are mutually independent normally distributed random variables whose variances have been found and thus , ,, 2 2 2 2 ^ ' d e-j^ g eg Furthermore since the only difference in (15) for the gain and no gain groups is the mean of G, the variance of the 2 difference for the gain group, o ^ , and the variance of g 2 the difference for the no gain group, o , are equal, i. e ng 2 2 2 (17) Â°d = ''d = ^d Â• g ng This common value a. may then be used to determine the appropriate value of vto produce the desired power of 0.50 for the t test on the raw difference scores. The t test on the rav: difference scores is found from the follov;in2; formula: PAGE 31 23 D (18) t = g " ^ng (n 1) S^ + (n 1) S' n^ + n^ 2 J}-E n + 1_ If the group size is 25 and reliability O.5O, the value of is found as follows: 2 (note: rcl = O.5O implies o = 106.72) (19) t = ^^-Q 2^(2a^ ) + 2ii(2af ng 25 +25-2 1. 4. 1_ 25 " 25 t = L 2T9155 This value of t is greater than the critical value of t (2.01) only if (20) D. 2-Â°^ ^-279^1 that is, only if (21) 5.86 < D g Thus if a value of 5.86 is chosen for the mean gain, the power is .50. Appropriate values for other group sizes and reliabilities were similarly determined. Ajgial^_jj3__of_t}vq_t_ Va^^^^ The number of t's significant at the O.O5 level v/as recorded for each of the four methods of analysis. These PAGE 32 2k . data were recorded for each of the 20 combiriations of sample size, reliability and gain. A chi square statistic was calculated for each of these 20 sets to test the null hypothesis of no difference in the proportion of significant t values for the four methods of analysis. These data may be seen in Tables 1 and 2 of Chapter IV. PAGE 33 chaptp:r IV BESULTS, CONCLUSIONS, AND SUMMARY Results The number of significant t's for each method of analysis under the no gain condition is presented in Table 1, and for the gain condition in Table 2. Additionally, the computed chi square statistics for each reliability level are given. In each case the null hypothesis tested v;as that the proportion of significant t's was the same for each of the four methods of analysis. The chi square values v;ere computed from the 2 x ^ contingency tables implied by the corresponding line of the table. For example, for group size of 25 and a reliability of 0.50, the 2 X ^ contingency table implied by the first line of Table 1 is: Rav7 Lord's Regressed Analysis of gain gain gain covariance Significant 5 58 2 2 Non significant 95 42 98 98 As may be seen by inspection of Tables 1 and 2, all the chi square values v-;ere significant at the 0.05 level and in each case the hypothesis of equal proportion of significant t values for the four methods of analysis v;as rejected. PAGE 34 26 TABLE 1 NUMBER OP SIGNIFICANT t's WHEN THE TRUE MEAN GAIN V;AS 0.0 FOR BOTH GROUPS GROUP SIZE = 25 RELIABILITY PAGE 35 27 TABLE 2 NUMBER OP SIGNIFICANT t's WHEN THE POV/ER OF THE t TEST ON THE RAV/ DIFFERENCE SCORES WAS O.5O GROUP SIZE = 25 RELIABILITY PAGE 36 28 Further inspection of Tables 1 and 2 reveals a higher number of significant t's for Lord's true gain procedure than for any other methods. Moreover , examination of Table 1 shovis that this particular technique gives a considerablygreater frequency of significant t values than one would expect by chance. The expected frequency is 5 for the a priori established condition of no actual difference in the tv/o populations sampled. These results indicate that use of Lord's true gain procedure tends to create a higher significance level than the user would intend. If the sample proportion of significant t's found in the analysis is used as an estimate of the significance level, that estimate is 0.58 for the case v:hen the group size was 25. For the same group size the lowest estimate of the significance level is 0.39 Â• Conclusions Since the hypothesis of equal proportion of significant t's for the four methods of analysis was rejected in each of the 10 cases v:here the mean gain was 0.0 and since the use of Lord's true gain scores provided estimated levels of significance which vrere considerably higher than those intended, the use of Lord's true gain scores is stronglj'' susi^ect and therefore is not recommended. No apparent differences were found among the remaining three methods of analysis. Hov;ever, there is a similarity betvjeen the regressed gain scores procedure and t}ie PAGE 37 29 analysis of co variance procedure that should be examined. The data in Table 1 indicate that the same number of significant t's v;as found by both of these methods in the case where there was no gain for either group. The 100 t values for each of the four methods of analysis when the group size was 25 and there was no gain in either group are presented in Appendix B. Inspection of the t values for the regressed gain procedure and the analysis of covariance procedure reveals a striking similarity between the t values; for each sample the t values are identical to at least the first decimal place. As a descriptive statistic it is noted that the correlation between the t values found by these two methods is 1.00 (rounded to 3 digits). Thus, the tvjo methods are providing very similar results. In contrast the correlation between the t's for the rav/ difference and regressed gain procedures is 0.898. The two methods, regressed gain and analysis of covariance, are not entirely similar to the raw difference procedure. It may also be seen from Appendix B that the signs of the t's from both the regressed gain and analysis of covariance procedures sometimes are opposite from the sign of the t for the ravj difference procedure. Snedecor (195^, pp. 397,398) has indicated that the regressed gain procedure and the analysis of covariance procedure on the post scores using pre score as covariate are identical procedures. No source was found indicating a PAGE 38 30 similarity between the regressed gain procedure and the analysis of covariance procedure on the difference scores using pre score as covariate. However, the two methods may be shown to be equivalent by writing the linear model for the regressed gain, or, equivalently , for the analysis of covariance on the post scores using pre score as covariate, and the model for the analysis of covariance on the difference scores using pre score as covariate. The model for covariance analysis on the post scores is (1) Y = Bq + BjX + BgZ + E (r4endenhall, I968, p. I70) , where X and Y are defined as previously and Z = 1, if Y is from the gain group, =0, if Y is not from the gain group. E a normally distributed random error with mean 0. The model for covariance analysis on the difference scores is (2) D = Bp, + B-,X + BpZ + E " ^1^ " ^'Z' where (3) D = Y X and all other elements are defined as in (1). Now if the right side of (3) is substituted into (2) and the resultant equation rearranged to yield (^0 Y = Bq + (B-L+ 1.0)X -IBgZ + E , PAGE 39 31 it is seen that (1) and {k) are identical except for the addition of 1.0 to B, of equation (1) and thus the two methods will yield the same t values for testing the hypothesis that Bp is equal to 0.0. Since no clear difference was found among the raw gain procedure, the regressed gain procedure and the analysis of covariance procedure, none of these is recommended as more appropriate for the analysis of change than the other. All of these three procedures are reconimended above Lord's true gain procedure . Discussion It is reasonable to ask if there is some questionable logic in Lord's derivation of true gain scores. Tvio things become apparent upon examination of the derivation. First the formula v;hich Lord uses to begin his derivation, (3) of Chapter II, requires that the independent variable be knovai exactly (Madansky, 1959; Scheffe', 1959, P^) , i. e. v/ithout error of measurement. The problem of estimating true gain arises in that the pre and post test score.8 are not knovn exactly, but instead the observed vcovez), or the true scores plus measurement errors, are known as may be seen from Lord's models of the observed scoi'es, (1) and (2) in Chapter II. If true pre and true post score were known these could be put into the regression equation. Hov;ever, if. true pre score and true post score were known there would be no need for the regression equation to PAGE 40 32 estimate gain. The gain could be obtained simply by subtracting true pre score from true post score. In short, Lord seems to have assumed his conclusion in his derivation. Second, a look at the basic method for estimating true gain is enlightening. In order to estimate true gain from observed score it is necessary to somehow remove the error of measurement since, assuming the test to be valid, this is the factor that obscures the true score. Mendenhall (1968, p. 1) says that "...statistics is a theory of information...-." V/hat information is knovra concerning the errors of measurement? By Lord's assumption (iii) of Chapter II the errors are uncorrelated with true score and with each other. Thus neither the observed pre score nor the observed post score should provide any information concerning the size of the error. Since this error is random it would seem that it could not be removed from the observed scores. Consider tvjo equal observed scores, one obtained from a higher true score by the addition of a negative error, the other obtained from a lov/er true score by the addition of an error of the same size but opposite sign of the previous error. Hov/ does one decide which score is to have a positive correction added and which is to have a negative correction added? It would appear that one cannot make this decision v/ithout having some information besides the observed scores . PAGE 41 33 A Direction for Future Research Since this study shows no difference among the proportion of significant t's for the raw gain, regressed gain, and the analysis of covariance procedures, it would be interesting to investigate the use of these procedures under assumptions other than the models listed in the first section of Chapter III. Differences may occur, for example, when the gain is a linear function of the pre score. Summary This study compared four selected measures of change. The four measures were: rav; difference, Lord's true gain, regressed gain, and analysis of covariance procedures. An empirical comparison was made among these four methods. Samples vjere generated using Monte Carlo techniques and the data in each sample were analyzed by each of the four methods . It V7as found that Lord's true gain procedure produced a number of spurious significant t values, greater than vjould be expected by chance, v/hen there was no real difference in amount of gain betvjeen the tvjo populations sampled. No apparent differences were noted among the remaining three methods and these three methods did not appear to have inflated significance levels. With such data use of Lord's true gain procedure is not recommended and none of the remaining three methods vjas recommended over the others . PAGE 42 APPENDIX A PAGE 43 35 FORTRAN PROGRAM WHICH PERFORMED THE CALCULATIONS C C c c c c c c c DIMENSION X{100,2,2),LnRD(100,2),DIFF(100,2) 1,AMAT(3,3),DIFFX{ 3) ,DUM(3) DOUBLE PRECISION SEED REAL KR21 , LO^C , MPRG, MPRiMG, MPOG, MPONG , MLG , MLNG , MRGG , MRGNG READ (5,1) N,SEEU,NSAMP,KSAMP,,NOPT 1 FORMAK 13, FlI .0,314) IFCNCPT.EC.O) GO TO 3 READ{5,2) IREL.ISAMP IREL RESTART RELIABILITY AT IREL FOR ABORTED RUN I SAMP RESTART SAMPLE NUMBER AT I SAMP FOR ABORTED RUN 2 for;-' AT (214) GG TO 4 3 I R E L -. 1 ISAMP=I 't CONTINUE N SIZE OF SAMPLE GAIN AVERAGE GAIN IN GAIN GROUP SEED SEED FOR RANDOM NUMBER GENERATOR KSAMP NUMBER OF SAMPLES TO BE TAKEN AT EACH LEVEL NSAMP NUMBER OF TFIE LAST SAMPLE FROM THE PREVIOUS RUN INCREMENTED AND PRINTED OUT AS THE SAMPLE NUMBER PAGE 44 36 c c c c c c c c c c c c c OF RELIABILITY TT=0.0 DO 1000 IR=IREL,5 READ(5,14) GAIN 14 FORNAK F6.'^) DO 902 J1=ISAMP,KSAMP NSAMP=NSAMP+1 S SUM SS SUV: OF SQUARES D DIFFERENCE SCORE G GAIN GROUP NG NO GAIN GROUP PR PRE SCORE PO POST SCORE PP PRE X POST PRDIFF SUM PR X DIFF SDG =0.0 SONG =0.0 SSDG =0.0 SSDNG =0.0 SPRG =0.0 SPRNG =0.0 SSPRG =0.0 PAGE 45 37 c c c c c c c c c c SSPRNG=0.0 SPGG =0.0 SPCNG =0.0 SSPOG =0.0 SSPCNG=0.0 SSPPG =0.0 SSPPNG=0.0 PRDIFF=0.0 REL=0.40+ 0.10Â»IR SEl= SQRT( 100.0-SEUSEl ) SX = SQRTl 100.0-SE1Â»SE1 ) SE2= SCRT((SElÂ»SEl+3.36)/REL-3.3 6-SEleSEl) GAIW=TTÂ«SGRT{2.0Â«(SEleSEl+SE2*Sc2+3.36)/N) XII, J, K) I-STUDENT J=l, PRE SCORE =2, POST SCORE K=l GAIN =2 NO GAIN DIFF( I , J) 1= STUDENT J=l, GAIN =2, NO GAIN DC 10 I=1,N l)l = SXÂ«RAND(SEEC) D2=SXÂ«RANC(SEED) PAGE 46 38 X(I,1.1)=50+D1+SE1Â»RAND(SEED) X(I,2,1)=50+01+GAIN+1.83Â»RAND(SEE0)+SE2Â»RAND( SEED) X( I,l,2)-50+D2+SEl*RAND(SEED) X( I,2,2)=50+D2+1.83Â»RAND(SEED)+SE2*RAND(SEED) DIFF( I, 1)-X( I ,2, 1 )-X( 1,1,1) DIFF( I,2)=X( I ,2,2)-X{ I, 1,2) SDG=SDG +DIFF( I, 1) SDNG=SDNG +CI FF{ I ,2) SSOG=SSDG+CIFF( I,l)fiCIFF(I,l) SSDNG = SSCMG + DIFF{ l,2) PAGE 47 39 1SP0GÂ»SP0G/N) ) CPPNG= (SSPPNG-SPRNG*SPONG/N )/SQRT( ( SSPRNG-SPRNG* SPRNG/N) * ( 1SSP0NG-SPCNG*SP0NG/N) ) DBARG= SDG/N DBARNG = SDNG/N VADG= (SSDG-SDGÂ»SDG/N)/(N-1) VAUNG =(SSDNG-SDiNG*SDNG/N)/(N-l) Tl= (CBARG-D3ARNG)/SGRT( { ( N1 ) *^ ( VADG + VAONG ) / { 2*N-2 ) )Â»{2.0/N 1)) . B1G= ( ( ( 1.0-REL)Â»CPPGÂ«SQRT(VAP0G) ) /SORT ( VAPRG ) -RE L+CPPG*CPP IG)/ 1 ( 1.0-CPPGÂ«CPPG) B2G= (RlL-CPPG>CPPG-( ( 1.0-REL )Â« SQRT ( VAPRG ) *CP PG ) /SQRKVAPOG 1) )/(l .0-CPPGÂ«CPPG) B1NG= {(( 1.0-REL) Â»CPP,NGÂ»SQRT( VAPONG) ) / SQRT ( VAPRNG ) -REL + CP PNG 1Â«CPP,\G) )/ (1.0-CPPNG = CPPMG) B2NG= (REL-CPPNGÂ»CPPNG-( ( 1.0-REL )* SQRT{ VAPRNG ) Â»CPPNG ) /SQRT{ 1VAP0\G) )/ ( 1.0-CPPNGÂ«CPP.MG) SLG =0.0 SSLG =0.0 SLNG =0.0 SSLNG =0.0 KPRG=SPRG/N MPtJG = SPGC/N MPRNG=SPRNG/N MPONG = SPCf\G/\ PAGE 48 ko DO 110 1=1, N LORDt I ,1)=DBARG+B1G*(X( 1,1,1 )-KPRG)+B2GÂ« (X( I ,2,1)-MP0G) LORD( 1,2 )= 08ARNG+81NG*(X{ I, 1, 2 )-MPRNG ) +B2NG* (X(I ,2,2)-MPaN IG) SLG=SLG + LCRD( 1,1) SLNG = SLNG + LORD( 1,2) SSLG=SSLG+L0RD(1,1)Â«L0RD{I,1) 110 SSLNG=SSLNG+LORO( I,2)bL0RD(I,2) MLG=SLG/N MLNG=SLNG/N VALG= (SSLG-SLG*SLG/N )/CN-1.0) VALNG=(SSLNG-SLNGÂ»SLMG/N)/(N-1.0) T2=(NLG-MLNG)/SQ^T( ( (SSLG-SLG*SLG/N)+SSLNG-SLNGÂ«SLNG/N)/(NÂ« 1(N-1 ) ) ) A={SSPPG + SSPPNG-( SPRG + SPRNG)<-(SP0G + SP0NG)/(2'^N) ) / 1 (SSPRG + SSPRMG-{SPRG + SPRNG)fr( SPRG + SPR^JG ) / ( 2^11) ) B=(SPCG + SPOrJG)/( 2*N)-A*( SPRG + SPRNG ) / ( 2 PAGE 49 ^1 SRGNG=SRGNG+RGSNG 210 SSRG!\G = SSRGNG + RGSNG*RGSNG MRGG=SRGG/N NRGNG=SRGNG/N VARGG=(SSRGG-SRGGÂ»SRGG/N)/(iM-l) VARGNG=(SSRGNG-SRGNGÂ»SRGNG/N)/{N-1) T3= (KRGG-MRGNG)/SQKT( ( SSRGG-SRGGÂ»SRGG/N+ SSRGNG-SRGNG* 1SRGNG/N)/(NÂ»( N-1) ) ) AMATC 1, 1)-2Â«M AMAT( 1,2)=N AMAT( I, 3)^SPRG+SPRNG AMAT{2, n^AMAK 1, 2) AMAT(2,2)=N AMAT(2,3)=SPRG Af-'AT(3, 1 ) = AMAT( 1, 3) AMAT( 3,2)=AKAT(2, 3) AMAT(3.3)=SSPRG PAGE 50 ^2 A05 DUMd )=.CUN{ I ) +DI FFX ( J ) * AMAT ( I , J) A 10 YXXXXY = YXXXXY + CUM( I )Â«CIFFXl I ) SSE = SSCG + SSCrvlG-YXXXXY VAACCV=SSE/{2.0*N-3) T4=DUN(2)/SGRT ( V AACCV* AMAT ( 2 t 2 ) ) AA=(PP[:iFF-{SPRG + SPRNG)*f(SCG + SDNG)/{2*N))/ 1 {SSPRG + SSPRi\G-{SPRG + SPRNG)Â»( SPRG + SPR^JG ) / ( 2Â»-fNl ) ) AKG=CEARG-AA'^(KPRG-(I^PRG + KPRNG)/2) AN'NG=CB/lRNG-AAÂ»(MPRNG-(MPRG + yPRNG)/2) KR21^1.G-(KPRGÂ«llC0-NPRG))/{ 1C0Â»VAPRG) VvRITE (6,501) NSAMP,REL,KR21, T 1 , T 2 , T3 t T^ , N-PRG , N PRNG , MPGG , 1NPCNGÂ»MG,KLNG,MRGG,MRGNG,AMC,AMNG, SE E D , VAPRG , VAPRNG , VAPCG , 2VAPCKG,VALG,VALNG,VARGG, VARGNG, VAACOV 501 FCRFAT(I6,1X,2(F3.2, 1X),1X,4(F7.3),5(^X,2F6.2)/5X,F23.11, 116X,A(AX,2F6.1),8X,F6.2) V PAGE 51 ^3 DCUDLE PRECISICN RC RC=CN'CCIRC* 30 517578125. ,3^359738368.) X=RC/34359738368. Y=SIGK( l.C,X-0.5) V=SCBTI-2.0*ALCG(0.5Â»(1.0-ABS(1.C-2,0Â«X)))) RAKC=YÂ«{V-{2.5 15517+C.8 2853Â«V+.C1032 8* lVe*2)/[1.0+l.A32788Â«V + 0.189269*VsÂ«-2 + 0.C0130 8tV*'^3)) RETURN END SUDRGLTINE INVCA) PRCGRAM FCR FINCIKG THE INVERSE CF A 3X3 MATRIX CINENSICN A(3, 3), L( 3 ),f',( 3) CATA N/3/ SEARCH FCR LARGEST ELEMENT CC80 K=1,N L{K)=K K (K) = K BIGA=A(K,K) CC2G I=K,N CC20 J=K,N IF(ABS (DIGA)-AL'S (A(I,J))) ]C, 20,20 10 BIGA=A( I , J) L { K ) I N(K)-J 20 CCNTINUE INTERCH/iNGE RCWS PAGE 52 kk J=L{K) IF(L(K)-K) 35,35,25 25 DC30 1=1, N HCLC=-A(K, I ) A(K, I ) = A( J, I) 30 A( J, I ) = l-CLrINTERCt-AKGE CCLUM\S 35 I = f''lK) IF(KIK)-K) A5,^5,3 37 CC40 J=1,N HCLC=-A( J,K ) A( J,K) = A( J, I ) 40 A( J, I ) = hCLC DIVICE CCLL'FN EY MNUS PIVOT A5 CC55 1=1, N A6 1F{ I-K)50,55,50 50 A( I ,K )=A(I,K) /(-A(K,K ) ) 55 CONTINUE RECUCE r-'ATRIX DC65 1=1, N CC65 J=1,N 56 IF(I-K) 57,65,57 57 IF(J-K) 60,65,60 60 A(I,J)=A(I,K)^fA(K,J)^A(I,J) 65 CONTINUE CIVICE RCVv BY PIVCT PAGE 53 45 CC75 J=1,N 68 IF( J-K)70,75, 70 70 A(K, J)=yi(K, J)/A(K,K) 75 CCKTINIjE C CCMINUEC FRUCUCT CF PIVOTS C REPLACE PIVCT BY RECIPROCAL A(K,K)=1.0/A(K,K) 80 CCMINUE C FINAL RCW AND COLUMN INTERCHANGE K = N 100 K=(K-I) 1F(K) 150,150,103 103 I=L{K) IF(I-K) 120,120,105 105 CCUO J^l.N HCLn=A( J,K) A ( J,K )=-A ( J, I ) 110 A( J, I )-hCLC 120 J=y(K) IFIJ-K) 100,100,125 125 CC130 1=1, N HCLC=A(K, I ) A(K, I )=-A( J, I ) 130 A( J, I )=FCLC GC TC 100 150 RETURN PAGE 54 ^6 END PAGE 55 APPENDIX B PAGE 56 48 LISTING OF t's FOR EACH METHOD OF ANALYSIS WITH THE GROUP SIZE 25 AND GAIN 0.0 SAMPLE PAGE 57 49 LISTING OF t's (CONTINUED) SAMPLE NUMBER EAV/ GAIN LORD'S TRUE GAIN REGRESSED GAIN ANALYSIS OP COVARIANCE 42 43 44 45 46 47 48. 49 50 51 52 57 58 |9 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 1.042 1.556 0.511 -1.295 0.669 0.382 0.228 0.922 -1.263 -0.276 0.496 1.385 -0.087 -1.594 0.489 -1.450 1.146 1.754 -0.538 -0.434 1.155 1.321 0.130 0.721 -0.556 0.173 -0.1^6 1.290 -1.904 0.572 0.923 -0.034 0.123 -0.586 -2.622 -0.034 -0.259 -0.597 0.065 0.614 0.729 -0.967 -1.275 1.259 1.762 16.696 3.033 -7.504 4.977 1.210 0.216 5.810 -3.021 -0.763 1.975 6.511 -0.187 -2.591 3.276 -3.217 4.044 7.198 -8.090 -2.971 3.037 20.233 1.233 2.349 -1.149 0.332 -0.465 4.543 -4.395 5.132 2.998 -0.106 . 546 -2.404 -8.344 -0.078 -0.303 -2.141 0.197 1.834 0.579 -2. 813 -5. 166 4.430 0.975 1.728 0.184 -1.783 1.026 0.640 -0.110 0.943 -1.114 -0.941 0.757 1.583 -0.161 -1.646 -0.233 -1.298 0.657 1.535 -0.255 -0.083 1.766 1.620 -0.008 0.514 -0.084 0.434 -0.496 1.526 -1.858 0.470 1.140 -0.004 0.026 -0.920 -2.917 -1.368 -0.257 0.577 -0.404 0.162 1.078 -1.069 -1.077 0.797 0.966 1.710 0.183 -I.78O 1.026 0.634 -0.110 0.933 -1.106 -0.933 0.749 1.572 -0.160 -1.630 -0.239 -1.290 0.657 1.530 -0.253 -0.083 1.749 1.604 -0.003 0.510 -0.034 0.432 -0.493 1.514 -1.844 0.466 1.128 -0.004 0.026 -0.911 -2.895 -1.332 -0.254 0.605 -0.412 0.162 1.073 -1.058 -1.071 0.802 PAGE 58 50 LISTING OF t's (CONTINUED) SAMPLE PAGE 59 BIBLIOGBAPHI Anastasl, Anne Psycholop;ical Testing; . New York: Hacmillan, 19bl. Bigge, Morris L. Learning; Theories for Teachers . New York: Harper and Rov?, 196^ Combs, A. W., Snygg, Donald Individual Behavior . Nev; York: Harper and Row, 1959. Cronbach, Lee J., Furby, Lita "How Should We Measure Change--or Should We?" Psychological Bulletin . 1970, 7^: 68-80. Hays, William L. Statistics for Psychologists . Nev? York: Holt, Rinehart, and Winston, I963. Hilgard, Ernest R. Theories of Le arning. New York: Appleton, 1956. Lord, P. H. "The Measurement of Growth," Education al and Psychol ogical Measureme nt, 1956, 16: ii-21-'537. Lord, P.M. "Statistical Inferences about True Scores," Psycho pietrika , 1959, 24: 1-17 . Lord, P. M. "Elementary Models for Measuring Change," in C. W. Harris Prob lems in Measuring Chan ge . Madison, Wisconsin: University of Wisconsin Press, I963, pp. 21-38. McNemar, Q. "On Grov;th Measurement," E duc at ional and Psychological Measurement, 1958, 18: "^7-55. Madansky, Albert "The Fitting of Straight Lines When Both Variables Are Subject to Error," Journ al of th e American Statistical Associa tion, I959, 5^: 173-205. Manning, Winton H., DiiBois, Philip H. "Correlational Methods in Research on Human Learning," Perceptual and Moto r Skills , I962, 15: 287--321 . Mendenhall, William Introducti o n to Linear Models and the Design and Analysis of Sx i-73riments . Belmont , California: Wadsv-forth, r968T' 51 PAGE 60 52 Meyer, Paul Introductory Pr o bability and S t atistical Application . Reading, Massachusetts: Addison-V/esley , Nunnally, Jura C. Psychometric Theory . New York: McGrawHill, 1967. Ohnmacht, Fred U. "Correlates of Change in Academic Achievement," Journal of Educational M easurement. 1968, 5: ^1-^^": Rosenthal, Myron R. Numerical Methods in Computer Program.ming . Homevmrd, Illinois: Irwin, I966. Scheffe', Henry The Analysis of Variance . New York: V/iley, 1959. Schick, George B. (ed) , May, Merril M. (ed) The Psycho logy of Reading. Milwaukee: The National Reading Conference, Inc., 15th Yearbook, I969, Ranking, Earl P., Jr., Dale, Lothar H. pp. 17-26. Skinner, B. F. Th e Technolo g y of Teaching . New York: Appleton, 19b8. Snedecor, George W. Statistical M e thods . Ames, lovja: Iowa State College Press, 1955. Soar, Robert S. "Optimum Teacher-Pupil Interaction for Sum.mer Grovith," Educational Le a dership Research Supplement, I968, 26(37: 275^280, Tate, Merle V/ . Statistics in E ducation and Psychology . Nev; York: Macmillan, I965. Thorndike, E. L. "The Influence of Chance Imperfections of Measures Upon the Relation of Initial Score to Gain or Loss," Journal of Experimental Psycholo gy, 192^-, 7: 225-232. Tillm.an, Chester E. Crude Gain VS. True Ga in: Correlates of Gain in Readin g after Re medial Tuto r inp.; . Doctoral dissertation. University of Florida^ T^^S. Winer, B. J. Statis tical Princ i ples in Experimental Desi.gn . New York: McGraw-Hill, 19^2"^ PAGE 61 BIOGRAPHICAL SKETCH John Howard Neel was born July 27, 19^^, at Waynesburg, Pennsylvania. He graduated from William R. Boone High school, Orlando, Florida, in June, 1962. In August, I965, he received the degree Bachelor of Arts with a major in mathematics from the University of Florida. He taught algebra and general mathematics at John F. Kennedy Junior High School from September, I965, until June, I966. In September, I966, he enrolled in the College of Education at the University of Florida under a United States Office of Education fellowship program directed by Dr. V/ilson H. Guertin. In September, I96B, he accepted a research assistantship under the same program. In June, I96S, he received the degree Master of Arts in Education. He vias an instructor in the College of Education at the University of South Florida from September, 19^8, until August, I969, and is currently on leave from that position. In September, 1969, he was appointed Interim Instructor in the College of Education at the University of Florida and he holds that position currently. John Howard Neel is married to the former Carol Lynn Raraft. They have tv;o daughters, Sarah Elizabeth and Lia Suzanne . 53 PAGE 62 5^ John Howard Neel is a member of the Florida Educational Research Association, The American Educational Eesearch Association, Phi Delta Kappa, and the American Statistical Association, PAGE 63 This dissertation v;as prepared uri.der the direction of the chairmen of the candidate's supervisory committeG and has "been approved by all raombei's of that conirnitteG. It V7as submitted to the Dean of the ColJ.ege of Education and to the Graduate Council, and vjas approved as partial fulfillment of the requirements for the degree of Doctor of Philosophy. August, 1970 d,iL.i ?V6 /X/O^ '4^ Dean, Go3/lege,' of Education Supervisory Coninittee: ^/ LA^i i-y^ 7 / > 'Chairman/' '^ Dean, Graduate School Co -Chairman ij^ Â• /^ PAGE 64 u.. 7 9 4 6 B |