A COMPARATIVE ANALYSIS OF
SOME MEASURES OF CHANGE
By
JOHN HOWARD NEEL
A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1970
COPYRIGHT BY
JOHN HOWARD NEEL
1970
ACKNOWLEDGEMENTS
The writer wishes to thank his committee members for
their guidance in this study, especially Dr. Charles H.
Bridges, Jr., who suggested the topic and was a constant
source of assistance throughout the study. When this study
was begun there was a questioning of change scores in the
department which was helpful and encouraging. Those
responsible for this atmosphere were Dr. Charles M. Bridges,
Jr., Dr. Robert S. Soar, Dr. William B. Ware and Mr. Keith
Brown.
Dr. Vynce A. Hines and Dr. P. V. Rao assisted by each
detecting an error in the model presented.
Dr. William B. Ware was especially helpful editorially,
as was Dr. Wilson H. Guertin.
The writer's wife, Carol, was encouraging, and under
standing of the time which was necessarily spent away from
home.
ii
TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS . . . . . . .
LIST OF TABLES . . . . . . .
ABSTRACT . . . . . . . .
CHAPTER
I. INTRODUCTION . . . . .
The Problem of Measuring Change
Methods of Analyzing Change .
The Problem . . . . .
Some Limitations . . . .
Procedures . . . . .
Significance of the Study . .
Organization of the Study . .
II. RELATED LITERATURE . . .
. . . iii
. . vi
S . . vii
. . . 10
Derivation of Lord's True Gain Scores
Comparison of Lord's True Gain Scores
with Other Scores . . . . .
Comparison of Regressed Gain Scores with
Other Scores . . . . . .
Another Study and Summary . . . .
III. METHODS AND PROCEDURES . .
10
13
. 14
. 15
. . 16
Procedures: An Overview . . . . 16
Sampling from a Normal Population with
Specified Mean and Variance . . .. 19
Selecting Reliability . .. . . 20
Selecting Gain . . . . . ... .21
Analysis of the t Values for the Four
Methods . . . . . . ... 23
IV. RESULTS, CONCLUSIONS,AND SUMMARY . . 25
Resul ts . . . . . . . .
Conclusions . . . . . . .
Disc ssio . . . . . . .
S 25
231
.31
* .
*
*
Page
CHAPTER
A Direction for Future Research . . .
Summary . . . . . . . . .
APPENDIX
A. FORTRAN PROGRAM . . . . .
B. LIST OF t's FOR THE FOUR METHODS .
BIBLIOGRAPHY . . . . . . . .
BIOGRAPHICAL SKETCH . . . . . . .
. .
LIST OF TABLES
Table
NUMBER OF SIGNIFICANT t's WHEN THE TRUE
MEAN GAIN WAS 0.0 FOR BOTH GROUPS . .
NUMBER OF SIGNIFICANT t's WHEN THE POWER
OF THE t TEST ON THE RAW DIFFERENCE
SCORES WAS 0.50 . . . . . .
Page
26
27
Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
A COMPARATIVE ANALYSIS
OF SOME MEASURES OF CHANGE
by
John Howard Neel
August, 1970
Chairman: Wilson H. Guertin
CoChairman: Charles M. Bridges, Jr.
Major Department: Foundations of Education
The purpose of this study was to determine which of
four selected methods was most appropriate for measuring
change such as gain in achievement. The four selected
methods were raw difference, Lord's true gain, regressed
gain, and analysis of covariance procedures.
In order to compare the four methods Monte Carlo
techniques were employed to generate samples of pre and
post scores for two groups. The reliability, variances,
and means of the sampled populations were controlled. One
hundred sa:ples were generated at each of 20 conbinations
of five levels of reliability, two levels of group size,
and L..o lecv3.s of gain. Using each of the four methods,
a t statistic ;was calculated lor each sample to test the
null] hypothesis of no difference in amount of gain between
the tA:o groups. The number of t's significant at the 0.05
level of significance was recorded for each of the four
methods.
Vii
At each of the 20 combinations a chi square test was
used to test the null hypothesis of equal proportions of
significant t's among the four groups. This hypothesis
was rejected in each case. It was noted that use of Lord's
true gain procedure tended to create a greater significance
level than the user would intend. The proportion of
significant t's for each of the other three methods of
analysis fell reasonably close to the expected values. On
this basis use of Lord's true gain procedure was not
recommended and since there was no apparent difference among
the remaining methods of analysis, none was recommended
above the other.
vi i
CHAPTER I
INTRODUCTION
Learning has long been a primary focus of investi
gation for educators. A definition of learning has been
offered by Hilgard (1956):
Learning is the process by which an activity
originates or is changed through reacting to
an encountered situation, provided that the
change in activity cannot be explained on the
basis of native response tendencies, maturation,
or temporary states of the organism (e. g. fatigue,
drugs, etc.). (p. 3)
Although Hilgard went on to say that the definition
is not perfect, it does illustrate a commonly accepted
aspect of learning: learning involves change of the
behavior of the organism which learns (Bigge, 1964, p. 1;
Skinr.nr, 1968, p. 10; Combs, 1959, p. 88).
Educators have been concerned with this change and
often have sought to measure the change occurring in some
situation. Several methods of analyzing change have been
presented in the literature. A comparison of these methods
of analysis of thef measurement of change and the acco;lpIxj.ny
ing difficulties .which arise was the focus of this study.
The purpose of the study was to determine which of the
several selected measures of change is most appropriate
under various conditions.
The Problem of Measuring Change
In all sciences measurement is an approximation. In
conducting a first order survey, a surveyor makes three
measurements of a distance and takes the average of the
three measurements. Physicists and engineers customarily
report the relative size of the error of their measurements.
Physical scientists have been fortunate in that the size
of the relative error involved in their measurements
often has been small, frequently less than 0.01 and some
times less than 10 Educators are unfortunate in this
respect in that if a student's true I. Q. were 100 and an
I. Q. score of 88 was observed, the relative error would
be 0.14. The size of this error is not uncommon and larger
relative errors do occur. The StanfordBinet intelligence
test has standard error of measurement equal to five I. Q.
points (Anastasi, 1961, p. 200). Thus, relative errors of
0.14 or larger will occur 1.64 per cent of the time,
assuming a normal distribution of errors.
In measuring change the problem is compounded since
there is an error in both the pre score and the post score.
iloreover, when the magnitude of the change is small or zero,
the magnitude of the error may be larger than that of the
change. This possibility makes the change difficult to
detect or to separate from the error. The effect of this
error of measurement on change scores was noted as early
as 1924 by Thorndike:
When the individuals in a varying group are
measured twice in respect to any ability by
an imperfect measure (that is one whose self
correlation is below 1.00), the average
difference between the two obtained scores
will equal the average difference between
the true scores that would have been obtained
by perfect measures, but for any individual
the difference between the two obtained scores
will be affected by the error. Individuals
who are below the mean of the group will tend
by the error to be less far below it in the
second, and individuals who are above the
mean of the group in the first measurement
will tend by the error to be less far above
it in the second. The lower the selfcorre
lation, the greater the error and its effect.
Thorndike (1924) went on to show that there was a spurious
negative correlation between initial true score and true
gain. He then stated that "the equation connecting the
relation of obtained initial ability with obtained gain,
the unreliability of the measures and the true facts" had
not been discovered. Lord (1.956) developed the equation
to which Thorndike alluded. Lord made the following
assumptions concerning the error of measurement: the
errors
i) have zero mean for the groups tested.
ii) have the same variances for both tests.
iii) are uncorrelated with each other and with
true score on either test. (Lord, 1956)
HcNemar (1958) has extended Lord's work to the case of
unqual error variances. It should be noted here that
:cl;emanir's followup of Lor:'s work: is only an extension.
When the variances are equal, :;c'ieniar's foLmiul.s re
identical. to Lord's (Lord, 1958).
Methods of Analyzing Change
The estimated true gain scores derived from Lord's and
NcNemar's equations have been used in either t tests or
analysis of variance procedures (Soar, 1968; Tillman, 1969).
There are, in addition to Lord's method, three other
commonly used methods for analyzing change.
One method has been the use of a straight analysis
of variance or t test on the raw difference scores as the
situation warrants. It is important to note that the raw
scores are used in the analysis with no correction for the
unreliability of the measures.
A second method is to complete an analysis of covari
ance on the raw difference scores using the pretest scores
as covariates. These procedures are standard statistical
techniques and mnsy bo found in many texts (Hays, 1963;
Snedecor, 1956; Winer, 1962).
The third method of measuring gain has been advocated
by Manning and DuBois (1962): the method of residual gain.
In this method the final scores are regressed on the initial
scores and the difference between the final score and the
score predicted by the regression equation is taken as a
measure of gain. This measure is then used in t tests or
analysis of variance procedures.
Thus in the case of equal variances four common
methods of analyzing change have been identified:
1. Use of raw gain scores in appropriate
procedures.
2. Use of Lord's true gain scores in appro
priate procedures.
3. Use of iaanning and DuBois' regressed gain
scores in appropriate procedures.
4. U.se of analysis of covariance on raw gain
scores with pre scores as covariates.
The Problem
A researcher faced with these different methods and
a problem in measuring change is confronted with a second
and fundamental methodological problem: which of the
methods for measuring change is most appropriate? It is
this question that this study sought to answer. The
problem of which method to use is further complicated since
different writers have claimed that different techniques
were appropriate. "iManning and DuBois and Rankin and Tracy
feel that the method of residual gain is more appropriate
for correlational procedures since it is metric free" (as
quoted in Tillman, 1969, p. 2). Ohnmacht (1968) also sug
gested that this procedure was the best. Lord(in Harris,
1963, chapter 2) mentioned regressed gain but seemed to
advocate his own method as being superior. This position
is further supported by Cronbach and Furby (1970).
To determine which of the method of analysis was
most appropriate, an empirical study was conducted to
conipre the results of each method under known situations.
Some Limitations
This study was limited to the two group situation.
To examine more than two groups would have involved such
a large number of possibilities as to make the study
impractical in terms of time and money. Thus, this study
excluded the more general multigroup comparisons possible
with analysis of variance and covariance procedures and
was limited to examining t tests and the analysis of
covariance using the pre score as covariate.
Additionally the case of unequal variances between
the two groups was not considered.
A third limitation was that the variance of the true
gains was selected a priori to be 3.6. True gain scores
with this variance will be such that over 99 per cent will
be within five units of the true mean gain. It is the ratio
of the variances of the gain and error which is important.
Since the reliabilities were varied, as indicated later,
this study was conducted utilizing several such ratios.
One of the factors of interest was the reliability of
the test used. A second factor of interest was the sample
size or the relative power of the procedures under study.
There is an infinite number of reliabilities and a decision
was made as to the levels of reliability to be investigated:
0.50 to 0.90 in increments of 0.10. Tests with reliabilities
lower than 0.50 are rarely used in practice and at best the
resulting data would be highly questionable. The lower
limit of 0.50 was chosen for this reason.
Sample sizes of 25 and 100 per group were chosen as
being somewhat representative of sample sizes used in
educational research.
Procedures
Two groups were compared under 20 conditions using t
tests as follows. There were five levels of reliability
(0.50, 0.60, 0.70, 0.80, 0.90) and two levels of sample size
(25 and 100) used in this study. Thus there were ten differ
ent combinations of sample sizes and reliabilities. For each
of these combinations two cases were investigated, one where
there was no pre to post test gain in either group and the
second where there was a known gain from the pre to the post
test for one group. For each of these 20 instances, 100
samples were generated and analyzed using each of the four
methods of analysis indicated previously.
Consequently, two questions were to be answered:
1. Does any one of the selected methods yield a
disproportionate number of significant t values
when there is no difference between the mean
gain of the two groups?
2. Is any one of the selected methods more power
ful, i. e. more successful in detecting a
difference when a difference does exist?
Samples from a normal distribution were generated using
techniques described by Rosenthal (1966). The method for
generating random numbers was the multiplication by a
constant method. With the procedures used, this method
will produce 8.5 million numbers before the series repeats.
This number was more than sufficient for this study. All
generation of the samples and calculation of t values using
the various methods of analysis were done on the IBM 360/65
computer at the University of Florida. The significance
level used for the t tests was 0.05.
The two research questions generated two null hypo
theses:
1. The proportion of t's significant at the
0.05 level is the same for each method of
analysis when there is no gain in either
group.
2. The proportion of t's significant at the
0.05 level is the same for each method of
analysis when there is gain by one group
but not by the other.
These hypotheses were tested at each of two combinations
of reliability and sample size with chi square tests using
the 0.05 level of significance.
Significance of the Study
The results of this study should either indicate
empirically that one or more methods were superior to the
others or that there were no great differences among the
methods. If the former were true, then educational resear
chers may select one of the better methods. If the latter
were true, then educational researchers may select any of the
methods. In either case the study provides some answer as
to how change scores should be analyzed.
Organization of the Study
Chapter I has been the introduction, statement of the
problem, limitations, hypotheses, and procedural overview.
Chapter II reviews related literature, essentially the
development of the equation and methodology of the various
techniques studied. Chapter III describes the procedures
and Chapter IV presents the data, conclusions, and summary.
CHAPTER II
BELATED LITERATURE
Much has been said in the literature about measuring
change. However, most of this discussion is centered
around the four methods investigated in this study: raw
gain, Lord's true gain, regressed gain, and analysis of
covariance procedures. As pointed out in Chapter I raw
gain and regressed gain procedures are discussed in many
texts and therefore not discussed here. Lord's true gain
and regressed gain procedures are discussed in this chapter.
Derivation of Lord's True Gain Scores
The following derivation parallels Lord's (1956)
development of true gain scores with one exception as noted.
Lord gave the following equations as a model for the observed
pre and post scores:
(1) X = T + E1
(2) Y = T + G + E2
where
X = observed pre score;
Y = observed post score;
T = true pre score;
G = true gain;
El = error of measurement in pre observed score;
E2 = error of measurement in post observed score.
Lord then made the following assumption concerning E1 and
E2, the errors of measurement.
The Errors
i) have zero mean in the group tested.
ii) have the same variance (a ) for both tests.
e
iii) are uncorrelated with each other and with
true score on either test (Lord, 1956).
The derivation can be considerably shortened at this point
by examining a standard regression equation which predicts
one variable, X1, from two other variables, X2, X3. The
equation is (Tate, 1965, p. 171)
a1 a
(3) X = B2.3 2 (X2 2 + B32 (X X) + 1
1 12.3 2 2 13.2 G 3 3
where
r r r
23
and
r r r
B r 13 12 23
B13.2 1 2
23
If we let
X1 = G = gain;
X2 = X = observed pre score;
X3 = Y = observed post score;
the following elements in the regression equation can be
identified as
X1 = estimated gain;
X2 = mean of the observed pre scores;
X = mean of the observed post scores;
S1 = standard deviation of the gain scores;
S2 = standard deviation of observed pre scores;
S = standard deviation of observed post scores;
r23 = correlation of observed pre and post scores.
Lord has pointed out that r12 and r13, the correlations
between observed score and true score, are the reliabilities.
From (1) and his stated assumptions, Lord writes
(4) o 2 +2 2
(4) = t + e
t e e
2 2 2 2
(5) a = o + o + e
y t g e
(6) ax = o2 + a
(6) xy t te
Lord solves these equations to find
2 2 2 2
(7) a = o + o + 20 2o
g x y xy
the variance of the true gain scores.
At this point the only element in the regression
equation which is undefined is X1. This element is found
by considering the mean of the observed pre scores which
from (1) can be seen to be equal to the mean of the gain
scores plus the mean of the errors of measurement, that
is
(8) X= E .
But by Lord's assumption (i) E = O, therefore
(9) X = T .
Similarly from (2) Lord shows
(10) Y = T + G .
Then (9) is subtracted from (10) and rewritten to yield
(11) G = Y X .
Thus all elements are defined and (3) may be rewritten as
in terms of T, G, X, and Y as
(12) G = B123 ^ (X B) + B132 Y (Y Y) + Y X
x y
which Lord has asserted to be an estimate of true gain.
It may be noted that no notational scheme or other
method has been presented to distinguish between statistics
and parameters in the preceding derivation. This lack is
in keeping with Lord's derivation. It is assumed here
that Lord was referring to parameters until the point at
which he obtained the final equation and that he then
intended to use sample values to estimate the appropriate
parameters in the regression equation.
Comparison of Lord's True Gain Scores with Other Scores
In his original article Lord (1956) made no comparison
of his method with any other method. In a subsequent article
(1959) Lord again made no mention of other methods. In
a chapter written in Problems Jn iesurins Chan'ye (Harris,
1963, Chapte ir 2) he m.de reference to regrosscd vain scores,
but no discussion or comparison was presented. In Statis
tical Theories of Mental Test Scores (Lord and Novick,
1968) no comparison of Lord's true gain scores with other
procedures is presented.
Comparison of Regressed Gain Scores with Other Scores
Manning and DuBois (1962) have compared per cent,
raw, and residual gain scores. A per cent gain score is
raw gain score divided by the pre score (Manning and DuBois,
1962). The comparisons were made on the bases of metric
requirements, reliability, and appropriateness of use in
correlation procedures. On each of these bases residual
gain scores were recommended over per cent and raw gain
scores. Manning and DuBois pointed out that per cent and
raw gain scores require at least equal interval scales on
both pre and post scores and that the scales be the same
on both pre and post scores, i. e. the same equal interval
scale must be used on both tests. According to Manning
and DuBois these qualities are not possessed by educational
and psychological test scores. In contrast, residual gain
scores do not require the same equal interval scales and
therefore are appropriate for use with test scores (Manning
and DuBois, 1962). Manning and DuBois summarily list
formulas showing that residual gain scores are more reliable
and more appropriate measures for correlational procedures
than are raw or per cent gain scores. These formulas were
only listed, not derived, and no reference was made to their
derivation.
Another Study and Summary
Madansky (1959) reported or derived several methods
for fitting straight lines to two variables when both were
measured with error. One of these procedures is applicable
in the case when the variance of the error of measurement
is unknown. However, there has apparently been no attempt
to apply the method to the analysis of change.
A search of the literature has revealed no compara
tive empirical examination of the four methods examined
in this study. Further, the advocates and authors of two
of the reported procedures, each of whom has been shown to
know of the existence of the other procedure, continue to
advocate their own method even though they offer no reason
or data for this advocacy. This study should provide some
knowledge as to any difference in the four methods.
CHAPTER III
METHODS AND PROCEDURES
Procedures: An Overview
As stated in Chapter I, Monte Carlo techniques were
employed to generate pre and post test scores for two
groups. One group is referred to as the gain group, the
other as the no gain group.
The model for the observed pre scores is
(1) X = T + El
where
X is the observed pre score;
T is the true pre score;
E1 is a normally distributed random error
2
with mean 0.0 and variance C .
e
The model for the post scores is
(2) Y = T + G + E2
where
Y is the observed post score;
T is the true pre score;
G is the true gain from pre to post score;
E2 is a normally distributed random error
wit mo!ican 0.0 anid variance o
e2
The generated scores were subsequently analyzed for
the difference in the amount of gain or change between the
two groups. The scores were analyzed by the four selected
methods;
1. a t test on the raw difference scores
2. a t test on Lord's true gain scores
3. a t test on regressed gain scores
4. a t test from an analysis of covariance on
the raw difference scores using pre score
as covariate.
The results of these analyses were then compared. For the
gain group an appropriate mean gain, V from pre to post
scores was obviously selected to be 0.0 in the case of no
gain for either group and selected to be of such size as
to male the po,:wr 0.50 when there was a gain in the gain
group. The post scores were generated by adding a random
normal gain,, and a random noriiral error, E2, to the
generated true pre scores. The variables G and E2 had
means 1 and 0 respectively and variances as discussed
later. For the no gain group there was no gain from pre
to post scores.
The pre scores for both the gain and the no gainI
group were taken from a normal population with mean 50.0
and variance 100.0. The mean and variance of the popula
tion of post scores for both the gain and the no gain
groups w;ere function", of the mean true gain and of the
reliability.
After the samples were generated, the hypothesis of no
difference in average gain between the two groups was tested
using each of the four methods. The t values for each of
these tests were recorded.
This procedure was repeated over 100 samples for each
of the selected reliabilities 0.50, 0.60, 0.70, 0.80 and
0.90. The method for introducing the effect of the selected
reliability into each generated score is presented in a
following section.
Thus 100 t values were calculated and recorded for
each method of analysis and at each level of reliability.
This entire procedure was repeated for each of the following
conditions:
1. group size = 25, Pg = 0.0 for both groups;
2. group size = 25, Vg 0.0 for the gain group
S0.0 for the no gain group;
3. group size = 100, V = 0.0 for both groups;
g
4. group size = 100, P 9 0.0 for the gain group
= 0.0 for the no gain group.
Where the gain was not equal to 0.0 it was such that the
power of the t tests on the raw difference scores was 0.50,
i. e. the expected proportion of rejected null hypotheses
was 0.50. The following sections describe in more detail
some of the previously mentioned procedures.*
* The reader is also referred to the FORTRAN listing in
Appendix A for the exact computer routines by which these
procedures were carried out.
Sampling from a Normal Population with Specified Mean and
Variance
If F is the cumulative density function of a random
variable R1, thaa the random variable R2 defined by
(3) R2 = F(R1)
is uniformly distributed over the interval [o,li (Meyer,
1965, p. 256, Theorem 13.6). Here F is the cumulative
density function of the random variable R1. It follows
then that Rl, where
(4) R = F(R2)
is normally distributed if F1 is the inverse cumulative
density function of a normal distribution and if R2 is a
uniform random number on the interval [0,1] (Meyer, 1965,
pp. 256257).
Thus random samples from a normal distribution may be
obtained using uniform random numbers and by (4) where F
is the cumulative density function of a normal distribution.
1
For a normal distribution, F1 (R ) must be calculated using
numeric:al approximation methods. This calculation as well
as the generation of the uniform random numbers werc'e done
using a routine dcscribcd by Rosenthal (1966, pp. 270, 267).
Roscnthal's techniques were adapted to the IBM' 360/65
computer installed at the University of Florida (see
Appendix A FUIICTIOII RAUD). The normal population sampled
had mean 0 and variance 1.0. If a different mean or variance
20
was required, it was obtained by addition or multiplication
by an appropriate constant.
Selecting Reliability
Reliabilities of 0.50, 0.60, 0.70, 0.80 and 0.90 were
selected as representative of reliabilities found in test
scores. The reliability, rel, of a test may be defined as
c2
e
(5) rel = 1 (Nunnally, 1967, p. 221),
2
x
2 2
where a is the error of measurement variance and o is
e x
the observed score variance. Since c2 had been selected
x
a priori to be 100.0, we have from (5)
2
(6) a = 100 (1 rel)
e1
Moreover, since
(7) X = T + E1
and since the error, El, is assumed to be independent of
the true score, T, we have
(8) o2 = 02 + a2
x t e
or, combining (6) and (8) and solving for o2,
(9) a2 = 100 o2
t e1
For the post scores the desired variances are also
easily found from the model for a post score,
(10) Y = T + G + E
and for which
(11) 2 = 2 + 2 + C2
y t g e2
As stated in Chapter I, a2 was selected to be 3.6. If (5)
g
y e2 x el
tively, then (5) and (11) may be used to find
c2 + 2 2
(12) C2 t C
e rel t g
2
The effect of the selected reliability may be obtained
by selecting the error of measurement variances and the
variance of the true scores in accordance with (6), (9)
and (12).
Thus it is seen that if true scores are selected from
2
a distribution with variance a2 and if the errors of
t
measurement are selected independently from a distribution
2 2
with variance 0 then by (8) X has variance if (7) holds.
e x
Selecting Gain
When there was no gain in either group the value of g .
would then be 0.0. When vt was nonzero for the gain,
its value was selected so as to make the power of the t
test on the raw difference scores equal to 0.50. The
power of 0.50 was selected in order to permit maximum
difference between the four methods of analysis.
The value of G was determined by examining the
difference scores (D).
(13) D = Y X,
and from
(14) D = (T + G + E2) (T + E1)
or
(15) D = G + E2 El
The elements in the right side of (11) are mutually inde
pendent normally distributed random variables whose vari
ances have been found and thus
2 2 2 2
(16) d = oa + a + o
d el g e2
Furthermore since the only difference in (15) for the gain
and no gain groups is the mean of G, the variance of the
2
difference for the gain group, ad and the variance of
g
2
the difference for the no gain group, d are equal, i. e.
ng
2 2 2
(17) dg= dng = d
g ng
2
This common value a may then be used to determine the
appropriate value of pg to produce the desired power of
0.50 for the t test on the raw difference scores.
The t test on the raw difference scores is found from
the following formula:
(18) t = g n Dng
S(n 1) S2 + (n 1) S
g ng 1 1
n + 2 r n "2
If the group size is 25 and reliability 0.50, the value of
is found as follows:
2
(note: rel = 0.50 impliescl = 106.72)
(19) t =
2 2
24(2 c ) + 24(2L + 1
g 1!
25 + 25 2 25 25
D
t = 2 5
This value of t is greater than the critical value of t
(2.01) only if
(20) 2.01 gT
that is, only if
(21) 5.86 <5
g
Thus if a value of 5.86 is chosen for the mean gain, the
power is .50. Appropriate values for other group sizes
and reliabilities were similarly determined.
Analysis~ of the t Values for the Four fethod'J
The number of t's; significant at the 0.05 level w.ss
rccorde.d for each of' the four me'.1o.is of analysis. The.o
24
data were recorded for each of the 20 combinations of sample
size, reliability and gain. A chi square statistic was
calculated for each of these 20 sets to test the null
hypothesis of no difference in the proportion of signifi
cant t values for the four methods of analysis. These data
may be seen in Tables 1 and 2 of Chapter IV.
CHAPTER IV
RESULTS, CONCLUSIONS, AND SUMMARY
Results
The number of significant t's for each method of
analysis under the no gain condition is presented in
Table 1, and for the gain condition in Table 2. Addi
tionally, the computed chi square statistics for each
reliability level are given. In each case the null hypo
thesis tested was that the proportion of significant t's
was the same for each of the four methods of analysis. The
chi square values were computed from the 2 x 4 contingency
tables implied by the corresponding line of the table. For
example, for group size of 25 and a reliability of 0.50,
the 2 x 4 contingency table implied by the first line of
Table 1 is:
Raw Lord's Regressed Analysis of
gain gain gain covariance
Significant 5 58 2 2
Non significant 95 42 98 98
As may be seen by inspection of Tables 1 and 2, all the
chi square values wcre significant at the 0.05 level and
in each case the hypothesis of equal proportion of signi
ficant t values for the four methods of analysis w:as rejected.
TABLE 1
NUMBER OF SIGNIFICANT t's WHEN THE TRUE MEAN
GAIN WAS 0.0 FOR BOTH GROUPS
GROUP SIZE = 25
RELIABILITY RAW LORD'S REGRESSED ANALYSIS CHI
GAIN TRUE GAIN OF SQUARE
GAIN COVARIANCE
0.50 5 58 2 2 163.13
0.60 5 53 4 4 128.98
0.70 7 51 6 6 103.69
0.80 5 24 5 5 30.76
0.90 6 27 4 4 40.95
GROUP SIZE = 100
RELIABILITY RAW LORD'S REGRESSED ANALYSIS CHI
GAIN TRUE GAIN OF SQUARE
GAIN COVARIANCE
242.95
178.28
115.05
111.60
59.61
0.50
0.60
0.70
0.80
0.90
CHI SQUARE (3,.95) = 7.82
TABLE 2
NUMBER OF SIGNIFICANT t's WHEN THE POWER OF THE
t TEST ON THE RAW DIFFERENCE SCORES WAS 0.50
GROUP SIZE = 25
RELIABILITY RAW LORD'S REGRESSED ANALYSIS CHI
GAIN TRUE GAIN OF SQUARE
GAIN COVARIANCE
0.50 44 89 54 54 48.80
0.60 52 89 64 64 33.00
0.70 52 86 65 66 28.23
0.80 47 78 50 51 25.43
0.90 50 75 53 52 16.90
GROUP SIZE = 100
RELIABILITY RAW LORD's REGRESSED ANALYSIS CHI
GAIN TRUE GAIN OF SQUARE
GAIN COVARIANCE
0.50
0.60
0.70
0.80
0.90
38.80
39.48
24.45
38.37
21.35
CHI SQUARE (3,.95) = 7.82
Further inspection of Tables 1 and 2 reveals a higher
number of significant t's for Lord's true gain procedure
than for any other methods. Moreover, examination of Table
1 shows that this particular technique gives a considerably
greater frequency of significant t values than one would
expect by chance. The expected frequency is 5 for the a
priori established condition of no actual difference in the
two populations sampled. These results indicate that use
of Lord's true gain procedure tends to create a higher
significance level than the user would intend. If the
sample proportion of significant t's found in the analysis
is used as an estimate of the significance level, that
estimate is 0.58 for the case when the group size was 25.
For the same group size the lowest estimate of the signi
ficance level is 0.39.
Conclusions
Since the hypothesis of equal proportion of signifi
cant t's for the four methods of analysis was rejected in
each of the 10 cases where the mean gain was 0.0 and since
the use of Lord's true gain scores provided estimated
levels of significance which were considerably higher than
those intended, the use of Lord's true gain scores is
strongly suspect and therefore is not recommended.
No apparent differences were found among the remaining
three Iethods of analysis. However, there is a similarity
bet;sen the regressed gain scores procedure and the
analysis of covariance procedure that should be examined.
The data in Table 1 indicate that the same number of
significant t's was found by both of these methods in the
case where there was no gain for either group.
The 100 t values for each of the four methods of
analysis when the group size was 25 and there was no gain
in either group are presented in Appendix B. Inspection
of the t values for the regressed gain procedure and the
analysis of covariance procedure reveals a striking simi
larity between the t values; for each sample the t values
are identical to at least the first decimal place. As a
descriptive statistic it is noted that the correlation
between the t values found by these two methods is 1.00
(rounded to 3 digits). Thus, the two methods are providing
very similar results.
In contrast the correlation between the t's for the
raw difference and regressed gain procedures is 0.898. The
two methods, regressed gain and analysis of covariance,
are not entirely similar to the raw difference procedure.
It may also be seen from Appendi: B that the signs of the
t's from bath the r'egresscd gain and analysis of covariance
procedures co.:,etimes are opposite from the sign of the t
for the raw difference procedure.
Snodecor (1956, pp. 397,398) has indicated that the
regressed gain procedure and the analysis of covariance
procedure on the post scores using pre score as covariate
are identical procedures. Nlo cource i.as found indicating a
similarity between the regressed gain procedure and the
analysis of covariance procedure on the difference scores
using pre score as covariate. However, the two methods
may be shown to be equivalent by writing the linear model
for the regressed gain, or, equivalently, for the analysis
of covariance on the post scores using pre score as covariate,
and the model for the analysis of covariance on the differ
ence scores using pre score as covariate. The model for
covariance analysis on the post scores is
(1) Y = B0 + BlX + B2Z + E (Mendenhall, 1968, p. 170),
where X and Y are defined as previously and
Z = 1, if Y is from the gain group,
= 0, if Y is not from the gain group.
E = a normally distributed random error with mean 0.
The model for covariance analysis on the difference scores
is
(2) D = B0 + BlX + B2Z + E
where
(3) D =Y X
and all other elements are defined as in (1). Now if the
right side of (3) is substituted into (2) and the resultant
equation rearranged to yield
Y = BO + (B + 1.O)X + B2Z + E
(4)
it is seen that (1) and (4) are identical except for the
addition of 1.0 to BI of equation (1) and thus the two
methods will yield the same t values for testing the
hypothesis that B2 is equal to 0.0.
Since no clear difference was found among the raw gain
procedure, the regressed gain procedure and the analysis
of covariance procedure, none of these is recommended as
more appropriate for the analysis of change than the other.
All of these three procedures are recommended above Lord's
true gain procedure.
Discussion
It is reasonable to ask if there is some questionable
logic in Lord's derivation of true gain scores. Two
things become apparent upon examination of the derivation.
First the formula which Lord uses to begin his derivation,
(3) of Chapter II, requires that the independent variable
be known exactly (Madansky, 1959; Scheffe', 1959, p. 4),
i. e. without error of measurement. The problem of esti
mating true gain arises in that the pre and post test scores
are not knoin exactly, but instead thec observed ..cores, or
the t'ru.e scores plus measurement errors, are known as m.ay
be seen from Lord's models of the observed scores, (1) and
(2) in Chapter II. If true pre and true post score were
known these could be put into the regression equation.
However, if true pre score and true post score were known
there would be no nee:1 for the regression equation to
estimate gain. The gain could be obtained simply by sub
tracting true pre score from true post score. In short,
Lord seems to have assumed his conclusion in his derivation.
Second, a look at the basic method for estimating true
gain is enlightening. In order to estimate true gain from
observed score it is necessary to somehow remove the error
of measurement since, assuming the test to be valid, this
is the factor that obscures the true score. Mendenhall
(1968, p. 1) says that "...statistics is a theory of infor
mation...." What information is known concerning the errors
of measurement? By Lord's assumption (iii) of Chapter II
the errors are uncorrelated with true score and with each
other. Thus neither the observed pre score nor the observed
post score should provide any information concerning the
size of the error. Since this error is random it would
seem that it could not be removed from the observed scores.
Consider two equal observed scores, one obtained from a
higher true score by the addition of a negative error, the
other obtained from a lower true score by the addition of
an error of the same size but opposite sign of the previous
error. How does one decide which score is to have a posi
tive correction added and which is to have a negative
correction added? It would appear that one cannot make
this decision without having some information besides the
observed scores.
A Direction for Future Research
Since this study shows no difference among the propor
tion of significant t's for the raw gain, regressed gain,
and the analysis of covariance procedures, it would be
interesting to investigate the use of these procedures
under assumptions other than the models listed in the first
section of Chapter III. Differences may occur, for example,
when the gain is a linear function of the pre score.
Summary
This study compared four selected measures of change.
The four measures were: raw difference, Lord's true gain,
regressed gain, and analysis of covariance procedures. An
empirical comparison was made among these four methods.
Samples were generated using Monte Carlo techniques and
the data in each sample were analyzed by each of the four
methods.
It was found that Lord's true gain procedure produced
a number of spurious significant t values, greater than
would be expected by chance, when there was no real differ
ence in amount of gain between the two populations sampled.
No apparent differences were noted among the remaining
three methods and these three methods did not appear to
have inflated significance levels. 'With such data use of
Lord's true gain procedure is not recomriiended annc none
of the retrmainingr three methods was recommrnended over the
others.
APPENDIX A
FORTRAN PROGRAM WHICH PERFORMED THE CALCULATIONS
DIMENSION X(100,2,2),LORD(100,2),DIFF(100,2)
1,AMAT(3,3),DIFFX(3),DUM(3)
DOUBLE PRECISION SEED
REAL KR21,LORC,MPRG,MPRNG,MPOG,MPONG,MLG,MLNG,MRGG,MRGNG
READ (5,1) N SEEDNSAMP,KSAMP,NOPT
1 FORMAT(13,F11.0,314)
IF(NOPT.EC.O) GO TO 3
READ(5,2) IREL, ISAMP
C IREL RESTART RELIABILITY AT IREL FOR ABORTED RUN
C ISAMP RESTART SAMPLE NUMBER AT ISAMP FOR ABORTED RUN
2 FORMAT(214)
GO TO 4
3 IREL=1
ISAMP=
4 CONTINUE
C
C N SIZE OF SAMPLE
C GAIN AVERAGE GAIN IN GAIN GROUP
C SLED SEED FOR RAN'DOl NUMBER GENERATOR
C KSAM:P NUI'ER OF SAMPLES TO BE TI.KEN AT EACH LEVEL
C NSAMP N'UM.BLR SfF IPE LAST SAMPLE FROM THE PREVIOUS RUN
C I r'CR.'ENTt D t."r PRI(.IED OUJT AS THE SAMPLE NUMBER
C OF RELIABILITY
C
TT=O.0
DO 1000 IR=IREL,5
READ(5,14) GAIN
14 FORMAT(F6.4)
DO 902 JI=ISAMP,KSAMP
NSAMP=NSAMP+1
C
C S SUM
C SS SUM OF SQUARES
C D DIFFERENCE SCORE
C G GAIN GROUP
C NG NO GAIN GROUP
C PR PRE SCORE
C PO POST SCORE
C PP PRE X POST
C PRDIFF SUM PR X DIFF
C
SDG =0.0
SDNG =0.0
SSDG =0.0
SSDNG =0.0
SPRG =0.0
SPRING =0.0
SSPRG =0.0
SSPRNG=0.0
SPOG =0.0
SPONG =0.0
SSPOG =0.0
SSPCNG=O.0
SSPPG =0.0
SSPPNG=0.0
PRDIFF=0.0
REL=0.40+ 0.10*IR
SEI= SQRT(100.0SE1*SE1)
SX=SQRT(100.0SEI*SE1)
SE2= SCR T((SE1lSEl+3.36)/REL3.36SElcSEl)
GAI;= I1T SCRT( 2.0 (SEl 1 SE1+SE2*SE2+ 3. ?6)/N)
C X(I,J,K) ISTUDE"' T
C J=1, PRE SCORE
C =2, POST SCORE
C K= GA IN
C =2 NO GAINr
C
C 1IFF(I,J) 1= STUDENT
C J=1, GAIN
C =2, Nr GAIN
C
DC 10 I=I,N
l)l=SX R.':1C(SEEC )
D2=SX*R [C (SE E0)
X(I,1, )=50+D1+SE 1RAND(SEED)
X(I,2,1)=50+01+GAIN+1.83*RAND(SEED)+SE2*RAND(SEED)
X(I1,,2)=50+D2+SE1*RAND(SEED)
X(I,2,2)=50+D2+1.83*RAND(SEED)+SE2*RAND(SEED)
DIFF 1,1)=Xl1,2,1 )X( I, 1,1)
DIFF(1,2)=X(I,2,2)X(I,1,2)
SDG=SDG +CIFF(I,1)
SDNG=SDNG +DIFF(1,2)
SSDG=SSDG+CIFF(1,1)*DIFF( I,1)
SSDNG=SSCNG+DIFF(1,2)*DIFF(1,2)
SPRG=SPRG +X(I,1,l)
SPRNG=SPRNG +X(,1,2)
SSPRG=SSPRG +X(I,1, 1)X(1,1,1)
SSPRNG=SSPRNG+X( I, 2)r X( 1, 1,2)
SPOG=SPOG+X(I,2, 1)
SPONG=SPONG+X(I,2,2)
SSPOG=SSPCG+X(1,2, 1)X(1,2,1)
SSPOiNG=SSPOi 9G+X(I,2,2)*X(1,2,2)
SSPPG=SSPPG+ X(1,1,1)*X(I,2,l)
SSPPNG=SSPPNG +X(I,1,2)*X(I,2,2)
10 PRDIFF=PRCIFF+DIFF( I )*X( , l)+DIFF( ,2)
VAPRG= (SSPRGSPRG*SPRG/N)/(N1)
VAPRNG= (SSPRNGSPRNGSPRNG/N)/(N1)
VAPOG= (SSPOGSPCG*SPOG/N)/(N1)
VAPONC= (SSPONGSPONG*SPC]ONG/N)/(N1)
CPPG= ( SSPPG S PR GSPG / N ) / SQRT ( S SPRGSPQ.RG SPRG/;)*(SSPOG
1SPOG*SPOG/N))
CPPNG= (SSPPNGSPRNG*SPONG/N)/SQRT(ISSPRNGSPRNGSPRNG/N)*(
1SSPONGSPCNG*SPONG/N))
DBARG= SDG/N
DBARNG = SDNG/N
VADG= (SSDGSDG*SDG/N)/(N1)
VAUNG =(SSDNGSDNG*SDNG/N)/(N1)
Tl= (CBARGDBARNG)/SQRT(((N1)*(VADG+VADNG)/(2*N2))*(2.0/N
1))
B1G= (((1.0REL)*CPPG*SQRT(VAPOG))/SORT(VAPRG)REL+CPPG*CPP
1G)/
1(1.0CPPG*CPPG)
B2G= (RILCPPG*CPPG((1.0REL) SQRT(VAPRG)*CPPG)/SQRT(VAPOG
1))/(1.0CPCPPGCPPG)
B1NG=(((1.0REL)*CPPNGFSQRT(VAPONG))/SQRT(VAPRNG)REL+CPPNG
I1CPPNG))/ (1.0CPPNG=CPPNG)
B2NG= (RELCPPNG*CPPNG((1.0REL)*SQRT(VAPRNG)*CPPNG)/SQRTI
IVAPONG))/(1.0CPPNG*CPPNG)
SLG =0.0
SSLG =0.0
SLNG =0.0
SSLrjG =0.0
V PRG=SPRG;/ ~:
f' PU C=SP]C//N
MPR ~(;= SPR :G /f,
rI:'PONG = S P CI, G/ .'
DO 110 I=1,N
LORD(I,1)=DBARG+B1G*(X(I,1,1)MPRG)+B2G*(X(I,2,1)MPOG)
LORD(I,2)= DBARNG+B1NG*(X(I,1,2)MPRNG)+B2NG*(X(I,2,2)MPON
1G)
SLG=SLG+LCRD(I,1)
SLNG=SLNG+LORD(1,2)
SSLG=SSLG+LORD(1, 1)*LORD(I, )
110 SSLNG=SSLNG+LORD(I,2)*LORD(I,2)
MLG=SLG/N
MLNG=SLNG/N
VALG=(SSLGSLG*SLG/N)/(N1.0)
VALNG=(SSLNGSLNG*SLNG/N)/(N1.0)
T2=(fLGMLNG)/SQRT(((SSLGSLG*SLG/N)+SSLNGSLNG
1(N1)))
A=(SSPPG+SSPPNG(SPRG4SPRNG)*(SPOG+SPONG)/[ 2N))/
1(SSPRG+SSPRNG(SPRG+SPRNG)*(SPRG+SPRNG)/(2*N))
B=(SPCG+SPONG)/(2*N)A*(SPRG+SPRNG)/(2*N)
SRGG=0.0
SSRGG=0.0
SRGNG=0.0
SSRGNG=O.0
.00 210 I=1,N
RGSG=X(I,2,1)A*X(I,1,1)B
RGSNG=X(I,2,2)AlX(I,1,2)B
SRGG=SRGG+RGSG
SSRGG=SSRGG+RGSG RGSG
SRGNG=SRGNG+RGSNG
210 SSRGNG=SSRGNG+RGSNG*RGSNG
MRGG=SRGG/N
NRGNG=SRGNG/N
VARGG=(SSRGGSRGG*SRGG/N)/(N1)
VARGNG=(SSRGNGSRGNG*SRGNG/N)/(N1)
T3= (MRGGMRGNG)/SQRT((SSRGGSGG*SSRGG/N+SSRGNGSRGNG*
1SRGNG/N)/(N*(N1)))
AMAT(1,1)=2*N
AMAT(1,2)=N
AMAT(1,3)=SPRG+SPRNG
AMAT(2,1)=AMAT(1,2)
AMAT(2,2)=N
AMAT(2,3) =SPRG
AMAT(3, 1)=AMAT (1, 3)
AMAT(3,2)=AMAT(2,3)
AMAT(3,3)=SSPRG+SSPRNG
900 COrNT I UE
CALL Ir.(AMAT
DIFF (1)= SUG SDriG
DIFF>(2)=SCG
DIFFX(3)=PROIFF
YXXXXY=0.0
DO 410 1=1,3
DUFM( I )=0.0
DO 405 J=1,3
405 DUM(I)=CUI(I)+DIFFX(J)*AMAT(I,J)
410 YXXXXY=YXXXXY+CUM(I)*CIFFX(1)
SSE=SSCG+SSCNGYXXXXY
VAACCV=SSE/(2.0*N3)
T4=DU (2)/SCRT(VAACOV*AMAT(2,2))
AA=(PRCIFF(SPRG+PRNRG)*(SCG+SDNG)/(2*N))/
1(SSPRG+SSPRNG(S+SG+SPRNG)*(SPRG+SPRNG)/(2*N))
AVG=CEARGAA*((PPRC(WPRG+PRNG)/2)
AVNG=CBIARNGAA*(MPRNG(MPRG+VPRNG)/2)
KR21=1.0(U PRG*(1COVPRG))/(1CO*VAPRG)
VRITE(6,501) NSAMP,REL,KR21,T1,T2,T3 T4TiMPRG,IVPRNG,MPCG,
1VPCNG,MLG,VLNGMRGG,tRGNG,AMC,AMNG,SEED,VAPRG,VAPRNG,VAPCG,
2VAPCNG,VALG,VALNG,VARCG,VARGNG,VAACOV
501 FORI AT(16,lX,2(F3.2, IX),1X,4(F7.3),5(4X,2F6.2)/5X,F23.11,
116X,4(4X,2F6.1 ),8X,F6.2)
VIRITE(7,502)NSAMP,REL,KR21,T1,T2,T3,T4,MPRG,MPRNGMPOG,
1 PCGLG,LGLN ,NS AMP, RGG RGNG,AMG,AMNG,VAPRG,VAPRNG,
2VAPCG,VAPCNG,VALG,VALNG,VARGC,VARGNG,VAACOV
502 FGRVAT(16,2F3.2,4F7.3,6F6.2, 3X,' 1'/I6, F6.2,8F5.1,F6.2,3X,'
12')
902 CCNTINUE
ISAVP=1
1000 CONTINUE
STOP
E NC
FUNCT I C R A;.D (RGC)
DOUBLE PRECISION RO
RG=CVCC(RC*30517578125.,34359738368.)
X=RC/34359738368.
Y=SIGN(1.O,X0.5)
V=SCRT(2.0*ALOG(O.5*(1.0ABS(I.C2.0*X))))
RANC=Y*(V(2.515517+0.802853*V+.C10328*
1V.**2)/( 1.0+1.432788*V+0.189269*V:*2+O.CO1308V**3))
RETURN
ENC
SUBRCLTINE INV(A)
C PROGRAM FCR FINDING TFE INVERSE OF A 3X3 MATRIX
CIVENSICA A(3,3),L(3),M(3)
CATA N/3/
C SEARCH FCR LARGEST ELEMENT
C080 K=1,N
L(K)=K
'(K)=K
BIGA=P(K,K)
CC2C I=K,N
CC20 J=K,N
IF(ABS (DIG,)AL'S (A(I,.))) ]C,2C,20
10 BIGA=A(I,J)
L(K)=I
(K ) J
20 CC TINrUE
C INTERCHA.rGE PCWS
J=L(K)
IF(L(K)K) 35,35,25
25 0C30 I=1,N
HOLC=A(K,I)
A(K, I)=A(J,I)
30 A(J,I)=FCLC C
INTERCHANGE CCLUM\S
35 I=' (K)
IF(V(K)K) 45,45,3
37 DC40 J=i,N
HCLC=A(J,K)
A(J,K)=A(J,I)
40 A(J, I)=CLC
DIVICE CCLUMN BY VINUS PIVOT
45 0C55 I=l,N
46 IF(IK)50,55,50
50 A(I,K)=A(I,K)/(A(K,K))
55 CONTINUE
S RECUCE VATRIX
D065 I=1,
GC.( 5 J= l r
56 IFlll:) 57,65,57
57 IF(J,) OC; 5,60
60 A I J)=t. ( I ,k) A ( J) [ I J )
65 CC'.TI IUE
C DIVIE E F:C., CY PIVCT
DC75 J=1,
68 IF(JK)70,75,70
70 A(K,J)=A(K,J)/A(K,K)
75 CCNTINUE
C CCrTI\UEC FRUCUCT CF PIVOTS
C REFLACE PIVCT EY RECIPROCAL
A(K,K)=1 .0/A( K,K)
80 CC I.T1,UE
C FI1 AL RCW AND COL rt.' T ;TERCHANr GE
K = N
100 K=(K1)
IF(K) 153,150,103
103 I=L(K)
IF(IK) 12C,120,105
105 CCI10 J=1,r\
HCLF ,( J.K )
S( J K ) = !, ( J, I)
110 A(J,I)=hCLC
120 J=V( K )
IF(JK) 1 IC, CO, 125
125 CC130 1= 1,n
HCLC=A(K, I)
A(Krl)=A(J, I
130 t.(J, I )= CLC
GC TC 103
150 RETURN
END
APPENDIX B
LISTING OF t's FOR EACH METHOD OF ANALYSIS
WITH THE GROUP SIZE 25 AND GAIN 0.0
REGRESSED
GAIN
0.334
0.052
1.428
0.888
0.318
0.005
2.297
0.288
0.904
1.008
0.186
0.051
0.232
0.585
2.239
0.575
1.297
0.538
0.674
0.954
0.235
0.036
0.451
1.324
0.856
1.240
0.221
0.006
0.165
0.076
0.610
1.242
0.696
0.410
0.076
0.670
2.034
1.216
0.029
1.205
0.361
ANALYSIS
OF
COVARIANCE
0.935
0.175
3.180
3.837
0.925
0.013
25.767
1.746
2.464
1.691
0.664
0.040
1.325
2.182
10.504
2.038
2.736
2.794
4.600
5.083
0.937
0.075
2.305
2.824
2.220
4.498
0.910
0.019
0.609
0.251
2.692
2.416
2.768
2.427
0.379
2.169
6.352
7.274
0.143
3.964
0.995
SAMPLE
NUMBER
RAW
GAIN
LORD'S
TRUE
GAIN
0.597
0.257
1.340
1.441
0.153
0.903
0.899
0.252
1.383
1.013
0.243
0.255
0.161
0.959
1.702
0.143
0.717
0.269
0.638
1.758
0.379
0.867
0.564
1.661
0.518
0.735
0.112
0.163
0.250
0.164
0.920
1.441
0.608
0.225
0.254
0.894
1.791
0.966
0.813
1.347
0.729
0.592
0.254
1.330
1.453
0.152
0.906
0.986
0.249
1.370
1.003
0.241
0.253
0.159
0.958
1.720
0.152
0.725
0.267
0.632
1.767
0.375
0.864
0.558
1.648
0.516
0.739
0.112
0.161
0.247
0.163
0.912
1.427
0.603
0.224
0.252
0.884
1.790
0.961
0.829
1.334
0.722
LISTING OF t's (CONTINUED)
REGRESSED
GAIN
42
43
44
45
46
47
48.
49
50
51
52
53
54
55
56
57
53
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
ANALYSIS
OF
COVARIANCE
1.042
1.556
0.511
1.295
0.669
0.382
0.228
0.922
1.263
0.276
0.496
1.385
0.087
1.594
0.489
1.450
1.146
1.754
0.533
0.484
1.155
1.321
0.130
0.721
0.556
0.178
0.146
1.290
1.904
0.572
0.923
0.034
0.123
0.586
2.622
0.034
0.259
0.597
0.065
0.611'
0.729
0.967
1.275
1.259
SAMPLE
NUMBER
RAW
GAIN
1.762
16.696
3.038
7.504
4.977
1.210
0.216
5.810
3.021
0.763
1.975
6.511
0.187
2.591
3.276
3.217
4 .044
7.198
8.090
2.971
3.037
20.233
1.233
2.349
1.149
0.332
0.465
4.543
4.395
5.132
2.998
0.106
0.546
2.404
 .344
0.078
0.303
2.141
0.1.97
1.834
0.579
2.813
5.166
4.430
LORD'S
TRUE
GAIN
0.975
1.728
0.184
1.783
1.026
0.640
0.110
0.943
1.114
0.941
0.757
1.583
0.161
1.646
0.233
1.298
0.657
1.535
0.255
0.083
1.766
1.620
0.008
0.514
0.084
0.434
0.496
1.526
1.858
0.470
1.140
0.004
0.026
0.920
2.917
1.368
0.257
0.577
0.404
0.162
1.078
1.069
1.077
0.797
0.966
1.710
0.183
1.780
1.026
0.634
0.110
0.933
1.106
0.938
0.749
1.572
0.160
1.630
0.239
1.290
0.657
1.530
0.253
0.083
1.749
1.604
0.003
0.510
0.084
0.432
0.493
1.514
1.844
0.466
1.128
0.004
0.026
0.911
2.895
1.382
0.254
0.605
0.412
0.162
1.073
1.058
1.071
0.302
LISTING OF t's (CONTINUED)
REGRESSED
GAIN
0.996
0.406
0.633
0.924
0.579
0.167
1.212
2.149
0.815
0.442
1.030
1.004
0.162
0.194
0.218
ANALYSIS
OF
COVARIANCE
1.010
0.408
0.630
0.915
0.573
0.165
1.204
2.145
0.809
0.438
1.020
1.019
0.160
0.204
0.216
SAMPLE
NUMBER
RAW
GAIN
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
LORD'S
TRUE
GAIN
16.635
4.328
4.608
3.109
1.689
0.420
9.886
12.778
2.005
3.848
1.445
0.659
0.168
57.894
0.638
1.736
0.835
0.956
0.930
0.551
0.121
1.352
2.337
0.921
0.599
0.756
0.155
0.091
1.449
0.125
BIBLIOGRAPHY
Anastasi, Anne Psychological Testing. New York:
Macmillan, 1961.
Bigge, Morris L. Learning Theories for Teachers. New York:
Harper and Row, 1964.
Combs, A. W., Snygg, Donald Individual Behavior. New York:
Harper and Row, 1959.
Cronbach, Lee J., Furby, Lita "How Should We leisure
Changeor Should Wle?" Prsychol1ogical Bulletin, 1970,
74: 6380.
Hays, William L. Statistics for PsycholoLists. New York:
Holt, Rinehart, and Winston, 1963.
Hilgard, Ernest H. Theories of Learning. few York:
Appleton, 1956.
Lord, F. I.. "The measurement of Growth," Educational and
Psychological Heasurement, 1956, 16: ,i21437.
Lord, F. II. Statistical Inferences about True Scores,"
Psychoriiletrika, 1959, 24: 117.
Lord, F. l. "Elementary models for Mleasuring Change," in
C. W. Harris Problems in Ileasuring Charige. ;adison,
Wisconsin: University of Wisconsin Press, 1963,
pp. 213S.
1c.Nemar, Q. "On Growth Ileasurermenit," Educational and
Psychological Mleasurement, 1958, 18: 7755.
Madansky, Albert "The Fitting of Straight Lines When Both
Variables Are Subject to Error," Journal of the
American Statistical Association, 1959, 54: 173205.
planning, Winton H., DuiBois, Philip H. "Correlational
Methods in Research on Human Learning," Perceptral
and Motor Skills, 1962, 15: 287321.
Mendonhall, .William Introduction to Linear iioodels and the
Dcin and Analysis _of Ep:ri mnts. Eelmont,
California: Wadsworth, 1968.
Meyer, Paul Introductory Probability and Statistical
Application. Reading, Massachusetts: AddisonWesley,
1965.
Nunnally, Jum C. Psychometric Theory. New York: McGraw
Hill, 1967.
Ohnmacht, Fred W. "Correlates of Change in Academic
Achievement," Journal of Educational Measurement,
1968, 5: 4144.
Rosenthal, Myron R. Numerical Methods in Computer
Programming. Homeward, Illinois: Irwin, 1966.
Scheffe', Henry The Analysis of Variance. New York: Wiley,
1959.
Schick, George B. (ed), May, Merril M. (ed) The Psychology
of Reading. Milwaukee: The National Reading Conference,
Inc., 15th Yearbook 1969, Ranking, Earl F., Jr., Dale,
Lothar H. pp. 172L.
Skinner, B. F. The Technology of Teaching. New York:
Appleton, 1968.
Snedecor, George W. Statistical Methods. Ames, Iowa:
Iowa State College Press, 1956.
Soar, Robert S. "Optimum TeacherPupil Interaction for
Summer Growth," Educational Leadership Research
Supplement, 1968, 26(3): 275280.
Tate, Merle W. Statistics in Education and Psychology.
New York: Macmillan, 1965.
Thorndike, E. L. "The Influence of Chance Imperfections of
Measures Upon the Relation of Initial Score to Gain
or Loss," Journal of Experimental Psychology, 1924,
7: 225232.
Tillman, Chester E. Crude Gain VS. True Gain: Correlates
of Gain in Reading after Remedial Tutoring. Doctoral
dissertation, University of Florida, 1969.
Winer, B. J. Statistical Principles in Experimental Design.
New York: McGrawHill, 1962.
BIOGRAPHICAL SKETCH
John Howard Neel was born July 27, 1944, at Waynesburg,
Pennsylvania. He graduated from William R. Boone High
school, Orlando, Florida, in June, 1962. In August, 1965,
he received the degree Bachelor of Arts with a major in
mathematics from the University of Florida. He taught
algebra and general mathematics at John F. Kennedy Junior
High School from September, 1965, until June, 1966. In
September, 1966, he enrolled in the College of Education
at the University of Florida under a United States Office
of Education fellowship program directed by Dr. Wilson H.
Guertin. In September, 1968, he accepted a research
assistantship under the same program. In June, 1968, he
received the degree Master of Arts in Education. He was
an instructor in the College of Education at the University
of South Florida from September, 1968, until August, 1969,
and is currently on leave from that position. In September,
1969, he was appointed Interim Instructor in the College of
Education at the University of Florida and he holds that
position currently.
John Howard Heel is married to the forncr Carol Lynn
Ramft. They have two daughters, Sarah Elizabeth and Lia
Suzanne.
54
John Howard Neel is a member of the Florida Educational
Research Association, The American Educational Research
Association, Phi Delta Kappa, and the American Statistical
Association.
This dissertation was prepared under the direction of
the chairmen of the candidate's supervisory committee and
has been approved by all members of that committee. It was
submitted to the Dean of the College of Education and to
the Graduate Council, and was approved as partial fulfill
ment of the requirements for the degree of Doctor of
Philosophy.
August, 1970
Dean, Co ~eg of Education
Dean, Graduate School
Supervisory Committee:
C L. /n / __
Co..i i 'mla_
_2 , r",,
