Citation
A Comparative analysis of some measures of change

Material Information

Title:
A Comparative analysis of some measures of change
Creator:
Neel, John Howard, 1944-
Publication Date:
Copyright Date:
1970
Language:
English
Physical Description:
viii, 54 leaves. : ; 28 cm.

Subjects

Subjects / Keywords:
Covariance ( jstor )
Educational research ( jstor )
Error rates ( jstor )
Group size ( jstor )
Learning ( jstor )
Mathematical procedures ( jstor )
Proportions ( jstor )
Statistical discrepancies ( jstor )
Statistics ( jstor )
T tests ( jstor )
Dissertations, Academic -- Foundations of Education -- UF ( lcsh )
Foundations of Education thesis Ph. D ( lcsh )
Mathematical statistics ( lcsh )
Probabilities ( lcsh )
Genre:
bibliography ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis--University of Florida, 1970.
Bibliography:
Bibliography: leaves 51-52.
Additional Physical Form:
Also available on World Wide Web
General Note:
Manuscript copy.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
029482338 ( AlephBibNum )
AEG9071 ( NOTIS )
014346100 ( OCLC )

Downloads

This item has the following downloads:

PDF ( 2 MBs ) ( .pdf )

comparativeanaly00neelrich_Page_39.txt

comparativeanaly00neelrich_Page_21.txt

comparativeanaly00neelrich_Page_26.txt

comparativeanaly00neelrich_Page_34.txt

comparativeanaly00neelrich_Page_32.txt

comparativeanaly00neelrich_Page_29.txt

comparativeanaly00neelrich_Page_23.txt

comparativeanaly00neelrich_Page_30.txt

comparativeanaly00neelrich_Page_42.txt

comparativeanaly00neelrich_Page_25.txt

comparativeanaly00neelrich_Page_60.txt

comparativeanaly00neelrich_Page_20.txt

comparativeanaly00neelrich_Page_45.txt

comparativeanaly00neelrich_Page_44.txt

comparativeanaly00neelrich_Page_43.txt

comparativeanaly00neelrich_Page_54.txt

comparativeanaly00neelrich_Page_62.txt

comparativeanaly00neelrich_Page_63.txt

comparativeanaly00neelrich_Page_51.txt

comparativeanaly00neelrich_Page_52.txt

comparativeanaly00neelrich_Page_15.txt

comparativeanaly00neelrich_Page_57.txt

comparativeanaly00neelrich_Page_03.txt

comparativeanaly00neelrich_Page_41.txt

comparativeanaly00neelrich_Page_12.txt

comparativeanaly00neelrich_Page_55.txt

comparativeanaly00neelrich_Page_31.txt

comparativeanaly00neelrich_Page_09.txt

comparativeanaly00neelrich_Page_35.txt

comparativeanaly00neelrich_Page_06.txt

comparativeanaly00neelrich_Page_11.txt

comparativeanaly00neelrich_Page_28.txt

comparativeanaly00neelrich_Page_07.txt

comparativeanaly00neelrich_Page_50.txt

comparativeanaly00neelrich_Page_16.txt

comparativeanaly00neelrich_Page_56.txt

comparativeanaly00neelrich_Page_19.txt

comparativeanaly00neelrich_Page_58.txt

comparativeanaly00neelrich_Page_48.txt

comparativeanaly00neelrich_Page_47.txt

comparativeanaly00neelrich_Page_53.txt

comparativeanaly00neelrich_Page_36.txt

comparativeanaly00neelrich_Page_08.txt

comparativeanaly00neelrich_Page_04.txt

comparativeanaly00neelrich_Page_02.txt

comparativeanaly00neelrich_Page_18.txt

comparativeanaly00neelrich_Page_38.txt

comparativeanaly00neelrich_Page_61.txt

comparativeanaly00neelrich_Page_37.txt

comparativeanaly00neelrich_Page_14.txt

comparativeanaly00neelrich_Page_40.txt

comparativeanaly00neelrich_Page_49.txt

EKH6J6O6G_JSBLEW_xml.txt

comparativeanaly00neelrich_Page_33.txt

comparativeanaly00neelrich_Page_17.txt

comparativeanaly00neelrich_Page_13.txt

comparativeanaly00neelrich_Page_01.txt

comparativeanaly00neelrich_Page_24.txt

comparativeanaly00neelrich_Page_10.txt

comparativeanaly00neelrich_pdf.txt

comparativeanaly00neelrich_Page_59.txt

comparativeanaly00neelrich_Page_27.txt

comparativeanaly00neelrich_Page_05.txt

comparativeanaly00neelrich_Page_46.txt

comparativeanaly00neelrich_Page_22.txt


Full Text









A COMPARATIVE ANALYSIS OF

SOME MEASURES OF CHANGE















By
JOHN HOWARD NEEL


A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY











UNIVERSITY OF FLORIDA


1970





























COPYRIGHT BY

JOHN HOWARD NEEL

1970












ACKNOWLEDGEMENTS


The writer wishes to thank his committee members for

their guidance in this study, especially Dr. Charles H.

Bridges, Jr., who suggested the topic and was a constant

source of assistance throughout the study. When this study

was begun there was a questioning of change scores in the

department which was helpful and encouraging. Those

responsible for this atmosphere were Dr. Charles M. Bridges,

Jr., Dr. Robert S. Soar, Dr. William B. Ware and Mr. Keith

Brown.

Dr. Vynce A. Hines and Dr. P. V. Rao assisted by each

detecting an error in the model presented.

Dr. William B. Ware was especially helpful editorially,

as was Dr. Wilson H. Guertin.

The writer's wife, Carol, was encouraging, and under-

standing of the time which was necessarily spent away from

home.


ii













TABLE OF CONTENTS


Page


ACKNOWLEDGMENTS . . . . . . .

LIST OF TABLES . . . . . . .

ABSTRACT . . . . . . . .

CHAPTER

I. INTRODUCTION . . . . .

The Problem of Measuring Change
Methods of Analyzing Change .
The Problem . . . . .
Some Limitations . . . .
Procedures . . . . .
Significance of the Study . .
Organization of the Study . .

II. RELATED LITERATURE . . .


. . . iii

. . vi

S . . vii


. . . 10


Derivation of Lord's True Gain Scores
Comparison of Lord's True Gain Scores
with Other Scores . . . . .


Comparison of Regressed Gain Scores with
Other Scores . . . . . .
Another Study and Summary . . . .


III. METHODS AND PROCEDURES . .


10

13

. 14
. 15


. . 16


Procedures: An Overview . . . . 16
Sampling from a Normal Population with
Specified Mean and Variance . . .. 19
Selecting Reliability . .. . . 20
Selecting Gain . . . . . ... .21
Analysis of the t Values for the Four
Methods . . . . . . ... 23


IV. RESULTS, CONCLUSIONS,AND SUMMARY . . 25


Resul ts . . . . . . . .
Conclusions . . . . . . .
Disc ssio . . . . . . .


S 25
231
.31


* .


*
*








Page


CHAPTER


A Direction for Future Research . . .
Summary . . . . . . . . .


APPENDIX


A. FORTRAN PROGRAM . . . . .

B. LIST OF t's FOR THE FOUR METHODS .

BIBLIOGRAPHY . . . . . . . .

BIOGRAPHICAL SKETCH . . . . . . .


. .












LIST OF TABLES


Table


NUMBER OF SIGNIFICANT t's WHEN THE TRUE
MEAN GAIN WAS 0.0 FOR BOTH GROUPS . .

NUMBER OF SIGNIFICANT t's WHEN THE POWER
OF THE t TEST ON THE RAW DIFFERENCE
SCORES WAS 0.50 . . . . . .


Page



26



27







Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


A COMPARATIVE ANALYSIS
OF SOME MEASURES OF CHANGE

by

John Howard Neel

August, 1970

Chairman: Wilson H. Guertin
Co-Chairman: Charles M. Bridges, Jr.
Major Department: Foundations of Education


The purpose of this study was to determine which of

four selected methods was most appropriate for measuring

change such as gain in achievement. The four selected

methods were raw difference, Lord's true gain, regressed

gain, and analysis of covariance procedures.

In order to compare the four methods Monte Carlo

techniques were employed to generate samples of pre and

post scores for two groups. The reliability, variances,

and means of the sampled populations were controlled. One

hundred sa-:ples were generated at each of 20 conbinations

of five levels of reliability, two levels of group size,

and L.-.-o lecv3.s of gain. Using each of the four methods,

a t statistic ;was calculated lor each sample to test the

null] hypothesis of no difference in amount of gain between

the tA:o groups. The number of t's significant at the 0.05

level of significance was recorded for each of the four

methods.


Vii







At each of the 20 combinations a chi square test was

used to test the null hypothesis of equal proportions of

significant t's among the four groups. This hypothesis

was rejected in each case. It was noted that use of Lord's

true gain procedure tended to create a greater significance

level than the user would intend. The proportion of

significant t's for each of the other three methods of

analysis fell reasonably close to the expected values. On

this basis use of Lord's true gain procedure was not

recommended and since there was no apparent difference among

the remaining methods of analysis, none was recommended

above the other.


vi i











CHAPTER I
INTRODUCTION


Learning has long been a primary focus of investi-

gation for educators. A definition of learning has been

offered by Hilgard (1956):

Learning is the process by which an activity
originates or is changed through reacting to
an encountered situation, provided that the
change in activity cannot be explained on the
basis of native response tendencies, maturation,
or temporary states of the organism (e. g. fatigue,
drugs, etc.). (p. 3)

Although Hilgard went on to say that the definition

is not perfect, it does illustrate a commonly accepted

aspect of learning: learning involves change of the

behavior of the organism which learns (Bigge, 1964, p. 1;

Skinr.nr, 1968, p. 10; Combs, 1959, p. 88).

Educators have been concerned with this change and

often have sought to measure the change occurring in some

situation. Several methods of analyzing change have been

presented in the literature. A comparison of these methods

of analysis of thef measurement of change and the acco;lpIxj.ny-

ing difficulties .which arise was the focus of this study.

The purpose of the study was to determine which of the

several selected measures of change is most appropriate

under various conditions.







The Problem of Measuring Change

In all sciences measurement is an approximation. In

conducting a first order survey, a surveyor makes three

measurements of a distance and takes the average of the

three measurements. Physicists and engineers customarily

report the relative size of the error of their measurements.

Physical scientists have been fortunate in that the size

of the relative error involved in their measurements

often has been small, frequently less than 0.01 and some-

times less than 10 Educators are unfortunate in this

respect in that if a student's true I. Q. were 100 and an

I. Q. score of 88 was observed, the relative error would

be 0.14. The size of this error is not uncommon and larger

relative errors do occur. The Stanford-Binet intelligence

test has standard error of measurement equal to five I. Q.

points (Anastasi, 1961, p. 200). Thus, relative errors of

0.14 or larger will occur 1.64 per cent of the time,

assuming a normal distribution of errors.

In measuring change the problem is compounded since

there is an error in both the pre score and the post score.

iloreover, when the magnitude of the change is small or zero,

the magnitude of the error may be larger than that of the

change. This possibility makes the change difficult to

detect or to separate from the error. The effect of this

error of measurement on change scores was noted as early

as 1924 by Thorndike:







When the individuals in a varying group are
measured twice in respect to any ability by
an imperfect measure (that is one whose self-
correlation is below 1.00), the average
difference between the two obtained scores
will equal the average difference between
the true scores that would have been obtained
by perfect measures, but for any individual
the difference between the two obtained scores
will be affected by the error. Individuals
who are below the mean of the group will tend
by the error to be less far below it in the
second, and individuals who are above the
mean of the group in the first measurement
will tend by the error to be less far above
it in the second. The lower the self-corre-
lation, the greater the error and its effect.

Thorndike (1924) went on to show that there was a spurious

negative correlation between initial true score and true

gain. He then stated that "the equation connecting the

relation of obtained initial ability with obtained gain,

the unreliability of the measures and the true facts" had

not been discovered. Lord (1.956) developed the equation

to which Thorndike alluded. Lord made the following

assumptions concerning the error of measurement: the

errors

i) have zero mean for the groups tested.
ii) have the same variances for both tests.
iii) are uncorrelated with each other and with
true score on either test. (Lord, 1956)

HcNemar (1958) has extended Lord's work to the case of

un-qual error variances. It should be noted here that

:cl;emanir's follow-up of Lor-:'s work: is only an extension.

When the variances are equal, :;c'ieniar's foLmiul.s re

identical. to Lord's (Lord, 1958).







Methods of Analyzing Change

The estimated true gain scores derived from Lord's and

NcNemar's equations have been used in either t tests or

analysis of variance procedures (Soar, 1968; Tillman, 1969).

There are, in addition to Lord's method, three other

commonly used methods for analyzing change.

One method has been the use of a straight analysis

of variance or t test on the raw difference scores as the

situation warrants. It is important to note that the raw

scores are used in the analysis with no correction for the

unreliability of the measures.

A second method is to complete an analysis of covari-

ance on the raw difference scores using the pretest scores

as covariates. These procedures are standard statistical

techniques and mnsy bo found in many texts (Hays, 1963;

Snedecor, 1956; Winer, 1962).

The third method of measuring gain has been advocated

by Manning and DuBois (1962):- the method of residual gain.

In this method the final scores are regressed on the initial

scores and the difference between the final score and the

score predicted by the regression equation is taken as a

measure of gain. This measure is then used in t tests or

analysis of variance procedures.

Thus in the case of equal variances four common

methods of analyzing change have been identified:

1. Use of raw gain scores in appropriate

procedures.







2. Use of Lord's true gain scores in appro-

priate procedures.

3. Use of iaanning and DuBois' regressed gain

scores in appropriate procedures.

4. U.se of analysis of covariance on raw gain

scores with pre scores as covariates.

The Problem

A researcher faced with these different methods and

a problem in measuring change is confronted with a second

and fundamental methodological problem: which of the

methods for measuring change is most appropriate? It is

this question that this study sought to answer. The

problem of which method to use is further complicated since

different writers have claimed that different techniques

were appropriate. "iManning and DuBois and Rankin and Tracy

feel that the method of residual gain is more appropriate

for correlational procedures since it is metric free" (as

quoted in Tillman, 1969, p. 2). Ohnmacht (1968) also sug-

gested that this procedure was the best. Lord(in Harris,

1963, chapter 2) mentioned regressed gain but seemed to

advocate his own method as being superior. This position

is further supported by Cronbach and Furby (1970).

To determine which of the method of analysis was

most appropriate, an empirical study was conducted to

con-ipre the results of each method under known situations.







Some Limitations

This study was limited to the two group situation.

To examine more than two groups would have involved such

a large number of possibilities as to make the study

impractical in terms of time and money. Thus, this study

excluded the more general multigroup comparisons possible

with analysis of variance and covariance procedures and

was limited to examining t tests and the analysis of

covariance using the pre score as covariate.

Additionally the case of unequal variances between

the two groups was not considered.

A third limitation was that the variance of the true

gains was selected a priori to be 3.6. True gain scores

with this variance will be such that over 99 per cent will

be within five units of the true mean gain. It is the ratio

of the variances of the gain and error which is important.

Since the reliabilities were varied, as indicated later,

this study was conducted utilizing several such ratios.

One of the factors of interest was the reliability of

the test used. A second factor of interest was the sample

size or the relative power of the procedures under study.

There is an infinite number of reliabilities and a decision

was made as to the levels of reliability to be investigated:

0.50 to 0.90 in increments of 0.10. Tests with reliabilities

lower than 0.50 are rarely used in practice and at best the

resulting data would be highly questionable. The lower

limit of 0.50 was chosen for this reason.







Sample sizes of 25 and 100 per group were chosen as

being somewhat representative of sample sizes used in

educational research.


Procedures

Two groups were compared under 20 conditions using t

tests as follows. There were five levels of reliability

(0.50, 0.60, 0.70, 0.80, 0.90) and two levels of sample size

(25 and 100) used in this study. Thus there were ten differ-

ent combinations of sample sizes and reliabilities. For each

of these combinations two cases were investigated, one where

there was no pre to post test gain in either group and the

second where there was a known gain from the pre to the post

test for one group. For each of these 20 instances, 100

samples were generated and analyzed using each of the four

methods of analysis indicated previously.

Consequently, two questions were to be answered:

1. Does any one of the selected methods yield a

disproportionate number of significant t values

when there is no difference between the mean

gain of the two groups?

2. Is any one of the selected methods more power-

ful, i. e. more successful in detecting a

difference when a difference does exist?

Samples from a normal distribution were generated using

techniques described by Rosenthal (1966). The method for

generating random numbers was the multiplication by a

constant method. With the procedures used, this method







will produce 8.5 million numbers before the series repeats.

This number was more than sufficient for this study. All

generation of the samples and calculation of t values using

the various methods of analysis were done on the IBM 360/65

computer at the University of Florida. The significance

level used for the t tests was 0.05.

The two research questions generated two null hypo-

theses:

1. The proportion of t's significant at the

0.05 level is the same for each method of

analysis when there is no gain in either

group.

2. The proportion of t's significant at the

0.05 level is the same for each method of

analysis when there is gain by one group

but not by the other.

These hypotheses were tested at each of two combinations

of reliability and sample size with chi square tests using

the 0.05 level of significance.


Significance of the Study

The results of this study should either indicate

empirically that one or more methods were superior to the

others or that there were no great differences among the

methods. If the former were true, then educational resear-

chers may select one of the better methods. If the latter

were true, then educational researchers may select any of the








methods. In either case the study provides some answer as

to how change scores should be analyzed.


Organization of the Study

Chapter I has been the introduction, statement of the

problem, limitations, hypotheses, and procedural overview.

Chapter II reviews related literature, essentially the

development of the equation and methodology of the various

techniques studied. Chapter III describes the procedures

and Chapter IV presents the data, conclusions, and summary.













CHAPTER II
BELATED LITERATURE


Much has been said in the literature about measuring

change. However, most of this discussion is centered

around the four methods investigated in this study: raw

gain, Lord's true gain, regressed gain, and analysis of

covariance procedures. As pointed out in Chapter I raw

gain and regressed gain procedures are discussed in many

texts and therefore not discussed here. Lord's true gain

and regressed gain procedures are discussed in this chapter.


Derivation of Lord's True Gain Scores

The following derivation parallels Lord's (1956)

development of true gain scores with one exception as noted.

Lord gave the following equations as a model for the observed

pre and post scores:

(1) X = T + E1

(2) Y = T + G + E2

where

X = observed pre score;

Y = observed post score;

T = true pre score;

G = true gain;

El = error of measurement in pre observed score;







E2 = error of measurement in post observed score.

Lord then made the following assumption concerning E1 and

E2, the errors of measurement.

The Errors

i) have zero mean in the group tested.
ii) have the same variance (a ) for both tests.
e
iii) are uncorrelated with each other and with
true score on either test (Lord, 1956).

The derivation can be considerably shortened at this point

by examining a standard regression equation which predicts

one variable, X1, from two other variables, X2, X3. The

equation is (Tate, 1965, p. 171)

a1 a
(3) X = B2.3 2 (X2 2 + B32 (X X) + 1
1 12.3 2 2 13.2 G 3 3

where
r r r

23

and

r r r
B r 13 12 23
B13.2 1 2
23

If we let

X1 = G = gain;

X2 = X = observed pre score;

X3 = Y = observed post score;

the following elements in the regression equation can be

identified as

X1 = estimated gain;







X2 = mean of the observed pre scores;

X = mean of the observed post scores;

S1 = standard deviation of the gain scores;

S2 = standard deviation of observed pre scores;

S = standard deviation of observed post scores;

r23 = correlation of observed pre and post scores.

Lord has pointed out that r12 and r13, the correlations

between observed score and true score, are the reliabilities.

From (1) and his stated assumptions, Lord writes

(4) o 2 +2 2
(4) = t + e
t e e

2 2 2 2
(5) a = o + o + e
y t g e

(6) ax = o2 + a
(6) xy t te

Lord solves these equations to find

2 2 2 2
(7) a = o + o + 20 2o
g x y xy

the variance of the true gain scores.

At this point the only element in the regression

equation which is undefined is X1. This element is found

by considering the mean of the observed pre scores which

from (1) can be seen to be equal to the mean of the gain

scores plus the mean of the errors of measurement, that

is


(8) X= -E .


But by Lord's assumption (i) E = O, therefore







(9) X = T .

Similarly from (2) Lord shows

(10) Y = T + G .

Then (9) is subtracted from (10) and rewritten to yield


(11) G = Y X .

Thus all elements are defined and (3) may be rewritten as

in terms of T, G, X, and Y as


(12) G = B123 ^ (X B) + B132 Y (Y Y) + Y X
x y

which Lord has asserted to be an estimate of true gain.

It may be noted that no notational scheme or other

method has been presented to distinguish between statistics

and parameters in the preceding derivation. This lack is

in keeping with Lord's derivation. It is assumed here

that Lord was referring to parameters until the point at

which he obtained the final equation and that he then

intended to use sample values to estimate the appropriate

parameters in the regression equation.

Comparison of Lord's True Gain Scores with Other Scores

In his original article Lord (1956) made no comparison

of his method with any other method. In a subsequent article

(1959) Lord again made no mention of other methods. In

a chapter written in Problems Jn iesurins Chan'ye (Harris,

1963, Chapte ir 2) he m.de reference to regrosscd vain scores,







but no discussion or comparison was presented. In Statis-

tical Theories of Mental Test Scores (Lord and Novick,

1968) no comparison of Lord's true gain scores with other

procedures is presented.


Comparison of Regressed Gain Scores with Other Scores

Manning and DuBois (1962) have compared per cent,

raw, and residual gain scores. A per cent gain score is

raw gain score divided by the pre score (Manning and DuBois,

1962). The comparisons were made on the bases of metric

requirements, reliability, and appropriateness of use in

correlation procedures. On each of these bases residual

gain scores were recommended over per cent and raw gain

scores. Manning and DuBois pointed out that per cent and

raw gain scores require at least equal interval scales on

both pre and post scores and that the scales be the same

on both pre and post scores, i. e. the same equal interval

scale must be used on both tests. According to Manning

and DuBois these qualities are not possessed by educational

and psychological test scores. In contrast, residual gain

scores do not require the same equal interval scales and

therefore are appropriate for use with test scores (Manning

and DuBois, 1962). Manning and DuBois summarily list

formulas showing that residual gain scores are more reliable

and more appropriate measures for correlational procedures

than are raw or per cent gain scores. These formulas were

only listed, not derived, and no reference was made to their

derivation.







Another Study and Summary

Madansky (1959) reported or derived several methods

for fitting straight lines to two variables when both were

measured with error. One of these procedures is applicable

in the case when the variance of the error of measurement

is unknown. However, there has apparently been no attempt

to apply the method to the analysis of change.

A search of the literature has revealed no compara-

tive empirical examination of the four methods examined

in this study. Further, the advocates and authors of two

of the reported procedures, each of whom has been shown to

know of the existence of the other procedure, continue to

advocate their own method even though they offer no reason

or data for this advocacy. This study should provide some

knowledge as to any difference in the four methods.












CHAPTER III
METHODS AND PROCEDURES


Procedures: An Overview

As stated in Chapter I, Monte Carlo techniques were

employed to generate pre and post test scores for two

groups. One group is referred to as the gain group, the

other as the no gain group.

The model for the observed pre scores is


(1) X = T + El


where

X is the observed pre score;

T is the true pre score;

E1 is a normally distributed random error
2
with mean 0.0 and variance C .
e
The model for the post scores is


(2) Y = T + G + E2


where

Y is the observed post score;

T is the true pre score;

G is the true gain from pre to post score;

E2 is a normally distributed random error

wit mo!ican 0.0 anid variance o
e2







The generated scores were subsequently analyzed for

the difference in the amount of gain or change between the

two groups. The scores were analyzed by the four selected

methods;

1. a t test on the raw difference scores

2. a t test on Lord's true gain scores

3. a t test on regressed gain scores

4. a t test from an analysis of covariance on

the raw difference scores using pre score

as covariate.

The results of these analyses were then compared. For the

gain group an appropriate mean gain, V from pre to post

scores was obviously selected to be 0.0 in the case of no

gain for either group and selected to be of such size as

to male the po,:w-r 0.50 when there was a gain in the gain

group. The post scores were generated by adding a random

normal gain,, and a random noriiral error, E2, to the

generated true pre scores. The variables G and E2 had

means 1 and 0 respectively and variances as discussed

later. For the no gain group there was no gain from pre

to post scores.

The pre scores for both the gain and the no gainI

group were taken from a normal population with mean 50.0

and variance 100.0. The mean and variance of the popula-

tion of post scores for both the gain and the no gain

groups w;ere function", of the mean true gain and of the

reliability.







After the samples were generated, the hypothesis of no

difference in average gain between the two groups was tested

using each of the four methods. The t values for each of

these tests were recorded.

This procedure was repeated over 100 samples for each

of the selected reliabilities 0.50, 0.60, 0.70, 0.80 and

0.90. The method for introducing the effect of the selected

reliability into each generated score is presented in a

following section.

Thus 100 t values were calculated and recorded for

each method of analysis and at each level of reliability.

This entire procedure was repeated for each of the following

conditions:

1. group size = 25, Pg = 0.0 for both groups;

2. group size = 25, Vg 0.0 for the gain group

S0.0 for the no gain group;

3. group size = 100, V = 0.0 for both groups;
g
4. group size = 100, P 9 0.0 for the gain group

= 0.0 for the no gain group.

Where the gain was not equal to 0.0 it was such that the

power of the t tests on the raw difference scores was 0.50,

i. e. the expected proportion of rejected null hypotheses

was 0.50. The following sections describe in more detail

some of the previously mentioned procedures.*


* The reader is also referred to the FORTRAN listing in
Appendix A for the exact computer routines by which these
procedures were carried out.







Sampling from a Normal Population with Specified Mean and
Variance

If F is the cumulative density function of a random

variable R1, thaa the random variable R2 defined by


(3) R2 = F(R1)

is uniformly distributed over the interval [o,li (Meyer,

1965, p. 256, Theorem 13.6). Here F is the cumulative
density function of the random variable R1. It follows

then that Rl, where

(4) R = F-(R2)

is normally distributed if F-1 is the inverse cumulative

density function of a normal distribution and if R2 is a

uniform random number on the interval [0,1] (Meyer, 1965,

pp. 256-257).
Thus random samples from a normal distribution may be

obtained using uniform random numbers and by (4) where F

is the cumulative density function of a normal distribution.
-1
For a normal distribution, F1 (R ) must be calculated using

numeric:al approximation methods. This calculation as well

as the generation of the uniform random numbers werc'e done

using a routine dcscribcd by Rosenthal (1966, pp. 270, 267).

Roscnthal's techniques were adapted to the IBM' 360/65

computer installed at the University of Florida (see

Appendix A FUIICTIOII RAUD). The normal population sampled

had mean 0 and variance 1.0. If a different mean or variance




20

was required, it was obtained by addition or multiplication

by an appropriate constant.


Selecting Reliability

Reliabilities of 0.50, 0.60, 0.70, 0.80 and 0.90 were

selected as representative of reliabilities found in test

scores. The reliability, rel, of a test may be defined as

c2
e
(5) rel = 1- (Nunnally, 1967, p. 221),
2
x
2 2
where a is the error of measurement variance and o is
e x
the observed score variance. Since c2 had been selected
x
a priori to be 100.0, we have from (5)

2
(6) a = 100 (1 rel)
e1

Moreover, since


(7) X = T + E1

and since the error, El, is assumed to be independent of

the true score, T, we have


(8) o2 = 02 + a2
x t e

or, combining (6) and (8) and solving for o2,


(9) a2 = 100 o2
t e1

For the post scores the desired variances are also

easily found from the model for a post score,







(10) Y = T + G + E

and for which


(11) 2 = 2 + 2 + C2
y t g e2

As stated in Chapter I, a2 was selected to be 3.6. If (5)
g


y e2 x el

tively, then (5) and (11) may be used to find

c2 + 2 2
(12) C2 t C
e rel t g
2

The effect of the selected reliability may be obtained

by selecting the error of measurement variances and the

variance of the true scores in accordance with (6), (9)

and (12).

Thus it is seen that if true scores are selected from
2
a distribution with variance a2 and if the errors of
t
measurement are selected independently from a distribution
2 2
with variance 0 then by (8) X has variance if (7) holds.
e x

Selecting Gain

When there was no gain in either group the value of g .

would then be 0.0. When vt was nonzero for the gain,

its value was selected so as to make the power of the t

test on the raw difference scores equal to 0.50. The

power of 0.50 was selected in order to permit maximum

difference between the four methods of analysis.







The value of G was determined by examining the

difference scores (D).


(13) D = Y X,

and from


(14) D = (T + G + E2) (T + E1)


or


(15) D = G + E2 El

The elements in the right side of (11) are mutually inde-

pendent normally distributed random variables whose vari-

ances have been found and thus

2 2 2 2
(16) d = oa + a + o
d el g e2

Furthermore since the only difference in (15) for the gain

and no gain groups is the mean of G, the variance of the
2
difference for the gain group, ad and the variance of
g
2
the difference for the no gain group, d are equal, i. e.
ng

2 2 2
(17) dg= dng = d
g ng
2
This common value a may then be used to determine the

appropriate value of pg to produce the desired power of

0.50 for the t test on the raw difference scores.

The t test on the raw difference scores is found from

the following formula:







(18) t = g n Dng

S(n 1) S2 + (n 1) S
g ng 1 1
n + 2 r n "2


If the group size is 25 and reliability 0.50, the value of
is found as follows:

2
(note: rel = 0.50 impliescl = 106.72)


(19) t =
2 2
24(2 c ) + 24(2L + 1
g 1!
25 + 25 2 25 25


D
t = 2 5


This value of t is greater than the critical value of t
(2.01) only if
(20) 2.01 -gT

that is, only if

(21) 5.86 <5
g

Thus if a value of 5.86 is chosen for the mean gain, the

power is .50. Appropriate values for other group sizes
and reliabilities were similarly determined.

Analysis~ of the t Values for the Four feth-od'J

The number of t's; significant at the 0.05 level w.ss
rccorde.d for each of' the four me'.1o.-is of analysis. The.o




24


data were recorded for each of the 20 combinations of sample

size, reliability and gain. A chi square statistic was

calculated for each of these 20 sets to test the null

hypothesis of no difference in the proportion of signifi-

cant t values for the four methods of analysis. These data

may be seen in Tables 1 and 2 of Chapter IV.












CHAPTER IV
RESULTS, CONCLUSIONS, AND SUMMARY


Results

The number of significant t's for each method of

analysis under the no gain condition is presented in

Table 1, and for the gain condition in Table 2. Addi-

tionally, the computed chi square statistics for each

reliability level are given. In each case the null hypo-

thesis tested was that the proportion of significant t's

was the same for each of the four methods of analysis. The

chi square values were computed from the 2 x 4 contingency

tables implied by the corresponding line of the table. For

example, for group size of 25 and a reliability of 0.50,

the 2 x 4 contingency table implied by the first line of

Table 1 is:


Raw Lord's Regressed Analysis of
gain gain gain covariance
Significant 5 58 2 2

Non significant 95 42 98 98


As may be seen by inspection of Tables 1 and 2, all the

chi square values wcre significant at the 0.05 level and

in each case the hypothesis of equal proportion of signi-

ficant t values for the four methods of analysis w:as rejected.











TABLE 1


NUMBER OF SIGNIFICANT t's WHEN THE TRUE MEAN
GAIN WAS 0.0 FOR BOTH GROUPS


GROUP SIZE = 25

RELIABILITY RAW LORD'S REGRESSED ANALYSIS CHI
GAIN TRUE GAIN OF SQUARE
GAIN COVARIANCE


0.50 5 58 2 2 163.13
0.60 5 53 4 4 128.98
0.70 7 51 6 6 103.69
0.80 5 24 5 5 30.76
0.90 6 27 4 4 40.95


GROUP SIZE = 100

RELIABILITY RAW LORD'S REGRESSED ANALYSIS CHI
GAIN TRUE GAIN OF SQUARE
GAIN COVARIANCE


242.95
178.28
115.05
111.60
59.61


0.50
0.60
0.70
0.80
0.90


CHI SQUARE (3,.95) = 7.82











TABLE 2

NUMBER OF SIGNIFICANT t's WHEN THE POWER OF THE
t TEST ON THE RAW DIFFERENCE SCORES WAS 0.50


GROUP SIZE = 25

RELIABILITY RAW LORD'S REGRESSED ANALYSIS CHI
GAIN TRUE GAIN OF SQUARE
GAIN COVARIANCE


0.50 44 89 54 54 48.80
0.60 52 89 64 64 33.00
0.70 52 86 65 66 28.23
0.80 47 78 50 51 25.43
0.90 50 75 53 52 16.90


GROUP SIZE = 100

RELIABILITY RAW LORD's REGRESSED ANALYSIS CHI
GAIN TRUE GAIN OF SQUARE
GAIN COVARIANCE


0.50
0.60
0.70
0.80
0.90


38.80
39.48
24.45
38.37
21.35


CHI SQUARE (3,.95) = 7.82







Further inspection of Tables 1 and 2 reveals a higher

number of significant t's for Lord's true gain procedure

than for any other methods. Moreover, examination of Table

1 shows that this particular technique gives a considerably

greater frequency of significant t values than one would

expect by chance. The expected frequency is 5 for the a

priori established condition of no actual difference in the

two populations sampled. These results indicate that use

of Lord's true gain procedure tends to create a higher

significance level than the user would intend. If the

sample proportion of significant t's found in the analysis

is used as an estimate of the significance level, that

estimate is 0.58 for the case when the group size was 25.

For the same group size the lowest estimate of the signi-

ficance level is 0.39.


Conclusions

Since the hypothesis of equal proportion of signifi-

cant t's for the four methods of analysis was rejected in

each of the 10 cases where the mean gain was 0.0 and since

the use of Lord's true gain scores provided estimated

levels of significance which were considerably higher than

those intended, the use of Lord's true gain scores is

strongly suspect and therefore is not recommended.

No apparent differences were found among the remaining

three I-ethods of analysis. However, there is a similarity

bet;sen the regressed gain scores procedure and the







analysis of covariance procedure that should be examined.

The data in Table 1 indicate that the same number of

significant t's was found by both of these methods in the

case where there was no gain for either group.

The 100 t values for each of the four methods of

analysis when the group size was 25 and there was no gain

in either group are presented in Appendix B. Inspection

of the t values for the regressed gain procedure and the

analysis of covariance procedure reveals a striking simi-

larity between the t values; for each sample the t values

are identical to at least the first decimal place. As a

descriptive statistic it is noted that the correlation

between the t values found by these two methods is 1.00

(rounded to 3 digits). Thus, the two methods are providing

very similar results.

In contrast the correlation between the t's for the

raw difference and regressed gain procedures is 0.898. The

two methods, regressed gain and analysis of covariance,

are not entirely similar to the raw difference procedure.

It may also be seen from Appendi-: B that the signs of the

t's from bath the r'egresscd gain and analysis of covariance

procedures co.:,etimes are opposite from the sign of the t

for the raw difference procedure.

Snodecor (1956, pp. 397,398) has indicated that the

regressed gain procedure and the analysis of covariance

procedure on the post scores using pre score as covariate

are identical procedures. Nlo cource i.as found indicating a







similarity between the regressed gain procedure and the

analysis of covariance procedure on the difference scores

using pre score as covariate. However, the two methods

may be shown to be equivalent by writing the linear model

for the regressed gain, or, equivalently, for the analysis

of covariance on the post scores using pre score as covariate,

and the model for the analysis of covariance on the differ-

ence scores using pre score as covariate. The model for

covariance analysis on the post scores is


(1) Y = B0 + BlX + B2Z + E (Mendenhall, 1968, p. 170),

where X and Y are defined as previously and

Z = 1, if Y is from the gain group,

= 0, if Y is not from the gain group.

E = a normally distributed random error with mean 0.

The model for covariance analysis on the difference scores

is


(2) D = B0 + BlX + B2Z + E

where


(3) D =Y -X

and all other elements are defined as in (1). Now if the

right side of (3) is substituted into (2) and the resultant

equation rearranged to yield


Y = BO + (B + 1.O)X + B2Z + E


(4)







it is seen that (1) and (4) are identical except for the

addition of 1.0 to BI of equation (1) and thus the two

methods will yield the same t values for testing the

hypothesis that B2 is equal to 0.0.

Since no clear difference was found among the raw gain

procedure, the regressed gain procedure and the analysis

of covariance procedure, none of these is recommended as

more appropriate for the analysis of change than the other.

All of these three procedures are recommended above Lord's

true gain procedure.


Discussion

It is reasonable to ask if there is some questionable

logic in Lord's derivation of true gain scores. Two

things become apparent upon examination of the derivation.

First the formula which Lord uses to begin his derivation,

(3) of Chapter II, requires that the independent variable

be known exactly (Madansky, 1959; Scheffe', 1959, p. 4),

i. e. without error of measurement. The problem of esti-

mating true gain arises in that the pre and post test scores

are not knoin exactly, but instead thec observed ..cores, or

the t'ru.e scores plus measurement errors, are known as m.ay

be seen from Lord's models of the observed scores, (1) and

(2) in Chapter II. If true pre and true post score were

known these could be put into the regression equation.

However, if true pre score and true post score were known

there would be no nee:1 for the regression equation to







estimate gain. The gain could be obtained simply by sub-

tracting true pre score from true post score. In short,

Lord seems to have assumed his conclusion in his derivation.

Second, a look at the basic method for estimating true

gain is enlightening. In order to estimate true gain from

observed score it is necessary to somehow remove the error

of measurement since, assuming the test to be valid, this

is the factor that obscures the true score. Mendenhall

(1968, p. 1) says that "...statistics is a theory of infor-

mation...." What information is known concerning the errors

of measurement? By Lord's assumption (iii) of Chapter II

the errors are uncorrelated with true score and with each

other. Thus neither the observed pre score nor the observed

post score should provide any information concerning the

size of the error. Since this error is random it would

seem that it could not be removed from the observed scores.

Consider two equal observed scores, one obtained from a

higher true score by the addition of a negative error, the

other obtained from a lower true score by the addition of

an error of the same size but opposite sign of the previous

error. How does one decide which score is to have a posi-

tive correction added and which is to have a negative

correction added? It would appear that one cannot make

this decision without having some information besides the

observed scores.








A Direction for Future Research

Since this study shows no difference among the propor-

tion of significant t's for the raw gain, regressed gain,

and the analysis of covariance procedures, it would be

interesting to investigate the use of these procedures

under assumptions other than the models listed in the first

section of Chapter III. Differences may occur, for example,

when the gain is a linear function of the pre score.


Summary

This study compared four selected measures of change.

The four measures were: raw difference, Lord's true gain,

regressed gain, and analysis of covariance procedures. An

empirical comparison was made among these four methods.

Samples were generated using Monte Carlo techniques and

the data in each sample were analyzed by each of the four

methods.

It was found that Lord's true gain procedure produced

a number of spurious significant t values, greater than

would be expected by chance, when there was no real differ-

ence in amount of gain between the two populations sampled.

No apparent differences were noted among the remaining

three methods and these three methods did not appear to

have inflated significance levels. 'With such data use of

Lord's true gain procedure is not recomriiended annc none

of the retrmainingr three methods was recommrnended over the

others.






























APPENDIX A











FORTRAN PROGRAM WHICH PERFORMED THE CALCULATIONS







DIMENSION X(100,2,2),LORD(100,2),DIFF(100,2)

1,AMAT(3,3),DIFFX(3),DUM(3)

DOUBLE PRECISION SEED

REAL KR21,LORC,MPRG,MPRNG,MPOG,MPONG,MLG,MLNG,MRGG,MRGNG

READ (5,1) N SEEDNSAMP,KSAMP,NOPT

1 FORMAT(13,F11.0,314)

IF(NOPT.EC.O) GO TO 3

READ(5,2) IREL, ISAMP

C IREL RESTART RELIABILITY AT IREL FOR ABORTED RUN

C ISAMP RESTART SAMPLE NUMBER AT ISAMP FOR ABORTED RUN

2 FORMAT(214)

GO TO 4

3 IREL=1

ISAMP=

4 CONTINUE

C

C N SIZE OF SAMPLE

C GAIN AVERAGE GAIN IN GAIN GROUP

C SLED SEED FOR RAN'DOl NUMBER GENERATOR

C KSAM:P NUI'ER OF SAMPLES TO BE TI.KEN AT EACH LEVEL

C NSAMP N'-UM.BLR SfF IPE LAST SAMPLE FROM THE PREVIOUS RUN

C I r'CR.'ENTt D t."r PRI(.IED OUJT AS THE SAMPLE NUMBER











C OF RELIABILITY

C

TT=O.0

DO 1000 IR=IREL,5

READ(5,14) GAIN

14 FORMAT(F6.4)

DO 902 JI=ISAMP,KSAMP

NSAMP=NSAMP+1

C

C S SUM

C SS SUM OF SQUARES

C D DIFFERENCE SCORE

C G GAIN GROUP

C NG NO GAIN GROUP

C PR PRE SCORE

C PO POST SCORE

C PP PRE X POST

C PRDIFF SUM PR X DIFF

C

SDG =0.0

SDNG =0.0

SSDG =0.0

SSDNG =0.0

SPRG =0.0

SPRING =0.0

SSPRG =0.0










SSPRNG=0.0

SPOG =0.0

SPONG =0.0

SSPOG =0.0

SSPCNG=O.0

SSPPG =0.0

SSPPNG=0.0

PRDIFF=0.0

REL=0.40+ 0.10*IR

SEI= SQRT(100.0-SE1*SE1)

SX=SQRT(100.0-SEI*SE1)

SE2= SCR T((SE1lSEl+3.36)/REL-3.36-SElcSEl)

GAI;= I1T SCRT( 2.0 (SEl 1 SE1+SE2*SE2+ 3. ?6)/N)

C X(I,J,K) I-STUDE"' T

C J=1, PRE SCORE

C =2, POST SCORE

C K= GA IN

C =2 NO GAINr

C

C 1IFF(I,J) 1= STUDENT

C J=1, GAIN

C =2, Nr GAIN

C

DC 10 I=I,N

l)l=SX R.':1C(SEEC )

D2=SX*R [C (SE E0)











X(I,1, )=50+D1+SE 1RAND(SEED)

X(I,2,1)=50+01+GAIN+1.83*RAND(SEED)+SE2*RAND(SEED)

X(I1,,2)=50+D2+SE1*RAND(SEED)

X(I,2,2)=50+D2+1.83*RAND(SEED)+SE2*RAND(SEED)

DIFF 1,1)=Xl1,2,1 )-X( I, 1,1)

DIFF(1,2)=X(I,2,2)-X(I,1,2)

SDG=SDG +CIFF(I,1)

SDNG=SDNG +DIFF(1,2)

SSDG=SSDG+CIFF(1,1)*DIFF( I,1)

SSDNG=SSCNG+DIFF(1,2)*DIFF(1,2)

SPRG=SPRG +X(I,1,l)

SPRNG=SPRNG +X(,1,2)

SSPRG=SSPRG +X(I,1, 1)X(1,1,1)

SSPRNG=SSPRNG+X( I, 2)r X( 1, 1,2)

SPOG=SPOG+X(I,2, 1)

SPONG=SPONG+X(I,2,2)

SSPOG=SSPCG+X(1,2, 1)X(1,2,1)

SSPOiNG=SSPOi 9G+X(I,2,2)*X(1,2,2)

SSPPG=SSPPG+ X(1,1,1)*X(I,2,l)

SSPPNG=SSPPNG +X(I,1,2)*X(I,2,2)

10 PRDIFF=PRCIFF+DIFF( I )*X( , l)+DIFF( ,2)
VAPRG= (SSPRG-SPRG*SPRG/N)/(N-1)

VAPRNG= (SSPRNG-SPRNG-SPRNG/N)/(N-1)

VAPOG= (SSPOG-SPCG*SPOG/N)/(N-1)

VAPONC= (SSPONG-SPONG*SPC]ONG/N)/(N-1)

CPPG= ( SSPPG- S PR GSPG / N ) / SQRT ( S SPRG-SPQ.RG SPRG/;)*(SSPOG-











1SPOG*SPOG/N))

CPPNG= (SSPPNG-SPRNG*SPONG/N)/SQRT(ISSPRNG-SPRNGSPRNG/N)*(

1SSPONG-SPCNG*SPONG/N))

DBARG= SDG/N

DBARNG = SDNG/N

VADG= (SSDG-SDG*SDG/N)/(N-1)

VAUNG =(SSDNG-SDNG*SDNG/N)/(N-1)

Tl= (CBARG-DBARNG)/SQRT(((N-1)*(VADG+VADNG)/(2*N-2))*(2.0/N

1))

B1G= (((1.0-REL)*CPPG*SQRT(VAPOG))/SORT(VAPRG)-REL+CPPG*CPP

1G)/

1(1.0-CPPG*CPPG)

B2G= (RIL-CPPG*CPPG-((1.0-REL) SQRT(VAPRG)*CPPG)/SQRT(VAPOG

1))/(1.0-CPCPPGCPPG)

B1NG=(((1.0-REL)*CPPNGFSQRT(VAPONG))/SQRT(VAPRNG)-REL+CPPNG

I1CPPNG))/ (1.0-CPPNG=CPPNG)

B2NG= (REL-CPPNG*CPPNG-((1.0-REL)*SQRT(VAPRNG)*CPPNG)/SQRTI

IVAPONG))/(1.0-CPPNG*CPPNG)

SLG =0.0

SSLG =0.0

SLNG =0.0

SSLrjG =0.0

V PRG=SPRG;/ ~:

f-' PU C=SP]C//N

MPR ~(;= SPR :G /f,

rI:'PONG = S P CI, G/ .'











DO 110 I=1,N

LORD(I,1)=DBARG+B1G*(X(I,1,1)-MPRG)+B2G*(X(I,2,1)-MPOG)

LORD(I,2)= DBARNG+B1NG*(X(I,1,2)-MPRNG)+B2NG*(X(I,2,2)-MPON

1G)

SLG=SLG+LCRD(I,1)

SLNG=SLNG+LORD(1,2)

SSLG=SSLG+LORD(1, 1)*LORD(I, )

110 SSLNG=SSLNG+LORD(I,2)*LORD(I,2)

MLG=SLG/N

MLNG=SLNG/N

VALG=(SSLG-SLG*SLG/N)/(N-1.0)

VALNG=(SSLNG-SLNG*SLNG/N)/(N-1.0)

T2=(fLG-MLNG)/SQRT(((SSLG-SLG*SLG/N)+SSLNG-SLNG
1(N-1)))

A=(SSPPG+SSPPNG-(SPRG4SPRNG)*(SPOG+SPONG)/[ 2N))/

1(SSPRG+SSPRNG-(SPRG+SPRNG)*(SPRG+SPRNG)/(2*N))

B=(SPCG+SPONG)/(2*N)-A*(SPRG+SPRNG)/(2*N)

SRGG=0.0

SSRGG=0.0

SRGNG=0.0

SSRGNG=O.0

.00 210 I=1,N

RGSG=X(I,2,1)-A*X(I,1,1)-B

RGSNG=X(I,2,2)-AlX(I,1,2)-B

SRGG=SRGG+RGSG

SSRGG=SSRGG+RGSG RGSG











SRGNG=SRGNG+RGSNG

210 SSRGNG=SSRGNG+RGSNG*RGSNG

MRGG=SRGG/N

NRGNG=SRGNG/N

VARGG=(SSRGG-SRGG*SRGG/N)/(N-1)

VARGNG=(SSRGNG-SRGNG*SRGNG/N)/(N-1)

T3= (MRGG-MRGNG)/SQRT((SSRGG-SGG*SSRGG/N+SSRGNG-SRGNG*

1SRGNG/N)/(N*(N-1)))

AMAT(1,1)=2*N

AMAT(1,2)=N

AMAT(1,3)=SPRG+SPRNG

AMAT(2,1)=AMAT(1,2)

AMAT(2,2)=N

AMAT(2,3) =SPRG

AMAT(3, 1)=AMAT (1, 3)

AMAT(3,2)=AMAT(2,3)

AMAT(3,3)=SSPRG+-SSPRNG

900 COrNT I UE

CALL Ir.(AMAT

DIFF (1)= SUG SDriG

DIFF>(2)=SCG

DIFFX(3)=PROIFF

YXXXXY=0.0

DO 410 1=1,3

DUFM( I )=0.0

DO 405 J=1,3











405 DUM(I)=CUI(I)+DIFFX(J)*AMAT(I,J)

410 YXXXXY=YXXXXY+CUM(I)*CIFFX(1)

SSE=SSCG+SSCNG-YXXXXY

VAACCV=SSE/(2.0*N-3)

T4=DU (2)/SCRT(VAACOV*AMAT(2,2))

AA=(PRCIFF-(SPRG+PRNRG)*(SCG+SDNG)/(2*N))/

1(SSPRG+SSPRNG-(S+SG+SPRNG)*(SPRG+SPRNG)/(2*N))

AVG=CEARG-AA*((PPRC-(WPRG+PRNG)/2)

AVNG=CBIARNG-AA*(MPRNG-(MPRG+VPRNG)/2)

KR21=1.0-(U PRG*(1CO-VPRG))/(1CO*VAPRG)

VRITE(6,501) NSAMP,REL,KR21,T1,T2,T3 T4TiMPRG,IVPRNG,MPCG,

1VPCNG,MLG,VLNGMRGG,tRGNG,AMC,AMNG,SEED,VAPRG,VAPRNG,VAPCG,

2VAPCNG,VALG,VALNG,VARCG,VARGNG,VAACOV

501 FORI AT(16,lX,2(F3.2, IX),1X,4(F7.3),5(4X,2F6.2)/5X,F23.11,

116X,4(4X,2F6.1 ),8X,F6.2)

VIRITE(7,502)NSAMP,REL,KR21,T1,T2,T3,T4,MPRG,MPRNGMPOG,

1 PCGLG,LGLN ,NS AMP, RGG RGNG,AMG,AMNG,VAPRG,VAPRNG,

2VAPCG,VAPCNG,VALG,VALNG,VARGC,VARGNG,VAACOV

502 FGRVAT(16,2F3.2,4F7.3,6F6.2, 3X,' 1'/I6, F6.2,8F5.1,F6.2,3X,'

12')

902 CCNTINUE

ISAVP=1

1000 CONTINUE

STOP

E NC

FUNCT I C R A;.D (RGC)












DOUBLE PRECISION RO

RG=CVCC(RC*30517578125.,34359738368.)

X=RC/34359738368.

Y=SIGN(1.O,X-0.5)

V=SCRT(-2.0*ALOG(O.5*(1.0-ABS(I.C-2.0*X))))

RANC=Y*(V-(2.515517+0.802853*V+.C10328*

1V.**2)/( 1.0+1.432788*V+0.189269*V:*-2+O.CO1308-V**3))

RETURN

ENC

SUBRCLTINE INV(A)

C PROGRAM FCR FINDING TFE INVERSE OF A 3X3 MATRIX

CIVENSICA A(3,3),L(3),M(3)

CATA N/3/

C SEARCH FCR LARGEST ELEMENT

C080 K=1,N

L(K)=K

'(K)=K

BIGA=P(K,K)

CC2C I=K,N

CC20 J=K,N

IF(ABS (DIG,)-AL'S (A(I,.))) ]C,2C,20

10 BIGA=A(I,J)

L(K)=I

(K ) J

20 CC TINrUE

C INTERCHA.rGE PCWS











J=L(K)

IF(L(K)-K) 35,35,25

25 0C30 I=1,N

HOLC=-A(K,I)

A(K, I)=A(J,I)

30 A(J,I)=FCLC C

INTERCHANGE CCLUM\S

35 I=' (K)

IF(V(K)-K) 45,45,3

37 DC40 J=i,N

HCLC=-A(J,K)

A(J,K)=A(J,I)

40 A(J, I)=-CLC

DIVICE CCLUMN BY VINUS PIVOT

45 0C55 I=l,N

46 IF(I-K)50,55,50

50 A(I,K)=A(I,K)/(-A(K,K))

55 CONTINUE

S RECUCE VATRIX

D065 I=1,

GC.( 5 J= l r

56 IFll-l:) 57,65,57

57 IF(J-,) OC; 5,60

60 A I J)=t. ( I ,k) A ( J) [ I J )

65 CC'.TI IUE

C DIVIE E F:C., CY PIVCT












DC75 J=1,

68 IF(J-K)70,75,70

70 A(K,J)=A(K,J)/A(K,K)

75 CCNTINUE

C CCrTI\UEC FRUCUCT CF PIVOTS

C REFLACE PIVCT EY RECIPROCAL

A(K,K)=1 .0/A( K,K)

80 CC I.T1,UE

C FI1 AL RCW AND COL rt.' T ;TERCHANr GE

K = N

100 K=(K-1)

IF(K) 153,150,103

103 I=L(K)

IF(I-K) 12C,120,105

105 CCI10 J=1,r\

HCLF ,( J.K )

S( J K ) =- !, ( J, I)

110 A(J,I)=hCLC

120 J=V( K )

IF(J-K) 1 IC, CO, 125

125 CC130 1= 1,n

HCLC=A(K, I)

A(Krl)=-A(J, I

130 t.(J, I )= CLC

GC TC 103

150 RETURN











END






























APPENDIX B









LISTING OF t's FOR EACH METHOD OF ANALYSIS
WITH THE GROUP SIZE 25 AND GAIN 0.0


REGRESSED
GAIN


0.334
-0.052
1.428
0.888
0.318
0.005
2.297
0.288
-0.904
-1.008
-0.186
-0.051
-0.232
-0.585
2.239
-0.575
-1.297
-0.538
0.674
-0.954
0.235
-0.036
0.451
-1.324
-0.856
1.240
0.221
0.006
0.165
0.076
-0.610
1.242
-0.696
0.410
-0.076
-0.670
2.034
1.216
-0.029
-1.205
0.361


ANALYSIS
OF
COVARIANCE


0.935
-0.175
3.180
3.837
0.925
0.013
25.767
1.746
-2.464
-1.691
-0.664
-0.040
-1.325
-2.182
10.504
-2.038
-2.736
-2.794
4.600
-5.083
0.937
-0.075
2.305
-2.824
-2.220
4.498
0.910
0.019
0.609
0.251
-2.692
2.416
-2.768
2.427
-0.379
-2.169
6.352
7.274
-0.143
-3.964
0.995


SAMPLE
NUMBER


RAW
GAIN


LORD'S
TRUE
GAIN


0.597
-0.257
1.340
1.441
-0.153
0.903
0.899
0.252
-1.383
-1.013
-0.243
-0.255
-0.161
-0.959
1.702
0.143
-0.717
-0.269
0.638
-1.758
0.379
-0.867
0.564
-1.661
-0.518
0.735
-0.112
-0.163
0.250
0.164
-0.920
1.441
-0.608
0.225
-0.254
-0.894
1.791
0.966
0.813
-1.347
0.729


0.592
-0.254
1.330
1.453
-0.152
0.906
0.986
0.249
-1.370
-1.003
-0.241
-0.253
-0.159
-0.958
1.720
0.152
-0.725
-0.267
0.632
-1.767
0.375
-0.864
0.558
-1.648
-0.516
0.739
-0.112
-0.161
0.247
0.163
-0.912
1.427
-0.603
0.224
-0.252
-0.884
1.790
0.961
0.829
-1.334
0.722







LISTING OF t's (CONTINUED)


REGRESSED
GAIN


42
43
44
45
46
47
48.
49
50
51
52
53
54
55
56
57
53
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85


ANALYSIS
OF
COVARIANCE


1.042
1.556
0.511
-1.295
0.669
0.382
0.228
0.922
-1.263
-0.276
0.496
1.385
-0.087
-1.594
0.489
-1.450
1.146
1.754
-0.53-3
-0.484
1.155
1.321
0.130
0.721
-0.556
0.178
-0.146
1.290
-1.904
0.572
0.923
-0.034
0.123
-0.586
-2.622
-0.034
-0.259
-0.597
0.065
0.611'
0.729
-0.967
-1.275
1.259


SAMPLE
NUMBER


RAW
GAIN


1.762
16.696
3.038
-7.504
4.977
1.210
0.216
5.810
-3.021
-0.763
1.975
6.511
-0.187
-2.591
3.276
-3.217
4 .044
7.198
-8.090
-2.971
3.037
20.233
1.233
2.349
-1.149
0.332
-0.465
4.543
-4.395
5.132
2.998
-0.106
0.546
-2.404
- .344
-0.078
-0.303
-2.141
0.1.97
1.834
0.579
-2.813
-5.166
4.430


LORD'S
TRUE
GAIN


0.975
1.728
0.184
-1.783
1.026
0.640
-0.110
0.943
-1.114
-0.941
0.757
1.583
-0.161
-1.646
-0.233
-1.298
0.657
1.535
-0.255
-0.083
1.766
1.620
-0.008
0.514
-0.084
0.434
-0.496
1.526
-1.858
0.470
1.140
-0.004
0.026
-0.920
-2.917
-1.368
-0.257
0.577
-0.404
0.162
1.078
-1.069
-1.077
0.797


0.966
1.710
0.183
-1.780
1.026
0.634
-0.110
0.933
-1.106
-0.938
0.749
1.572
-0.160
-1.630
-0.239
-1.290
0.657
1.530
-0.253
-0.083
1.749
1.604
-0.003
0.510
-0.084
0.432
-0.493
1.514
-1.844
0.466
1.128
-0.004
0.026
-0.911
-2.895
-1.382
-0.254
0.605
-0.412
0.162
1.073
-1.058
-1.071
0.302







LISTING OF t's (CONTINUED)


REGRESSED
GAIN


-0.996
-0.406
-0.633
0.924
0.579
0.167
1.212
2.149
0.815
0.442
1.030
-1.004
-0.162
-0.194
-0.218


ANALYSIS
OF
COVARIANCE


-1.010
-0.408
-0.630
0.915
0.573
0.165
1.204
2.145
0.809
0.438
1.020
-1.019
-0.160
-0.204
-0.216


SAMPLE
NUMBER


RAW
GAIN


86
87
88
89
90
91
92
93
94
95
96
97
98
99
100


LORD'S
TRUE
GAIN


-16.635
-4.328
-4.608
3.109
1.689
0.420
9.886
12.778
2.005
3.848
1.445
0.659
-0.168
-57.894
0.638


-1.736
-0.835
-0.956
0.930
0.551
0.121
1.352
2.337
0.921
0.599
0.756
0.155
-0.091
-1.449
0.125












BIBLIOGRAPHY


Anastasi, Anne Psychological Testing. New York:
Macmillan, 1961.

Bigge, Morris L. Learning Theories for Teachers. New York:
Harper and Row, 1964.

Combs, A. W., Snygg, Donald Individual Behavior. New York:
Harper and Row, 1959.

Cronbach, Lee J., Furby, Lita "How Should We leisure
Change--or Should Wle?" Prsychol1ogical Bulletin, 1970,
74: 63-80.

Hays, William L. Statistics for PsycholoLists. New York:
Holt, Rinehart, and Winston, 1963.

Hilgard, Ernest H. Theories of Learning. few York:
Appleton, 1956.

Lord, F. I.. "The measurement of Growth," Educational and
Psychological Heasurement, 1956, 16: -,i21-437.

Lord, F. II. Statistical Inferences about True Scores,"
Psychoriiletrika, 1959, 24: 1-17.

Lord, F. l. "Elementary models for Mleasuring Change," in
C. W. Harris Problems in Ileasuring Charige. ;adison,
Wisconsin: University of Wisconsin Press, 1963,
pp. 21-3S.

1c.Nemar, Q. "On Growth Ileasurermenit," Educational and
Psychological Mleasurement, 1958, 18: 77-55.

Madansky, Albert "The Fitting of Straight Lines When Both
Variables Are Subject to Error," Journal of the
American Statistical Association, 1959, 54: 173-205.

planning, Winton H., DuiBois, Philip H. "Correlational
Methods in Research on Human Learning," Perceptral
and Motor Skills, 1962, 15: 287--321.

Mendonhall, .William Introduction to Linear iioodels and the
Dcin and Analysis _of Ep:-ri mnts. Eelmont,
California: Wadsworth, 1968.







Meyer, Paul Introductory Probability and Statistical
Application. Reading, Massachusetts: Addison-Wesley,
1965.

Nunnally, Jum C. Psychometric Theory. New York: McGraw-
Hill, 1967.

Ohnmacht, Fred W. "Correlates of Change in Academic
Achievement," Journal of Educational Measurement,
1968, 5: 41-44.

Rosenthal, Myron R. Numerical Methods in Computer
Programming. Homeward, Illinois: Irwin, 1966.

Scheffe', Henry The Analysis of Variance. New York: Wiley,
1959.
Schick, George B. (ed), May, Merril M. (ed) The Psychology
of Reading. Milwaukee: The National Reading Conference,
Inc., 15th Yearbook 1969, Ranking, Earl F., Jr., Dale,
Lothar H. pp. 17-2L.

Skinner, B. F. The Technology of Teaching. New York:
Appleton, 1968.

Snedecor, George W. Statistical Methods. Ames, Iowa:
Iowa State College Press, 1956.

Soar, Robert S. "Optimum Teacher-Pupil Interaction for
Summer Growth," Educational Leadership Research
Supplement, 1968, 26(3): 275-280.

Tate, Merle W. Statistics in Education and Psychology.
New York: Macmillan, 1965.

Thorndike, E. L. "The Influence of Chance Imperfections of
Measures Upon the Relation of Initial Score to Gain
or Loss," Journal of Experimental Psychology, 1924,
7: 225-232.

Tillman, Chester E. Crude Gain VS. True Gain: Correlates
of Gain in Reading after Remedial Tutoring. Doctoral
dissertation, University of Florida, 1969.

Winer, B. J. Statistical Principles in Experimental Design.
New York: McGraw-Hill, 1962.












BIOGRAPHICAL SKETCH


John Howard Neel was born July 27, 1944, at Waynesburg,

Pennsylvania. He graduated from William R. Boone High

school, Orlando, Florida, in June, 1962. In August, 1965,

he received the degree Bachelor of Arts with a major in

mathematics from the University of Florida. He taught

algebra and general mathematics at John F. Kennedy Junior

High School from September, 1965, until June, 1966. In

September, 1966, he enrolled in the College of Education

at the University of Florida under a United States Office

of Education fellowship program directed by Dr. Wilson H.

Guertin. In September, 1968, he accepted a research

assistantship under the same program. In June, 1968, he

received the degree Master of Arts in Education. He was

an instructor in the College of Education at the University

of South Florida from September, 1968, until August, 1969,

and is currently on leave from that position. In September,

1969, he was appointed Interim Instructor in the College of

Education at the University of Florida and he holds that

position currently.

John Howard Heel is married to the forncr Carol Lynn

Ramft. They have two daughters, Sarah Elizabeth and Lia

Suzanne.




54


John Howard Neel is a member of the Florida Educational

Research Association, The American Educational Research

Association, Phi Delta Kappa, and the American Statistical

Association.







This dissertation was prepared under the direction of
the chairmen of the candidate's supervisory committee and
has been approved by all members of that committee. It was
submitted to the Dean of the College of Education and to
the Graduate Council, and was approved as partial fulfill-
ment of the requirements for the degree of Doctor of
Philosophy.
August, 1970





Dean, Co ~eg of Education



Dean, Graduate School
Supervisory Committee:









C L. /n / __
Co.-.i i 'mla_





_2 -, r",,




Full Text
xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EKH6J6O6G_JSBLEW INGEST_TIME 2017-07-14T21:19:33Z PACKAGE UF00097734_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES



PAGE 1

A COMPARATIVE ANALYSIS OF SOME MEASURES OF CHANGE By JOHN HOWARD NEEL A DISSERTATION PRESENTED TO THE GRADUATE COUNOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1970

PAGE 2

COPYRIGHT BY JOHN HOVJARD NEEL 1970

PAGE 3

ACKN0WLEDGEI4ENTS The writer v:ishes to thank his committee members for their guidance in this study, especially Dr. Charles H. Bridges, Jr., v/ho suggested the topic and was a constant source of assistance throughout the study. When this study was begun there v.'as a questioning of change scores in the department which vjas helpful and encouraging. Those responsible for this atmosphere were Dr. Charles M. Bridges, Jr., Dr. Robert S. Soar, Dr. William B. Ware and lir. Keith Brown . Dr. Vynce A. Hines and Dr. P. V. Rao assisted by each detecting an error in the model presented. Dr. William B. Ware was especially helpful editorially, as vjas Dr. Wilson H. Guertin. The v/riter's wife, Carol, was encouraging, and understanding of the time V7hich v;as necessarily spent av/ay from home . iii

PAGE 4

TABLE OF CONTENTS ACKNOWLEDGMENTS LIST OP TABLES ABSTRACT CHAPTER I. INTRODUCTION , The Problem of Measuring Change . . . . , Methods of Analyzing Change , The Problem , Some Limitations Procedures , Significance of the Study Organization of the Study II . RELATED LITER^^TURE Derivation of Lord's True Gain Scores . Comparison of Lord's True Gain Scores with Other Scores , Comparison of Regressed Gain Scores with Other Scores , Another Study and Summary , III . rlETHODS AND PROCEDURES Procedures: An Overview Sampling from a Normal Population v;ith Specified Mean and Variance Selecting Reliability Selecting Gain Analysis of the t Values foi' the Pour Methods Page iii vi vii 2 5 6 7 8 9 10 10 13 1^ 15 16 16 19 20 21 23 IV. RESULTS, CONCLUSIONS, AND SUMMARY Results Conclusions Discussion 25 25 28 31 IV

PAGE 5

Page CHAPTER A Direction for Future Research 33 Summary 33 APPENDIX A. FORTRAN PROGR/\M 34B. LIST OF t's FOR THE FOUR METHODS .... 4? BIBLIOGRAPHY 51 BIOGRAPHICAL SKETCH 53

PAGE 6

LIST OF TABLES Table iage NUMBER OP SIGNIFICANT t's V/HEN THE TRUE I4EAN GAIN WAS 0.0 FOR BOTH GROUPS ... 26 NUMBER OF SIGNIFICANT t's V/HEN THE POVffiR OF TliE t TEST ON THE RAV; DIFFERENCE SCORES WAS 0.50 27 VI

PAGE 7

Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy A COMPARATIVE ANALYSIS OF SOME MEASURES OF CHANGE by John Howard Neel August, 1970 Chairman: Wilson H. Guertin Co-Chairman: Charles M. Bridges, Jr. Major Department: Foundations of Education The purpose of this study was to determine which of four selected methods v;as most appropriate for measuring change such as gain in achievement. The four selected methods v.'ere ravi difference. Lord's true gain, regressed gain, and analysis of covariance procedures. In order to compare the four methods Monte Carlo techniques were employed to generate samples of pre and post scores for two groups. The reliability, variances, and means of the sampled populations were controlled. One hundred samples were generated at each of 20 combinations of five levels of reliability, Uio levels of group size, and two levels of gain. Using each of the four methods, a t statistic was calculated for each sample to test the null hypothesis of no difference in amount of gain betv/een the two groups. The number of t's significant at the 0.05 level of significance was recorded for each of the four methods . VI 1

PAGE 8

At each of the 20 combinations a chi square test was used to test the null hypothesis of equal proportions of significant t's among the four groups. This hypothesis was rejected in each case. It V7as noted that use of Lord's true gain procedure tended to create a greater significance level than the user vrould intend. The proportion of significant t's for each of the other three methods of analysis fell reasonably close to the expected values. On this basis use of Lord's true gain procedure vras not recommended and since there v;as no apparent difference among the remaining methods of analysis, none v;as recommended above the other. VI 11

PAGE 9

CHAPTER I INTRODUCTION Learning has long been a primary focus of investigation for educators. A definition of learning has been offered by Hilgard (I956): Learning is the process by which an activity originates or is changed through reacting to an encountered situation, provided that the change in activity cannot be explained on the basis of native response tendencies, maturation, or temporary states of the organism (e. g. fatigue, drugs, etc .) . (p. 3) Although Hilgard v/ent on to say that the definitio2i is not perfect, it does illustrate a commonly accepted aspect of learning: learning involves change of the behavior of the organism v/hich learns (Bigge, 196^, p. 1; Skinrsr, 1968, p. 10; Combs, 1959, p. 88). Educators have been concerned with this change and often have sought to measure the change occurring in some situation. Several methods of analyzing change have been presented in the literature. A comparison of these methods of analysis of the measurement of change and the accompanying difficulties v:hich arise was the focus of this study. The purpose of the study vms to determine v:hich of the several selected measures of change is most appropriate under various conditions.

PAGE 10

The Problem of Aeasurln/g; Ghang;e In all sciences measurement is an approximation. In conducting a first order survey, a surveyor makes three measurements of a distance and takes the average of the three measurements . Physicists and engineers customarily report the relative size of the error of their measurements. Physical scientists have been fortunate in that the size of the relative error involved in their measurements often has been small, frequently less than 0.01 and sometimes less than 10" . Educators are unfortunate in this respect in that if a student's true I. Q. v;ere 100 and an I. Q. score of 38 was observed, the relative error would be 0.14. The size of this error is not uncommon and larger relative errors do occur. The Stanford-Binet intelligence test has standard error of measurement equal to five I. Q. points (Anastasi, I96I, p. 200). Thus, relative errors of 0.14 or larger v;ill occur 1.64 per cent of the time, assuming a normal distribution of errors. In measuring change the proble'ii is compounded since there is an error in both the pre score and the post score. Moreover, V7hen the miagnitude of the change is small or zero, the magnitude of the error may be larger than that of the change. This possibility makes the change difficult to detect or to separate from the error. The effect of this error of measurement on change scores was noted as early as 1924 by Thorndike:

PAGE 11

When the individuals in a varying group are measured twice in respect to any ability by an imperfect measure (that is one whose selfcorrelation is below 1.00), the average difference betv/een the two obtained scores will equal the average difference betv;een the true scores that v;ould have been obtained by perfect measures, but for any individual the difference betvjeen the tv/o obtained scores Vfill be affected by the error. Individuals who a.re belov; the mean of the group v;ill tend by the error to be less far below it in the second, and individuals v/ho are above the mean of the group in trie first measurement v;ill tend by the error to be less far above it in the second. The lower the self -correlation, the greater the error and its effect. Thorndike (192^) v/ent on to shov: that there vjas a spurious negative correlation betv/een initial true score and ti'-ue gain. He then stated that "the equation connecting the relation of obtained initial ability v;ith obtained gain, the unreliability of the measures and the true facts" had not been discovered. Lord (195^) developed the equation to vrhicli Thorndike alluded. Lord made the following assumptions concerning the error of measurement: the errors i) have zero mean for the groups tested. ii) have the same variances for both tests, iii) are uncorrelatod vxith each other and vjith true score on either test. (Lord, 1956) McNemar (1955) has extended Lord's woi-'k to the case of unequal ei-ror variances. It should be noted here that McNemar's follovj-up of Lord's v;ork is only an extension. When the variances are equal, McIIemar's formulas are identical to Lord's (Lord, 195S) .

PAGE 12

Methods of Analyzing; Change The estimated true gain scores derived from Lord's and McNeraar's equations have been used in either t tests or analysis of variance procedures (Soar, I968; Tillman, 1969). There are, in addition to Lord's method, three other commonly used methods for analyzing change. One method has been the use of a straight analysis of variance or t test on the raw difference scores as the situation warrants. It is important to note that the rav; scores are used in the analysis with no correction for the mirel lability of the measures. A second method is to complete an analysis of covariance on the rav; difference scores using the pretest scores as covariates , These procedures are standard statistical techniques and may bo fomid in many texts (Hays, 1963; Snedecor, 1956; v/iner, I962) . The third method of measuring gain has been advocated by Manning and DuBois (I962): the method of residual gain. In this method the final scores are regressed on the initial scores and the difference betv;een the final score and the score predicted by the regression equation is taken as a measure of gain. This measure is than used in t tests or analysis of variance procedures. Thus in the case of equal variances four common methods of analyzing change have been identified: 1. Use of rav: gain scores in appropriate procedures ,

PAGE 13

2. Use of Lord's true gain scores in appropriate procedures . 3. Use of Manning and DuBois' regressed gain scores in appropriate procedures. ^. Use of analysis of covariance on raw gain scores with pre scores as covariates . The Problem A researcher faced with these different methods and a problem in measuring change is confronted with a second and fimda.Tiental methodological problem: which of the methods for measuring change is most appropriate? It is this question that this study sought to ansv;er. The problem of v/hich method to use is further complicated since different writers have claimed that different techniques were appropriate. "Manning and DuJBois and Rankin and Tracy feel that the method of residual gain is more appropriate for correlational procedures since it is metric free" (as quoted in Tillman, I969, p. 2). Ohnmacht (1963) also suggested that this procedure vra.s the best. Lord] in Harris, 1963, chapter 2) mentioned regressed gain but seemed to advocate his ovin method as being superior. This position is further supported by Cronbach and Furby (I97O). To determine which of the methods of analysis v;as most appropriate, an empirical study vjas conducted to compare the results of each method under knovm situations.

PAGE 14

Some Limitations This study v/as limited to the two group situation. To examine more than two groups would have involved such a large number of possibilities as to make the study impractical in terms of time and money. Thus, this study excluded the more general multigroup comparisons possible with analysis of variance and covariance procedures and was limited to examining t tests and the analysis of covariance using the pre score as covariate . Additionally the case of unequal variances between the two groups v.'as not considered. A third limitation vjas that the variance of the true gains was selected a priori to be 3-6. True gain scores with this variance will be such that over 99 psr cent will be within five imits of the true mean gain. It is the ratio of the variances of the gain and error which is important. Since the reliabilities were varied, as indicated later, this study was conducted utilizing several such ratios. One of the factors of interest was the reliability of the test used. A second factor of interest was the sample size or the relative pov;er of tlie procedures under study. There is an infinite number of i^eliabilities and a decision was made as to the levels of reliability to be investigated: 0.50 to 0.90 in increments of 0.10. Tests vjith reliabilities lower than O.5O are rarely used in practice and at best the resulting data would be highly questionable. The lovjer limit of 0.50 was chosen for this reason.

PAGE 15

Sample sizes of 25 and 100 per group were chosen as being somewhat representative of sample sizes used in educational research. Procedures Two groups v^ere compared under 20 conditions using t tests as f ollov;s . There were five levels of reliability (0.50, 0.60, 0.70, 0.80, 0.90) and two levels of sample size (25 and 100) used in this study. Thus there v;ere ten different combinations of sample sizes and reliabilities. For each of these combinations two cases vjere investigated, one v;here there was no pre to post test gain in either group and the second where there was a knovm. gain from the pre to the post test for one group. For each of these 20 instances, 100 samples were generated and analyzed using each of the four methods of analysis indicated previously. « Consequently, tv;o questions vjere to be answered: 1. Does any one of the selected methods yield a disproportionate number of significant t values v;hen there is no difference between the mean gain of the tv70 groups? 2. Is any one of the selected methods more pov;erful, i. e. more successful in detecting a difference when a difference does exist? Samples from a normal distribution v/ere generated using techniques described by Rosenthal (I966) . The method for generating random numbers was the multiplication by a constant method. With the procedures used, this nethod

PAGE 16

8 will produce 8.5 million numbers before the series repeats. This number vjas more than sufficient for this study. All generation of the samples and calculation of t values using the various methods of analysis were done on the IBM 36O/65 computer at the University of Florida. The significance level used for the t tests was 0.05 . The two research questions generated two null hypotheses: 1. The proportion of t's significant at the 0.05 level is the same for each method of analysis when there is no gain in either group . 2, The proportion of t's significant at the 0.05 level is the same for each method of analysis when there is gain by one group but not by the other. These hypotheses were tested at each of tv7o combinations of reliability and sample size with chi square tests using the 0.05 level of significance. Signif ican ce of the Study The results of this study should either indicate empirically that one or more methods vrere superior to the others or that there were no great differences among the methods. If the former were true, then educational researchers may select one of tlie better methods. If the latter were true, then educational researchers may select any of the

PAGE 17

methods. In either case the study provides some answer as to how change scores should be analyzed. Orp:anization of the Study Chapter I has been the introduction, statement of the problem, limitations, hypotheses, and procedural overview. Chapter II reviews related literature, essentially the development of the equation and methodology of the various techniques studied. Chapter III describes the procedures and Chapter IV presents the data, conclusions, and summary.

PAGE 18

CHAPTER II RELATED LITERATURE Much has been said in the literature about measuringchange. However, most of this discussion is centered around the four methods investigated in this study: raw gain, Lord's true gain, regressed gain, and analysis of covariance procedures. As pointed out in Chapter I raw gain and regressed gain procedures are discussed in many texts and therefore not discussed here. Lord's true gain and regressed gain procedures are discussed in this chapter. Derivation o f Lord's True Gain Scores The following derivation parallels Lord's (1956) development of true gain scores with one exception as noted. Lord gave the following equations as a model for the observed pre and post scores: (1) X = T + E-^ (2) I = T + G + Eg where X = observed pre score; Y = observed post score; T = true pre score; G = true gain; E-j = error of measurement in pre observed score; 10

PAGE 19

11 Ep = error of measurement in post observed score. Lord then made the following assumption concerning E, and Ep, the errors of measurement. The Errors i) have zero mean in the group tested. 2 ii) have the same variance (o ) for both tests. iii) are uncorrelated with each other and with true score on either test (Lord, 1956) . The derivation can be considerably shortened at this point by examining a standard regression equation v/hich predicts one variable, X^ , from two other variables, X„, X„. The equation is (Tate, 1965, p. 171) <3' ^1 = B12.3 C^ (X2 X^) + B^3 2 i (X3 X3) + x^ where r — V T B = ^12 ^1 3 "^23 12.3 1 p2 23 and r r r B = -12 12 ^23 13.2 1 _ ^2 If we let X-j^ = G ^ gain; X = X = observed pre score; Xo = 'if = observed post score; the following elements in the regression equation can be identified as X-, = estimated gain;

PAGE 20

12 Xp = mean of the observed pre scores; Xo = mean of the observed post scores; S-, = standard deviation of the gain scores; Sp = standa.rd deviation of observed pre scores; S„ = standard deviation of observed post scores; Tp^ = correlation of observed pre and post scores. Lord has pointed out that r-,^ and r, ^, the correlations between observed score and true score, are the reliabilities Prom (1) and his stated assumptions, Lord v;rites , r'\ ^2 2 ^ 2 ^ 2 (5) ^y = ^t + °g + % 5 (6) ^xy^^t'-ne • Lord solves these equations to find 2 2 2 2 (7) o = a -^^ o + 2a 2a ^ , w / g X y xy e ' the variance of the true gain scores . At this point the only element in the regression equation vjhich is undefined is X . This eleraeiit is found by considering the mean of the observed pre scores vjhich from (1) can be seen to be equal to the mean of the gain scores plus t?ie mean of the errors of measurement, that is (8) X = T -iE . But by Lord's assumption (i) E = 0, therefore

PAGE 21

13 . (9) X = T . . Similarly from (2) Lord shows (10) Y = T + G . Then (9) is subtracted from (10) and rewritten to yield (11) G = Y X . Thus all elements are defined and (3) may be rewritten as in terras of T, G, X, and Y as (12) G B^2 3 ^^ (^ ^) + B;L3 2 ^ (Y Y) + I X X y which Lord has asserted to be an estimate of true gain. It may be noted that no notational scheme or other method has been presented to distinguish betvjeen statistics and parameters in the preceding derivation. This lack is in keeping v^ith Lord's derivation. It is assumed here that Lord v.'as referring to parameters until the point at which he obtained the final equation and that he then intended to use sample values to estimate the appropriate parameters in the regression equation. Comparison of Lord's True Gain Scores with Other Scor es In his original article Lord (1956) made no comparison of his method v;ith any other method. In a subsequent artj.cle (1959) Lord again made no mention of other methods. In a chapter written in Problems in Mea suring Chanp:e (Harris, 1963, Chapter 2) he made reference to regressed gain scores,

PAGE 22

14 but no discussion or comparison was presented. In Statis tical Theories of Mental Tes t Scores (Lord and Novick, 1968) no comparison of Lord's true gain scores with other procedures is presented. Comparison of Regressed Gain Scores with Other Scores Manning and DuBois (I962) have compared per cent, rav;, and residual gain scores. A per cent gain score is raw gain scoredivided by the pre score (Manning and DuBois, 1962) . The comparisons vjere made on the bases of metric requirements, reliability, and appropriateness of use in correlation procedures. On each of these bases residual gain scores were recommended over per cent and raw gain scores. Manning and DuBois pointed out that per cent and raw gain scores require at least equal interval scales on both pre and post scores and that the scales be the same on both pre and post scores, i. e. the same equal interval scale must be used on both tests . According to Maniiing and DuBois these qualities are not possessed by educational and psychological test scores. In contrast, residual gain scores do not require the same equal interval scales and therefore are appropriate for use with test scores (Manning and DuBois, 1962) . Manning and DuBois summarily list form.ulas showing that residual gain scores are more reliable and more appropriate measures for correlational procedures than are raw or per cent gain scores. These formulas were only IJsted, not derived, and no reference v;as made to their derivation.

PAGE 23

15 Another Study and Summary Madansky (1959) reported or derived several methods for fitting straight lines to two variables when both were measured with error. One of these procedures is applicable in the case when the variance of the error of measurement is unknown. Hov;ever, there has apparently been no attempt to apply the method to the analysis of change. A search of the literature has revealed no comparative empirical examination of the four methods examined in this study. Further, the advocates and authors of two of the reported procedures, each of whom has been shovjn to know of the existence of the other procedure, continue to advocate their own method even though they offer no reason or data for this advocacy. This study should provide some knowledge as to any difference in the four methods.

PAGE 24

CHAPTER III METHODS AND PROCEDURES Procedures: An Overview As stated in Chapter I, Monte Carlo techniques v^Jere employed to generate pre and post test scores for tvjo groups. One group is referred to as the gain group, the other as the no gain group. The model for the observed pre scores is (1) X = T + E-j^ where X is the observed pre score; T is the true pre score; E-, is a nornally distributed random error 2 with mean 0.0 and variance o . ®1 The model for the post scores is (2) Y = T + G + Eg where y is the observed post score; T is the true pre score; G is the true gain from pre to post score; Ep is a norraally distributed random error 2 vjith mean 0.0 and variance o 16

PAGE 25

17 The generated scores vrere subsequently analyzed for the difference in the amount of gain or change between the two groups. The scores were analyzed by the four selected methods; 1. at test on the raw difference scores 2. at test on Lord's true gain scores 3. at test on regressed gain scores 4. a t test from an analysis of covariance on the raw difference scores using pre score as covariate. The results of these analyses v/ere then compared. For tlie gain group an appropriate mean gain, y , from pre to post scores was obviously selected to be 0.0 in the case of no gain for either group and selected to be of such size as to make the pov;er 0.50 when there was a gain in the gain group. The post scores were generated by adding a random normal gain, G, and a random normal error, Ep, to the generated true pre scores . The variables G and Ep had means V and respectively and variances as discussed D later. For the no gain group there v/as no gain from pre to post scores. The pre scores for both the gain and the no gain group were taken from a normal population with mean 50.0 and variance 100.0. The mean and variance of the population of post scores for both the gain and the no gain groups were functions of the mean true gain and of the re] iability .

PAGE 26

18 After the samples were generated, the hypothesis of no difference in average gain between the two groups was tested using each of the four methods. The t values for each of these tests were recorded. This procedure was repeated over 100 samples for each of the selected reliabilities O.5O, O.6O, O.7O, 0.80 and 0.90. The method for introducing the effect of the selected reliability into each generated score is presented in a following section. Thus 100 t values were calculated and recorded for each method of analysis and at each level of reliability. This entire procedure was repeated for each of the follovn.ng conditions: 1.group size = 25, l^ = 0.0 for both groups; 2. group size = 25, v 5^ 0.0 for the gain group O = 0.0 for the no gain group; 3. group size 100, V =0.0 for both groups; 4. group size = 100, V7^ 0.0 for the gain group O =0.0 for the no gain group. V/here the gain vjas not equal to 0.0 it vjas such that the power of the t tests on the raw difference scores was O.5O, i. e. the expected proportion of rejected null hypotheses was 0.50. The follov;ing sections describe in more detail some of the previously mentioned procedures .* * The reader is also referred to the FOSTRAH listing in Appendix A for the exact computer routines by which these procedures v:ere carried out.

PAGE 27

19 Samplinp; from a Normal Population with Specified Mean and Variance If P is the cumulative density function of a random variable R, , thai the random variable R_ defined by (3) B2 " ^^^1^ is imiforiTily distributed over the interval [o,l| (Heyer, 1965, P256, Theorem 13-6) . Here P is the cumulative density function of the random variable R-, . It follows then that R-, , where (^) Ri = F~^(I^2^ ' is normally distributed if F~ is the inverse cumulative density fujiction of a normal distribution and if R is a uniform random number on the interval [o,l] (Meyer, I965, pp. 256-257) . Thus random samples from a normal distribution may be obtained using uniform random numbers and by (4) vihere P is the cumulative density function of a normal distribution. For a normal distribution, F~ (R ) must be calculated using numerical approximation methods. This calculation as well as the generation of the uniform random numbers were done using a routine described by Rosenthal (I966, pp. 270, 287). Rosenthal's techniques v^ere adapted to the IBM 36O/65 computer installed at the University of Florida (see Appendix A FUIICTIOII R/^KD) . The normal population sampled had mean and variance 1.0. If a different mean or variance

PAGE 28

20 was required, it was obtained by addition or multiplication by an appropriate constant. Selecting; Reliability Reliabilities of O.5O, O.6O, O.7O, 0.80 and O.9O were selected as representative of reliabilities found in test scores. The reliability, rel, of a test may be defined as (5) , rel = 1 — i (Nunnally, I967, p. 221), a X 2 2 where is the error of measurement variance and o is e X 2 the observed score variance. Since had been selected X a priori to be 100.0, we have from (5) (6) o^ = 100 (1 rel) . ®1 Moreover, since (7) X = T + E^ and since the error, E-, , is assumed to be independent of the true score, T, we have (8) a^ z. o^ -ia^ , X t e^ 2 or, combining (6) and (8) and solving for o (9) o? = 100 cj^ t e^ For the post scores the desired variances are also eavSily found from the model for a post score,

PAGE 29

21 (10) Y = T + G + E^ , and for which (11) c"^ ^ c\^ o"^ ^ o"^ y t g eg 2 As stated in Chapter I, oj was selected to be 3.6. If (5) D is rewritten with a and a^ instead of a^ and g , respecy eg X e^; tively, then (5) and (11) may be used to find (12) /= -L. _S _ o2 o^ . e^ i^el t g • The effect of the selected reliability may be obtained by selectiiig the error of measurenient variances and the variance of the true scores in accordance with (6), (9) and (12) . Thus it is seen that if true scores are selected from a distribution with variance o and if the errors of 1/ measurement are selected independently from a distribution with variance o^, then by (8) X has variance a^, if (7) holds. S electJ-n/^ Gain v;hen there was no gain in either group the value of v would then be 0.0. When vi was nonzero for the gain its value was selected so as to make the power of the t test on the va.\i difference scores equal to O.50. The power of 0.50 vjas selected in order to permit maximum difference between the four methods of analysis.

PAGE 30

22 The value of G was determined by examining the difference scores (D) . (13) D = Y X, and from (1^) D = (T + G + E^) (T + E^) , or (15) D = G + E^ E-[_ . The elements in the right side of (11) are mutually independent normally distributed random variables whose variances have been found and thus , ,, 2 2 2 2 ^ ' d e-j^ g eg Furthermore since the only difference in (15) for the gain and no gain groups is the mean of G, the variance of the 2 difference for the gain group, o ^ , and the variance of g 2 the difference for the no gain group, o , are equal, i. e ng 2 2 2 (17) °d = ''d = ^d • g ng This common value a. may then be used to determine the appropriate value of vto produce the desired power of 0.50 for the t test on the raw difference scores. The t test on the rav: difference scores is found from the follov;in2; formula:

PAGE 31

23 D (18) t = g " ^ng (n 1) S^ + (n 1) S' n^ + n^ 2 J}-E n + 1_ If the group size is 25 and reliability O.5O, the value of is found as follows: 2 (note: rcl = O.5O implies o = 106.72) (19) t = ^^-Q 2^(2a^ ) + 2ii(2af ng 25 +25-2 1. 4. 1_ 25 " 25 t = L 2T9155 This value of t is greater than the critical value of t (2.01) only if (20) D. 2-°^ ^-279^1 that is, only if (21) 5.86 < D g Thus if a value of 5.86 is chosen for the mean gain, the power is .50. Appropriate values for other group sizes and reliabilities were similarly determined. Ajgial^_jj3__of_t}vq_t_ Va^^^^ The number of t's significant at the O.O5 level v/as recorded for each of the four methods of analysis. These

PAGE 32

2k . data were recorded for each of the 20 combiriations of sample size, reliability and gain. A chi square statistic was calculated for each of these 20 sets to test the null hypothesis of no difference in the proportion of significant t values for the four methods of analysis. These data may be seen in Tables 1 and 2 of Chapter IV.

PAGE 33

chaptp:r IV BESULTS, CONCLUSIONS, AND SUMMARY Results The number of significant t's for each method of analysis under the no gain condition is presented in Table 1, and for the gain condition in Table 2. Additionally, the computed chi square statistics for each reliability level are given. In each case the null hypothesis tested v;as that the proportion of significant t's was the same for each of the four methods of analysis. The chi square values v;ere computed from the 2 x ^ contingency tables implied by the corresponding line of the table. For example, for group size of 25 and a reliability of 0.50, the 2 X ^ contingency table implied by the first line of Table 1 is: Rav7 Lord's Regressed Analysis of gain gain gain covariance Significant 5 58 2 2 Non significant 95 42 98 98 As may be seen by inspection of Tables 1 and 2, all the chi square values v-;ere significant at the 0.05 level and in each case the hypothesis of equal proportion of significant t values for the four methods of analysis v;as rejected.

PAGE 34

26 TABLE 1 NUMBER OP SIGNIFICANT t's WHEN THE TRUE MEAN GAIN V;AS 0.0 FOR BOTH GROUPS GROUP SIZE = 25 RELIABILITY

PAGE 35

27 TABLE 2 NUMBER OP SIGNIFICANT t's WHEN THE POV/ER OF THE t TEST ON THE RAV/ DIFFERENCE SCORES WAS O.5O GROUP SIZE = 25 RELIABILITY

PAGE 36

28 Further inspection of Tables 1 and 2 reveals a higher number of significant t's for Lord's true gain procedure than for any other methods. Moreover , examination of Table 1 shovis that this particular technique gives a considerablygreater frequency of significant t values than one would expect by chance. The expected frequency is 5 for the a priori established condition of no actual difference in the tv/o populations sampled. These results indicate that use of Lord's true gain procedure tends to create a higher significance level than the user would intend. If the sample proportion of significant t's found in the analysis is used as an estimate of the significance level, that estimate is 0.58 for the case v:hen the group size was 25. For the same group size the lowest estimate of the significance level is 0.39 • Conclusions Since the hypothesis of equal proportion of significant t's for the four methods of analysis was rejected in each of the 10 cases v:here the mean gain was 0.0 and since the use of Lord's true gain scores provided estimated levels of significance which vrere considerably higher than those intended, the use of Lord's true gain scores is stronglj'' susi^ect and therefore is not recommended. No apparent differences were found among the remaining three methods of analysis. Hov;ever, there is a similarity betvjeen the regressed gain scores procedure and t}ie

PAGE 37

29 analysis of co variance procedure that should be examined. The data in Table 1 indicate that the same number of significant t's v;as found by both of these methods in the case where there was no gain for either group. The 100 t values for each of the four methods of analysis when the group size was 25 and there was no gain in either group are presented in Appendix B. Inspection of the t values for the regressed gain procedure and the analysis of covariance procedure reveals a striking similarity between the t values; for each sample the t values are identical to at least the first decimal place. As a descriptive statistic it is noted that the correlation between the t values found by these two methods is 1.00 (rounded to 3 digits). Thus, the tvjo methods are providing very similar results. In contrast the correlation between the t's for the rav/ difference and regressed gain procedures is 0.898. The two methods, regressed gain and analysis of covariance, are not entirely similar to the raw difference procedure. It may also be seen from Appendix B that the signs of the t's from both the regressed gain and analysis of covariance procedures sometimes are opposite from the sign of the t for the ravj difference procedure. Snedecor (195^, pp. 397,398) has indicated that the regressed gain procedure and the analysis of covariance procedure on the post scores using pre score as covariate are identical procedures. No source was found indicating a

PAGE 38

30 similarity between the regressed gain procedure and the analysis of covariance procedure on the difference scores using pre score as covariate. However, the two methods may be shown to be equivalent by writing the linear model for the regressed gain, or, equivalently , for the analysis of covariance on the post scores using pre score as covariate, and the model for the analysis of covariance on the difference scores using pre score as covariate. The model for covariance analysis on the post scores is (1) Y = Bq + BjX + BgZ + E (r4endenhall, I968, p. I70) , where X and Y are defined as previously and Z = 1, if Y is from the gain group, =0, if Y is not from the gain group. E a normally distributed random error with mean 0. The model for covariance analysis on the difference scores is (2) D = Bp, + B-,X + BpZ + E " ^1^ " ^'Z' where (3) D = Y X and all other elements are defined as in (1). Now if the right side of (3) is substituted into (2) and the resultant equation rearranged to yield (^0 Y = Bq + (B-L+ 1.0)X -IBgZ + E ,

PAGE 39

31 it is seen that (1) and {k) are identical except for the addition of 1.0 to B, of equation (1) and thus the two methods will yield the same t values for testing the hypothesis that Bp is equal to 0.0. Since no clear difference was found among the raw gain procedure, the regressed gain procedure and the analysis of covariance procedure, none of these is recommended as more appropriate for the analysis of change than the other. All of these three procedures are reconimended above Lord's true gain procedure . Discussion It is reasonable to ask if there is some questionable logic in Lord's derivation of true gain scores. Tvio things become apparent upon examination of the derivation. First the formula v;hich Lord uses to begin his derivation, (3) of Chapter II, requires that the independent variable be knovai exactly (Madansky, 1959; Scheffe', 1959, P^) , i. e. v/ithout error of measurement. The problem of estimating true gain arises in that the pre and post test score.8 are not knovn exactly, but instead the observed vcovez), or the true scores plus measurement errors, are known as may be seen from Lord's models of the observed scoi'es, (1) and (2) in Chapter II. If true pre and true post score were known these could be put into the regression equation. Hov;ever, if. true pre score and true post score were known there would be no need for the regression equation to

PAGE 40

32 estimate gain. The gain could be obtained simply by subtracting true pre score from true post score. In short, Lord seems to have assumed his conclusion in his derivation. Second, a look at the basic method for estimating true gain is enlightening. In order to estimate true gain from observed score it is necessary to somehow remove the error of measurement since, assuming the test to be valid, this is the factor that obscures the true score. Mendenhall (1968, p. 1) says that "...statistics is a theory of information...-." V/hat information is knovra concerning the errors of measurement? By Lord's assumption (iii) of Chapter II the errors are uncorrelated with true score and with each other. Thus neither the observed pre score nor the observed post score should provide any information concerning the size of the error. Since this error is random it would seem that it could not be removed from the observed scores. Consider tvjo equal observed scores, one obtained from a higher true score by the addition of a negative error, the other obtained from a lov/er true score by the addition of an error of the same size but opposite sign of the previous error. Hov/ does one decide which score is to have a positive correction added and which is to have a negative correction added? It would appear that one cannot make this decision v/ithout having some information besides the observed scores .

PAGE 41

33 A Direction for Future Research Since this study shows no difference among the proportion of significant t's for the raw gain, regressed gain, and the analysis of covariance procedures, it would be interesting to investigate the use of these procedures under assumptions other than the models listed in the first section of Chapter III. Differences may occur, for example, when the gain is a linear function of the pre score. Summary This study compared four selected measures of change. The four measures were: rav; difference, Lord's true gain, regressed gain, and analysis of covariance procedures. An empirical comparison was made among these four methods. Samples vjere generated using Monte Carlo techniques and the data in each sample were analyzed by each of the four methods . It V7as found that Lord's true gain procedure produced a number of spurious significant t values, greater than vjould be expected by chance, v/hen there was no real difference in amount of gain betvjeen the tvjo populations sampled. No apparent differences were noted among the remaining three methods and these three methods did not appear to have inflated significance levels. With such data use of Lord's true gain procedure is not recommended and none of the remaining three methods vjas recommended over the others .

PAGE 42

APPENDIX A

PAGE 43

35 FORTRAN PROGRAM WHICH PERFORMED THE CALCULATIONS C C c c c c c c c DIMENSION X{100,2,2),LnRD(100,2),DIFF(100,2) 1,AMAT(3,3),DIFFX{ 3) ,DUM(3) DOUBLE PRECISION SEED REAL KR21 , LO^C , MPRG, MPRiMG, MPOG, MPONG , MLG , MLNG , MRGG , MRGNG READ (5,1) N,SEEU,NSAMP,KSAMP,,NOPT 1 FORMAK 13, FlI .0,314) IFCNCPT.EC.O) GO TO 3 READ{5,2) IREL.ISAMP IREL RESTART RELIABILITY AT IREL FOR ABORTED RUN I SAMP RESTART SAMPLE NUMBER AT I SAMP FOR ABORTED RUN 2 for;-' AT (214) GG TO 4 3 I R E L -. 1 ISAMP=I 't CONTINUE N SIZE OF SAMPLE GAIN AVERAGE GAIN IN GAIN GROUP SEED SEED FOR RANDOM NUMBER GENERATOR KSAMP NUMBER OF SAMPLES TO BE TAKEN AT EACH LEVEL NSAMP NUMBER OF TFIE LAST SAMPLE FROM THE PREVIOUS RUN INCREMENTED AND PRINTED OUT AS THE SAMPLE NUMBER

PAGE 44

36 c c c c c c c c c c c c c OF RELIABILITY TT=0.0 DO 1000 IR=IREL,5 READ(5,14) GAIN 14 FORNAK F6.'^) DO 902 J1=ISAMP,KSAMP NSAMP=NSAMP+1 S SUM SS SUV: OF SQUARES D DIFFERENCE SCORE G GAIN GROUP NG NO GAIN GROUP PR PRE SCORE PO POST SCORE PP PRE X POST PRDIFF SUM PR X DIFF SDG =0.0 SONG =0.0 SSDG =0.0 SSDNG =0.0 SPRG =0.0 SPRNG =0.0 SSPRG =0.0

PAGE 45

37 c c c c c c c c c c SSPRNG=0.0 SPGG =0.0 SPCNG =0.0 SSPOG =0.0 SSPCNG=0.0 SSPPG =0.0 SSPPNG=0.0 PRDIFF=0.0 REL=0.40+ 0.10»IR SEl= SQRT( 100.0-SEUSEl ) SX = SQRTl 100.0-SE1»SE1 ) SE2= SCRT((SEl»SEl+3.36)/REL-3.3 6-SEleSEl) GAIW=TT«SGRT{2.0«(SEleSEl+SE2*Sc2+3.36)/N) XII, J, K) I-STUDENT J=l, PRE SCORE =2, POST SCORE K=l GAIN =2 NO GAIN DIFF( I , J) 1= STUDENT J=l, GAIN =2, NO GAIN DC 10 I=1,N l)l = SX«RAND(SEEC) D2=SX«RANC(SEED)

PAGE 46

38 X(I,1.1)=50+D1+SE1»RAND(SEED) X(I,2,1)=50+01+GAIN+1.83»RAND(SEE0)+SE2»RAND( SEED) X( I,l,2)-50+D2+SEl*RAND(SEED) X( I,2,2)=50+D2+1.83»RAND(SEED)+SE2*RAND(SEED) DIFF( I, 1)-X( I ,2, 1 )-X( 1,1,1) DIFF( I,2)=X( I ,2,2)-X{ I, 1,2) SDG=SDG +DIFF( I, 1) SDNG=SDNG +CI FF{ I ,2) SSOG=SSDG+CIFF( I,l)fiCIFF(I,l) SSDNG = SSCMG + DIFF{ l,2)
PAGE 47

39 1SP0G»SP0G/N) ) CPPNG= (SSPPNG-SPRNG*SPONG/N )/SQRT( ( SSPRNG-SPRNG* SPRNG/N) * ( 1SSP0NG-SPCNG*SP0NG/N) ) DBARG= SDG/N DBARNG = SDNG/N VADG= (SSDG-SDG»SDG/N)/(N-1) VAUNG =(SSDNG-SDiNG*SDNG/N)/(N-l) Tl= (CBARG-D3ARNG)/SGRT( { ( N1 ) *^ ( VADG + VAONG ) / { 2*N-2 ) )»{2.0/N 1)) . B1G= ( ( ( 1.0-REL)»CPPG«SQRT(VAP0G) ) /SORT ( VAPRG ) -RE L+CPPG*CPP IG)/ 1 ( 1.0-CPPG«CPPG) B2G= (RlL-CPPG>CPPG-( ( 1.0-REL )« SQRT ( VAPRG ) *CP PG ) /SQRKVAPOG 1) )/(l .0-CPPG«CPPG) B1NG= {(( 1.0-REL) »CPP,NG»SQRT( VAPONG) ) / SQRT ( VAPRNG ) -REL + CP PNG 1«CPP,\G) )/ (1.0-CPPNG = CPPMG) B2NG= (REL-CPPNG»CPPNG-( ( 1.0-REL )* SQRT{ VAPRNG ) »CPPNG ) /SQRT{ 1VAP0\G) )/ ( 1.0-CPPNG«CPP.MG) SLG =0.0 SSLG =0.0 SLNG =0.0 SSLNG =0.0 KPRG=SPRG/N MPtJG = SPGC/N MPRNG=SPRNG/N MPONG = SPCf\G/\

PAGE 48

ko DO 110 1=1, N LORDt I ,1)=DBARG+B1G*(X( 1,1,1 )-KPRG)+B2G« (X( I ,2,1)-MP0G) LORD( 1,2 )= 08ARNG+81NG*(X{ I, 1, 2 )-MPRNG ) +B2NG* (X(I ,2,2)-MPaN IG) SLG=SLG + LCRD( 1,1) SLNG = SLNG + LORD( 1,2) SSLG=SSLG+L0RD(1,1)«L0RD{I,1) 110 SSLNG=SSLNG+LORO( I,2)bL0RD(I,2) MLG=SLG/N MLNG=SLNG/N VALG= (SSLG-SLG*SLG/N )/CN-1.0) VALNG=(SSLNG-SLNG»SLMG/N)/(N-1.0) T2=(NLG-MLNG)/SQ^T( ( (SSLG-SLG*SLG/N)+SSLNG-SLNG«SLNG/N)/(N« 1(N-1 ) ) ) A={SSPPG + SSPPNG-( SPRG + SPRNG)<-(SP0G + SP0NG)/(2'^N) ) / 1 (SSPRG + SSPRMG-{SPRG + SPRNG)fr( SPRG + SPR^JG ) / ( 2^11) ) B=(SPCG + SPOrJG)/( 2*N)-A*( SPRG + SPRNG ) / ( 2
PAGE 49

^1 SRGNG=SRGNG+RGSNG 210 SSRG!\G = SSRGNG + RGSNG*RGSNG MRGG=SRGG/N NRGNG=SRGNG/N VARGG=(SSRGG-SRGG»SRGG/N)/(iM-l) VARGNG=(SSRGNG-SRGNG»SRGNG/N)/{N-1) T3= (KRGG-MRGNG)/SQKT( ( SSRGG-SRGG»SRGG/N+ SSRGNG-SRGNG* 1SRGNG/N)/(N»( N-1) ) ) AMATC 1, 1)-2«M AMAT( 1,2)=N AMAT( I, 3)^SPRG+SPRNG AMAT{2, n^AMAK 1, 2) AMAT(2,2)=N AMAT(2,3)=SPRG Af-'AT(3, 1 ) = AMAT( 1, 3) AMAT( 3,2)=AKAT(2, 3) AMAT(3.3)=SSPRG
PAGE 50

^2 A05 DUMd )=.CUN{ I ) +DI FFX ( J ) * AMAT ( I , J) A 10 YXXXXY = YXXXXY + CUM( I )«CIFFXl I ) SSE = SSCG + SSCrvlG-YXXXXY VAACCV=SSE/{2.0*N-3) T4=DUN(2)/SGRT ( V AACCV* AMAT ( 2 t 2 ) ) AA=(PP[:iFF-{SPRG + SPRNG)*f(SCG + SDNG)/{2*N))/ 1 {SSPRG + SSPRi\G-{SPRG + SPRNG)»( SPRG + SPR^JG ) / ( 2»-fNl ) ) AKG=CEARG-AA'^(KPRG-(I^PRG + KPRNG)/2) AN'NG=CB/lRNG-AA»(MPRNG-(MPRG + yPRNG)/2) KR21^1.G-(KPRG«llC0-NPRG))/{ 1C0»VAPRG) VvRITE (6,501) NSAMP,REL,KR21, T 1 , T 2 , T3 t T^ , N-PRG , N PRNG , MPGG , 1NPCNG»MG,KLNG,MRGG,MRGNG,AMC,AMNG, SE E D , VAPRG , VAPRNG , VAPCG , 2VAPCKG,VALG,VALNG,VARGG, VARGNG, VAACOV 501 FCRFAT(I6,1X,2(F3.2, 1X),1X,4(F7.3),5(^X,2F6.2)/5X,F23.11, 116X,A(AX,2F6.1),8X,F6.2) V
PAGE 51

^3 DCUDLE PRECISICN RC RC=CN'CCIRC* 30 517578125. ,3^359738368.) X=RC/34359738368. Y=SIGK( l.C,X-0.5) V=SCBTI-2.0*ALCG(0.5»(1.0-ABS(1.C-2,0«X)))) RAKC=Y«{V-{2.5 15517+C.8 2853«V+.C1032 8* lVe*2)/[1.0+l.A32788«V + 0.189269*Vs«-2 + 0.C0130 8tV*'^3)) RETURN END SUDRGLTINE INVCA) PRCGRAM FCR FINCIKG THE INVERSE CF A 3X3 MATRIX CINENSICN A(3, 3), L( 3 ),f',( 3) CATA N/3/ SEARCH FCR LARGEST ELEMENT CC80 K=1,N L{K)=K K (K) = K BIGA=A(K,K) CC2G I=K,N CC20 J=K,N IF(ABS (DIGA)-AL'S (A(I,J))) ]C, 20,20 10 BIGA=A( I , J) L { K ) I N(K)-J 20 CCNTINUE INTERCH/iNGE RCWS

PAGE 52

kk J=L{K) IF(L(K)-K) 35,35,25 25 DC30 1=1, N HCLC=-A(K, I ) A(K, I ) = A( J, I) 30 A( J, I ) = l-CLrINTERCt-AKGE CCLUM\S 35 I = f''lK) IF(KIK)-K) A5,^5,3 37 CC40 J=1,N HCLC=-A( J,K ) A( J,K) = A( J, I ) 40 A( J, I ) = hCLC DIVICE CCLL'FN EY MNUS PIVOT A5 CC55 1=1, N A6 1F{ I-K)50,55,50 50 A( I ,K )=A(I,K) /(-A(K,K ) ) 55 CONTINUE RECUCE r-'ATRIX DC65 1=1, N CC65 J=1,N 56 IF(I-K) 57,65,57 57 IF(J-K) 60,65,60 60 A(I,J)=A(I,K)^fA(K,J)^A(I,J) 65 CONTINUE CIVICE RCVv BY PIVCT

PAGE 53

45 CC75 J=1,N 68 IF( J-K)70,75, 70 70 A(K, J)=yi(K, J)/A(K,K) 75 CCKTINIjE C CCMINUEC FRUCUCT CF PIVOTS C REPLACE PIVCT BY RECIPROCAL A(K,K)=1.0/A(K,K) 80 CCMINUE C FINAL RCW AND COLUMN INTERCHANGE K = N 100 K=(K-I) 1F(K) 150,150,103 103 I=L{K) IF(I-K) 120,120,105 105 CCUO J^l.N HCLn=A( J,K) A ( J,K )=-A ( J, I ) 110 A( J, I )-hCLC 120 J=y(K) IFIJ-K) 100,100,125 125 CC130 1=1, N HCLC=A(K, I ) A(K, I )=-A( J, I ) 130 A( J, I )=FCLC GC TC 100 150 RETURN

PAGE 54

^6 END

PAGE 55

APPENDIX B

PAGE 56

48 LISTING OF t's FOR EACH METHOD OF ANALYSIS WITH THE GROUP SIZE 25 AND GAIN 0.0 SAMPLE

PAGE 57

49 LISTING OF t's (CONTINUED) SAMPLE NUMBER EAV/ GAIN LORD'S TRUE GAIN REGRESSED GAIN ANALYSIS OP COVARIANCE 42 43 44 45 46 47 48. 49 50 51 52 57 58 |9 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 1.042 1.556 0.511 -1.295 0.669 0.382 0.228 0.922 -1.263 -0.276 0.496 1.385 -0.087 -1.594 0.489 -1.450 1.146 1.754 -0.538 -0.434 1.155 1.321 0.130 0.721 -0.556 0.173 -0.1^6 1.290 -1.904 0.572 0.923 -0.034 0.123 -0.586 -2.622 -0.034 -0.259 -0.597 0.065 0.614 0.729 -0.967 -1.275 1.259 1.762 16.696 3.033 -7.504 4.977 1.210 0.216 5.810 -3.021 -0.763 1.975 6.511 -0.187 -2.591 3.276 -3.217 4.044 7.198 -8.090 -2.971 3.037 20.233 1.233 2.349 -1.149 0.332 -0.465 4.543 -4.395 5.132 2.998 -0.106 . 546 -2.404 -8.344 -0.078 -0.303 -2.141 0.197 1.834 0.579 -2. 813 -5. 166 4.430 0.975 1.728 0.184 -1.783 1.026 0.640 -0.110 0.943 -1.114 -0.941 0.757 1.583 -0.161 -1.646 -0.233 -1.298 0.657 1.535 -0.255 -0.083 1.766 1.620 -0.008 0.514 -0.084 0.434 -0.496 1.526 -1.858 0.470 1.140 -0.004 0.026 -0.920 -2.917 -1.368 -0.257 0.577 -0.404 0.162 1.078 -1.069 -1.077 0.797 0.966 1.710 0.183 -I.78O 1.026 0.634 -0.110 0.933 -1.106 -0.933 0.749 1.572 -0.160 -1.630 -0.239 -1.290 0.657 1.530 -0.253 -0.083 1.749 1.604 -0.003 0.510 -0.034 0.432 -0.493 1.514 -1.844 0.466 1.128 -0.004 0.026 -0.911 -2.895 -1.332 -0.254 0.605 -0.412 0.162 1.073 -1.058 -1.071 0.802

PAGE 58

50 LISTING OF t's (CONTINUED) SAMPLE

PAGE 59

BIBLIOGBAPHI Anastasl, Anne Psycholop;ical Testing; . New York: Hacmillan, 19bl. Bigge, Morris L. Learning; Theories for Teachers . New York: Harper and Rov?, 196^ Combs, A. W., Snygg, Donald Individual Behavior . Nev; York: Harper and Row, 1959. Cronbach, Lee J., Furby, Lita "How Should We Measure Change--or Should We?" Psychological Bulletin . 1970, 7^: 68-80. Hays, William L. Statistics for Psychologists . Nev? York: Holt, Rinehart, and Winston, I963. Hilgard, Ernest R. Theories of Le arning. New York: Appleton, 1956. Lord, P. H. "The Measurement of Growth," Education al and Psychol ogical Measureme nt, 1956, 16: ii-21-'537. Lord, P.M. "Statistical Inferences about True Scores," Psycho pietrika , 1959, 24: 1-17 . Lord, P. M. "Elementary Models for Measuring Change," in C. W. Harris Prob lems in Measuring Chan ge . Madison, Wisconsin: University of Wisconsin Press, I963, pp. 21-38. McNemar, Q. "On Grov;th Measurement," E duc at ional and Psychological Measurement, 1958, 18: "^7-55. Madansky, Albert "The Fitting of Straight Lines When Both Variables Are Subject to Error," Journ al of th e American Statistical Associa tion, I959, 5^: 173-205. Manning, Winton H., DiiBois, Philip H. "Correlational Methods in Research on Human Learning," Perceptual and Moto r Skills , I962, 15: 287--321 . Mendenhall, William Introducti o n to Linear Models and the Design and Analysis of Sx i-73riments . Belmont , California: Wadsv-forth, r968T' 51

PAGE 60

52 Meyer, Paul Introductory Pr o bability and S t atistical Application . Reading, Massachusetts: Addison-V/esley , Nunnally, Jura C. Psychometric Theory . New York: McGrawHill, 1967. Ohnmacht, Fred U. "Correlates of Change in Academic Achievement," Journal of Educational M easurement. 1968, 5: ^1-^^": Rosenthal, Myron R. Numerical Methods in Computer Program.ming . Homevmrd, Illinois: Irwin, I966. Scheffe', Henry The Analysis of Variance . New York: V/iley, 1959. Schick, George B. (ed) , May, Merril M. (ed) The Psycho logy of Reading. Milwaukee: The National Reading Conference, Inc., 15th Yearbook, I969, Ranking, Earl P., Jr., Dale, Lothar H. pp. 17-26. Skinner, B. F. Th e Technolo g y of Teaching . New York: Appleton, 19b8. Snedecor, George W. Statistical M e thods . Ames, lovja: Iowa State College Press, 1955. Soar, Robert S. "Optimum Teacher-Pupil Interaction for Sum.mer Grovith," Educational Le a dership Research Supplement, I968, 26(37: 275^280, Tate, Merle V/ . Statistics in E ducation and Psychology . Nev; York: Macmillan, I965. Thorndike, E. L. "The Influence of Chance Imperfections of Measures Upon the Relation of Initial Score to Gain or Loss," Journal of Experimental Psycholo gy, 192^-, 7: 225-232. Tillm.an, Chester E. Crude Gain VS. True Ga in: Correlates of Gain in Readin g after Re medial Tuto r inp.; . Doctoral dissertation. University of Florida^ T^^S. Winer, B. J. Statis tical Princ i ples in Experimental Desi.gn . New York: McGraw-Hill, 19^2"^

PAGE 61

BIOGRAPHICAL SKETCH John Howard Neel was born July 27, 19^^, at Waynesburg, Pennsylvania. He graduated from William R. Boone High school, Orlando, Florida, in June, 1962. In August, I965, he received the degree Bachelor of Arts with a major in mathematics from the University of Florida. He taught algebra and general mathematics at John F. Kennedy Junior High School from September, I965, until June, I966. In September, I966, he enrolled in the College of Education at the University of Florida under a United States Office of Education fellowship program directed by Dr. V/ilson H. Guertin. In September, I96B, he accepted a research assistantship under the same program. In June, I96S, he received the degree Master of Arts in Education. He vias an instructor in the College of Education at the University of South Florida from September, 19^8, until August, I969, and is currently on leave from that position. In September, 1969, he was appointed Interim Instructor in the College of Education at the University of Florida and he holds that position currently. John Howard Neel is married to the former Carol Lynn Raraft. They have tv;o daughters, Sarah Elizabeth and Lia Suzanne . 53

PAGE 62

5^ John Howard Neel is a member of the Florida Educational Research Association, The American Educational Eesearch Association, Phi Delta Kappa, and the American Statistical Association,

PAGE 63

This dissertation v;as prepared uri.der the direction of the chairmen of the candidate's supervisory committeG and has "been approved by all raombei's of that conirnitteG. It V7as submitted to the Dean of the ColJ.ege of Education and to the Graduate Council, and vjas approved as partial fulfillment of the requirements for the degree of Doctor of Philosophy. August, 1970 d,iL.i ?V6 /X/O^ '4^ Dean, Go3/lege,' of Education Supervisory Coninittee: ^/ LA^i i-y^ 7 / > 'Chairman/' '^ Dean, Graduate School Co -Chairman ij^ • /^

PAGE 64

u.. 7 9 4 6 B