A STUDY OF THE POWER OF MULTIVARIATE ANALYSIS OF
VARIANCE ON STANDARDIZED ACHIEVEMENT TESTING
WHEN ESTIMATORS FOR OMISSIONS UTILIZE MEAN
VALUE AND REGRESSION APPROACHES
By
STEPHEN S. SLEDJESKI
A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1976
UNIVERSITY OF FLORIDA
3 1262 08552 7785IIIIIIIIIIIIIII11 I
3 1262 08552 77t85
ACKNOWLEDGEMENTS
My appreciation is extended to the members of my
doctoral committee for their contributions to the develop
ment of this dissertation. They are: Drs. Vynce A. Hines
(Chairman), Ira J. Gordon, Zorin R. PopStojanovic, and
Robert S. Soar.
To Dr. Hattie Bessent, no statement can express her
impact and assistance in attaining my educational goals.
Words can be neither sufficient nor appropriate to express
my esteem.
To Drs. Ann Bromley, Molly Harrower, and Wilson H.
Guertin, I present thanks for direction and assistance in
the understanding of my educational commitment.
To my sisters, Helen Brush and Ann Pendzick, and
their families, I can but state our fortuitous interaction
which has allowed not only educational growth but also
complete dispersion while retaining faith in one another's
existence.
To my mother, Helen Sledjeski, and my late father,
Stephen Sledjeski, I wish to express my deepest appreciation
for their successful development of a family unit filled
with motivation, sincerity, trust, and love. This work is
dedicated to their lives and memory.
TABLE OF CONTENTS
Page
ACKNOWLEDGEMENTS ..................................... ii
LIST OF TABLES ............. ...... .... ....... .......... v
ABSTRACT ............................................. vi
Chapter
I. INTRODUCTION ............ ........... ......... 1
Nature of the Study ........... ............ 1
The Problem and the Hypotheses ............. 4
Significance of the Study .................. 5
II. REVIEW OF RELATED LITERATURE ................. 7
Introduction .............................. 7
Historical Overview ........................ 7
Problems of Missing Multiresponse
Observations in Education ................ 13
Direction of Present Research ............. 14
III. DESIGN OF THE STUDY ........... ............... 15
Procedures .. ....... ... ................... 15
Method .. ......... .......... ........... 17
IV. RESULTS ................... ....... 20
Comparison of the Mean Value and the
Regression Estimated Data Sets with
One Another and with the Complete
Data Set at the 2% Percent Level of
Missing Subsamples ................ ....... 22
Comparison of the Mean Value and the
Regression Estimated Data Sets with
One Another and with the Complete
Data Set at the 5 Percent Level of
Missing Subsamples ....................... 24
TABLE OF CONTENTSContinued
Chapter Page
IV. Comparison of the Mean Value and the
Regression Estimated Data Sets with
One Another and with the Complete
Data Set at the 10 Percent Level of
Missing Subsamples ....................... 26
Comparison of the Mean Value and the
Regression Estimated Data Sets with
One Another and with the Complete
Data Set at the 15 Percent Level of
Missing Subsamples ................ ....... 28
Comparison of the Mean Value and the
Regression Estimated Data Sets with
One Another and with the Complete
Data Set at the 20 Percent Level of
Missing Subsamples ....................... 30
Further Results ............. .. ........ . 32
Summary ............... .......... .......... 34
V. DISCUSSION, CONCLUSIONS, AND RECOMMENDATIONS .. 36
Discussion ................................. 36
Conclusions ........... ......... ........... 37
Recommendations ........... ...... .......... 39
REFERENCES ............................................. 41
BIOGRAPHICAL SKETCH ................... ...... ......... 45
LIST OF TABLES
Table Page
1 Fratios and Complements (P) of the Cumulative
Distribution Function for Fourth and Fifth
Grade Samples Having Mean Value and Regres
sion Estimated Subsamples Consisting of 2
Percent of the Complete Samples ............. 23
2 Fratios and Complements (P) of the Cumulative
Distribution Function for Fourth and Fifth
Grade Samples Having Mean Value and Regres
sion Estimated Subsamples Consisting of 5
Percent of the Complete Samples ............ 25
3 Fratios and Complements (P) of the Cumulative
Distribution Function for Fourth and Fifth
Grade Samples Having Mean Value and Regres
sion Estimated Subsamples Consisting of 10
Percent of the Complete Samples ............ 27
4 Fratios and Complements (P) of the Cumulative
Distribution Function for Fourth and Fifth
Grade Samples Having Mean Value and Regres
sion Estimated Subsamples Consisting of 15
Percent of the Complete Samples ............. 29
5 Fratios and Complements (P) of the Cumulative
Distribution Function for Fourth and Fifth
Grade Samples Having Mean Value and Regres
sion Estimated Subsamples Consisting of 20
Percent of the Complete Samples ............ 31
Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment
of the Requirements for the Degree of
Doctor of Philosophy
A STUDY OF THE POWER OF MULTIVARIATE ANALYSIS OF
VARIANCE ON STANDARDIZED ACHIEVEMENT TESTING
WHEN ESTIMATORS FOR OMISSIONS UTILIZE MEAN
VALUE AND REGRESSION APPROACHES
By
Stephen S. Sledjeski
March, 1976
Chairman: Dr. Vynce A. Hines
Major Department: Foundations of Education
The efficacy of utilizing estimators for omissions
in a multiresponse achievement data set which is analyzed
using multivariate analysis of variance (MANOVA) techniques
is the concern of this study. The estimates were determined
employing mean value and regression methods.
Random samples of fourth and fifthgrade students
were administered the Stanford Achievement Test, Intermediate
Level I and Intermediate Level II, respectively, in the spring
of 1974. Each sample had a n of 193 consisting of two fixed
groups as the independent variables and the achievement sub
scores as the dependent variables.
These two samples comprised the complete data sets
from which random subsamples of missing data were removed
from among the dependent variables. The missing subsample
consisted of 2, 5, 10, 15, and 20 percent of the complete
samples, each percent level being investigated five times
for each of the two methods of estimation.
The MANOVA results of the data sets with mean value
and regression estimates were compared to one another and
to the complete data set. The null hypotheses tested were:
There is no difference in MANOVA results for the
complete data set and the mean value estimated
data set with the size of the missing subsample
ranging from 2 to 20 percent of the complete
data set.
There is no difference in MANOVA results for the
complete data set and the regression estimated
data set with the size of the missing subsample
ranging from 2 to 20 percent of the complete
data set.
There is no difference in MANOVA results for the
mean value estimated data set and the regression
estimated data set both with the size of the
missing subsample ranging from 2 to 20 percent
of the complete data set.
The hypotheses were analyzed by comparing the comple
ment of the cumulative distribution function derived from the
Fratio of each MANOVA of the complete data set to that of
the estimated data sets. No significant differences were
found for the three hypotheses. Inspection of the results
demonstrated that the regression estimates provide MANOVA
results apparently closer to that of the complete data set
than did mean value estimates.
The research concluded that, within the confines of
this study, one cannot reject the use of mean value and
regression estimates for data sets with missing values which
are to be analyzed using MANOVA.
viii
CHAPTER I
INTRODUCTION
With the increased emphasis on multivariate analysis,
the experimenter has been confronted with multiresponse data
where measurements on all responses are not available for
every experimental unit. Since the time, resources, and
money involved in gathering multiple observations on experi
mental subjects are greater than for gathering single
observations, multivariate analysis of variance (MANOVA)
must give attention to missing data. It is the purpose of
this study to consider missing observations in MANOVA
utilizing mean value and regression estimators on a set of
achievement data with subsets of randomly chosen missing
data ranging in size from 2 to 20 percent of the complete
data set. The power of MANOVA results will then be
determined.
Nature of the Study
Missing data estimation has been of interest to
educational and statistical researchers for several decades.
Estimation of uniresponse data has been conducted for various
experimental designs. Baird and Kramer (1960) investigated
the balanced incomplete block design. They developed
formulas through minimization of the error sum of squares
for the special case where missing values are within the
same block or treatment. Their method facilitates calcu
lations but does nothing to restore missing information.
Kramer and Glass (1960) examined the Latin square
design. In the same manner as Baird and Kramer, they
developed formulas through minimizing of the error sums of
squares for several missing values to restore the balance
of the design. The formulas are for the specific cases
described and not for the completely general case.
Preece (1972) studied the twoway classification
design. He developed a method of estimating block and
treatment parameters from the nonmissing data plus the
estimated data.
Mitra (1959) considered the effect of missing value
estimates on the Ftest in analysis of variance (ANOVA).
He demonstrated that the numerator in F (the treatment mean
square) and the denominator (the error mean square) cannot
have the same expected value when missing observations exist.
An examination of various missing data procedures
was performed by Wilkinson (1960). He put forth a method
of solving for estimates through simultaneous equations and
compares it to an iterative least squares method and a
covariance method. His method is preferred since it
requires fewer steps and gives the correct residual sums of
squares directly.
Studies investigating multiresponse data estimators
have been less numerous. The works of Kleinbaum (1970),
Srivastava (1967), and Trawinski (1961) are some examples of
early endeavors in multiresponse data. Kleinbaum looked at
the effect of estimation upon hypothesis testing of general
ized multivariate linear models. In concurrence with Mitra
who investigated the uniresponse situation, he demonstrated
that hypotheses are rejected with bias when utilizing
estimators for missing values.
Srivastava extended the GaussMarkov theorem to
multivariate linear models.
Trawinski showed that it is not necessary to collect
data on each characteristic of interest for each experimental
unit. She brought out the important fact that in many situa
tions one needs to have experiments where observations on
some of the responses are missing not by accident, but by
design.
The relevance and importance of missing observations
were demonstrated by Srivastava and McDonald (1969, 1971).
They established, under realistic conditions, the preference
for the hierarchial incomplete models within the groups of
general incomplete multiresponse models.
Dempster (1971) provided an overview of the problems
involved. He surveyed a cross section of the developing
topics in multivariate analysis of data concentrating on
problems of pragmatic data analysis and not on technical
and mathematical detail.
The Problem and the Hypotheses
The present investigationwill attempt to determine
the efficacy of two types of estimates of missing data in
MANOVA. One type of estimate will be the mean value of the
variable for a particular treatment; the other, the regres
sion of one of the MANOVA dependent variables on the remain
ing dependent variables which then act as independent
variables. The results of these MANOVAs will be compared
to MANOVA results of nonmissing data. The hypotheses to
be investigated are:
Hi: There is no difference in MANOVA results for the
complete data set and the mean value estimated
data set with the size of the missing subsample
ranging from 2 to 20 percent of the complete
data set.
H2: There is no difference in MANOVA results for the
complete data set and the regression estimated
data set with the size of the missing subsample
ranging from 2k to 20 percent of the complete
data set.
Ha: There is no difference in MANOVA results for the
mean value estimated data set and the regression
estimated data set both with the size of the
missing subsample ranging from 2% to 20 percent
of the complete data set.
For each hypothesis, missing subsamples will be randomly
chosen which will comprise. 2, 5, 10, 15, and 20 percent
of the original complete sample. Each subsample percent
level will be investigated five times. Estimated values
will then be substituted and be subjected to MANOVA.
Fvalues from the MANOVA results will be compared
using the cumulative distribution function to determine the
power of the analyses.
Data used in the analysis will consist of achievement
scores as determined on the Stanford Achievement Test col
lected in the spring of 1974. Two samples will be investi
gated: a fourthgrade sample of 193 students who were
administered the Intermediate I Battery (eight variables)
and a fifthgrade sample of 193 students who were adminis
tered the Intermediate II Battery (seven variables). The
students in each sample were chosen at random from each of
two fixed groups, an experimental group and a control group.
For each MANOVA, the independent variables will be the two
fixed groups.
Significance of the Study
The two types of estimators to be investigated
differ from one another in an important sense. The mean
value estimator considers all nonmissing values of a par
ticular dependent variable for a specific treatment whereas
the regression estimators consider only those experimental
units with complete data. One approach attempts to utilize
all possible data elements, and the other forms an esti
mation based on even less information.
Combining the fact of the two approaches with that
of varying subsamples of missing data will provide a thorough
look at omissions in multiresponse data taken from an edu
cational setting. It is hoped that insights will be
developed for future analysis of similar educational data.
6
This chapter has presented the problem to be investi
gated and the nature, significance, and hypotheses of the
study. Chapter II contains a review of literature related
to the problem of the study. The design and procedures
are stated in Chapter III; the results of the study are in
Chapter IV; and the discussion, conclusions, and recommen
dations are given in Chapter V.
CHAPTER II
REVIEW OF RELATED LITERAURE
Introduction
Missing data have posed a problem in data analysis
for more than four decades. The initial investigations
involving incomplete data sets concerned univariate statis
tical analysis. With the developments in computational
technology in the past quarter century, multivariate data
analysis has become feasible (Dempster, 1971) as has the
investigation of missing data in multivariate analysis.
The initial focus of researchers concerned the
techniques involved in the estimation of parameters when
there existed missing observations in the data set. It was
a question of developing the parameters and then adjusting
these parameters considering the missing data. The direc
tion taken in the review of the literature which follows is
first, the estimation of the missing observations and
second, the formulation of the parameters required for
analysis.
Historical Overview
The first researcher to develop analysis procedures
by first estimating values for the missing observations was
7 
Wilks (1932). He examined the incomplete bivariate case of
a bivariate normal distribution using sample means for the
missing observations. He found that the optimum method of
determining the variance between the two variables was the
correlation between the two variables which included only
those pairs that were complete.
Wilks' example of a sample of statistical data from
a multivariate population has been popularized in many
related papers. Srivastava and Zaatar (1972) summarized
Wilks' example as:
[T]he situation when the experimental units are
skulls that have been dug out from a certain
graveyard. Since these skulls may be partly
mutilated, the choice as to which characteristics
should be measured on a particular unit is not
entirely in the hand of the investigator. (One
may suggest that in such a situation, we should
restrict ourselves to those skulls on which all
measurements of interest can be obtained. How
ever, clearly this would in general not be very
proper unless there were a rather large number
of skulls free from any mutilation.) p. 117
Little more was published on incomplete multivariate
data sets until the 1950s when papers began to appear extend
ing the work of Wilks. Matthai (1951) developed a method to
determine the correlation between two variates with missing
data using the total available data set. He formulated a
solution for the trivariate case using the correlation
estimates. His estimates, he concluded, were inconsistent.
For example, correlation coefficients could exceed unity.
Federspiel et al. (1959) and Glasser (1964)
generalized this situation. They investigated the
correlation matrix of a general number of variates based
on all available paired data. They studied intuitive
approaches for estimating linear regression coefficients
when an unspecified number and pattern of missing values
exist among the independent values. It is shown that the
efficacy of the approaches depends upon the correlations
among the independent variables as well as the proportion
of observations which are missing.
Lord (1955) demonstrated the solutions for the
trivariate case when the dependent variable is recorded
for all experimental units in the sample. Either of the
two independent variables is recorded for all experimental
units, but not both. He showed that, in this instance,
means and regression coefficients can be estimated
accurately.
The trivariate case was studied by Edgett (1956) in
the opposite sense of Lord. He gave attention to the in
stance when the dependent variable has missing values and
the two independent variates were complete. Nicholson
(1957) extended Edgett's work to any number of independent
variables. Edgett and Nicholson demonstrated that a maxi
mum likelihood function for a plausible probability
distribution could provide as good population parameter
estimates as could least squares estimates.
A mode of estimation different from Wilks' method
was provided by Dear (1959). He substituted for each
missing observation of an independent variate the division
of the sum of the value of all observed independent vari
ables by the sum of the number of observations for all
observed independent variables. This somewhat corresponds
to the grand mean of all the independent variables. It
is clear that serious difficulties would be incurred when
the independent variables are measured on different scales.
Walsh (1959) and Buck (1960) considered omission
estimates in respect to paired simple linear regression.
Walsh studied the utilization of all data available for a
pair of variables in the simple linear regression computa
tion. Those experimental units for which no data were
missing were looked at by Buck in the paired regression
analysis. Both Walsh and Buck determined that the average
of values obtained from the simple linear regression pro
vided suitable estimates for missing responses.
Anderson (1957) investigated a particular pattern
of missing observations called a monotone sample. This is
a sample in which the observations on each variate is a sub
set of another variate, i.e., each variate is nested within
another variate. He'set forth a method of estimation very
similar to Edgett's although greatly simplified in the
amount of necessary mathematical manipulation. Several
writers (Bhargava, 1962; Afifi and Elashoff, 1966, 1967)
have gone beyond the monotone trivariate case of Anderson
and determined solutions for the general variate case.
In addition, Bhargava developed the likelihood ratio tests
for hypotheses dealing with the linear model and equality of
covariance matrices with multivariate monotone samples.
Trawinski and Bargmann (1964) examined a considerably
more complicate pattern of missing data than Anderson (1957),
Bhargava (1962), and Afifi and Elashoff (1966, 1967). The
concern of Trawinski and Bargmann was with observations that
were missing not by accident, but by design. They found that
correlation coefficients were logically consistent estimates
to use with incomplete multivariate data.
In deference to data missing by accident or design,
Hocking and Smith (1968) assumed neither in developing their
analytic procedures. They formulated a procedure to compute
maximum likelihood estimates for parameters but only in the
case of large samples.
Anderson, Trawinski and Bargmann, and Hocking and
Smith used estimates of groups of data. They did not esti
mate specific missing observations.
The design of experiments which involve multiresponses
and omissions was considered by Srivastava (1968). He pointed
out that an experimenter must give attention to whether or not
each response on each experimental unit is to be measured. He
provides a discussion of what he calls the lack of need of a
regular design. (A regular design is one where all responses
are sought on all experimental units.) Before data collection,
a researcher should set up his design such that the only data
collected will be somewhat convenient or useful.
Haitovsky (1968) compared the methods of Buck and
Walsh. He carried out a simulated data analysis, first
using only complete data, discarding incomplete experi
mental units and second, using all available observations
to estimate correlations. He found the former procedure
superior. This is the case when the number of missing
entries is not high.
A comparison of a complete data set and an incom
plete data set which is a subset of the complete set was
conducted by Morrison (1971). He determined that when the
correlations between the complete and incomplete variates
of the data set are small, the multivariate missing value
estimates are less accurate in the estimation of the mean
square error term than the multivariate data set with no
estimates.
An extension of the work of Walsh and Buck was
conducted by Dagenais (1971). He developed a more general
ized method which not only corrects for data omissions but
also provides for additional corrections during data analysis.
His estimates are consistent when the independent variable is
fixed; each observation contains a value for the dependent
variable and at least one of the independent variables; and
some observations are complete.
Srivastava and Zaatar (1972) dealt with the problem
of classifying a future multiresponse observation into one
of two populations given two incomplete multiresponse
samples, one from each population. They developed a rule for
the classification given the fact that the observation did
come from one of the populations.
Investigations of entire sections of missing data
were performed by Hartwell and Gaylor (1973) and Rubin (1974).
The former examined missing cells employing the method of
unweighted means. He provides a method of cell estimation
using estimated variances. Rubin looked at complete blocks
of missing data by decomposing the original estimation problem
into smaller estimation problems using a technique he denotes
as factorizationn." This consists of discovering those
subject responses that are complete and using these response
patterns to estimate missing observations of subjects with a
similar response pattern.
Problems of Missing Multiresponse
Observations in Education
In a paper which is an overview of multivariate data
in education, Pruzek (1971) brought both the educational com
munity and other areas of research face to face with the
problem of incomplete multiresponse data sets and their
investigation employing multivariate analysis of variance
(MANOVA). He outlined two procedures regarding the phenome
non of missing data in MANOVA applications. The first is the
situation where several scattered responses are missing for
each dependent variable, and the second is where whole vectors
of responses are missing. No proven method of estimations
for omissions is provided.
Raffeld (1973) and Lord (1974) considered missing item
responses and their estimates. Lord examined ability and item
parameters. His emphasis was on the inappropriateness of
scoring an item as incorrect if it were omitted by the sub
ject. He uses probability methods to estimate the omitted
data from a minimum of two or three thousand other subjects.
Raffeld pursued estimates of items on standardized achieve
ment tests using mean value estimates. He concluded that
for omitted items on a standardized achievement test it is
better to assign value which is the mean of the alternatives
for that item rather than assigning the mean response for the
group omitting the item. Neither Lord nor Raffeld concerned
himself with subscbre estimates.
Direction of Present Research
The above review was concerned either with estimates
of missing data and their parameters or estimates of missing
data without concern for analysis. The intention of this
study is to forego parametric concerns, apply simple methods
of data estimation, analyze the estimated data sets, examine
the results of the analysis,and provide results directly
related to educational research. It will use a frequently
employed educational measurement, the achievement test with
several subscores, and investigate estimation methods under
stood by most researchers and students of research.
CHAPTER III
DESIGN OF THE STUDY
The research conducted in this study focused on the
usefulness of the inclusion of multiresponse data, which
consists of several subscores, in a multivariate analysis
of variance as dependent variables when random missing sub
scores were estimated using mean value and regression
techniques. The analyses of the data sets formed by the
two methods of estimation were compared to each other and
to the analysis of the complete data set.
The underlying focus of the research concerned the
efficacy of the above method when applied to educationally
related data. Thus the data sets investigated consisted of
achievement scores collected on elementary school students.
Procedures
Two random samples were drawn from two fixed groups.
The first sample consisted of 193 fourthgrade students and
the second of an equal number of fifthgrade students. Both
were administered the Stanford Achievement Test Battery in
the spring of 1974. The fourthgrade sample was given the
Intermediate I Battery and the fifthgrade sample the
Intermediate II Battery providing raw scores for analysis.
In preparing the data for analysis, random subsamples
were drawn comprising 2, 5, 10, 15, and 20 percent of each
of the two original complete data sets. The number of
subjects in each of these subsamples was 5, 10, 20, 29,
and 39, respectively. The subjects in these subsamples
were considered as having missing data. One achievement
subscore was randomly discarded for each subject in each of
the missing subsamples. This procedure was conducted five
times for each of the five percent levels, obtaining five
different random subsamples.
Utilizing the subjects without randomly chosen
missing subscores, means on each achievement test variable
were formed. These means were substituted for the randomly
discarded subscore for each subject in each of the missing
subsamples.
Likewise, the subjects without randomly chosen
missing subscores were subjected to multiple linear regres
sion analysis. One achievement test subscore was randomly
chosen as the dependent variable, and the remaining sub
scores were the independent variables. The nondiscarded
subscores of each of the subjects with a missing subscore
were substituted in the corresponding resulting regression
equation. The value obtained from the regression equation
was substituted for the randomly discarded subscores.
Method
In testing the hypotheses, multivariate analysis of
variance (MANOVA) was conducted on each of the 100 adjusted
samples with missing data and on the complete original sample
with no missing data. The two fixed groups were the inde
pendent variables, and the achievement test subscores were
the dependent variables in each case. The MANOVA results
of the mean value estimates and the multiple linear regres
sion estimates were compared to the MANOVA results of the
complete original sample and to each other.
The comparisons of the resulting Fratios were
determined by the evaluation of the complement of the
cumulative distribution function of the variance ratio
distribution. The method consists of the following series
expansion. Let n and m be the first and second number of
degrees of freedom, respectively, and let
a = tan' /nF/m
where F is the Fratio value. Then if n is even, the comple
ment P is defined as
P(n,m,F) = cosm a 1 + sin a
+ (m+2) sin4 a + .
m(m+2) . (m+n4) n2
+ 2)(4) . (n2) s
18
If m is even,
P(n,m,F) = 1 sinn a 1 + cos a
+ n(n+2) o4 +
n(n+2) . (n+m4) m2
+ (2)(4) (m2) cos
If n and m are both odd,
2 (2)(4) (ml) m
P(n,m,F) ()() . (m2) cosm sin a
T m+) ... (m+) (m+3)
S1 + lsin2 a + ( )(m+3) sin4 a
3 (3)(5)
+ + (m+l)(m+3) . (m+n4) n3
S + (3)(5) . (n2) s
2 sin a cos a, I 2 c
S cos a
+ (2)(4) 4 +
3T5 cos a + .
+(2)(4) . (m3) m3 2
+ (3 ) . (m2) os + 1
where, if n = 1, the first series is to be taken as zero, and
if m = 1, the second series is to be taken as zero and the
factor (2)(4) (ml)
factor (3)(5) (m2) is to be taken as unity (Hopper, 1970)
If the complement of the complete data set is greater
than 0.05 and the complement of a data set with an estimated
missing subsample is less than or equal to 0.05, then the
MANOVA results are considered significantly different from
one another. Likewise, if the complement of the complete data
set is less than or equal to 0.05 and the complement of a data
set with an estimated missing subsample is greater than 0.05,
then the MANOVA results are considered significantly different
from one another. If both results are either greater than
0.05 or less than or equal to 0.05, then the MANOVA results
are not considered significantly different from one another.
This method is contingent upon the level of significance
chosen and relies on the fact that the point of significance
is immutable.
CHAPTER IV
RESULTS
It has been the experience of the researcher that
when conducting data analysis on achievement tests, he
obtains a list of scores which contains missing subscores.
The data on experimental units with missing subscores must
then be discarded and results in a loss of information.
The present study questioned the applicability of
using estimates for multiresponse data in multivariate
analysis of variance (MANOVA) when one response of an experi
mental unit is missing. Both mean value and regression
estimates were employed for missing data in the manner
reported in Chapter III.
There were three specific questions 'investigated
in this study: Do mean value estimates provide different
MANOVA results from that obtained when analyzing the total
data set? Do regression estimates provide different MANOVA.
results from that obtained when analyzing the complete data
set? and thus, Do mean value estimates provide different
MANOVA results from regression estimates? Each of these
inquiries was looked at for varying percent levels of missing
data (2, 5, 10, 15, and 20 percent of the total sample).
The five different levels were employed on five different
random subsamples of missing data. This was performed on
two different data sets of fourth and fifthgrade elemen
tary school students for the two types of estimates. This
resulted in 5 x 5 x 2 x 2 random incomplete samples, or a
total of 100 incomplete samples, that were studied and
compared to the two complete data sets of fourth and fifth
grade students.
The presentation of results in this chapter is
according to each of the five percent levels of missing
data for the three aforementioned questions. These three
questions represent the three hypotheses which are stated
as follows:
Hi: There is no difference in MANOVA results for the
complete data set and the mean value estimated
data set with the size of the missing subsample
ranging from 2 to 20 percent of the complete
data set.
H2: There is no difference in MANOVA results for the
complete data set and the regression estimated
data set with the size of the missing subsample
ranging from 2 to 20 percent of the complete
data set.
H3: There is no difference in MANOVA results for the
mean value estimated data set and the regression
estimated data set both with the size of the
missing subsample ranging from 2 to 20 percent
of the complete data set.
The MANOVA Fratios and the corresponding complement of the
cumulative distribution function of the variance ratio
distribution are provided in response to these hypotheses.
MANOVA performed on the complete data set of fourth
graders resulted in a F = 2.8851 with 8 and 185 df (degrees
of freedom); for the fifth graders, there resulted a
F = 3.3229 with 7 and 185 df. Determining the complement
of the cumulative distribution function, the P value
obtained for the fourthgrade data set was 0.004745 and
that for the fifthgrade data set was 0.002341.
Comparison of the Mean Value and the Regression
Estimated Data Sets with One Another and
with the Complete Data Set at the 2
Percent Level of Missing Subsamples
The values of the Fratio and complement of the
cumulative distribution function for fourth and fifth
grade mean value and regression estimated data sets at the
2 percent level are presented in Table 1. For the fourth
grade sample, no Fratio of the mean value estimated data
sets differed from the complete data set's Fratio by more
than 0.1267. Likewise, for the regression estimated data
sets, no .Fratio differed from the complete data set's
Fratio by more than 0.0675. Equivalent ranges for the
fifthgrade sample were 0.0329 and 0.0397, respectively.
Examining the complement of the cumulative distri
bution function for the fourthgrade sample, no P of the
mean value estimated data sets differed from the complete
data set's complement by a value greater than 0.001388.
Likewise, for the regression estimated data sets, no comple
ment differed from the complete data set's complement by
a value greater than 0.000798. Equivalent ranges for the
fifthgrade sample were 0.000196 and 0.000245, respectively.
TABLE 1. Fratios and Complements (P) of the Cumulative Distribution
Function for Fourth and FifthGrade Samples Having Mean
Value and Regression Estimated Subsamples Consisting of 2
Percent of the Complete Samples
Grade Four Grade Five
Mean Value Regression Mean Value Regression
F P F P F P F P
2.9708 2.9228 3.3265 3.2832
Sample 1
0.003756 0.004282 0.002323 0.002589
2.8974 2.9338 3.3126 3.2907
Sample 2
0.004589 0.004155 0.002406 0.002541
2.8796 2.9096 3.3462 3.2865
Sample 3
0.004817 0.004440 0.002212 0.002568
3.0118 2.9526 3.2983 3.2852
Sample 4
0.003357 0.003947 0.002493 0.002576
2.9590 2.9490 3.3558 3.2953
Sample 5
0.003878 0.003988 0.002158 0.002512
Since the complement of the complete data set for
both the fourth and fifth grades was less than 0.05 while
at the same time the five complements of the mean value and
the regression estimated data sets were less than 0.05, the
three null hypotheses are not rejected at the 2 percent
level of missing subsamples.
Comparison of the Mean Value and the Regression
Estimated Data Sets with One Another and
with the Complete Data Set at the 5
Percent Level of Missing SubsampIes
The values of the Fratio and complement of the
cumulative distribution function for fourth and fifthgrade
mean value and regression estimated data sets at the 5 per
cent level are presented in Table 2. For the fourthgrade
sample, no Fratio of the mean value estimated data sets
differed from the complete data set's Fratio by more than
0.1859. Likewise, for the regression estimated data sets,
no Fratio differed from the complete data set's Fratio by
more than 0.0302. Equivalent ranges for the fifthgrade
sample were 0.1268 and 0.1226, respectively.
Examining the complement of the cumulative distri
bution function for the fourthgrade sample, no P of the
mean value estimated data sets differed from the complete
data set's complement by a value greater than 0.001893.
Likewise, for the regression estimated data sets, no
complement differed from the complete data set's complement
TABLE 2. Fratios and Complements (P) of the Cumulative Distribution
Function for Fourth and FifthGrade Samples Having Mean
Value and Regression Estimated Subsamples Consisting of
5 Percent of the Complete Samples
Grade Four Grade Five
Mean Value Regression Mean Value Regression
F P F P F P F P
2.9982 2.9094 3.3587 3.3830
Sample 1
0.003484 0.004418 0.002143 0.002016
2.8943 2.8848 3.2744 3.2745
Sample 2
0.004628 0.004750 0.002647 0.002647
2.8706 2.8771 3.3053 3.2786
Sample 3
0.004937 0.004851 0.002450 0.002619
3.0710 2.9153 3.2904 3.3363
Sample 4
0.002852 0.004370 0.002543 0.002267
2.9555 2.8999 3.1961 3.2003
Sample 5
0.003916 0.004558 0.003219 0.003186
by a value greater than 0.000375. Equivalent ranges for the
fifthgrade sample were 0.000875 and 0.000842, respectively.
Since the complement of the complete data set for
both the fourth and fifth grades was less than 0.05 while
at the same time the five complements of the mean value and
the regression estimated data sets were less than 0.05, the
three null hypotheses are not rejected at the 5 percent
level of missing subsamples.
Comparison of the Mean Value and the Regression
Estimated Data Sets with One Another and
with the Complete Data Set at the 10
Percent Level of Missing Subsamples
The values of the Fratio and complement of the
cumulative distribution function for fourth and fifthgrade
mean value and regression estimated data sets at the 10 per
cent level are presented in Table 3. For the fourthgrade
sample, no Fratio of the mean value estimated data sets
differed from the complete data set's Fratio by more than
0.5650. Likewise, for the regression estimated data sets,
no Fratio differed from the complete data set's Fratio by
more than 0.1607. Equivalent ranges for the fifthgrade
sample were 0.1006 and 0.0801, respectively.
Examining the complement of the cumulative distri
bution function for the fourthgrade sample, no P of the
mean value estimated data sets differed from the complete
data set's complement by a value greater than 0.003977.
Likewise, for the regression estimated data sets, no
TABLE 3. Fratios and Complements (P) of the Cumulative Distribution
Function for Fourth and FifthGrade Samples Having Mean
Value and Regression Estimated Subsamples Consisting of
10 Percent of the Complete Samples
Grade Four Grade Five
Mean Value Regression Mean Value Regression
F P F P F P F P
3.0076 2.9488 3.4235 3.4030
Sample 1
0.003395 0.003988 0.001821 0.001917
2.9682 2.9043 3.2743 3.2802
Sample 2
0.003782 0.004504 0.002648 0.002609
2.8678 2.8713 3.3378 3.2773
Sample 3
0.004975 0.004928 0.002259 0.002628
3.4501 3.0458 3.2941 3.3524
Sample 4
0.000998 0.003057 0.002520 0.002177
3.0149
2.8983
3.2814
3.2859
Sample 5
0.003328
0.004578
0.002601
0.002572
complement differed from the complete data set's complement
by a value greater than 0.001688. Equivalent ranges for the
fifthgrade sample were 0.000523 and 0.000427, respectively.
Since the complement of the complete data set for
both the fourth and fifth grades was less than 0.05 while
at the same time the five complements of the mean value and
the regression estimated data sets were less than 0.05, the
three null hypotheses are not rejected at the 10 percent
level of missing subsamples.
Comparison of the Mean Value and the Regression
Estimated Data Sets with One Another and
with the Complete Data Set at the 15
Percent Level of Missing Subsamples
The values of the Fratio and complement of the
cumulative distribution function for fourth and fifthgrade
mean value and regression estimated data sets at the 15 per
cent level are presented in Table 4. For the fourthgrade
sample, no Fratio of the mean value estimated data sets
differed from the complete data set's Fratio by more than
0.3063. Likewise, for the regression estimated data sets,
no Fratio differed from the complete data set's Fratio by
more than 0.1386. Equivalent ranges for the fifthgrade
sample were 0.2364 and 0.0412, respectively.
Examining the complement of the cumulative distri
bution function for the fourthgrade sample, no P of the mean
value estimated data sets differed from the complete data
set's complement by a value greater than 0.002696. Likewise,
TABLE 4. Fratios and Complements (P) of the Cumulative Distribution
Function for Fourth and FifthGrade Samples Having Mean
Value and Regression Estimated Subsamples Consisting of
15 Percent of the Complete Samples
Grade Four Grade Five
Mean Value Regression Mean Value Regression
F P F P F P F P
2.9470 2.9765 3.5593 3.3263
Sample 1
0.004008 0.003697 0.001294 0.002325
2.8829 2.8880 3.2797 3.3013
Sample 2
0.004775 0.004708 0.002612 0.002475
2.8862 2.8830 3.4280 3.2971
Sample 3
0.004731 0.004773 0.001801 0.002501
3.1914 3.0237 3.2777 3.2899
Sample 4
0.002049 0.003249 0.002625 0.002547
3.1742 2.9796 3.3087 3.2817
Sample 5
0.002146 0.003666 0.002430 0.002599
for the regression estimated data sets, no complement dif
fered from the complete data set's complement by a value
greater than 0.001496. Equivalent ranges for the fifth
grade sample were 0.001050 and 0.000255, respectively.
Since the complement of the complete data set for
both the fourth and fifth grades was less than 0.05 while
at the same time the five complements of the mean value and
the regression estimated data sets were less than 0.05, the
three null hypotheses are not rejected at the 15 percent
level of missing subsamples.
Comparison of the Mean Value and the Regression
Estimated Data Sets with One Another and
with the Complete Data Set at the 20
Percent Level of Missing Subsamples
The values of the Fratio and complement of the
cumulative distribution function for fourth and fifthgrade
mean value and regression estimated data sets at the 20 per
cent level are presented in Table 5. For the fourthgrade
sample, no Fratio of the mean value estimated data sets
differed from the complete data set's Fratio by more than
0.3305. Likewise, for the regression estimated data sets,
no Fratio differed from the complete data set's Fratio by
more than 0.1237. Equivalent ranges for the fifthgrade
sample were 0.2711 and 0.0479, respectively.
Examining the complement of the cumulative distri
bution function for the fourthgrade sample, no P of the
mean value estimated data sets differed from the complete
TABLE 5. Fratios and Complements (P) of the Cumulative Distribution
Function for Fourth and FifthGrade Samples Having Mean
Value and Regression Estimated Subsamples Consisting of
20 Percent of the Complete Samples
Grade Four Grade Five
Mean Value Regression Mean Value Regression
F P F P F P F P
2.9608 2.9272 3.5940 3.3024
Sample 1
0.003859 0.004231 0.001185 0.002468
2.8703 2.8637 3.3104 3.2750
Sample 2
0.004941 0.005031 0.002419 0.002643
2.9036 2.8916 3.5476 3.3119
Sample 3
0.004513 0.004663 0.001333 0.002410
3.0312 2.9180 3.3004 3.3196
Sample 4
0.003183 0.004339 0.002480 0.002364
3.2156 3.0088 3.3048 3.2770
Sample 5
0.001915 0.003384 0.002453 0.002630
data set's complement by a value greater than 0.002830.
Likewise, for the regression estimated data sets, no comple
ment differed from the complete data set's complement by a
value greater than 0.001361. Equivalent ranges for the
fifthgrade sample were 0.001159 and 0.000299, respectively.
Since the complement of the complete data set for
both the fourth and fifth grades was less than 0.05 while
at the same time the five complements of the mean value and
the regression estimated data sets were less than 0.05, the
three null hypotheses were not rejected at the 20 percent
level of missing subsamples.
Further Results
To determine which method of estimation investigated
was the stronger, an inspection of the values of the Fratios
and complements of the cumulative distribution function was
conducted. The closeness of these values of the incomplete
data sets to that of the appropriate complete data set was
observed. For each group of five incomplete data sets at
each percent level, the range of values was found and
examined for largeness of width.
The largest range at each percent level of missing
data for the fourthgrade sample with mean value estimates
varied from 0.001388 to 0.003977, whereas, for the regres
sion estimated samples, it varied from only 0.000375 to
0.001688. For the fifthgrade samples with mean value
estimates, the range varied from 0.000196 to 0.001159. For
regression estimates, it was 0.000245 to 0.000842. Only at
the 2% percent level of missing values did the mean value
complement range not exceed that of the regression comple
ment range.
A closer examination of the results revealed addi
tional information. One might presume that as the percent
of estimated data elements decreased, the smaller the range
would be between the value of the Fratio of the complete
data set and the most distant value of the Fratio of the
data sets with estimated values. This was neither consistent
within the fourthand fifthgrade samples nor within the
method of estimation. Considering the percent level of
missing data with the shortest range to the level with the
longest range, the order for the fourthgrade sample with
mean value estimates is 2, 5, 15, 20, 10; for the fourth
grade sample with regression estimates, 5, 2, 20, 15, 10;
for the fifthgrade sample with mean value estimates, 2%,
10, 5, 15, 20; and for the fifthgrade sample with regres
sion estimates, 2, 15, 20, 10, 5. The exact results hold
for the complement of the cumulative distribution function.
Another presumption might be that the value of the
Fratio of the complete data set would be within the range
of the values of the Fratios at a particular percent level
of missing data. This is consistent for the fourth and
fifthgrade samples within a method of estimation but not
between methods of estimation. For both the fourth and
fifthgrade samples having mean value estimates, the value
of the Fratio of the complete data set is within the range
of the values of the Fratios for all percent levels of
missing data. For regression estimated samples, this is
not the case. The fourthgrade samples have Fratios not
inclusive, rangewise, of the complete data set's Fratio
at the 2 percent level; for the fifth grade, it is at the
2% and 20 percent levels. The value of the Fratio of the
complete data set exceeds the values of the Fratio in the
fifthgrade sample and precedes the values in the fourth
grade sample.
Summary
In summary, this chapter has presented the statisti
cal analysis of the data. The results of the study indicated
that no significant differences exist among the MANOVA
results of data sets having missing subscores estimated by
mean values, data sets having missing subscores estimated by
regression, and the complete data set with no missing values.
This was demonstrated for 100 samples with estimated sub
scores. The estimated subsamples consisted of 2, 5, 10,
15, and 20 percent of the complete samples of fourth and
fifthgrade students.
Since inspection showed that the regression esti
mated values provided MANOVA and complement results at each
35
percent level closer, in all instances, to that of the
complete data set, it is apparently the stronger of the two
estimation procedures. Both methods of estimation, though,
were demonstrated to provide MANOVA results not signifi
cantly different from the results of the complete data sets.
CHAPTER V
DISCUSSION, CONCLUSIONS, AND RECOMMENDATIONS
Discussion
The intention of this study was to examine the
effect of different estimators for missing multiresponse
data on multivariate analysis of variance (MANOVA) results.
Mean value and regression techniques were used in deter
mining estimates. The MANOVA results for the data sets
which employed the different estimation techniques were
compared to each other and to MANOVA results of the complete
data set.
Specifically investigated were the achievement test
scores of a fourthgrade sample and a fifthgrade sample.
Fifty MANOVAs were conducted on each grade; 25 analyzed the
incomplete data sets with mean value estimates and 25 with
regression estimates. The 25 analyses were subgrouped into
five sets of analyses. Each set contained a different per
cent level of missing data. These levels were 2, 5, 10,
15, and 20 percent of the complete sample. Five samples
with different missing subsets of data were analyzed at each
level.
The results of Chapter IV demonstrated that the
MANOVA results of both estimation techniques did not differ
significantly from one another nor from the results obtained
from the complete data set. Inspection of the Fratios and
complements implied that the regression method was apparently
the stronger estimation technique.
The latter result was determined by the closeness of
the values of the Fratios and the complements of the cumu
lative distribution function for the estimated samples to
that of the complete data set.
In addition, two a posteriori results were observed.
It was found that as the percent of estimated data elements
decreased, it did not follow that the smaller the range
would be between the value of the Fratio of the complete
data set.and the most distant value of the Fratio of the
data sets with estimated values. The non sequitur held for
both grades of students and both methods of estimation.
This was likewise true for the complement of the cumulative
distribution function.
A second finding was that the Fratio of the complete
data set was not within the range of the values of the Fratios
at all percent levels of missing data estimated by regression
techniques. It did hold for mean value estimated data sets.
The same findings occurred among the complements of the
cumulative distribution function.
Conclusions
Three conclusions were drawn from the present
study:
1. Achievement data with up to 20 percent missing
subscores that are estimated by mean value
techniques when analyzed by MANOVA provide
results which do not differ significantly from
MANOVA results of the same achievement data
without any missing subscores.
2. Achievement data with up to 20 percent missing
subscores that are estimated by regression
techniques when analyzed by MANOVA provide
results which do not differ significantly
from MANOVA results of the same achievement
data without any missing subscores.
3. Achievement data with up to 20 percent missing
subscores that are estimated by mean value
techniques when analyzed by MANOVA provide
results which do not differ significantly
from MANOVA results of achievement data with
up to 20 percent missing subscores that are
estimated by regression techniques.
The above conclusions seem to suggest that there
exist for educators alternatives in data analysis other than
discarding incomplete multiresponse observations. The
alternatives provided here are the two methods of estimation:
mean value and regression. In addition, the mean value
method of estimation was demonstrated to be as appropriate
in MANOVA as the regression method as proven by the non
rejection of the third hypothesis. Further data consider
ations revealed that for all levels of missing data, the
Fratio of the complete data set was located within the
range of the Fvalues determined for the data sets with
missing subsamples estimated by the mean value methods.
This did not hold for the regression method.
Since the mean value method is straightforward
and has been proved to be an appropriate estimation
technique, data formerly lost to analysis can be retained.
No longer must estimates for omissions be evaded because of
complicated data manipulations, time, money, and resources.
Recommendations
The present study has operated under various limi
tations which need to be investigated in order to extend
the inferences of this research. Bracht and Glass (1968)
stated:
The intent (sometimes explicitly stated, sometimes
not) of almost all experimenters is to generalize
their findings to some group of subjects and set
of conditions that are not included in the experi
ment. To the extent and manner in which the
results of an experiment can be generalized to
different subjects, settings, experimenters, and,
possibly, tests, the experimenter possesses
external validity. pp. 437438
The external validity of this study is restricted by the
lack of reported research dealing with statistical analyses
which employ data estimates without parametric estimates.
Areas which require further investigation in reference to
inferential conclusions are presented in the following list:
1. The samples consisted of fourth and fifth
graders. Other educational levels need to
be examined.
2. Achievement scores for two levels of one
standardized achievement test were analyzed.
Other standardized achievement tests need
to be investigated.
3. In addition to achievement tests, other types
of tests which measure not only the cognitive
domain but also the affective domain need to
be studied such as those dealing with self
concept and social acceptance.
4. Other methods of estimation need to be con
sidered in a manner similar to the present
investigation and compared to mean value
methods for accuracy and simplicity.
5. Missing subsamples were determined randomly.
Actual missing subsamples need to be investi
gated for possible commonalities.
6. The levels of missing data should be expanded
in order to determine maximum levels of missing
subsamples.
7. More than one missing subscore per experimental
unit needs inspection.
8. Experimental designs requiring analyses different
from multivariate analysis of variance need
probing.
These recommendations are listed not only to provide closure
to the present study but also to indicate the multidirec
tional approaches involved in this specific area of research.
Closure is provided with respect to confining the present
research's inferences to the subset of investigations out
side of the above listing. The expanse of additional
approaches is suggested by the list itself. No one item
of the list is more worthy of study than the other. All
need investigation in order to advance to the universal
set of estimators for omissions of multirespons.e data.
REFERENCES
Afifi, A. and Elashoff, R. M. "Missing observations in
multivariate statistics I. Review of the litera
ture." Journal of the American Statistical
Association, 1966, 61, 595604.
Afifi, A. and Elashoff, R. M. "Missing observations in
multivariate statistics II. Point estimation in
simple linear regression." Journal of the
American Statistical Association, 1967, 62,
1029.
Anderson, T. W. "Maximum likelihood estimates for a multi
variate normal distribution when some observations
are missing." Journal of the American Statistical
Association, 1957, 52, 200203.
Baird, H. R. and Kramer, C. Y. "Analysis of variance of a
balanced incomplete block design with missing
observations. Applied Statistics, 1960, 9,
189198.
Bhargava, R. Multivariate tests of hypotheses with incomplete
data. Applied Mathematics and Statistical Labora
tories, Technical Report 3, 1962.
Bracht, G. H. and Glass, G. V. "The external validity of
experiments." American Educational Research
Journal, 1968, 5, 437474.
Buck, S. F. "A method of estimation of missing values in
multivariate data suitable for use with an electronic
computer." Journal of the Royal Statistical Society,
Series B, 1960, 22, 302307.
Dagenais, M. G. "Further suggestions concerning the utili
zation of incomplete observations in regression
analysis." Journal of the American Statistical
Association, 1971, 66, 9398.
Dear, R. E. "A principalcomponent missingdata method for
multiple regression models." SP86, Systems Develop
ment Corporation, Santa Monica, California, 1959.
Dempster, A. P. "An overview of multivariate data analysis."
Journal of Multivariate Analysis, 1971, 1, 316346.
Edgett, G. L. "Multiple regression with missing observa
tions among the independent variables." Journal of
the American Statistical Association, 1956, 51,
122131.
Federspiel, C. F., Monroe, R. J., and Greenberg, B. G.
"An investigation of some multiple regression
methods for incomplete samples." University of
North Carolina, Institute of Statistics, Mineo
Series, No. 236, August 1959.
Glasser, M. "Linear regression analysis with missing
observations and the independent variables."
Journal of the American Statistical Association,
1964, 59, 834844.
Haitovsky, Y. "Missing data in regression analysis."
Journal of the Royal Statistical Society,
Series B, 1968, 30, 6782.
Hartwell, T. D. and Gaylor, D. W. "Estimating variance
components for twoway disproportionate data with
missing cells by the method of unweighted means."
Journal of the American Statistical Association,
1973, 68, 379383.
Hocking, R. R. and Smith, W. B. "Estimation of parameters
in the multivariate normal distribution with
missing observations." Journal of the American
Statistical Association, 1968, 63, 159173.
Hopper, M. J., comp. Harwell Subroutine Library: A
Catalogue of Subroutines. London: Her Majesty's
Stationery Office, State House, 49 High Holborn,
1970.
Kleinbaum, D. G. Estimation and hypothesis testing for
generalized multivariate linear models. Doctoral
dissertation, University of North Carolina, Chapel
Hill, North Carolina, 1970.
Kramer, C. Y. and Glass, S. "Analysis of variance of a
Latin square design with missing observations."
Applied Statistics, 1960, 9, 4350
Lord, F. M. "Estimation of parameters from incomplete data."
Journal of the American Statistical Association,
1955, 50, 870876.
Lord, F. M. "Estimation of latent ability and item parame
ters when there are omitted responses." Psycho
metrika, 1974, 39, 247264.
Matthai, A. "Estimation of parameters from incomplete data
with applications to design of sample surveys."
Sankhya, 1951, 2, 145152.
Mitra, S. K. "Some remarks on the missing plot analysis."
Sankhya, 1959, 21, 337344.
Morrison, D. F. "Expectations and variances of maximum
likelihood estimates of the multivariate normal
distribution parameters with missing data."
Journal of the American Statistical Association,
1971, 66, 602604.
Nicholson, G. E., Jr. "Estimation of parameters from
incomplete multivariate samples." Journal of
the American Statistical Association, 1957, 2,
523526.
Preece, D. A. "Query and answer: Nonadditivity in two
way classifications with missing values." Bio
metrics, 1972, 28, 574577.
Pruzek, R. M. "Methods and problems in the analysis of
multivariate data." Review of Educational Research,
1971, 41, 163190.
Raffeld, P. C. The effects of Guttman weights on the
reliability and predictive validity of objective
tests when omissions are not differentially
weighted. Doctoral dissertation, University of
Oregon, 1973.
Rubin, D. B. "Characterizing the estimation of parameters
in incompletedata problems." Journal of the
American Statistical Association, 1974, 69, 467
474.
Srivastava, J. N. "On the extension of GaussMarkov theorem
to complex multivariate linear models." The Annals
of the Institute of Statistical Mathematics 1967,
19, 417437.
Srivastava, J. N. "On a general class of designs for multi
response experiments." The Annals of Mathematical
Statistics, 1968, 39, 18251843.
Srivastava, J. N. and McDonald L. "On the costwise optimality
of hierarchical multiresponse randomized block designs
under the trace criterion." The Annals of the Insti
tute of Statistical Mathematics, 1969, 21, 507514.
Srivastava, J. N. and McDonald, L. "On the costwise opti
mality of certain hierarchical and standard multi
response models under the determinant criterion."
Journal of Multivariate Statistics, 1971, 1, 118
128.
Srivastava, J. N. and Zaatar, M. K. "On the maximum likeli
hood classification rule for incomplete multivariate
samples and its admissibility." Journal of Multi
variate Analysis, 1972, 2, 115126.
Trawinski, I. M. Incompletevariable designs. Doctoral
dissertation, Virginia Polytechnic Institute,
Blacksburg, Virginia, 1961.
Trawinski, I. M. and Bargmann, R. E. "Maximum likelihood
estimation with incomplete multivariate data."
The Annals of Mathematical Statistics, 1964, 35,
647657.
Walsh, J. E. "Computerfeasible general method for fitting
and using regression functions when data are
incomplete." SP71, System Development Corpo
ration, Santa Monica, California, 1959.
Wilkinson, G. N. "Comparison of missing value procedures."
Australian Journal of Statistics, 1960, 2, 5365.
Wilks, S. S. "Moments and distributions of estimates of
population parameters from fragmentary samples."
The Annals of Mathematical Statistics, 1932, 3,
163195.
BIOGRAPHICAL SKETCH
Stephen S. Sledjeski was born November 27, 1942, in
Greenport, New York. He graduated from Southold High School,
Southold, New York; the Diocesan Preparatory Seminary,
Buffalo, New York (A.A.); St. Bonaventure University, St.
Bonaventure, New York (B.S.); and the University of Florida,
Gainesville, Florida (M.Ed., Ed.S., Ph.D.).
His educational employment experience consists of
working as a middle school mathematics teacher with the
Alachua County Board of Public Instruction, Gainesville,
Florida; a research associate with Santa Fe Community
College, Gainesville, Florida; supervisor of data processing
as a graduate research assistant with the Florida Parent
Education Model of Project Follow Through, University of
Florida, Gainesville, Florida; and Research Specialist at
P. K. Yonge Laboratory School, Gainesville, Florida. In
addition, he has been a statistical and computer consultant
for doctoral students, the Florida State Department of
Health and Rehabilitation Services, and the Career Oppor
tunities Program, Richmond, Virginia.
I certify that I have read this study and that in
my opinion'it conforms to acceptable standards of scholarly
presentation and is fully adequate, in scope and quality, as
a dissertation for the degree of Doctor of Philosophy.
Vynce A. Hines, Chairman
Professor of Foundations of Education
I certify that I have read this study and that in
my opinion it conforms to acceptable standards of scholarly
presentation and is fully adequate, in scope and quality, as
a dissertation for the degree of Doctor of Philosophy.
( e/ e
Ira J. Gord n
Graduate Research Professor of
Foundations of Education
I certify that I have read this study and that in
my opinion it conforms to acceptable standards of scholarly
presentation and is fully adequate, in scope and quality, as
a dissertation for the degree of Doctor of Philosophy.
Robert S. Soar
Professor of Foundations of
Education
I certify that I have read this study and that in
my opinion it conforms to acceptable standards of scholarly
presentation and is fully adequate, in scope and quality, as
a dissertation for the degree of Doctor of Philosophy.
Z. R. Pop Stojanovic
Associate Chairman and Professor
of Mathematics
I certify that I have read this study and that in
my opinion it conforms to acceptable standards of scholarly
presentation and is fully adequate, in scope and quality, as
a dissertation for the degree of Doctor of Philosophy.
Hattie Bessent
Assistant Professor of Foundations
of Education
This dissertation was submitted to the Graduate Faculty of
the College of Education and to the Graduate Council, and
was accepted as partial fulfillment of the requirements for
the degree of Doctor of Philosophy.
March, 1976
Dean, Colleg of education
Dean, Graduate School
