|
Citation |
- Permanent Link:
- https://ufdc.ufl.edu/UF00099574/00001
Material Information
- Title:
- The Construct validity of the holistic writing score an analysis of the essay subtest of the College-Level Academic Skills Test
- Creator:
- West, Gregory K., 1949- ( Dissertant )
Algina, James ( Thesis advisor )
Crocker, Linda ( Thesis advisor )
Nunnery, Michael ( Reviewer )
- Place of Publication:
- Gainesville, Fla.
- Publisher:
- University of Florida
- Publication Date:
- 1988
- Copyright Date:
- 1988
- Language:
- English
- Physical Description:
- vii, 62 leaves : ill. ; 28 cm.
Subjects
- Subjects / Keywords:
- Adverbials ( jstor )
Educational research ( jstor ) Essays ( jstor ) Handwriting ( jstor ) Phrases ( jstor ) Words ( jstor ) Writing ability ( jstor ) Writing instruction ( jstor ) Writing tests ( jstor ) Written composition ( jstor ) Dissertations, Academic -- Foundations of Education -- UF English language -- Composition and exercises -- Ability testing ( lcsh ) English language -- Rhetoric -- Ability testing ( lcsh ) Foundations of Education thesis Ph.D Grading and marking (Students) ( lcsh ) City of Tallahassee ( local )
- Genre:
- bibliography ( marcgt )
theses ( marcgt ) non-fiction ( marcgt )
- Spatial Coverage:
- United States -- Florida
Notes
- Abstract:
- The purpose of this study was to investigate the construct
validity of holistic scores on large-scale writing assessments using
as the vehicle for the study the essay subtest of the College-Level
Academic Skills Test (CLAST). The construct validity issue focused
on the extent to which holistic writing scores and atomistic skill
scores measured the same underlying writing trait(s).
In this study, 104 CLAST essays were drawn by random sampling
from a frame of 196 essays provided by the College-Level Academic
Skills Program (CLASP) staff from a population of 12,256 CLAST essays
administered in March 1985. Each of the 104 CLAST essays in this
sample were holistically scored by CLASP staff in accordance with the
requirements of Florida law. For each of the 104 CLAST essays, 12 atomistic writing subskill scores were derived: agreement errors,
punctuation errors, spelling errors, capitalization errors, nominals,
adjectivals, adverbials, paragraph coherence, coordination, words per
T-unit, total number of words, and handwriting quality (Atomistic
Writing Subskills).
The holistic scores and 12 Atomistic Writing Subskill scores were
subjected to the principal-axis method of common factor analysis. A
three-factor solution was computed as a result of the application of
the scree test and an examination of the conceptual meaningfulness of
the competing four- and five-factor solutions. The holistic score,
paragraph coherence, and total number of words loaded positively on
the first factor. The holistic score and handwriting quality loaded
negatively on the second factor; agreement errors, punctuation
errors, spelling errors, and capitalization errors loaded positively
on the second factor. Nominals, adjectivals, adverbials, and words
per T-unit loaded positively on the third factor.
The factor structure suggests that the writing construct measured
by holistic scoring encompasses two distinct constructs: one related
to paragraph coherence and to the total number of words, and one
related to the absence of mechanical errors and to handwriting
quality. The factor structure further suggests that there is a
separate writing construct unrelated to holistic scoring which is
composed of syntactic constructions and words per T-unit.
- Thesis:
- Thesis (Ph.D.)--University of Florida, 1988.
- Bibliography:
- Includes bibliographical references.
- General Note:
- Typescript.
- General Note:
- Vita.
- Statement of Responsibility:
- by Gregory K. West.
Record Information
- Source Institution:
- University of Florida
- Holding Location:
- University of Florida
- Rights Management:
- Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
- Resource Identifier:
- 024547254 ( AlephBibNum )
20112225 ( OCLC ) AFL2281 ( NOTIS )
|
Downloads |
This item has the following downloads:
|
Full Text |
THE CONSTRUCT VALIDITY OF THE
HOLISTIC WRITING SCORE:
AN ANALYSIS OF THE ESSAY SUBTEST
OF THE COLLEGE-LEVEL ACADEMIC SKILLS TEST
BY
GREGORY K. NEST
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1988
ni l F UbnA^:'
COPYRIGHT 1988
by
GREGORY K. WEST
ACKNOWLEDGMENTS
I wish to thank-Dr. James Algina for his guidance and cooperation
as chairperson of my committee. I also would like to express my
appreciation to the other members of my committee, Dr. Linda Crocker
and Dr. Michael Y. Nunnery, for their direction and editorial
comments.
My sincere appreciation and gratitude are expressed to my wife,
Dr. Susan S. Hill, for her assistance in the data extraction and her
loving encouragement, and to Chad and Michael for their special
encouragement and love.
I would also like to express my appreciation and gratitude to my
parents, Earl and Marjorie West, not only for their love and
encouragement but also for the hand-sewn quilt (queen-size) which
parents traditionally bestow upon an offspring after completion of a
doctoral dissertation.
-111-
TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS . . . . .. . ... . .. iii
ABSTRACT . . . . . . . . .. . . ... vi
CHAPTERS
I INTRODUCTION . . . .. . . . . . 1
Purpose . . . . . . . . . 1
Limitations ... . . . . .
Justification ..... . . . . 4
Definitions . . . . . . ... .. . . 6
II REVIEW OF RELATED LITERATURE . ... . . . 8
Factor-Analytic Research . . .. . . .. 8
Non-Factor-Analytic Research . . . . . . 10
Comparison of Factor-Analytic and
Regression Methods of Construct Validation . . 16
III PROCEDURES ...... . . . .... . 18
Selection of Subjects . . . . . . . 18
Selection of Atomistic Writing Subskills . . . 19
Holistic Scoring . . .. ...... . ..... .24
Atomistic Writing Subskill Scoring .. . ... 25
IV ANALYSIS OF THE DATA .... . . . . . 38
Normality Assumption . . . . . . 38
Correlational Validity . . . . . 39
Factorial Validity . . ... ........ . 42
V DISCUSSION . .... . .. ............. 50
Research Implications .... .. . ..... .. .50
Implications for Further Research . ...... 53
-iv-
APPENDIX: ROTATED FACTOR PATTERNS FOR FOUR- AND
FIVE-FACTOR SOLUTIONS . . ......... . 55
REFERENCES ..... . . . . . .. . . . ... 58
BIOGRAPHICAL SKETCH . . . .. .... . . 62
Abstract of Dissertation Presented to the
Graduate School of the University of Florida in
Partial Fulfillment of the Requirements
for the Degree of Doctor of Philosophy
THE CONSTRUCT VALIDITY OF THE
HOLISTIC WRITING SCORE:
AN ANALYSIS OF THE ESSAY SUBTEST
OF THE COLLEGE-LEVEL ACADEMIC SKILLS TEST
By
Gregory K. West
December 1988
Chairperson: James Algina
Major Department: Foundations of Education
The purpose of this study was to investigate the construct
validity of holistic scores on large-scale writing assessments using
as the vehicle for the study the essay subtest of the College-Level
Academic Skills Test (CLAST). The construct validity issue focused
on the extent to which holistic writing scores and atomistic skill
scores measured the same underlying writing trait(s).
In this study, 104 CLAST essays were drawn by random sampling
from a frame of 196 essays provided by the College-Level Academic
Skills Program (CLASP) staff from a population of 12,256 CLAST essays
administered in March 1985. Each of the 104 CLAST essays in this
sample were holistically scored by CLASP staff in accordance with the
requirements of Florida law. For each of the 104 CLAST essays, 12
atomistic writing subskill scores were derived: agreement errors,
punctuation errors, spelling errors, capitalization errors, nominals,
adjectivals, adverbials, paragraph coherence, coordination, words per
T-unit, total number of words, and handwriting quality (Atomistic
Writing Subskills).
The holistic scores and 12 Atomistic Writing Subskill scores were
subjected to the principal-axis method of common factor analysis. A
three-factor solution was computed as a result of the application of
the scree test and an examination of the conceptual meaningfulness of
the competing four- and five-factor solutions. The holistic score,
paragraph coherence, and total number of words loaded positively on
the first factor. The holistic score and handwriting quality loaded
negatively on the second factor; agreement errors, punctuation
errors, spelling errors, and capitalization errors loaded positively
on the second factor. Nominals, adjectivals, adverbials, and words
per T-unit loaded positively on the third factor.
The factor structure suggests that the writing construct measured
by holistic scoring encompasses two distinct constructs: one related
to paragraph coherence and to the total number of words, and one
related to the absence of mechanical errors and to handwriting
quality. The factor structure further suggests that there is a
separate writing construct unrelated to holistic scoring which is
composed of syntactic constructions and words per T-unit.
-vii-
CHAPTER I
INTRODUCTION
Purpose
The construct validity of holistic scoring on large-scale writing
assessments was investigated using the essay subtest of the
College-Level Academic Skills Test (CLAST) as the vehicle for the
study. The construct validity issue focused on the extent to which
holistic scores and atomistic skill scores measured the same
underlying writing trait(s). In this study, factor analysis was the
primary method used to investigate the construct validity of holistic
scores. Specifically, the construct validity of holistic scoring was
investigated by factor analyzing the correlations among (a) the
holistically scored CLAST essay subtest scores, (b) grammatical and
spelling errors, (c) syntactic constructions, (d) coherence and
coordination, (e) length, and (f) handwriting quality.
Limitations
This study was limited to a sample of 104 individuals who were
administered the March 1985 CLAST. The sample of 104 examinees was
drawn from a frame of 196 examinees which the College Level Academic
Skills Program (CLASP) staff had selected from the population of
12,256 examinees administered the March 1985 CLAST. No institutional
or demographic information was released by the CLASP staff for the
frame of 196 examinees. In addition, the CLASP staff provided the
frame of 196 examinees by physically retrieving a set of March 1985
CLAST essays that had been physically returned together from the
examination center and that had been bound together in the CLASP
storage area. Accordingly, the generalizability of this study to the
general population of all examinees who were administered the March
1985 CLAST is uncertain, in that the CLASP staff may not have drawn a
random sample from the population and in that it was not possible to
test whether the frame was representative of, inter alia, gender,
geographical distribution, institutional affiliation, or cultural
background of the population of individuals administered the March
1985 CLAST.
Furthermore, on the March 1985 CLAST essay subtest, examinees
chose to write an essay on one of two possible topics. The examinees
in the sample of 104 individuals in this study all wrote on the same
expository topic on the March 1985 CLAST essay subtest. This study
was, therefore, limited to one of the two topics on the March 1985
CLAST essay subtest. The CLASP staff does not release the topics
used in CLAST essay subtests, and the CLASP staff conditioned the
release of the data in the instant study upon the non-disclosure of
the two alternate topics in the March 1985 CLAST essay subtest. To
score the essay subtest, the CLASP staff also used a method of
holistic rating called "general impression marking" in which the
scorer fits a writing sample into ordered categories on the basis of
the overall impression created by the essay. The general impression
marking procedure was developed and used by the Educational Testing
Service and allows for "wide-open" topics on the assumption that
different discourse modes do not affect-the holistic scores. There
is, however, a body of literature (Cooper, 1977; Lloyd-Jones, 1977;
Moss, Cole, & Khampalikit, 1982; Odell & Cooper, 1980; Quellmalz,
Capell, & Chou, 1982) suggesting that different discourse modes and
topic selection are correlated with holistic scores. Because this
study involved March 1985 CLAST subtest scores dealing with a single
expository topic, the relationship of holistic writing scores to
discourse mode or topic was not investigated.
In addition, Freedman (1979) suggested that content and
organization of the writing were significantly correlated with the
holistic score. Veal and Hudson (1983) also found that content was
significantly correlated with the holistic score. In addition,
Benton and Kiewra (1986) determined that organizational ability, as
measured by objective organizational tests, was significantly
correlated with the holistic score. Because of the complexity and
time involved in eliminating confounding variables, the quality of
the content or organization of the overall CLAST essays was not
assessed.
Finally, the universe of all possible writing subskills which may
be correlated with holistic writing scores was not investigated. The
existence of such other writing skills could have significantly
affected the results of this study.
Justification
The construct validity of holistic scoring was investigated,
using as a vehicle for the investigation a test of critical
importance to the pedagogical communities in Florida. The specific
impetus for this study was derived from the paucity of factorial
validity studies of holistic scoring in the research literature.
At the time of this writing, in only two reported factorial
validity studies have the holistic score and atomistic writing
subskill scores been submitted to factor analysis, and in neither
study has the range of atomistic writing subskills investigated in
the instant study been explored. Freedman (1981) found by means of
factor analysis that one trait of compositions was measured by
holistic scores and analytic rating scale scores composed of voice,
development, organization, sentence structure, word choice, and
usage. In another factor-analytic study, Chapman, Fyans, and Kerins
(1984) reported that the following five measures of functional
writing loaded on a single factor: (a) focus, (b) support, (c)
organization, (d) mechanics, and (e) overall.
There has been, however, a significant amount of
non-factor-analytic research literature dealing with the relationship
between writing subskill scores and holistic scores (Charney, 1984;
Crocker, 1987). Generally, the results of these non-factor-analytic
studies have indicated that, "in spite of [scorer] training,
[holistic scores] are strongly influenced by . characteristics of
the writing samples." (Charney, 1984, p. 75).
The researchers of syntactic complexity did not conclusively
indicate whether syntactic complexity was significantly correlated
with the holistic score. Stewart and Leaman (1983) found that
syntactic complexity was a poor predictor of holistic scores. In
contrast, Homburg (1984) found that, at least for non-native
speakers of English, syntactic complexity in the form of
subordination and relativization measures was a significant predictor
of holistic scores.
The research results were also inconclusive as to whether
paragraph coherence or sentence level coordination was significantly
correlated with holistic scores. McCulley (1985) suggested that
general coherence was predictive of a primary-trait measure of
persuasive writing quality; Homburg (1984) found that coordination
was predictive of holistic scores, at least for non-native speakers
of English.
The length of the writing sample (Homburg, 1984; Stewart & Grobe,
1979) and the absence of mechanical errors (Grobe, 1981; Homburg,
1984; Stewart & Grobe, 1979; Veal & Hudson, 1983) uniformly were
found to be predictive of holistic scores. The appearance of the
writing sample also was found to correlate with the holistic score,
with papers written in poor handwriting tending to receive lower
holistic scores (Chase, 1986; Hughes, Keeling, & Tuck, 1983; McColly,
1970).
Researchers determined primarily by means of regression analysis
and analysis of variance that the holistic score might be correlated
with mechanical errors, syntactic complexity, coherence and
coordination, length of the essay, and handwriting quality; however,
only two researchers have, at the time of this writing, attempted to
establish the factorial validity of a holistic writing construct by
factor analyzing the holistic score and atomistic writing subskill
scores. In neither reported factor-analytic study was the range of
atomistic writing subskills contained in the instant study
investigated. The results of the instant factor-analytic study may
be useful in expanding and clarifying our understanding of the
construct validity of holistic scores.
Definitions
The following terms are defined as used in this study.
Atomistic Writing Subskills include agreement errors, punctuation
errors, spelling errors, capitalization errors, nominals,
adjectivals, adverbials, paragraph coherence, inter-T-unit
coordination, words per T-unit, total number of words in the essay,
and handwriting quality.
College-Level Academic Skills Program (CLASP) is a faculty
cooperative established in order to "advise the [Florida] Department
of Education to ensure continuing faculty contributions to decisions
concerning skills to be expected of college students, the ways in
which the skills are tested on the CLAST, and the utilization of
CLAST test results." (Florida Department of Education [FDOE], 1984,
p. 21).
College-Level Academic Skills Test (CLAST) is "a test developed
by the [Florida] Department of Education pursuant to Section
229.551(3)(k) lof The Florida School Code (1987)] to measure student
achievement of the skills listed in Rule 6A-10.31 [of the Florida
Administrative Code (1987)]." (FDEO, 1984, p. 21).
College-level academic skills are "the communication and
computational skills adopted by the [Florida] State Board of
Education in Rule 6A-10.31 [of the Florida Administrative Code
(1987)]." (FDEO, 1984, p. 21).
Holistic scoring is the rating of papers on a scale of 1 through
4 based on an overall impression of each essay. CLAST essay subtest
scores are the sum of holistic scores awarded by two readers and,
therefore, are in the range 2 through 8.
Mechanics refers to the basic conventions of writing, including
agreement, punctuation, spelling, and capitalization.
Syntax includes the "ways in which words are put together to form
phrases, clauses, and sentences [and] involves breaking each
composition into its 'T-units' and examining the ways in which
writers embed information in T-units and join T-units together."
(National Assessment of Educational Progress [NAEP], 1980, p. 10).
A T-unit is "a main clause with all its attendant modifying
words, phrases, and dependent clauses." (NAEP, 1980, p. 10).
CHAPTER II
REVIEW OF RELATED LITERATURE
A significant body of research exists concerning the relationship
between holistic scores and various writing subskill scores,
including the relationship between the holistic score and grammatical
errors, syntactic complexity, paragraph coherence, coordination,
length of syntactic structures, length of the essay, and appearance
of the handwriting. Only two researchers, however, have reported the
results of a factor analysis of the holistic score and atomistic
writing subskill scores. The literature has, instead, primarily been
based either on regression analysis, as a means of differentiating
significant versus non-significant predictors of the holistic score,
or on ANOVA techniques to assess the differing responses of holistic
graders to experimental and control groups.
Factor-Analytic Research
Researchers have reported the results of factor analyzing a
variety of writing qualities (Diederich, French, & Carlton, 1961;
McColly, 1970; Quellmalz et al., 1982); at the time of this writing,
however, in only two reported factor-analytic studies have both the
holistic score and atomistic writing subskill scores been submitted
to factor analysis in order to investigate the construct validity of
holistic scoring. Freedman's (1981) study involved the holistic
evaluation by four readers of 64 essays written by students of four
colleges and the evaluation by two scorers of the 64 essays on an
analytic scale. The analytic scale included scores for voice,
development, organization, sentence structure, word choice, and
usage. Freedman used factor analysis to explore five variables: (a)
the holistic score (HSCORE); (b) the analytic scale score (ASCORE)
composed of the sum of scores on voice, development, organization,
sentence structure, word choice, and usage; (c) the content/style
score (CONTENT/STYLE) composed of the sum of the voice, development,
and organization less the sum of the sentence structure, word choice,
and usage scores; (d) the voice/content score (VOICE/CONTENT)
composed of two times the voice score less the sum of the development
and organization scores; and (e) the usage/style score (USAGE/STYLE)
composed of the sum of the sentence structure and word choice scores
less two times the usage score. Using a principal component factor
analysis with varimax rotation, Freedman determined that the five
contrast scores represented two discreet, independent qualities of
the papers. HSCORE and ASCORE loaded on one factor, and
CONTENT/STYLE and USAGE/STYLE loaded on a second factor. Freedman
concluded that the factor structure suggested that holistic and
analytic scales measured a single writing trait.
In another factor-analytic study, Chapman et al. (1984)
determined that five measures of functional writing, one of which was
an overall measure, loaded on a single factor. The data was
collected within the context of the 1983 Illinois Inventory of
Educational Progress (IIEP) writing assessment of fourth-, eighth-
and eleventh-grade students from 120 schools. Essay raters within
the context of the IIEP assessment evaluated essays on the following
five areas of functional writing: (a) focus, (b) support, (c)
organization, (d) mechanics, and (e) overall. All five measures of
functional writing were rated on a scale ranging from 1 through 6.
Upon factor analyzing the correlation matrix among the five areas of
functional writing, the researchers found factor loadings ranging
from 0.70 to 0.90 on a single factor. They concluded that the
results supported "the aggregation [of the five areas of functional
writing] into one writing ability score." (Chapman et al., 1984,
p. 25).
Non-Factor-Analytic Research
Non-factor-analytic research has been based primarily on either
regression analysis or ANOVA techniques. In the non-factor-analytic
studies, researchers have explored the relationship between holistic
scores and five writing subskill scores: (a) mechanical errors, (b)
syntactic complexity, (c) coherence and coordination, (d) length, and
(e) handwriting quality.
Mechanical Errors
Without exception, in both the correlational and experimental
research, a significant inverse relationship between the number of
mechanical and spelling errors and the holistic scores has been
found. Stewart and Grobe (1979) performed stepwise regression
analysis on eight independent variables relating to mechanics of
writing on different students at grades 5, 8, and 11. They
determined that at all three grade levels there was a positive and
significant correlation between holistic quality ratings and the
absence of spelling errors. Using 30 freshman writing essays in the
argument mode which were holistically scored by senior high school
teachers teaching in the areas of (a) business, (b) English and
social studies, and (c) mathematics and science, Stewart and Leaman
(1983) performed stepwise regression analysis with eight independent
variables, finding that both the number of spelling errors and the
number of punctuation errors were significant predictors of the
holistic quality ratings awarded by the teachers in each of the three
curricular areas.
Chou, Kirkland, and Smith (1982) investigated the holistic scores
given by raters to compositions written for the University System of
Georgia's Regents' Testing Program. Chou et al. examined (a) total
number of sentences, (b) average sentence length, (c) shortest
sentence, (d) longest sentence, (e) total words, (f) grammatical
errors, (g) punctuation errors, (h) T-units, (i) words per T-unit,
(j) misspelled words, (k) crossouts, (1) vagueness, and (m) words in
outline. In a sample of 60 essay compositions, they determined that
a significant negative correlation existed between the holistic score
and the number of grammatical errors, the number of punctuation
errors, and the number of misspelled words.
Freedman (1979) performed an experimental study in which she
determined that the mechanics score proved to be a significant
predictor of holistic judgments, although less predictive than scores
for content and organization of the essay. Freedman rewrote essays
of moderate quality to be either stronger or weaker in the four
categories of content, organization, sentence structure, and
mechanics. The mechanics rewriting involved misuse of commas,
quotation marks, possessives, capitalization, underlining, and
spelling. Using analysis of variance, Freedman determined that the
mechanics score was a significant predictor of the holistic score and
that the interaction between mechanics and organization was
significant, such that the mechanics score was statistically
significant only in writings with high organization scores.
The number of mechanical errors has, without exception, been
found to be significantly correlated to the quality of holistic
scores in both correlational (Bertrand, 1983; Chou et al., 1982;
Grobe, 1981; Homburg, 1984; Stewart & Grobe, 1979; Stewart & Leaman,
1983; Veal & Hudson, 1983) and experimental (Freedman, 1979) studies.
Syntactic Complexity
Although the weight of the research literature suggests that
syntactic complexity is not significantly correlated to the holistic
score (Freedman, 1979; Neilson & Piche, 1981; Stewart & Grobe, 1979;
Stewart & Leaman, 1983), there is, nevertheless, some research to
support the existence of a significant correlation between the
holistic score and syntactic complexity (Homburg, 1984; Nold &
Freedman, 1977).
Neilson and Piche (1981) manipulated a syntactic variable, headed
nominal complexity, and a semantic variable, lexical choice, in
stimulus passages to determine their effect on scorers' judgments of
writing quality. Eighty inservice and preservice high school English
teachers rated the passages, allowing each of the four versions of
the passage to be rated 20 times. Using the ANOVA technique to test
the differences among the ratings achieved on the four treatment
versions of the passage, Neilson and Piche determined that there was
no significant relationship between syntactic complexity (limited to
headed nominal complexity) and quality of writing.
Homburg (1984), however, determined that the number of dependent
clauses per composition was a significant predictor of holistic
scores. Homburg derived his data from 30 compositions randomly
selected from the composition portion of the Michigan Test of English
Language Proficiency administered during the period from August 1979
to March 1980. The writers were, therefore, non-native speakers of
English who were applying to universities in the United States.
Homburg performed a one-way ANOVA and a discriminant analysis in
order to determine that number of dependent clauses per composition
was a significant predictor of holistic scores.
Coherence and Coordination
The corpus of research literature concerning the relationship
between coherence and holistic scores seems to indicate that both
coherence and coordination are significantly related to holistic
scores. McCulley (1985) used regression analysis on a sample of 120
persuasive papers written by 17-year-olds during the 1978-1979
National Assessment of Educational Progress, correcting for number of
T-units in each writing passage. The results of the study indicated
that general coherence was a significant predictor of a primary-trait
measure of persuasive writing quality and that the lexical cohesive
features, use of synonym, hyponym, and collocation, were also
significant predictors of the primary-trait measure. Homburg (1984),
in his study of 30 writing samples by non-native English speakers,
applied discriminant analysis and found that the number of
coordinating conjunctions per composition was a significant predictor
of holistic scores.
Length
As with syntactic complexity, the research literature regarding
number of words per sentence, number of words per T-unit, number of
sentences in the entire essay, and number of words in the entire
essay is inconclusive as to whether such attributes of length are
predictive of holistic scoring. Nold and Freedman (1977) and Stewart
and Grobe (1979), both by means of stepwise regression analysis,
determined that number of words in an essay was significantly
predictive of holistic scores. In correlational studies, Thomas and
Donlan (1980) and Bertrand (1983) also determined that number of
words in an essay was predictive of the holistic writing score. Chou
et al. (1982) determined that there was a significant relationship
between holistic scores and the number of sentences in an essay.
A number of researchers (Grobe, 1981; Nold & Freedman, 1977;
Stewart & Grobe, 1979; Stewart & Leaman, 1983) indicated that
syntactic complexity, measured by the number of words per T-unit, was
not significantly correlated with holistic scores. Homburg (1984),
by means of discriminant analysis, determined, however, that number
of words in a sentence was significantly predictive of holistic
scores for non-native speakers of English.
Handwriting Quality
Researchers (Chase, 1986; Hughes et al., 1983; McColly, 1970)
have uniformly suggested that handwriting quality is a significant
predictor of the holistic score. Hughes et al. (1983) determined by
means of ANOVA that there was a significant difference between
holistic scores awarded to essays in experimental and control groups
defined in terms of handwriting quality and further found that
handwriting quality did not interact with score or achievement
expectations.
Chase (1986) used an ANOVA technique to analyze the data
collected by having each of 80 readers evaluate a single contrived
student essay. Each of the readers was also given a student record
card which identified the student as either a high or low achiever,
black or white, and male or female. Chase found that there were
complex interactions of expectations, handwriting, and sex within
race. The results indicated that quality of handwriting was
significantly related to the holistic score, but that the
relationship between handwriting quality and the holistic score
interacted with expectations of sex within race.
Comparison of Factor-Analytic and
Regression Methods of Construct Validation
In the instant study, factor analysis was selected as the method
for examining the construct validity of holistic scores. Unlike
regression analysis, factor analysis permits the researcher to
examine multiple writing traits measured by the holistic writing
score, number of grammatical errors, syntactic complexity, paragraph
cohesion, coordination, length of syntactic structures, length of the
essay, and physical appearance of the handwriting. Factor analysis,
therefore, enables the researcher to identify a reduced number of
underlying constructs which describe multiple variables involved in
the evaluation of writing.
Regression analysis, however, permits the distillation of a
number of independent variables into only a single dependent variable
or trait, such as the holistic score. Regression analysis would,
therefore, only identify which of the independent variables was
related to the holistic score and would not differentiate between
various sub-relationships among the independent variables and the
holistic score.
Although the regression analysis studies on holistic scoring
cited in the section entitled "Non-Factor-Analytic Research" in
Chapter II have functioned to identify a range of possible variables
which may be related to holistic scores, factor analysis is the more
appropriate method to investigate construct validation because factor
analysis permits the researcher to determine whether the holistic
17
score and the various writing subskills previously isolated by
regression analysis are empirically identified as measuring a common
factor. Factor analysis permits the researcher to meaningfully
explain the relationships between holistic scoring and the writing
subskill scores in terms of a few conceptually meaningful, relatively
independent factors.
CHAPTER III
PROCEDURES
The purpose of this study was to investigate the construct
validity of holistic scores using an exploratory factor analysis of
the holistically-scored CLAST essay subtest score (Holistic) and the
12 Atomistic Writing Subskill scores. This chapter contains
information on the research design, selection of subjects, selection
of atomistic writing subskills, and procedures for data-extraction
used in this study.
Selection of Subjects
The subjects in the sample included a total of 104 individuals
who participated in the March 1985 CLAST administration. The CLASP
staff selected 196 individuals from a population of 12,256
individuals who chose to write on the same essay topic and who were
administered the March 1985 CLAST. The CLASP staff provided a sample
of 196 examinees by physically retrieving a set of March 1985 CLAST
essays that had been returned together from the examination center
and were bound together in the CLASP storage area.
From this packet of 196 essays, a sample of approximately 100
individuals was randomly selected to produce a sample size
sufficiently large to produce reliable factor analysis results.
Gorsuch (1974) suggested that the sample for factor analysis should
provide an "absolute minimum ratio [of] five individuals to every
variable, but not less than 100 individuals for any analysis."
(p. 296). In this study the sample size of 104 individuals with 13
variables provided a ratio of 8 individuals to each variable, which
ratio was greater than the minimum suggested by Gorsuch. In
addition, time constraints involved in extracting the writing
subskill scores created a practical limit to the total number of
individuals sampled. Each essay required in excess of 3 hours of
grading time.
For this sample of 104 individuals, the frequency distribution of
the CLAST essay subtest scores is set forth in Table 1. The
descriptive statistics, including the mean, standard deviation, and
range for each of the four CLAST subtests of the 104 individuals
sampled from the March 1985 CLAST administration, are presented in
Table 2. The descriptive statistics, including the mean, standard
deviation, and median for each of the four CLAST subtests of the
total population of 12,256 individuals who took the March 1985 CLAST,
are presented in Table 3.
Selection of Atomistic Writing Subskills
The 12 Atomistic Writing Subskills selected for inclusion in this
study fall in five broad categories: (a) mechanics, (b) syntax, (c)
coherence and coordination, (d) length, and (e) handwriting quality.
The mechanics, syntax, coherence, and length scores were included in
this study because of their inclusion in assessments by the staff of
the National Assessment of Educational Progress and because of their
20
Table 1. Distribution by essay subtest scores of 104 essays drawn
from frame of 196 essays selected by the CLASP staff from
the population of 12,256 essays from the March 1985 CLAST
administration.
Essay Subtest Score n Percentage
2 3 2.8%
3 1 1.0
4 23 22.1
5 27 26.0
6 28 26.9
7 14 13.5
8 8 7.7
Total: 104 100.0%
.....................................................................
Table 2. Descriptive statistics for four CLAST subtests for 104
individuals sampled from the March 1985 CLAST
administration.
Subtest n M SD Minimum Maximum
Computation 104 316.5 28.7 244.0 397.0
Reading 104 322.3 28.4 253.0 405.0
Writing 104 318.7 29.6 252.0 381.0
Essay 104 5.4 1.4 2.0 8.0
Table 3. Descriptive statistics for four CLAST subtests for the
population of 12,256 individuals from the March 1985 CLAST
administration.
Subtest N M SD Median
Computation 12,256 314.0 30.6 314.0
Reading 12,256 322.0 29.3 324.0
Writing 12,256 315.0 29.8 309.0
Essay 12,256 5.1 1.4 5.0
Source: FDOE, 1986a, p. 44.
inclusion in research studies of holistic scoring cited in the
sections entitled "Factor-Analytic Research" and "Non-Factor-Analytic
Research" in Chapter II. The writing assessments by the staff of the
National Assessment of Educational Progress, however, included six
syntax scores. These were a nominal clause score, a nominal phrase
score, an adjectival clause score, an adjectival phrase score, an
adverbial clause score, and an adverbial phrase score. In the
instant study, in order to maximize the ratio of individuals to
variables (Gorsuch, 1974) for the factor analysis, the nominal clause
and nominal phrase scores were combined into a single nominal score;
the adjectival clause and adjectival phrase scores were combined into
a single adjectival score; and the adverbial clause and adverbial
phrase scores were combined into a single adverbial score.
The coordination score was included in the instant study because
Homburg (1984) suggested that coordination was predictive of holistic
scores, at least for non-native speakers of English. The measure of
handwriting quality was also included because of the body of research
literature indicating that essays written in poor handwriting
received lower holistic scores (Chase, 1986; Hughes et al., 1983;
McColly, 1970).
As set forth in the section entitled "Limitations" in Chapter I,
no writing subskill score was included for discourse mode or topic,
although there was a body of research suggesting that discourse modes
and topic selection were correlated to the holistic score (Cooper,
1977; Lloyd-Jones, 1977; Moss, Cole, & Khampalikit, 1982; Odell &
Cooper, 1980; Quellmalz et al., 1982). This study involved a single
expository topic as administered by the CLASP staff at the March 1985
administration of the CLAST; it was, accordingly, not possible to
obtain different discourse mode and topic selection scores for
inclusion in this study.
Also as set forth in the section entitled "Limitations" in
Chapter I, neither the content nor organization of the writing was
investigated, although Freedman (1979) suggested that content and
organization were significantly correlated to holistic scoring and
although Benton and Kiewra (1986) suggested that organizational
ability was significantly correlated to holistic scoring. The main
reason that this study did not include measures of the quality of the
content or organization of the overall essay was because of the
complications of confounding with various other writing subskill
scores. In order to assess the overall content and organization of
the writings, either an experimental study would need to be done to
control for confounding, or each essay would need to be "corrected"
for all of the subskill writing scores which were related to holistic
scoring. In a non-experimental study such as this, extraction of
content and organization scores would involve retyping each essay in
its entirety to correct for possible handwriting confounding, making
all mechanical error corrections, and rewriting each essay to
standardize the occurrence of syntactic structures. Because of the
problems associated with neutralizing the other potentially
confounding writing subskills, the variables in this study did not
include measures of overall content and organization.
In summary, the variables examined in this study included four
mechanics scores, three scores of syntactic complexity, one paragraph
coherence score, and two length scores, all of which scores were
based on writing subskills scores developed by the staff of the
National Assessment of Educational Progress. Because of research
findings in this area, the variables also included two other writing
subskill scores not developed by the staff of the National Assessment
of Educational Progress, namely coordination and handwriting quality.
Holistic Scoring
Each essay from the sample of 104 individuals administered the
March 1985 CLAST was holistically scored by the CLASP staff in
accordance with CLASP holistic scoring procedures (FDOE, 1980). The
procedures employed by the CLASP staff to holistically score the
March 1985 CLAST essays are those propounded by the Florida
Department of Education (1986a).
The CLAST scoring scale reflects the four levels of performance
described below:
Score 1:
Writer includes very little, if any, specific
relevant supporting detail but, instead, uses
generalizations for support. Thesis statement
and organization are vague and/or weak.
Underdeveloped ineffective paragraphs do not
support the thesis. Sentences lack variety,
usually consisting of a series of subject-verb
and, occasionally, complement constructions.
Transitions and coherence devices are not
discernible. Syntactical, mechanical, and usage
errors occur frequently.
Score 2:
Writer employs an inadequate amount of specific
detail relating to the subject. Thesis statement
and organization are unambiguous. Paragraphs
generally follow the organizational plan, and
they are usually sufficiently unified and
developed. Sentence variety is minimal and
constructions lack sophistication. Some
transitions are used and parts are related to
each other in a fairly orderly manner. Some
errors occur in syntax, mechanics, and usage.
Score 3:
Writer presents a considerable variety of
relevant and specific detail in support of the
subject. The thesis statement expresses the
writer's purpose. Reasonably well-developed,
unified paragraphs document the thesis. A
variety of sentence patterns occurs, and sentence
construction indicates that the writer has
facility in the use of language. Effective
transitions are accompanied by sentences
constructed with orderly relationship between
word groups. Syntactical, mechanical, and usage
errors are minor.
Score 4:
Writer uses an abundance of specific, relevant
details, including concrete examples, that
clearly support generalizations. The thesis
statement effectively reflects the writer's
purpose. Body paragraphs carefully follow the
organizational plan stated in the introduction
and are fully developed and tightly controlled.
A wide variety of sentence constructions is
used. Appropriate transitional words and phrases
and effective coherence techniques make the prose
distinctive. Virtually no errors in syntax,
mechanics, and usage occur. (FDOE, 1986a, pp.
36-37).
Atomistic Writing Subskill Scoring
Each essay from the sample of 104 individuals administered the
March 1985 CLAST was evaluated to generate 12 scores in the following
five broad categories: (a) mechanics, (b) syntax, (c) coherence and
coordination, (d) length, and (e) handwriting. The broad category of
mechanics included scores for the number of agreement errors
(Agreement), punctuation errors (Punctuation), spelling errors
(Spelling), and capitalization errors (Capitalization). The broad
category of syntax included scores on the number of nominal clauses
and phrases (Nominal), the number of adjectival clauses and phrases
(Adjectival), and the number of adverbial clauses and phrases
(Adverbial). The broad category of coherence and coordination
included scores for paragraph coherence (Coherence) and inter-T-unit
coordination (Coordination). The broad category of length included
scores for the number of words per T-unit (Nords/T-unit) and the
total number of words in the essay (Words). The broad category of
handwriting included the score for handwriting quality (Handwriting).
Each of the 104 CLAST essays in the sample was, therefore,
evaluated to produce scores for the following 12 Atomistic Writing
Subskills: (a) Agreement, (b) Punctuation, (c) Spelling, (d)
Capitalization, (e) Nominal, (f) Adjectival, (g) Adverbial, (h)
Coherence, (i) Coordination, (j) Words/T-unit, (k) Words, and (1)
Handwriting. A description of each of the 12 Atomistic Writing
Subskills follows. With the exception of Coordination and
Handwriting, the method of scoring the Atomistic Writing Subskills
was derived from the procedures set forth in the National Assessment
of Educational Progress scoring guidelines (NAEP, 1980).
Mechanics
The Mechanics scores were derived by determining the number of
mechanical errors in each CLAST essay. For each CLAST essay, one
scorer initially counted the number of mechanical errors and a second
scorer checked the first scorer's countto generate an accurate count
of the number of mechanical errors. Prior to scoring, each scorer
reviewed the rules set forth in the National Assessment of
Educational Progress scoring guide for syntax, cohesion, and
mechanics (Mullis & Mellon, 1980). One score was produced for each
of the four mechanics scores for each CLAST essay.
The guidelines for categorizing mechanics errors as set forth in
the National Assessment of Educational Progress scoring guide (NAEP,
1980) were used to derive the four mechanics scores. Agreement
errors were deemed in this study to be mistakes in subject/verb
agreement, mistakes in pronoun/antecent agreement, misusage of a
subject/object pronoun, and errors in verb tense. Punctuation errors
included all errors of commission and errors of omission relating to
commas, dashes, quotation marks, semi-colons, apostrophies, and end
marks, using the most informal usage rules. Spelling errors included
misspellings, errors in word divisions at line endings, errors of
writing two words as one or one word as two, extraneous plurals, and
letter groupings not constituting a legitimate word. Capitalization
errors were deemed to be errors in the capitalization of the first
word in a sentence, the failure to capitalize a proper noun or
adjective within a sentence, and the failure to capitalize the
pronoun "I."
Syntax
The following descriptions of each syntactic structure counted in
this study were derived from the National Assessment of Educational
Progress guide for scoring syntax, cohesion, and mechanics (Mullis &
Mellon, 1980). For each essay, one scorer initially counted the
number of syntactic structures and a second scorer checked the first
scorer's count to generate an accurate count of each syntactic
structure. One score was recorded for each syntactic structure for
each essay. Examples of the syntactic structure extracted from the
essays in this study follow.
Nominal. The Nominal score consists of both nominal clauses and
nominal phrases. Nominal clauses occur in two forms: (a)
that-nominal clauses and (b) question-nominal clauses. Both
that-nominals and question-nominal clauses occur in a variety of
nominal positions, including subject of the sentence, object of the
verb, object of a preposition, subject complement, and appositive.
The following are examples of that-nominals and question-nominal
clauses from the CLAST essays in this study:
"We see by this point that fads are not only
non-verbal but also verbal." (that-nominal as
object of a verb)
"Every couple of years, there seems to be a
shift in what young people are doing for fun."
(question-nominal as object of a preposition)
"'To be in' it suddenly matters what brand of
jeans one is wearing or how one greets another."
(two question-nominals as subjects of the verb)
"The point is that the young are
unconformed." (that-nominal as subject
complement)
"The fact that teenagers watch t.v. a great
deal, shows a basis of where they learn
up-to-date fads." (that-nominal as subject of
the verb)
"In a teenagers mind, these fads will
determine -if they are 'in' or 'out' of the
crowd." (question nominal as object of the verb)
Nominal phrases are differentiated from nominal clauses by the
inclusion of uninflected verbs. Nominal phrases occur, however, in
the same nominal positions as do nominal clauses. The two types of
nominal phrases are the infinitive and the gerund. Examples of
nominal phrases from the CLAST essays in this study follow:
"Keeping up with current trends is assumed to
be the only road to popularity and thus offers
some security to the very insecure teenager."
(gerund as subject of verb)
"People consider fads to be a motivational
factor because they make them feel alive and
fresh." (infinitive as object of verb)
"They must learn to develop their minds and
personalities." (infinitive as object of the verb)
"Being able to relate to some thing on one's
own level is of the utmost importance." (gerund
as subject of the verb)
"This ability to be seen and read by all lends
itself perfectly to the spreading of fads across
the nation." (gerund as object of a preposition)
Adjectival. The Adjectival score is comprised of both adjective
clauses and adjective phrases. Adjectival clauses are sometimes
referred to as relative clauses. Three types of adjectival clauses
were scored: restrictive relatives, non-restrictive relatives, and
adverbials of time, place, or manner. The following are examples of
relative clauses from the essays in this study:
"He is constantly testing the boundaries of
acceptable behavior and discovering those actions
which he enjoys." (restrictive-relative)
"When many people think about fads they
suddenly think of the way people dress at a
certain point in time." (adverbial clause of
time)
"They are not caught up with society and the
restraints it imposes." (restrictive relative)
Modifying phrases are relative clauses reduced by the deletion of
relative pronouns, subjects, and, in many cases, verbs.
"Teenagers growing up want to make decisions
of their own." (participial phrase)
"Fads are popular characteristics adopted by
members of a given society." (participial phrase)
"Most teenagers are given allowances or have a
part-time job and being provided with the
essentials by their parents, are able to afford
the luxury of fads." (non-restrictive participial
phrase)
"Then there's Roxy and Tom, with their dog
collars on." (prepositional phrase)
"For this reason they will try out many fads
until they manage to find a style unique to
themselves." (reduced relative clause)
"By being caught up, the teenager also becomes
popular and excepted." (adverbial phrase of time,
place or manner)
Adverbial. The Adverbial score consists of adverbial clauses and
adverbial phrases. Adverbial clauses include all adverbial clauses
other than clauses of time, place, and manner. Adverbial clauses
include, therefore, adverbials of cause, purpose, condition, or
concession.
"This difference is very obvious if a person
visits a local shopping mall." (adverbial clause
of condition)
"If a teen is not socially accepted by his
friends, he is cast aside like a dirty rag."
(adverbial clause of condition)
"Fads are especially attractive to teenagers
because teenagers are young." (adverbial clause
of cause)
"Young people buy these 'fads' so they can
express hourly to their society what they are
feeling." (adverbial clause of purpose)
"Though the fads may change, the reasons
never will." (adverbial clause of
concession)
"Since they are too young to obtain a
substantial role, they accept one--fads."
(adverbial clause of cause)
Adverbial phrases are reduced adverbial clauses and also include
adverbials of cause, purpose, condition, or concession.
"They may explore different ideas or values
to discover who they really are." (adverbial
phrase of purpose)
"In conclusion the fact shows that fads can
be harmless if taken as a fad." (adverbial
phrase of condition)
"Many people believe that teenagers follow
fads to make a statement." (adverbial phrase of
purpose)
"To make up for this, teenagers hold their
feelings inside them and don't always express
how they feel towards certain things."
(adverbial phrase of purpose)
Coherence and Coordination
Two separate measures were used to score the way that individuals
create a unity within the essay: Coherence and Coordination. A
description of the scoring procedures for Coherence and Coordination
follows.
Coherence. A Coherence score was produced for each of the 104
CLAST essays in the sample. On each of the 104 CLAST essays in the
sample, the first full paragraph (other than the introductory
paragraph of the essay) which contained three or more complete
sentences was selected for scoring. Each selected paragraph was
typed onto a blank sheet of paper with a randomly-selected
identifying number in order to eliminate any possible handwriting
quality effect. The 104 sheets of paper, each with a typed
paragraph, were randomly ordered and given to two scorers, each of
whom had participated in CLASP holistic scoring sessions on at least
six occasions prior to scoring the paragraph coherence samples in
this study. The two scorers first reviewed the scoring criteria for
each category, using sample CLAST essays for each scored category. A
number of practice essays were scored by both scorers, and
discrepancies were reviewed. Each scorer evaluated each CLAST essay
independently. The scorers periodically checked the consistency of
the scoring. The inter-rater reliability was 0.67. The average of
the two scores, one from each scorer, constituted the Coherence score
for each CLAST essay.
The following are descriptions of the four reference points from
the National Assessment of Education Progress scoring guide for
syntax, cohesion, and mechanics (Mullis & Mellon, 1980):
1 = Little or no evidence of cohesion. Basically,
clauses and sentences are not connected beyond
pairings.
2 = Attempts at cohesion. There is evidence of
gathering details but little or no evidence that
these details are meaningfully ordered. In other
words, very little seems lost if the details were
rearranged.
3 = Cohesion. Details are both gathered and
ordered. Cohesion is achieved in the ways
illustrated briefly in the definition above
[that is, by lexical cohesion, conjunction,
reference, and substitution, and by syntactic
repetition]. Cohesion does not necessarily lead
to coherence, to the successful binding of parts
so that the sense of the whole discourse is
greater than the sense of the parts. In pieces
of writing that are cohesive rather than
coherent, there are large sections of details
which cohere but these sections stand apart as
sections.
4 = Coherence. While there may be a sense of
sections within the piece of writing, the sheer
number and variety of cohesion strategies bind
the details and sections into a wholeness. This
sense of wholeness can be achieved by a
saturation of syntactic repetition throughout the
piece . and/or by closure which
retrospectively orders the entire piece and/or by
general statements which organize the whole
piece. (p. 26).
Coordination. Coordination was considered to be the conjunction
of two T-units by a coordinating conjunction (i.e., "and," "or,"
"nor," "but," "for," or "yet") or by a conjunctive adverb (e.g.,
"however," "therefore"). Each essay was scored by one scorer and
checked by a second scorer to obtain an accurate count of the
Coordination score. One Coordination score was recorded for each
CLAST essay in this study.
Length
Each essay was scored by one scorer'and checked by a second
scorer to obtain an accurate count of the number of T-units (T-unit)
and number of words (Words). A T-unit was defined as "one main
clause with all the subordinate clauses attached to it." (Hunt, 1965,
p. 20). Contracted words such as "don't" were scored as two words.
The average number of words per T-unit in an essay (Words/T-unit)
was determined by dividing the total number of words by the total
number of T-units in the essay. The score for Words was the total
number of words in the essay.
The mean for Words in the sample of 104 CLAST essays was 369.4
(SD = 105.6) words, and the mean for T-units in the sample of 104
CLAST essays was 24.8 (SD = 7.9) T-units. The mean for Words/T-unit
in the sample of 104 CLAST essays was 15.2 (SD = 2.8) words.
Handwriting
A Handwriting score was produced for each of the 104 CLAST essays
in the sample. For each of the 104 CLAST essays in the sample, the
second through the fifth line of the first 7-line paragraph was
photocopied. Lines with no text on the line, including blank lines
and cross-outs of entire lines, were not counted as lines. Each such
photocopied 4-line passage from each of the 104 CLAST essays in the
sample was placed on a blank sheet of paper with a randomly-selected
identifying number. Two identical sets of the 104 sheets of paper
with the 4-line handwriting samples were produced, and each set was
randomly ordered. One set was given to each of two scorers, each of
whom had participated in CLASP holistic scoring sessions on at least
five occasions prior to scoring the handwriting samples in this study.
Each of the scorers rated each handwriting sample on a scale from
I through 5, with 1 being "most legible" and 5 being "least
legible." The inter-rater reliability determined by the Pearson
Product-Moment correlation was 0.56. The two scores were averaged to
produce a single Handwriting score for each essay.
Correction of Atomistic Writing Scores for Essay Length
The descriptive statistics for the 12 raw Atomistic Writing
Subskill scores are set forth in Table 4. To correct for the varying
lengths of the essays in the sample, each of the raw Mechanics,
Syntax, and Coordination scores was converted, following Mellon's
(1969) guidelines, into a ratio of mechanical errors, syntactic
constructions, or coordinations per 100 T-units. In that the scoring
procedures for Coherence and Handwriting controlled for length of the
essay, neither score was so converted. The descriptive statistics
for the 12 Atomistic Writing Subskill scores, with mechanical errors,
syntactic constructions, and coordinations converted into ratios per
100 T-units, are set forth in Table 5.
Table 4. Descriptive statistics for raw'Atomistic Writing Subskill
scores from essays of 104 individuals sampled from the
March 1985 CLAST administration.
Variable n M SD Minimum Maximum
Agreement 104 1.0 1.7 0.0 10.0
Punctuation 104 4.8 4.7 0.0 31.0
Spelling 104 2.9 3.5 0.0 21.0
Capitalization 104 0.6 1.3 0.0 8.0
Nominal 104 16.2 7.2 2.0 41.0
Adjectival 104 44.2 15.1 15.0 87.0
Adverbial 104 6.2 3.9 1.0 21.0
Coherence 104 2.8 0.7 1.0 4.0
Coordination 104 3.3 2.4 0.0 10.0
T-units 104 24.8 7.9 11.0 47.0
Words 104 369.4 105.6 173.0 654.0
Handwriting 104 3.3 1.1 1.0 5.0
Table 5. Descriptive statistics for Atomistic Writing Subskill
scores from essays of 104 individuals sampled from March
1985 CLAST administration.
Variable n M SD Minimum Maximum
Agreement 104 4.3 6.4 0.0 33.3
Punctuationb 104 19.5 16.4 0.0 96.0
Spellingb 104 12.0 14.3 0.0 75.0
Capitalizationb 104 2.3 5.1 0.0 25.8
Nominalb 104 66.5 24.5 13.3 165.0
Adjectivalb 104 184.3 61.7 81.0 425.0
Adverbialb 104 26.1 16.3 2.7 84.0
Coherencea 104 2.8 0.7 1.0 4.0
Coordination 104 13.4 9.3 0.0 41.2
Nords/T-unit' 104 15.2 2.8 9.5 24.6
Words" 104 369.4 105.6 173.0 654.0
Handwriting" 104 3.3 1.2 1.0 5.0
a Raw score.
b Converted into number per 100 T-units.
c Converted into number per one T-unit.
CHAPTER IV
ANALYSIS OF THE DATA
In order to investigate the construct validity of holistic
scores, the correlation matrix for the holistically-scored essay
subtest scores of the March 1985 CLAST and the 12 Atomistic Writing
Subskill scores was first generated. Second, the correlation matrix
for the holistically-scored essay subtest scores of the March 1985
CLAST and the 12 Atomistic Writing Subskill scores was submitted to
factor analysis in order to investigate the factorial validity of the
holistic writing score construct.
Normality Assumption
The descriptive statistics for the Atomistic Writing Subskill
scores set forth in Table 5 suggest extreme positive skewness in the
variables Agreement, Punctuation, Spelling, Capitalization,
Adjectival, and Adverbial. Accordingly, scatterplots were generated
and skewness was calculated for each of the suspect variables. The
scatterplots and skewness suggested that the scores for Agreement,
Punctuation, Spelling, Capitalization, Adjectival, and Adverbial
sufficiently violated the normality assumption to warrant corrective
measures. Although factor analysis is robust to violations of
normality (Gorsuch, 1974), the skewed variables were submitted to a
log transformation to make the skewed distributions approximately
normal. The set of 12 Atomistic Writing Subskill scores, with the
variables Agreement, Punctuation, Spelling, Capitalization,
Adjectival, and Adverbial normalized by the log transformation and
with the variables Nominal, Coherence, Coordination, Word/T-unit,
Words, and Handwriting not submitted to the log transformation,
provided the data for the correlational validity and the factorial
validity analyses.
Correlational Validity
As set forth in Table 6, there appear to be three general sets of
correlations between Holistic and the 12 Atomistic Writing Subskill
scores. First, Holistic was negatively correlated with the Mechanics
scores, and the Mechanics scores generally appeared to be positively
correlated with each other. Second, the Words/T-unit score was
positively correlated with the Nominal, Adjectival and Adverbial
scores. Third, Holistic was positively correlated with the Coherence
and Words scores.
The inverse correlations between the Holistic score and the
Mechanics scores and the positive correlations among the Mechanics
scores suggested that the quality of writing measured by holistic
scoring included the extent to which the writing was free from
mechanical errors and the extent to which the writer wrote legibly.
The positive correlations among the Words/T-unit, Nominal,
Adjectival, and Adverbial scores suggested that essays containing
longer thought-units had more density of nominal, adjectival, and
Table 6. Pearson Product-Moment correlation matrix for essay subtest
scores of March 1985 CLAST and 12 Atomistic Writing
Subskill scores.
Variables HOL AGR PUN SPE CAP NOM
Holistic (HOL) 1.00
Agreement (AGR) -0.30 1.00
Punctuation (PUN) -0.20 0.19 1.00
Spelling (SPE) -0.24 0.31 0.38 1.00
Capitalization (CAP) -0.16 0.14 0.32 0.28 1.00
Nominal (NOM) -0.12 0.00 -0.02 0.05 0.00 1.00
Adjectival (ADJ) 0.23 -0.10 0.07 0.01 -0.02 0.08
Adverbial (ADV) -0.09 0.21 0.15 0.04 0.00 0.25
Coherence (COH) 0.51 -0.09 -0.06 -0.12 0.02 -0.02
Coordination (COO) -0.11 0.07 0.14 -0.02 -0.07 0.08
Words/T-unit (W-T) 0.17 -0.02 0.14 0.01 0.02 0.28
Words (WDS) 0.47 -0.05 0.09 0.10 0.19 0.03
Handwriting (HAN) 0.17 -0.14 -0.10 -0.23 -0.16 -0.09
Table 6--Continued.
Variables ADJ ADV COH COO W-T WDS HAN
Holistic
Agreement
Punctuation
Spelling
Capitalization
Nominal
Adjectival 1.00
Adverbial 0.06 1.00
Coherence 0.07 0.01 1.00
Coordination -0.01 0.07 -0.10 1.00
Words/T-unit 0.82 0.40 0.10 0.01 1.00
Words 0.14 0.01 0.38 -0.01 0.16 1.00
Handwriting 0.04 -0.03 0.16 0.14 0.01 -0.05 1.00
adverbial constructions. The positive correlations among the
Holistic, Coherence, and Words scores suggested that the quality of
writing measured by holistic scoring was related to the writer's
ability to organize and unify a paragraph and to the length of the
essay. The failure of Holistic to correlate with any of the Syntax
scores, other than with Adjectival, or with Words/T-unit suggested
that the construct which the holistic writing score measured was not
strongly related to grammatical complexity or T-unit length.
Factorial Validity
Factor analysis was conducted on the correlation matrix set forth
in Table 6 in order to investigate further the construct validity of
the holistic scores. By submitting Holistic and the 12 Atomistic
Writing Subskill scores to a factor analysis, the 13 writing measures
were reduced to a smaller number of factors or writing constructs.
Extraction of Non-Trivial Factors
The criteria employed for determining the number of factors to be
retained for rotation were (a) the application of Cattell's scree
test (1966, p. 206), (b) careful examination of the size of loadings
on the principal-axis factor matrix, and (c) an examination of the
conceptional meaningfulness of the three-, four-, and five-factor
solutions. Collectively, the results of these efforts suggested that
three salient factors accounted for most of the common variance in
the data.
The scree test is a procedure whereby eigenvalues are plotted
from largest to smallest in order to determine the number of
non-trivial factors. In Table 7, the eigenvalues and corresponding
percentages of common variance for each of the 13 factors are set
forth. To determine the number of non-trivial factors, a straight
line is drawn on a scree plot, and the point where the factors
increase above the straight line on the plot yields the number of
non-trivial factors (Gorsuch, 1974). Cattell (1966) originally
suggested that the first factor on the straight line should also be
included in the number of non-trivial factors in order to ensure that
a sufficient number of factors were extracted; Cattell and Jaspers
(1967) subsequently suggested, however, that the number of
non-trivial factors should not include the first factor on the
straight line. The fact that a total of three factors lie above the
straight line in Figure I suggested that there were three non-trivial
factors in this study.
As set forth in the Appendix, the fact that the four- and
five-factor solutions produced less meaningful factor patterns was
further evidence for a three-factor solution. In the four-factor
solution, as set forth in Table 10 in the Appendix, an additional
fourth factor with significant positive loadings on only Coordination
and Handwriting was produced. The five-factor solution, as set forth
in Table II in the Appendix, resulted in factor two in the
three-factor solution being divided into two separate factors. The
additional factor in each of the four- and five-factor solutions was
Table 7. Eigenvalues and corresponding proportion of common variance.
Proportion
Common
Factor Eigenvalue Variance
1 2.39 0.18
2 2.21 0.17
3 1.61 0.12
4 1.16 0.09
5 1.09 0.08
6 0.92 0.07
7 0.81 0.06
8 0.70 0.05
9 0.67 0.05
10 0.55 0.04
11 0.46 0.04
12 0.33 0.03
13 0.10 0.01
I M
r-N
r+1
1+ C
+ - - + --- - -- + -- -
m cu c o
.. . .. . .. . .. .
not conceptually meaningful and was not supported by the research
literature, further suggesting that three factors should be retained
for rotation.
Three-Factor Solution
To determine empirically the factorial validity of the holistic
writing construct, Holistic and the 12 Atomistic Writing Subskill
scores were subjected to the principal-axis method of common factor
analysis by means of the SAS computer procedure Factor (Sarle,
1985). Three factors were then subjected to an oblique promax
rotation and an orthogonal varimax rotation. Examination of the
resulting intercorrelation matrix revealed that the factor
correlations were low (r's < .20), indicating that the results of an
orthogonal solution could be interpreted meaningfully. The final
communality estimate and the uniqueness for each of the 13 variables
submitted to factor analysis are set forth in Table 8.
The rotated factor pattern for the three-factor solution is
displayed in Table 9, with the specific items that had a loading
equal to or greater than the absolute value of 0.40 on any of the
three factors indicated. Gorsuch (1974) suggested that, in order to
obtain a minimum significant correlation coefficient of p < .05 on
factor analysis with a sample of 100 individuals, only variables with
loadings equal to or greater than the absolute value of 0.40 should
be assigned to a factor for interpretation.
Factor one had four writing variables with positive factor
loadings greater than 0.40: Agreement, Punctuation, Spelling, and
Capitalization. Factor one also had one writing variable,
Handwriting, with a negative factor loading less than -0.40.
Holistic had a factor loading of -0.37, which closely approached
significance at the p<.05 level; Holistic was, therefore, included
in this study in factor one. Factor two had four writing variables
with positive factor loadings greater than 0.40: Nominals,
Adjectivals, Adverbials, and Nords/T-unit. Factor three had three
variables with factor loadings greater than 0.40: Holistic,
Coherence, and Words. Only one variable, Coordination, did not load
significantly on any factor. Factor one was labeled
Holistic/Mechanics/Handwriting; factor two was labeled Syntax/T-unit;
and factor three was labeled Holistic/Coherence/Words.
As set forth in Table 7, the total variance associated with the
three factors was 47.8%. The factor identified as
Holistic/Mechanics/Handwriting accounted for 18.4%; the factor
identified as Syntax/T-unit accounted for 17.0%; and the factor
identified as Holistic/Coherence/Words accounted for 12.4%.
Table 8. Final communality estimates (h2) and uniqueness (u2)
for essay subtest scores and Atomistic Writing Subskill
scores of 104 individuals sampled from the March 1985 CLAST
administration.
Variable h2 u2
Holistic 0.75 0.25
Agreement 0.34 0.66
Punctuation 0.46 0.54
Spelling 0.55 0.45
Capitalization 0.47 0.53
Nominal 0.28 0.72
Adjectival 0.56 0.44
Adverbial 0.38 0.62
Coherence 0.54 0.46
Coordination 0.08 0.92
Words/T-unit 0.90 0.10
Words 0.62 0.38
Handwriting 0.21 0.79
Table 9. Rotated factor pattern for three-factor solution.
Variable Factor 1 Factor 2 Factor 3
Holistic -0.37** 0.06 0.78*
Agreement 0.52* 0.03 -0.25
Punctuation 0.65* 0.17 0.01
Spelling 0.74* 0.01 -0.03
Capitalization 0.65* -0.04 0.20
Nominal 0.02 0.49* -0.20
Adjectival -0.07 0.76* 0.28
Adverbial 0.14 0.57* -0.19
Coherence -0.11 0.01 0.72*
Coordination -0.02 0.16 -0.24
Words/T-unit 0.00 0.93* 0.19
Words 0.22 0.08 0.75*
Handwriting -0.46* 0.03 0.08
* p< .05.
** Closely approaches significance at p< .05 level.
CHAPTER V
DISCUSSION
The results clearly did not support the proposition that holistic
scoring measured a unitary writing trait. Instead, the results
suggested that the holistic score loaded on two uncorrelated
factors: (a) a factor on which the holistic score, paragraph
coherence, and number of words in the essay loaded significantly and
(b) a factor on which writing mechanics and handwriting quality
loaded significantly and on which the holistic score loaded at a
level approaching significance. Furthermore, the results suggested
that the holistic score was unrelated to the factor on which number
of syntactic structures and number of words per T-unit loaded
significantly. Finally, the results suggested that coordination was
unrelated to any of the three writing constructs.
Research Implications
The significant factor loadings of holistic scoring, paragraph
coherence, and total number of words in the essay on the factor
identified as Holistic/Coherence/Nords is supported by the research
literature. McCulley (1985) found that general coherence was a
significant predictor of a primary-trait measure of persuasive
writing quality. Bertrand (1983), Nold and Freedman (1977), Stewart
and Grobe (1979), and Thomas and Donlan (1980) determined that the
number of words in an essay was predictive of holistic scores. Chou
et al. (1982) also determined that there was a significant
correlation between number of sentences in an essay and holistic
scores.
The finding that mechanical errors and handwriting quality loaded
significantly and that holistic scores loaded at a level approaching
significance on the factor identified as Holistic/Mechanic/
Handwriting is also universally suggested by the research
literature. The number of mechanical errors was found to be
significantly correlated to holistic scores (Bertrand, 1983;
Freedman, 1979; Grobe, 1981; Homburg, 1984; Stewart & Grobe, 1979;
Stewart & Leaman, 1983; Veal & Hudson, 1983). Likewise, handwriting
quality was determined to be significantly correlated to holistic
scores (Chase, 1986; Hughes et al., 1983; McColly, 1970).
Furthermore, the finding that syntactic units and words per
T-unit, but not holistic scoring, loaded highly on the factor
identified as Syntax/T-unit also comports with the weight of the
research literature. Freedman (1979), Neilson and Piche (1981),
Stewart and Grobe (1979), and Stewart and Leaman (1983) concluded
that syntactic complexity was not significantly correlated to the
holistic score. The results of the instant study did not, however,
substantiate the findings of Homburg (1984) and Nold and Freedman
(1977) that there was a significant correlation between syntactic
complexity and the holistic score. In that the subjects for
Homburg's study were non-native speakers of English and in that the
graders were teachers of non-native speakers of English, Homburg's
study suggested a different agenda for holistic scoring on the part
of teachers of non-native speakers of English. As to the inclusion
of number of words per T-units in the factor identified as
Syntax/T-Unit, the number or words per T-unit was found not to be
significantly correlated to holistic scores (Grobe, 1981; Nold &
Freedman, 1977; Stewart & Grobe, 1979; Stewart & Leaman, 1983).
The finding that Coordination did not load on any of the three
factors in this study runs contrary to the conclusion of Homburg
(1984) in his study of 30 writing samples by non-English speakers.
Homburg found that the number of coordinating conjunctions per
composition was a significant predictor of holistic scores for
non-native English speakers. Homburg's findings again suggested a
different agenda for holistic scoring by teachers of non-native
speakers of English.
The results of the instant study were, therefore, consistent with
reported research literature, at least to the extent that such
research literature was based on writings of native speakers of
English. Because of the reliance on regression analysis, there is an
implication in prior research literature that a unitary writing
construct is measured by the holistic score, which writing construct
is composed of the subskills that correlate with the holistic
scoring. Furthermore, although they investigated a different set of
atomistic writing subskill scores from those included in the instant
study, both Freedman (1981) and Chapman et al. (1984) suggested that
the holistic score loaded on a single factor. The results of the
instant study, however, clearly suggested that there is not one, but
instead are two distinct writing traits measured by holistic scoring,
namely the factor identified as Holistic/Coherence/Words and the
factor identified as Holistic/Mechanics/Handwriting.
Implications for Further Research
This factorial validity study suggests that holistic scoring
loads highly on two distinct writing constructs. Factor analysis is,
however, but one method available to investigate construct validity
(Allen & Yen, 1979; Crocker & Algina, 1986). The factor analysis
should be replicated to confirm the factorial validity of the
holistic score. In addition, the relationship of the holistic score
to the three writing traits should be investigated by using other
methods of construct validation, including multitrait-multimethod
matrix analysis, experimental studies, and comparisons of scores of
defined groups.
Furthermore, different discourse modes and topic selection should
be investigated by replicating the factorial validity and by applying
other methods of establishing construct validity. It may be that the
factors in this study identified as Holistic/Coherence/Words and
Holistic/Mechanics/Handwriting are not stable over different
discourse modes and topics (Cooper, 1977; Lloyd-Jones, 1977; Moss et
al., 1982; Odell & Cooper, 1980; Quellmalz et al., 1982).
Further studies of construct validity should also be done to
determine whether the results of this study on college freshmen and
sophomores are stable across different age groups. Additionally, in
that Benton and Kiewra (1986) and Freedman (1979) suggested that
overall content and overall organizational ability were significantly
correlated to holistic scoring, researchers should attempt to
incorporate overall content and overall organization into future
construct validity studies. Finally, in that Holistic appeared to be
more strongly related to the factor identified as
Holistic/Coherence/Words (Holistic factor loading = 0.78) than to the
factor identified as Holistic/Mechanics/Handwriting (Holistic factor
loading = -0.37), future researchers should determine whether the
different levels of factor loadings reported in the instant study are
the result of greater variation in the former than in the latter
variables or whether the CLASP scorers in fact paid more attention to
the former than to the latter variables.
APPENDIX
ROTATED FACTOR PATTERNS FOR FOUR- AND
S FIVE-FACTOR SOLUTIONS
Table 10. Rotated factor pattern (varimax rotation) for four-factor
solution.
Variable -Factor 1 Factor 2 Factor 3 Factor 4
.....................................................................
Holistic -0.34 0.80* 0.05 -0.05
Agreement 0.55* -0.22 0.00 0.19
Punctuation 0.70* 0.04 0.13 0.16
Spelling 0.72* -0.07 0.02 -0.16
Capitalization 0.63* 0.15 -0.04 -0.22
Nominal 0.02 -0.18 0.49* 0.06
Adjectival -0.08 0.25 0.78* -0.18
Adverbial 0.19 -0.12 0.53* 0.31
Coherence -0.06 0.76* -0.03 0.09
Coordination 0.11 -0.08 0.05 0.77*
Nords/T-unit 0.01 0.19 0.94* -0.04
words 0.26 0.76* 0.06 -0.04
Handwriting -0.35 0.21 -0.05 0.58*
* p< .05.
Table 11. Rotated factor pattern (varimax rotation) for five-factor
solution.
Variable Factor 1 Factor 2 Factor 3 Factor 4 Factor 5
Holistic -0.31 0.77* 0.20 -0.17 0.00
Agreement 0.48* -0.13 -0.24 0.36 0.07
Punctuation 0.74* -0.04 0.15 0.00 0.24
Spelling 0.72* -0.08 -0.02 0.05 -0.15
Capitalization 0.66* 0.11 0.03 -0.11 -0.18
Nominal -0.09 -0.04 0.11 0.72* -0.13
Adjectival 0.03 0.08 0.95* -0.01 0.00
Adverbial 0.08 0.06 0.13 0.77* 0.13
Coherence -0.08 0.81* -0.02 0.03 0.05
Coordination 0.11 -0.10 -0.04 0.14 0.77*
Words/T-Unit 0.05 0.11 0.89* 0.36 0.02
Words 0.25 0.78* 0.08 0.04 -0.06
Handwriting -0.31 0.13 0.07 -0.18 0.66*
-----------* p<---------.05.
* p < .05.
REFERENCES
Allen, M.J., & Yen, N.M. (1979). Introduction to measurement
theory. Monterey, CA: Brooks/Cole Publishing Company.
Benton, S.L., & Kiewra, K.A. (1986). Measuring the
organizational aspects of writing ability. Journal of
Educational Measurement, 23, 377-386.
Bertrand, C.V. (1983, November). Factors in holistic ratings of
children's writing. Paper presented at the Annual Meeting of the
National Reading Conference, Austin, TX (ERIC Document
Reproduction Service No. ED 240 605).
Cattell, R.B. (1966). The scree test for the number of factors.
Multivariate Behavioral Research, 1, 245-276.
Cattell, R.B., & Jaspers, J.A. (1967). A general plasmode
(No. 30-10-5-2) for factor analytic exercises and research.
Multivariate Behavioral Research Monographs, 67 (Serial No. 3).
Chapman, C.W., Fyans, L.J., Jr., & Kerins, C.T. (1984). Writing
assessment in Illinois. Educational Measurement Issues and
Practice, 3, 24-26.
Charney, D. (1984). The validity of using holistic scoring to
evaluate writing: A critical overview. Research in the Teaching
of English, 18, 65-81.
Chase, C.I. (1986). Essay test scoring: Interaction of relevant
variables. Journal of Educational Measurement, 23, 33-41.
Chou, F.H., Kirkland, J.S., & Smith, L.R. (1982). Variables in
college composition. Augusta, GA: Augusta College (ERIC Document
Reproduction Service No. ED 224 017).
Cooper, C. (1977). Holistic evaluation of writing. In C. Cooper &
L. Odell (Eds.), Evaluating writing 3-31. Urbana, IL: National
Council of Teachers of English.
Crocker, L. (1987). Assessment of writing skills through essay
tests. In D. Bray & M.J. Belcher (Eds.), Issues in student
assessment 56-64. (New Directions for Community Colleges, No.
59). San Francisco, CA: Jossey-Bass.
Crocker, L., & Algina, J. (1986). Introduction to classical and
modern test theory. New York, NY: Holt, Rinehart and Winston.
Diederich, P.B., French, J.N., & Carlton, S.T. (1961). Factors in
judgments of writing ability (Research Bulletin RB-61-15).
Princeton, NJ: Educational Testing Service.
Florida Department of Education. (1980). Procedures for conducting
holistic scoring for the essay portion of the College-Level
Academic Skills Test. Tallahassee, FL: College-Level Academic
Skills Project.
Florida Department of Education. (1984). CLAST test administration
plan, 1984-85. Tallahassee, FL: College-Level Academic Skills
Project.
Florida Department of Education. (1986a). CLAST technical report,
1984-85. Tallahassee, FL: College-Level Academic Skills Project.
Florida Department of Education. (1986b). Student achievement of
college-level communication and computation skills in Florida:
1985-86. Tallahassee, FL: College-Level Academic Skills Project.
Florida Department of Education, 3 Fla. Admin. Code 6A-10.31 (1987).
The Florida School Code, Fla. Stat. 229.551(3)(k) (1987).
Freedman, S.W. (1979). How characteristics of student essays
influence teachers' evaluations. Journal of Educational
Psychology, 71, 328-338.
Freedman, S.W. (1981). Influences on evaluators of expository
essays: Beyond the text. Research in the Teaching of English,
15, 245-255.
Gorsuch, R.L., (1974). Factor analysis. Philadelphia: W.B. Saunders.
Grobe, C. (1981). Syntactic maturity, mechanics, and vocabulary
as predictors of quality ratings. Research in the Teaching of
English, 15, 75-85.
Homburg, T.J. (1984). Holistic evaluation of ESL compositions:
Can it be validated objectively? TESOL Quarterly, 18, 87-107.
Hughes, D.C., Keeling, B., & Tuck, B.F. (1983). Effects of
achievement expectations and handwriting quality on scoring
essays. Journal of Educational Measurement, 20, 66-70.
Hunt, K. (1965). Grammatical structures written at three grade
levels (Research Report No. 3). Champaign, IL: National Council
of Teachers of English.
Lloyd-Jones, R. (1977). Primary trait scoring. In C. Cooper & L.
Odell (Eds.), Evaluating writing 33-66. Urbana, IL: National
Council of Teachers of English.
McColly, W. (1970). What does educational research say about the
judging of writing ability? Journal of Educational Research, 64,
147-156.
McCulley, G.A. (1985). Writing quality, coherence, and
cohesion. Research in the Teaching of English, 19, 269-282.
Mellon, J.C. (1969). Transformational sentence-combining: A method
for enhancing the development of syntactic fluency in English
composition (Research Report No. 10). Urbana, IL: National
Council of Teachers of English.
Moss, P.A., Cole, N.S., & Khampalikit, C. (1982). A comparison of
procedures to access written language skills at grades 4, 7 and
10. Journal of Educational Measurement, 19, 37-47.
Mullis, I.V.S., & Mellon, J.C. (1980). Guidelines for describing
three aspects of writing: Syntax, cohesion, and mechanics.
Denver, CO: National Assessment of Educational Progress.
National Assessment of Educational Progress (1980). Writing
achievement, 1969-79: Results from the third national writing
assessment. Denver, CO: National Assessment of Educational
Progress.
Neilson, L., & Piche, G.L. (1981). The influence of headed nominal
complexity and lexical choice on teachers' evaluation of
writing. Research in the Teaching of English, 15, 65-74.
Nold, E.N., & Freedman, S.W. (1977). An analysis of readers'
responses to essays. Research in the Teaching of English, 11,
164-174.
Odell, L., & Cooper, C. (1980). Procedures for evaluating writing:
Assumptions and needed research. College English, 42, 35-43.
Quellmalz, E.S., Capell, F.J., & Chou, C. (1982). Effects of
discourse and response mode on the measurement of writing
competence. Journal of Educational Measurement, 19, 241-258.
Sarle, N.S. (1985). Factor. In SAS Institute Inc. SASR user's
guide: Statistics, version 5 edition. Cary, NC: SAS Institute
Inc.
Stewart, M.F., & Grobe, C.H. (1979). Syntactic maturity, mechanics
of writing, and teachers' quality ratings. Research in the
Teaching of English, 13, 207-215.
Stewart, M.F., & Leaman, H.L. (1983). Teachers' writing assessments
across the high school curriculum. Research in the Teaching of
English, 17, 113-125.
Thomas, D., & Donlan, D. (1980, March). Correlations between
holistic and qualitative methods of evaluating student writing,
grades 4-12. Paper presented at the combined Annual Meeting of
the Conference on English Education and the Secondary School
Conference, Omaha, NE (ERIC Document Reproduction Service No. ED
211 976).
Veal, L.R., & Hudson, S.A. (1983) Direct and indirect measures for
large-scale evaluation of writing. Research in the Teaching of
English, 17, 290-296.
BIOGRAPHICAL SKETCH
Gregory K. West-was born in Morgantown, West Virginia, on May 26,
1949. He completed a B.A. in English literature at Ohio State
University in 1971, an M.A. in classical Greek at Ohio State
University in 1975, an M.A. in linguistics at Ohio University in
1976, and a J.D. at the University of Florida in 1983. Prior to
receiving the J.D., he was an instructor in the English department of
Ohio State University from 1976 through 1979, teaching composition,
grammar, and technical writing. After completing the J.D., he was
appointed judicial law clerk to the Honorable Howell W. Melton,
United States District Judge, Middle District of Florida, serving
from 1983 through 1985. Since 1985 he has practiced law in the area
of tax-exempt governmental finance and is currently associated with
the firm of Mahoney Adams Milam Surface & Grimsley, P.A.,
Jacksonville, Florida. He has published articles on the technical
writing of both native and non-native speakers of English in TESOL
Quarterly, The Journal of Technical English, and elsewhere.
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the
degree of Doctor of Philosophy.
Ja s Algina, Chairp rson
Pro essor of Founda ions of
Education
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the
degree of Doctor of Philosophy.
S _- < -
Lihda Crocker
Professor of Foundations of
Education
I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the
degree of Doctor of Philosophy.
,'.I N / /
'Micha1 y. Nunnery
Professor of Educationay Leadership
This dissertation was submitted to the Graduate Faculty of the
College of Education and to the Graduate School, and was accepted as
partial fulfillment of the requirements for the degree of Doctor of
Philosophy.
December 1988 -l_______ x ___
Chajrperson, Foundat(ons of
Education
Dean, College of Education --
Dean, Graduate School
|
Full Text |
xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EZR1SNDVG_WIMTWW INGEST_TIME 2017-07-14T23:27:30Z PACKAGE UF00099574_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES
PAGE 1
THE CONSTRUCT VALIDITY OF THE HOLISTIC WRITING SCORE: AN ANALYSIS OF THE ESSAY SUBTEST OF TH€ COLLEGE-LEVEL ACADEMIC SKILLS TEST BY GREGORY K. WEST A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1988 OF F 1
PAGE 2
COPYRIGHT 1988 by GREGORY K. WEST
PAGE 3
ACKNOWLEDGMENTS I wish to thank-Dr. James Algina for his guidance and cooperation as chairperson of my committee. I also would like to express my appreciation to the other members of my committee, Dr. Linda Crocker and Dr. Michael Y. Nunnery, for their direction and editorial comments. My sincere appreciation and gratitude are expressed to my wife, Dr. Susan S. Hill, for her assistance in the data extraction and her loving encouragement, and to Chad and Michael for their special encouragement and love. I would also like to express my appreciation and gratitude to my parents, Earl and Marjorie West, not only for their love and encouragement but also for the hand-sewn quilt (queen-size) which parents traditionally bestow upon an offspring after completion of a doctoral dissertation.
PAGE 4
TABLE OF CONTENTS Page ACKNOWLEDGMENTS Ill ABSTRACT vi CHAPTERS I INTRODUCTION 1 Purpose 1 Limitations 1 Justification 4 Definitions 6 II REVIEW OF RELATED LITERATURE 8 Factor-Analytic Research 8 Non-Factor-Analytic Research 10 Comparison of Factor-Analytic and Regression Methods of Construct Validation .... 16 III PROCEDURES 18 Selection of Subjects 18 Selection of Atomistic Writing Subskills 19 Holistic Scoring 24 Atomistic Writing Subskill Scoring 25 IV ANALYSIS OF THE DATA 38 Normality Assumption 38 Correlational Validity 39 Factorial Validity 42 V DISCUSSION 50 Research Implications 50 Implications for Further Research 53
PAGE 5
APPENDIX: ROTATED FACTOR PATTERNS FOR FOURAND FIVE-FACTOR SOLUTIONS 55 REFERENCES 58 BIOGRAPHICAL SKETCH 62
PAGE 6
Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy THE CONSTRUCT VALIDITY OF THE HOLISTIC WRITING SCORE: AN ANALYSIS OF THE ESSAY SUBTEST OF THE COLLEGE-LEVEL ACADEMIC SKILLS TEST By Gregory K. West December 1988 Chairperson: James Algina Major Department: Foundations of Education The purpose of this study was to investigate the construct validity of holistic scores on large-scale writing assessments using as the vehicle for the study the essay subtest of the College-Level Academic Skills Test (CLAST). The construct validity issue focused on the extent to which holistic writing scores and atomistic skill scores measured the same underlying writing trait(s). In this study, 104 CLAST essays were drawn by random sampling from a frame of 196 essays provided by the College-Level Academic Skills Program (CLASP) staff from a population of 12,256 CLAST essays administered in March 1985. Each of the 104 CLAST essays in this sample were holistically scored by CLASP staff in accordance with the requirements of Florida law. For each of the 104 CLAST essays, 12
PAGE 7
atomistic writing subskill scores were derived: agreement errors, punctuation errors, spelling errors, capitalization errors, nominals, adjectivals, adverbials, paragraph coherence, coordination, words per T-unit, total number of words, and handwriting quality (Atomistic Writing Subski lis). The holistic scores and 12 Atomistic Writing Subskill scores were subjected to the principal-axis method of common factor analysis. A three-factor solution was computed as a result of the application of the scree test and an examination of the conceptual meaningfulness of the competing fourand five-factor solutions. The holistic score, paragraph coherence, and total number of words loaded positively on the first factor. The holistic score and handwriting quality loaded negatively on the second factor; agreement errors, punctuation errors, spelling errors, and capitalization errors loaded positively on the second factor. Nominals, adjectivals, adverbials, and words per T-unit loaded positively on the third factor. The factor structure suggests that the writing construct measured by holistic scoring encompasses two distinct constructs: one related to paragraph coherence and to the total number of words, and one related to the absence of mechanical errors and to handwriting quality. The factor structure further suggests that there is a separate writing construct unrelated to holistic scoring which is composed of syntactic constructions and words per T-unit.
PAGE 8
CHAPTER I INTRODUCTION Purpose The construct validity of holistic scoring on large-scale writing assessments was investigated using the essay subtest of the College-Level Academic Skills Test (CLAST) as the vehicle for the study. The construct validity issue focused on the extent to which holistic scores and atomistic skill scores measured the same underlying writing trait(s). In this study, factor analysis was the primary method used to investigate the construct validity of holistic scores. Specifically, the construct validity of holistic scoring was investigated by factor analyzing the correlations among (a) the holistically scored CLAST essay subtest scores, (b) grammatical and spelling errors, (c) syntactic constructions, (d) coherence and coordination, (e) length, and (f) handwriting quality. Limitations This study was limited to a sample of 104 individuals who were administered the March 1985 CLAST. The sample of 104 examinees was drawn from a frame of 196 examinees which the College Level Academic Skills Program (CLASP) staff had selected from the population of 12,256 examinees administered the March 1985 CLAST. No institutional or demographic information was released by the CLASP staff for the
PAGE 9
frame of 196 examinees. In addition, the CLASP staff provided the frame of 196 examinees by physically retrieving a set of March 1985 CLAST essays that had been physically returned together from the examination center and that had been bound together in the CLASP storage area. Accordingly, the general izabi 1 ity of this study to the general population of all examinees who were administered the March 1985 CLAST is uncertain, in that the CLASP staff may not have drawn a random sample from the population and in that it was not possible to test whether the frame was representative of, inter alia, gender, geographical distribution, institutional affiliation, or cultural background of the population of individuals administered the March 1985 CLAST. Furthermore, on the March 1985 CLAST essay subtest, examinees chose to write an essay on one of two possible topics. The examinees in the sample of 104 individuals in this study all wrote on the same expository topic on the March 1985 CLAST essay subtest. This study was, therefore, limited to one of the two topics on the March 1985 CLAST essay subtest. The CLASP staff does not release the topics used in CLAST essay subtests, and the CLASP staff conditioned the release of the data in the instant study upon the non-disclosure of the two alternate topics in the March 1985 CLAST essay subtest. To score the essay subtest, the CLASP staff also used a method of holistic rating called "general impression marking" in which the scorer fits a writing sample into ordered categories on the basis of the overall impression created by the essay. The general impression marking procedure was developed and used by the Educational Testing
PAGE 10
Service and allows for "wide-open" topics on the assumption that different discourse modes do not affectthe holistic scores. There is, however, a body of literature (Cooper, 1977; Lloyd-Jones, 1977; Moss, Cole, & Khampalikit, 1982; Odell & Cooper, 1980; Quellmalz, Capell, & Chou, 1982) suggesting that different discourse modes and topic selection are correlated with holistic scores. Because this study involved March 1985 CLAST subtest scores dealing with a single expository topic, the relationship of holistic writing scores to discourse mode or topic was not investigated. In addition, Freedman (1979) suggested that content and organization of the writing were significantly correlated with the holistic score. Veal and Hudson (1983) also found that content was significantly correlated with the holistic score. In addition, Benton and Kiewra (1986) determined that organizational ability, as measured by objective organizational tests, was significantly correlated with the holistic score. Because of the complexity and time involved in eliminating confounding variables, the quality of the content or organization of the overall CLAST essays was not assessed. Finally, the universe of all possible writing subskills which may be correlated with holistic writing scores was not investigated. The existence of such other writing skills could have significantly affected the results of this study.
PAGE 11
Justification The construct validity of holistic scoring was investigated, using as a vehicle for the investigation a test of critical importance to the pedagogical communities in Florida. The specific impetus for this study was derived from the paucity of factorial validity studies of holistic scoring in the research literature. At the time of this writing, in only two reported factorial validity studies have the holistic score and atomistic writing subskill scores been submitted to factor analysis, and in neither study has the range of atomistic writing subskills investigated in the instant study been explored. Freedman (1981) found by means of factor analysis that one trait of compositions was measured by holistic scores and analytic rating scale scores composed of voice, development, organization, sentence structure, word choice, and usage. In another factor-analytic study, Chapman, Fyans, and Kerins (1984) reported that the following five measures of functional writing loaded on a single factor: (a) focus, (b) support, (c) organization, (d) mechanics, and (e) overall. There has been, however, a significant amount of non-factor-analytic research literature dealing with the relationship between writing subskill scores and holistic scores (Charney, 1984; Crocker, 1987). Generally, the results of these non-factor-analytic studies have indicated that, "in spite of [scorer] training, [holistic scores] are strongly influenced by . . . characteristics of the writing samples." (Charney, 1984, p. 75).
PAGE 12
The researchers of syntactic complexity did not conclusively indicate whether syntactic complexity was significantly correlated with the holistic score. Stewart and Leaman (1983) found that syntactic complexity was a poor predictor of holistic scores. In contrast, Homburg (1984) found that, at least for non-native speakers of English, syntactic complexity in the form of subordination and relati vization measures was a significant predictor of hoi istic scores. The research results were also inconclusive as to whether paragraph coherence or sentence level coordination was significantly correlated with holistic scores. McCulley (1985) suggested that general coherence was predictive of a primary-trait measure of persuasive writing quality; Homburg (1984) found that coordination was predictive of holistic scores, at least for non-native speakers of Engl ish. The length of the writing sample (Homburg, 1984; Stewart & Grobe, 1979) and the absence of mechanical errors (Grobe, 1981; Homburg, 1984; Stewart & Grobe, 1979; Veal & Hudson, 1983) uniformly were found to be predictive of holistic scores. The appearance of the writing sample also was found to correlate with the holistic score, with papers written in poor handwriting tending to receive lower holistic scores (Chase, 1986; Hughes, Keeling, & Tuck, 1983; McColly, 1970). Researchers determined primarily by means of regression analysis and analysis of variance that the holistic score might be correlated
PAGE 13
with mechanical errors, syntactic complexity, coherence and coordination, length of the essay, and handwriting quality; however, only two researchers have, at the time of this writing, attempted to establish the factorial validity of a holistic writing construct by factor analyzing the holistic score and atomistic writing subskill scores. In neither reported factor-analytic study was the range of atomistic writing subskills contained in the instant study investigated. The results of the instant factor-analytic study may be useful in expanding and clarifying our understanding of the construct validity of holistic scores. Definitions The following terms are defined as used in this study. Atomistic Writing Subskills include agreement errors, punctuation errors, spelling errors, capitalization errors, nominals, adjectivals, adverbials, paragraph coherence, inter-T-unit coordination, words per T-unit, total number of words in the essay, and handwriting quality. College-Level Academic Skills Program ( CLASP ) is a faculty cooperative established in order to "advise the [Florida] Department of Education to ensure continuing faculty contributions to decisions concerning skills to be expected of college students, the ways in which the skills are tested on the CLAST, and the utilization of CLAST test results." (Florida Department of Education [FDOE], 1984, p. 21).
PAGE 14
College-Level Academic Skills Test ( CLAST ) is "a test developed by the [Florida] Department of Education pursuant to Section 229.551 (3) Ck) [of The Florida School Code (1987)] to measure student achievement of the skills listed in Rule 6A-10.31 [of the Florida Administrative Code" (1987)] . " (FDEO, 1984, p. 21). College-level academic skills are "the communication and computational skills adopted by the [Florida] State Board of Education in Rule 6A-10.31 [of the Florida Administrative Code (1987)]." (FDEO, 1984, p. 21). Holistic scoring is the rating of papers on a scale of 1 through 4 based on an overall impression of each essay. CLAST essay subtest scores are the sum of holistic scores awarded by two readers and, therefore, are in the range 2 through 8. Mechanics refers to the basic conventions of writing, including agreement, punctuation, spelling, and capitalization. Syntax includes the "ways in which words are put together to form phrases, clauses, and sentences [and] involves breaking each composition into its 'T-units' and examining the ways in which writers embed information in T-units and join T-units together." (National Assessment of Educational Progress [NAEP], 1980, p. 10). A T-uni t is "a main clause with all its attendant modifying words, phrases, and dependent clauses." (NAEP, 1980, p. 10).
PAGE 15
CHAPTER II REVIEW OF RELATED LITERATURE A significant body of research exists concerning the relationship between holistic scores and various writing subskill scores, including the relationship between the holistic score and grammatical errors, syntactic complexity, paragraph coherence, coordination, length of syntactic structures, length of the essay, and appearance of the handwriting. Only two researchers, however, have reported the results of a factor analysis of the holistic score and atomistic writing subskill scores. The literature has, instead, primarily been based either on regression analysis, as a means of differentiating significant versus non-significant predictors of the holistic score, or on ANOVA techniques to assess the differing responses of holistic graders to experimental and control groups. Factor-Analytic Research Researchers have reported the results of factor analyzing a variety of writing qualities (Diederich, French, & Carlton, 1961; McColly, 1970; Quellmalz et al., 1982); at the time of this writing, however, in only two reported factor-analytic studies have both the holistic score and atomistic writing subskill scores been submitted to factor analysis in order to investigate the construct validity of holistic scoring. Freedman's (1981) study involved the holistic
PAGE 16
evaluation by four readers of 64 essays written by students of four colleges and the evaluation by two scorers of the 64 essays on an analytic scale. The analytic scale included scores for voice, development, organization, sentence structure, word choice, and usage. Freedman used factor analysis to explore five variables: (a) the holistic score (HSCORE); (b) the analytic scale score (ASCORE) composed of the sum of scores on voice, development, organization, sentence structure, word choice, and usage; (c) the content/style score (CONTENT/STYLE) composed of the sum of the voice, development, and organization less the sum of the sentence structure, word choice, and usage scores; (d) the voice/content score (VOICE/CONTENT) composed of two times the voice score less the sum of the development and organization scores; and (e) the usage/style score (USAGE/STYLE) composed of the sum of the sentence structure and word choice scores less two times the usage score. Using a principal component factor analysis with varimax rotation, Freedman determined that the five contrast scores represented two discreet, independent qualities of the papers. HSCORE and ASCORE loaded on one factor, and CONTENT/STYLE and USAGE/STYLE loaded on a second factor. Freedman concluded that the factor structure suggested that holistic and analytic scales measured a single writing trait. In another factor-analytic study, Chapman et al . (1984) determined that five measures of functional writing, one of which was an overall measure, loaded on a single factor. The data was collected within the context of the 1983 Illinois Inventory of
PAGE 17
10 Educational Progress (HEP) writing assessment of fourth-, eighthand eleventh-grade students from 120 schools. Essay raters within the context of the HEP assessment evaluated essays on the following five areas of functional writing: (a) focus, (b) support, (c) organization, (d) mechanics, and (e) overall. All five measures of functional writing were rated on a scale ranging from 1 through 6. Upon factor analyzing the correlation matrix among the five areas of functional writing, the researchers found factor loadings ranging from 0.70 to 0.90 on a single factor. They concluded that the results supported "the aggregation [of the five areas of functional writing] into one writing ability score." (Chapman et al., 1984, p. 25). Non-Factor-Analytic Research Non-factor-analytic research has been based primarily on either regression analysis or AN0VA techniques. In the non-factor-analytic studies, researchers have explored the relationship between holistic scores and five writing subskill scores: (a) mechanical errors, (b) syntactic complexity, (c) coherence and coordination, (d) length, and (e) handwriting quality. Mechanical Errors Without exception, in both the correlational and experimental research, a significant inverse relationship between the number of mechanical and spelling errors and the holistic scores has been
PAGE 18
11 found. Stewart and Grobe (1979) performed stepwise regression analysis on eight independent variables' relating to mechanics of writing on different students at grades 5, 8, and 11. They determined that at all three grade levels there was a positive and significant correlation between holistic quality ratings and the absence of spelling errors. Using 30 freshman writing essays in the argument mode which were holistically scored by senior high school teachers teaching in the areas of (a) business, (b) English and social studies, and (c) mathematics and science, Stewart and Leaman (1983) performed stepwise regression analysis with eight independent variables, finding that both the number of spelling errors and the number of punctuation errors were significant predictors of the holistic quality ratings awarded by the teachers in each of the three curricular areas. Chou, Kirkland, and Smith (1982) investigated the holistic scores given by raters to compositions written for the University System of Georgia's Regents' Testing Program. Chou et al . examined (a) total number of sentences, (b) average sentence length, (c) shortest sentence, (d) longest sentence, (e) total words, (f) grammatical errors, (g) punctuation errors, (h) T-units, (i) words per T-unit, (j) misspelled words, (k) crossouts, (1) vagueness, and (m) words in outline. In a sample of 60 essay compositions, they determined that a significant negative correlation existed between the holistic score and the number of grammatical errors, the number of punctuation errors, and the number of misspelled words.
PAGE 19
12 Freedman (1979) performed an experimental study in which she determined that the mechanics score proved to be a significant predictor of holistic judgments, although less predictive than scores for content and organization of the essay. Freedman rewrote essays of moderate quality to be either stronger or weaker in the four categories of content, organization, sentence structure, and mechanics. The mechanics rewriting involved misuse of commas, quotation marks, possessives, capitalization, underlining, and spelling. Using analysis of variance, Freedman determined that the mechanics score was a significant predictor of the holistic score and that the interaction between mechanics and organization was significant, such that the mechanics score was statistically significant only in writings with high organization scores. The number of mechanical errors has, without exception, been found to be significantly correlated to the quality of holistic scores in both correlational (Bertrand, 1983; Chou et al . , 1982; Grobe, 1981; Homburg, 1984; Stewart & Grobe, 1979; Stewart & Leaman, 1983; Veal & Hudson, 1983) and experimental (Freedman, 1979) studies. Syntactic Complexity Although the weight of the research literature suggests that syntactic complexity is not significantly correlated to the holistic score (Freedman, 1979; Neilson & Piche, 1981; Stewart & Grobe, 1979; Stewart & Leaman, 1983), there is, nevertheless, some research to support the existence of a significant correlation between the holistic score and syntactic complexity (Homburg, 1984; Nold & Freedman, 1977).
PAGE 20
13 Neilson and Piche (1981) manipulated a syntactic variable, headed nominal complexity, and a semantic variable, lexical choice, in stimulus passages to determine their effect on scorers' judgments of writing quality. Eighty inservice and preservice high school English teachers rated the passages, allowing each of the four versions of the passage to be rated 20 times. Using the ANOVA technique to test the differences among the ratings achieved on the four treatment versions of the passage, Neilson and Piche determined that there was no significant relationship between syntactic complexity (limited to headed nominal complexity) and quality of writing. Homburg (1984), however, determined that the number of dependent clauses per composition was a significant predictor of holistic scores. Homburg derived his data from 30 compositions randomly selected from the composition portion of the Michigan Test of English Language Proficiency administered during the period from August 1979 to March 1980. The writers were, therefore, non-native speakers of English who were applying to universities in the United States. Homburg performed a one-way ANOVA and a discriminant analysis in order to determine that number of dependent clauses per composition was a significant predictor of holistic scores. Coherence and Coordination The corpus of research literature concerning the relationship between coherence and holistic scores seems to indicate that both coherence and coordination are significantly related to holistic scores. McCulley (1985) used regression analysis on a sample of 120
PAGE 21
14 persuasive papers written by 17-year-olds during the 1978-1979 National Assessment of Educational Progress, correcting for number of T-units in each writing passage. The results of the study indicated that general coherence was a significant predictor of a primary-trait measure of persuasive writing quality and that the lexical cohesive features, use of synonym, hyponym, and collocation, were also significant predictors of the primary-trait measure. Homburg (1984), in his study of 30 writing samples by non-native English speakers, applied discriminant analysis and found that the number of coordinating conjunctions per composition was a significant predictor of holi stic scores. Length As with syntactic complexity, the research literature regarding number of words per sentence, number of words per T-unit, number of sentences in the entire essay, and number of words in the entire essay is inconclusive as to whether such attributes of length are predictive of holistic scoring. Nold and Freedman (1977) and Stewart and Grobe (1979), both by means of stepwise regression analysis, determined that number of words in an essay was significantly predictive of holistic scores. In correlational studies, Thomas and Donlan (1980) and Bertrand (1983) also determined that number of words in an essay was predictive of the holistic writing score. Chou et al . (1982) determined that there was a significant relationship between holistic scores and the number of sentences in an essay.
PAGE 22
15 A number of researchers (Grobe, 1981; Nold & Freedman, 1977; Stewart & Grobe, 1979; Stewart & Leaman, 1983) indicated that syntactic complexity, measured by the number of words per T-unit, was not significantly correlated with holistic scores. Homburg (1984), by means of discriminant analysis, determined, however, that number of words in a sentence was significantly predictive of holistic scores for non-native speakers of English. Handwriting Qual i ty Researchers (Chase, 1986; Hughes et al . , 1983; McColly, 1970) have uniformly suggested that handwriting quality is a significant predictor of the holistic score. Hughes et al . (1983) determined by means of ANOVA that there was a significant difference between holistic scores awarded to essays in experimental and control groups defined in terms of handwriting quality and further found that handwriting quality did not interact with score or achievement expectations. Chase (1986) used an ANOVA technique to analyze the data collected by having each of 80 readers evaluate a single contrived student essay. Each of the readers was also given a student record card which identified the student as either a high or low achiever, black or white, and male or female. Chase found that there were complex interactions of expectations, handwriting, and sex within race. The results indicated that quality of handwriting was significantly related to the holistic score, but that the relationship between handwriting quality and the holistic score interacted with expectations of sex within race.
PAGE 23
16 Comparison of Factor-Analytic and Regression Methods of Construct Validation In the instant study, factor analysis was selected as the method for examining the construct validity of holistic scores. Unlike regression analysis, factor analysis permits the researcher to examine multiple writing traits measured by the holistic writing score, number of grammatical errors, syntactic complexity, paragraph cohesion, coordination, length of syntactic structures, length of the essay, and physical appearance of the handwriting. Factor analysis, therefore, enables the researcher to identify a reduced number of underlying constructs which describe multiple variables involved in the evaluation of writing. Regression analysis, however, permits the distillation of a number of independent variables into only a single dependent variable or trait, such as the holistic score. Regression analysis would, therefore, only identify which of the independent variables was related to the holistic score and would not differentiate between various sub-relationships among the independent variables and the holistic score. Although the regression analysis studies on holistic scoring cited in the section entitled "Non-Factor-Analytic Research" in Chapter II have functioned to identify a range of possible variables which may be related to holistic scores, factor analysis is the more appropriate method to investigate construct validation because factor analysis permits the researcher to determine whether the holistic
PAGE 24
17 score and the various writing subskills previously isolated by regression analysis are empirically identified as measuring a common factor. Factor analysis permits the researcher to meaningfully explain the relationships between holistic scoring and the writing subskill scores in terms of a few conceptually meaningful, relatively independent factors.
PAGE 25
CHAPTER III PROCEDURES The purpose of jthis study was to investigate the construct validity of holistic scores using an exploratory factor analysis of the holistically-scored CLAST essay subtest score (Holistic) and the 12 Atomistic Writing Subskill scores. This chapter contains information on the research design, selection of subjects, selection of atomistic writing subskills, and procedures for data-extraction used in this study. Selection of Subjects The subjects in the sample included a total of 104 individuals who participated in the March 1985 CLAST administration. The CLASP staff selected 196 individuals from a population of 12,256 individuals who chose to write on the same essay topic and who were administered the March 1985 CLAST. The CLASP staff provided a sample of 196 examinees by physically retrieving a set of March 1985 CLAST essays that had been returned together from the examination center and were bound together in the CLASP storage area. From this packet of 196 essays, a sample of approximately 100 individuals was randomly selected to produce a sample size sufficiently large to produce reliable factor analysis results. Gorsuch (1974) suggested that the sample for factor analysis should provide an "absolute minimum ratio [of] five individuals to every 18
PAGE 26
19 variable, but not less than 100 individuals for any analysis." (p. 296). In this study the sample size of 104 individuals with 13 variables provided a ratio of 8 individuals to each variable, which ratio was greater than the minimum suggested by Gorsuch. In addition, time constraints involved in extracting the writing subskill scores created a practical limit to the total number of individuals sampled. Each essay required in excess of 3 hours of grading time. For this sample of 104 individuals, the frequency distribution of the CLAST essay subtest scores is set forth in Table 1. The descriptive statistics, including the mean, standard deviation, and range for each of the four CLAST subtests of the 104 individuals sampled from the March 1985 CLAST administration, are presented in Table 2. The descriptive statistics, including the mean, standard deviation, and median for each of the four CLAST subtests of the total population of 12,256 individuals who took the March 1985 CLAST, are presented in Table 3. Selection of Atomistic Writing Subskills The 12 Atomistic Writing Subskills selected for inclusion in this study fall in five broad categories: (a) mechanics, (b) syntax, (c) coherence and coordination, (d) length, and (e) handwriting quality. The mechanics, syntax, coherence, and length scores were included in this study because of their inclusion in assessments by the staff of the National Assessment of Educational Progress and because of their
PAGE 27
20 Table 1 . Distribution by essay subtest scores of 104 essays drawn from frame of 196 essays selected by the CLASP staff from the population of 12,256 essays from the March 1985 CLAST administration. Essay Subtest Score n Percentage 2 3 2 . 87. 3 1 1.0 4 23 22.1 5 27 26.0 6 28 26.9 7 14 13.5 8 8 7.7 Total: 104 100.0%
PAGE 28
21 Table 2 . Descriptive statistics for four CLAST subtests for 104 individuals sampled from the March 1985 CLAST administration.
PAGE 29
22 inclusion in research studies of holistic scoring cited in the sections entitled "Factor-Analytic Research" and "Non-Factor-Analytic Research" in Chapter II. The writing assessments by the staff of the National Assessment of Educational Progress, however, included six syntax scores. These were a nominal clause score, a nominal phrase score, an adjectival clause score, an adjectival phrase score, an adverbial clause score, and an adverbial phrase score. In the instant study, in order to maximize the ratio of individuals to variables (Gorsuch, 1974) for the factor analysis, the nominal clause and nominal phrase scores were combined into a single nominal score; the adjectival clause and adjectival phrase scores were combined into a single adjectival score; and the adverbial clause and adverbial phrase scores were combined into a single adverbial score. The coordination score was included in the instant study because Homburg (1984) suggested that coordination was predictive of holistic scores, at least for non-native speakers of English. The measure of handwriting quality was also included because of the body of research literature indicating that essays written in poor handwriting received lower holistic scores (Chase, 1986; Hughes et al . , 1983; McColly, 1970). As set forth in the section entitled "Limitations" in Chapter I, no writing subskill score was included for discourse mode or topic, although there was a body of research suggesting that discourse modes and topic selection were correlated to the holistic score (Cooper, 1977; Lloyd-Jones, 1977; Moss, Cole, & Khampal iki t, 1982; Odell &
PAGE 30
23 Cooper, 1980; Quellmalz et al . , 1982). This study involved a single expository topic as administered by the CLASP staff at the March 1985 administration of the CLAST; it was, accordingly, not possible to obtain different discourse mode and topic selection scores for inclusion in this study. Also as set forth in the section entitled "Limitations" in Chapter I, neither the content nor organization of the writing was investigated, although Freedman (1979) suggested that content and organization were significantly correlated to holistic scoring and although Benton and Kiewra (1986) suggested that organizational ability was significantly correlated to holistic scoring. The main reason that this study did not include measures of the quality of the content or organization of the overall essay was because of the complications of confounding with various other writing subskill scores. In order to assess the overall content and organization of the writings, either an experimental study would need to be done to control for confounding, or each essay would need to be "corrected" for all of the subskill writing scores which were related to holistic scoring. In a non-experimental study such as this, extraction of content and organization scores would involve retyping each essay in its entirety to correct for possible handwriting confounding, making all mechanical error corrections, and rewriting each essay to standardize the occurrence of syntactic structures. Because of the problems associated with neutralizing the other potentially confounding writing subskills, the variables in this study did not include measures of overall content and organization.
PAGE 31
24 In summary, the variables examined in this study included four mechanics scores, three scores of syntactic complexity, one paragraph coherence score, and two length scores, all of which scores were based on writing subskills scores developed by the staff of the National Assessment of Educational Progress. Because of research findings in this area, the variables also included two other writing subskill scores not developed by the staff of the National Assessment of Educational Progress, namely coordination and handwriting quality. Holistic Scoring Each essay from the sample of 104 individuals administered the March 1985 CLAST was holistically scored by the CLASP staff in accordance with CLASP holistic scoring procedures (FDOE, 1980). The procedures employed by the CLASP staff to holistically score the March 1985 CLAST essays are those propounded by the Florida Department of Education (1986a). The CLAST scoring scale reflects the four levels of performance described below: Score 1 : Writer includes very little, if any, specific relevant supporting detail but, instead, uses generalizations for support. Thesis statement and organization are vague and/or weak. Underdeveloped ineffective paragraphs do not support the thesis. Sentences lack variety, usually consisting of a series of subject-verb and, occasionally, complement constructions. Transitions and coherence devices are not discernible. Syntactical, mechanical, and usage errors occur frequently.
PAGE 32
25 Score 2: Writer employs an inadequate amount of specific detail relating to the subject, Thesis statement and organization are unambiguous. Paragraphs generally follow the organizational plan, and they are usually sufficiently unified and developed. Sentence variety is minimal and constructions lack sophistication. Some transitions are used and parts are related to each other in a fairly orderly manner. Some errors occur in syntax, mechanics, and usage. Score 3: Writer presents a considerable variety of relevant and specific detail in support of the subject. The thesis statement expresses the writer's purpose. Reasonably well-developed, unified paragraphs document the thesis. A variety of sentence patterns occurs, and sentence construction indicates that the writer has facility in the use of language. Effective transitions are accompanied by sentences constructed with orderly relationship between word groups. Syntactical, mechanical, and usage errors are minor. Score 4: Writer uses an abundance of specific, relevant details, including concrete examples, that clearly support generalizations. The thesis statement effectively reflects the writer's purpose. Body paragraphs carefully follow the organizational plan stated in the introduction and are fully developed and tightly controlled. A wide variety of sentence constructions is used. Appropriate transitional words and phrases and effective coherence techniques make the prose distinctive. Virtually no errors in syntax, mechanics, and usage occur. (FDOE, 1986a, pp. 36-37). Atomistic Writing Subski 1 1 Scoring Each essay from the sample of 104 individuals administered the March 1985 CLAST was evaluated to generate 12 scores in the following five broad categories: (a) mechanics, (b) syntax, (c) coherence and
PAGE 33
26 coordination, (d) length, and (e) handwriting. The broad category of mechanics included scores for the number of agreement errors (Agreement), punctuation errors (Punctuation), spelling errors (Spelling), and capitalization errors (Capitalization). The broad category of syntax included scores on the number of nominal clauses and phrases (Nominal), the number of adjectival clauses and phrases (Adjectival), and the number of adverbial clauses and phrases (Adverbial). The broad category of coherence and coordination included scores for paragraph coherence (Coherence) and inter-T-unit coordination (Coordination). The broad category of length included scores for the number of words per T-unit (Words/T-unit) and the total number of words in the essay (Words). The broad category of handwriting included the score for handwriting quality (Handwriting). Each of the 104 CLAST essays in the sample was, therefore, evaluated to produce scores for the following 12 Atomistic Writing Subskills: (a) Agreement, (b) Punctuation, (c) Spelling, (d) Capitalization, (e) Nominal, (f) Adjectival, (g) Adverbial, (h) Coherence, (i) Coordination, (j) Words/T-unit, (k) Words, and (1) Handwriting. A description of each of the 12 Atomistic Writing Subskills follows. With the exception of Coordination and Handwriting, the method of scoring the Atomistic Writing Subskills was derived from the procedures set forth in the National Assessment of Educational Progress scoring guidelines (NAEP, 1980). Mechanics The Mechanics scores were derived by determining the number of mechanical errors in each CLAST essay. For each CLAST essay, one
PAGE 34
27 scorer initially counted the number of mechanical errors and a second scorer checked the first scorer's countto generate an accurate count of the number of mechanical errors. Prior to scoring, each scorer reviewed the rules set forth in the National Assessment of Educational Progress scoring guide for syntax, cohesion, and mechanics (Mull is & Mellon, 1980). One score was produced for each of the four mechanics scores for each CLAST essay. The guidelines for categorizing mechanics errors as set forth in the National Assessment of Educational Progress scoring guide (NAEP, 1980) were used to derive the four mechanics scores. Agreement errors were deemed in this study to be mistakes in subject/verb agreement, mistakes in pronoun/antecent agreement, misusage of a subject/object pronoun, and errors in verb tense. Punctuation errors included all errors of comission and errors of omission relating to commas, dashes, quotation marks, semi-colons, apostrophies, and end marks, using the most informal usage rules. Spelling errors included misspellings, errors in word divisions at line endings, errors of writing two words as one or one word as two, extraneous plurals, and letter groupings not constituting a legitimate word. Capitalization errors were deemed to be errors in the capitalization of the first word in a sentence, the failure to capitalize a proper noun or adjective within a sentence, and the failure to capitalize the pronoun "I."
PAGE 35
28 Syntax The following descriptions of each syntactic structure counted in this study were derived from the National Assessment of Educational Progress guide for scoring syntax, cohesion, and mechanics (Mullis & Mellon, 1980). For each essay, one scorer initially counted the number of syntactic structures and a second scorer checked the first scorer's count to generate an accurate count of each syntactic structure. One score was recorded for each syntactic structure for each essay. Examples of the syntactic structure extracted from the essays in this study follow. Nominal . The Nominal score consists of both nominal clauses and nominal phrases. Nominal clauses occur in two forms: (a) that-nominal clauses and (b) question-nominal clauses. Both that-nominals and question-nominal clauses occur in a variety of nominal positions, including subject of the sentence, object of the verb, object of a preposition, subject complement, and appositive. The following are examples of that-nominals and question-nominal clauses from the CLAST essays in this study: "We see by this point that fads are not only non-verbal but also verbal ." (that-nominal as object of a verb) "Every couple of years, there seems to be a shift in what young people are doing for fun ." (question-nominal as object of a preposition) '"To be in 1 it suddenly matters what brand of jeans one is wearing or how one greets another ." (two question-nominals as subjects of the verb) "The point is that the young are unconformed . " (that-nominal as subject complement)
PAGE 36
29 " The fact that teenagers watch t.v. a great deal , shows a basis of where they learn up-to-date fads." (that-nominal as subject of the verb) "In a teenagers mind, these fads will determine -if they are 'in' or 'out' of the crowd ." (question nominal as object of the verb) Nominal phrases are differentiated from nominal clauses by the inclusion of uninflected verbs. Nominal phrases occur, however, in the same nominal positions as do nominal clauses. The two types of nominal phrases are the infinitive and the gerund. Examples of nominal phrases from the CLAST essays in this study follow: " Keeping up with current trends is assumed to be the only road to popularity and thus offers some security to the very insecure teenager." (gerund as subject of verb) "People consider fads to be a motivational factor because they make them feel al ive and fresh ." (infinitive as object of verb) "They must learn to develop their minds and personal i ties . " (infinitive as object of the verb) " Being able to relate to some thing on one's own level is of the utmost importance." (gerund as subject of the verb) "This ability to be seen and read by all lends itself perfectly to the spreading of fads across the nation ." (gerund as object of a preposition) Adjectival . The Adjectival score is comprised of both adjective clauses and adjective phrases. Adjectival clauses are sometimes referred to as relative clauses. Three types of adjectival clauses were scored: restrictive relatives, non-restrictive relatives, and adverbials of time, place, or manner. The following are examples of relative clauses from the essays in this study:
PAGE 37
30 "He is constantly testing the boundaries of acceptable behavior and discovering those actions which he enjoys ." (restrictiverelative) " When many people think about fads they suddenly think of the way people dress at a certain point in time." (adverbial clause of time) "They are not caught up with society and the restraints it imposes ." (restrictive relative) Modifying phrases are relative clauses reduced by the deletion of relative pronouns, subjects, and, in many cases, verbs. "Teenagers growing up want to make decisions of their own." (participial phrase) "Fads are popular characteristics adopted by members of a given society ." (participial phrase) "Most teenagers are given allowances or have a part-time job and being provided with the essentials by their parents , are able to afford the luxury of fads." (non-restrictive participial phrase) "Then there's Roxy and Tom, with their dog col lars on . " (prepositional phrase) "For this reason they will try out many fads until they manage to find a style unique to themselves . " (reduced relative clause) " By being caught up , the teenager also becomes popular and excepted." (adverbial phrase of time, place or manner) Adverbial . The Adverbial score consists of adverbial clauses and adverbial phrases. Adverbial clauses include all adverbial clauses other than clauses of time, place, and manner. Adverbial clauses include, therefore, adverbials of cause, purpose, condition, or concession.
PAGE 38
31 "This difference is very obvious if a person visits a local shopping mall ." (adverbial clause of condi tion) " If a teen is not socially accepted by his friends , he is cast aside like a dirty rag." (adverbial clause of condition) "Fads are especially attractive to teenagers because teenagers are young ." (adverbial clause of cause) "Young people buy these 'fads' so they can express hourly to their society what they are feel ing . " (adverbial clause of purpose) " Though the fads may change , the reasons never will." (adverbial clause of concession) " Since they are too young to obtain a substantial role , they accept one— fads." (adverbial clause of cause) Adverbial phrases are reduced adverbial clauses and also include adverbials of cause, purpose, condition, or concession. "They may explore different ideas or values to discover who they really are ." (adverbial phrase of purpose) "In conclusion the fact shows that fads can be harmless if taken as a fad . " (adverbial phrase of condition) "Many people believe that teenagers follow fads to make a statement . " (adverbial phrase of purpose) " To make up for thi s , teenagers hold their feelings inside them and don't always express how they feel towards certain things." (adverbial phrase of purpose) Coherence and Coordination Two separate measures were used to score the way that individuals create a unity within the essay: Coherence and Coordination. A
PAGE 39
32 description of the scoring procedures for Coherence and Coordination follows. Coherence . A Coherence score was produced for each of the 104 CLAST essays in the sample. On each of the 104 CLAST essays in the sample, the first full paragraph (other than the introductory paragraph of the essay) which contained three or more complete sentences was selected for scoring. Each selected paragraph was typed onto a blank sheet of paper with a randomly-selected identifying number in order to eliminate any possible handwriting quality effect. The 104 sheets of paper, each with a typed paragraph, were randomly ordered and given to two scorers, each of whom had participated in CLASP holistic scoring sessions on at least six occasions prior to scoring the paragraph coherence samples in this study. The two scorers first reviewed the scoring criteria for each category, using sample CLAST essays for each scored category. A number of practice essays were scored by both scorers, and discrepancies were reviewed. Each scorer evaluated each CLAST essay independently. The scorers periodically checked the consistency of the scoring. The inter-rater reliability was 0.67. The average of the two scores, one from each scorer, constituted the Coherence score for each CLAST essay. The following are descriptions of the four reference points from the National Assessment of Education Progress scoring guide for syntax, cohesion, and mechanics (Mullis & Mellon, 1980):
PAGE 40
33 1 = Little or no evidence of cohesion . Basically, clauses and sentences are not connected beyond pairings. 2 = Attempts at cohesion . There is evidence of gathering details but little or no evidence that these details are meaningfully ordered. In other words, very little seems lost if the details were rearranged. 3 = Cohesion . Details are both gathered and ordered. Cohesion is achieved in the ways illustrated briefly in the definition above [that is, by lexical cohesion, conjunction, reference, and substitution, and by syntactic repetition]. Cohesion does not necessarily lead to coherence, to the successful binding of parts so that the sense of the whole discourse is greater than the sense of the parts. In pieces of writing that are cohesive rather than coherent, there are large sections of details which cohere but these sections stand apart as sections. 4 = Coherence . While there may be a sense of sections within the piece of writing, the sheer number and variety of cohesion strategies bind the details and sections into a wholeness. This sense of wholeness can be achieved by a saturation of syntactic repetition throughout the piece . . . and/or by closure which retrospectively orders the entire piece and/or by general statements which organize the whole piece, (p. 26). Coordination . Coordination was considered to be the conjunction of two T-units by a coordinating conjunction (i.e., "and," "or," "nor," "but," "for," or "yet") or by a conjunctive adverb (e.g., "however," "therefore"). Each essay was scored by one scorer and checked by a second scorer to obtain an accurate count of the Coordination score. One Coordination score was recorded for each CLAST essay in this study.
PAGE 41
34 Length Each essay was scored by one scorerand checked by a second scorer to obtain an accurate count of the number of T-units (T-unit) and number of words (Words). A T-unit was defined as "one main clause with all the subordinate clauses attached to it." (Hunt, 1965, p. 20). Contracted words such as "don't" were scored as two words. The average number of words per T-unit in an essay (Words/T-unit) was determined by dividing the total number of words by the total number of T-units in the essay. The score for Words was the total number of words in the essay. The mean for Words in the sample of 104 CLAST essays was 369.4 (SD = 105.6) words, and the mean for T-units in the sample of 104 CLAST essays was 24.8 (SD = 7.9) T-units. The mean for Words/T-unit in the sample of 104 CLAST essays was 15.2 (SD = 2.8) words. Handwriting A Handwriting score was produced for each of the 104 CLAST essays in the sample. For each of the 104 CLAST essays in the sample, the second through the fifth line of the first 7-line paragraph was photocopied. Lines with no text on the line, including blank lines and cross-outs of entire lines, were not counted as lines. Each such photocopied 4-line passage from each of the 104 CLAST essays in the sample was placed on a blank sheet of paper with a randomly-selected identifying number. Two identical sets of the 104 sheets of paper with the 4-line handwriting samples were produced, and each set was randomly ordered. One set was given to each of two scorers, each of
PAGE 42
35 whom had participated in CLASP holistic scoring sessions on at least five occasions prior to scoring the handwriting samples in this study. Each of the scorers rated each handwriting sample on a scale from 1 through 5, with 1 being "most legible" and 5 being "least legible." The inter-rater reliability determined by the Pearson Product-Moment correlation was 0.56. The two scores were averaged to produce a single Handwriting score for each essay. Correction of Atomistic Writing Scores for Essay Length The descriptive statistics for the 12 raw Atomistic Writing Subskill scores are set forth in Table 4. To correct for the varying lengths of the essays in the sample, each of the raw Mechanics, Syntax, and Coordination scores was converted, following Mel Ion's (1969) guidelines, into a ratio of mechanical errors, syntactic constructions, or coordinations per 100 T-units. In that the scoring procedures for Coherence and Handwriting controlled for length of the essay, neither score was so converted. The descriptive statistics for the 12 Atomistic Writing Subskill scores, with mechanical errors, syntactic constructions, and coordinations converted into ratios per 100 T-units, are set forth in Table 5.
PAGE 43
36 Table 4 . Descriptive statistics for raw* Atomistic Writing Subskill scores from essays of 104 individuals sampled from the March 1985 CLAST administration.
PAGE 44
37 Table 5 . Descriptive statistics for Atomistic Writing Subskill scores from essays of 104 individuals sampled from March 1985 CLAST administration. Variable
PAGE 45
CHAPTER IV ANALYSIS OF THE DATA In order to investigate the construct validity of holistic scores, the correlation matrix for the hoi istical ly-scored essay subtest scores of the March 1985 CLAST and the 12 Atomistic Writing Subskill scores was first generated. Second, the correlation matrix for the holistically-scored essay subtest scores of the March 1985 CLAST and the 12 Atomistic Writing Subskill scores was submitted to factor analysis in order to investigate the factorial validity of the holistic writing score construct. Normal ity Assumption The descriptive statistics for the Atomistic Writing Subskill scores set forth in Table 5 suggest extreme positive skewness in the variables Agreement, Punctuation, Spelling, Capitalization, Adjectival, and Adverbial. Accordingly, scatterplots were generated and skewness was calculated for each of the suspect variables. The scatterplots and skewness suggested that the scores for Agreement, Punctuation, Spelling, Capitalization, Adjectival, and Adverbial sufficiently violated the normality assumption to warrant corrective measures. Although factor analysis is robust to violations of normality (Gorsuch, 1974), the skewed variables were submitted to a log transformation to make the skewed distributions approximately 38
PAGE 46
39 normal. The set of 12 Atomistic Writing Subski 1 1 scores, with the variables Agreement, Punctuation, Spelling, Capitalization, Adjectival, and Adverbial normalized by the log transformation and with the variables Nominal, Coherence, Coordination, Word/T-unit, Words, and Handwriting not submitted to the log transformation, provided the data for the correlational validity and the factorial val idi ty analyses. Correlational Validity As set forth in Table 6, there appear to be three general sets of correlations between Holistic and the 12 Atomistic Writing Subski 1 1 scores. First, Holistic was negatively correlated with the Mechanics scores, and the Mechanics scores generally appeared to be positively correlated with each other. Second, the Words/T-unit score was positively correlated with the Nominal, Adjectival and Adverbial scores. Third, Holistic was positively correlated with the Coherence and Words scores. The inverse correlations between the Holistic score and the Mechanics scores and the positive correlations among the Mechanics scores suggested that the quality of writing measured by holistic scoring included the extent to which the writing was free from mechanical errors and the extent to which the writer wrote legibly. The positive correlations among the Words/T-unit, Nominal, Adjectival, and Adverbial scores suggested that essays containing longer thought-units had more density of nominal, adjectival, and
PAGE 47
40 Table 6 . Pearson Product-Moment correlation matrix for essay subtest scores of March 1985 CLAST and 12 Atomistic Writing Subskill scores. Variables HOL AGR PUN SPE CAP NOM Holistic (HOL) 1.00 Agreement (AGR) -0.30 1.00 Punctuation (PUN) -0.20 0.19 1.00 Spelling (SPE) -0.24 0.31 0.38 1.00 Capitalization (CAP) -0.16 0.14 0.32 0.28 1.00 Nominal (NOM) -0.12 0.00 -0.02 0.05 0.00 1.00 Adjectival (ADJ) 0.23 -0.10 0.07 0.01 -0.02 0.08 Adverbial (ADV) -0.09 0.21 0.15 0.04 0.00 0.25 Coherence (C0H) 0.51 -0.09 -0.06 -0.12 0.02 -0.02 Coordination (COO) -0.11 0.07 0.14 -0.02 -0.07 0.08 Words/T-unit (W-T) 0.17 -0.02 0.14 0.01 0.02 0.28 Words (WDS) 0.47 -0.05 0.09 0.10 0.19 0.03 Handwriting (HAN) 0.17 -0.14 -0.10 -0.23 -0.16 -0.09
PAGE 48
41 Table 6 — Continued. ===============
PAGE 49
42 adverbial constructions. The positive correlations among the Holistic, Coherence, and Words scores suggested that the quality of writing measured by holistic scoring was related to the writer's ability to organize and unify a paragraph and to the length of the essay. The failure of Holistic to correlate with any of the Syntax scores, other than with Adjectival, or with Words/T-unit suggested that the construct which the holistic writing score measured was not strongly related to grammatical complexity or T-unit length. Factorial Validity Factor analysis was conducted on the correlation matrix set forth in Table 6 in order to investigate further the construct validity of the holistic scores. By submitting Holistic and the 12 Atomistic Writing Subskill scores to a factor analysis, the 13 writing measures were reduced to a smaller number of factors or writing constructs. Extraction of Non-Trivial Factors The criteria employed for determining the number of factors to be retained for rotation were (a) the application of Cattell's scree test (1966, p. 206), (b) careful examination of the size of loadings on the principal-axis factor matrix, and (c) an examination of the conceptional meaningfulness of the three-, four-, and five-factor solutions. Collectively, the results of these efforts suggested that three salient factors accounted for most of the common variance in the data.
PAGE 50
43 The scree test is a procedure whereby eigenvalues are plotted from largest to smallest in order to determine the number of non-trivial factors. In Table 7, the eigenvalues and corresponding percentages of common variance for each of the 13 factors are set forth. To determine the number of non-trivial factors, a straight line is drawn on a scree plot, and the point where the factors increase above the straight line on the plot yields the number of non-trivial factors (Gorsuch, 1974). Cattell (1966) originally suggested that the first factor on the straight line should also be included in the number of non-trivial factors in order to ensure that a sufficient number of factors were extracted; Cattell and Jaspers (1967) subsequently suggested, however, that the number of non-trivial factors should not include the first factor on the straight line. The fact that a total of three factors lie above the straight line in Figure 1 suggested that there were three non-trivial factors in this study. As set forth in the Appendix, the fact that the fourand five-factor solutions produced less meaningful factor patterns was further evidence for a three-factor solution. In the four-factor solution, as set forth in Table 10 in the Appendix, an additional fourth factor with significant positive loadings on only Coordination and Handwriting was produced. The five-factor solution, as set forth in Table 11 in the Appendix, resulted in factor two in the three-factor solution being divided into two separate factors. The additional factor in each of the fourand five-factor solutions was
PAGE 51
44 Table 7 . Eigenvalues and corresponding proportion of common variance. Proportion Common Factor Eigenvalue Variance 1 2.39 0.18 2 2.21 0.17 3 1.61 0.12 4 1.16 0.09 5 1.09 0.08 6 0.92 0.07 7 0.81 0.06 8 0.70 0.05 9 0.67 0.05 10 0.55 0.04 11 0.46 0.04 12 0.33 0.03 13 0.10 0.01
PAGE 52
45 o u e + o o li)-HtJH)C>OH3«J)
PAGE 53
46 not conceptually meaningful and was not supported by the research literature, further suggesting that three factors should be retained for rotation. Three-Factor Solution To determine empirically the factorial validity of the holistic writing construct, Holistic and the 12 Atomistic Writing Subskill scores were subjected to the principal-axis method of common factor analysis by means of the SAS computer procedure Factor (Sarle, 1985). Three factors were then subjected to an oblique promax rotation and an orthogonal varimax rotation. Examination of the resulting intercorrelation matrix revealed that the factor correlations were low (r's < .20), indicating that the results of an orthogonal solution could be interpreted meaningfully. The final communal i ty estimate and the uniqueness for each of the 13 variables submitted to factor analysis are set forth in Table 8. The rotated factor pattern for the three-factor solution is displayed in Table 9, with the specific items that had a loading equal to or greater than the absolute value of 0.40 on any of the three factors indicated. Gorsuch (1974) suggested that, in order to obtain a minimum significant correlation coefficient of p < .05 on factor analysis with a sample of 100 individuals, only variables with loadings equal to or greater than the absolute value of 0.40 should be assigned to a factor for interpretation. Factor one had four writing variables with positive factor loadings greater than 0.40: Agreement, Punctuation, Spelling, and
PAGE 54
47 Capitalization. Factor one also had one writing variable, Handwriting, with a negative factor loading less than -0.40. Holistic had a factor loading of -0.37, which closely approached significance at the p<.05 level; Holistic was, therefore, included in this study in factor one. Factor two had four writing variables with positive factor loadings greater than 0.40: Nominals, Adjectivals, Adverbials, and Hords/T-uni t. Factor three had three variables with factor loadings greater than 0.40: Holistic, Coherence, and Words. Only one variable, Coordination, did not load significantly on any factor. Factor one was labeled Hoi istic/Mechanics/Handwri ting ; factor two was labeled Syntax/T-unit : and factor three was labeled Holistic/Coherence/Words . As set forth in Table 7, the total variance associated with the three factors was 47.8%. The factor identified as Holistic/Mechanics/Handwriting accounted for 18.4%; the factor identified as Syntax/T-unit accounted for 17.0%; and the factor identified as Holistic/Coherence/Words accounted for 12.4%.
PAGE 55
48 Table 8 . Final communality estimates (h 2 ) and uniqueness (u 2 ) for essay subtest scores and Atomistic Writing Subskill scores of 104 individuals sampled from the March 1985 CLAST admini stra'tion. Variable h 2 u 2 Holistic 0.75 0.25 Agreement 0.34 0.66 Punctuation 0.46 0.54 Spelling 0.55 0.45 Capitalization 0.47 0.53 Nominal 0.28 0.72 Adjectival 0.56 0.44 Adverbial 0.38 0.62 Coherence 0.54 0.46 Coordination 0.08 0.92 Words/T-unit 0.90 0.10 Words 0.62 0.38 Handwriting 0.21 0.79
PAGE 56
49 Table 9. Rotated factor pattern for three-factor solution. Variable
PAGE 57
CHAPTER V DISCUSSION The results clearly did not support the proposition that holistic scoring measured a unitary writing trait. Instead, the results suggested that the holistic score loaded on two uncorrected factors: (a) a factor on which the holistic score, paragraph coherence, and number of words in the essay loaded significantly and (b) a factor on which writing mechanics and handwriting quality loaded significantly and on which the holistic score loaded at a level approaching significance. Furthermore, the results suggested that the holistic score was unrelated to the factor on which number of syntactic structures and number of words per T-unit loaded significantly. Finally, the results suggested that coordination was unrelated to any of the three writing constructs. Research Implications The significant factor loadings of holistic scoring, paragraph coherence, and total number of words in the essay on the factor identified as Holistic/Coherence/Words is supported by the research literature. McCulley (1985) found that general coherence was a significant predictor of a primary-trait measure of persuasive writing quality. Bertrand (1983), Nold and Freedman (1977), Stewart and Grobe (1979), and Thomas and Donlan (1980) determined that the 50
PAGE 58
51 number of words in an essay was predictive of holistic scores. Chou et al. (1982) also determined that there was a significant correlation between number of sentences in an essay and holistic scores. The finding that mechanical errors and handwriting quality loaded significantly and that holistic scores loaded at a level approaching significance on the factor identified as Holistic/Mechanic/ Handwriting is also universally suggested by the research literature. The number of mechanical errors was found to be significantly correlated to holistic scores (Bertrand, 1983; Freedman, 1979; Grobe, 1981; Homburg, 1984; Stewart & Grobe, 1979; Stewart & Leaman, 1983; Veal & Hudson, 1983). Likewise, handwriting quality was determined to be significantly correlated to holistic scores (Chase, 1986; Hughes et al . , 1983; McColly, 1970). Furthermore, the finding that syntactic units and words per T-unit, but not holistic scoring, loaded highly on the factor identified as Syntax/T-uni t also comports with the weight of the research literature. Freedman (1979), Neilson and Piche (1981), Stewart and Grobe (1979), and Stewart and Leaman (1983) concluded that syntactic complexity was not significantly correlated to the holistic score. The results of the instant study did not, however, substantiate the findings of Homburg (1984) and Nold and Freedman (1977) that there was a significant correlation between syntactic complexity and the holistic score. In that the subjects for Homburg' s study were non-native speakers of English and in that the
PAGE 59
52 graders were teachers of non-native speakers of English, Homburg's study suggested a different agenda for holistic scoring on the part of teachers of non-native speakers of English. As to the inclusion of number of words _per T-units in the factor identified as Syntax/T-Unit, the number or words per T-unit was found not to be significantly correlated to holistic scores (Grobe, 1981; Nold & Freedman, 1977; Stewart & Grobe, 1979; Stewart & Leaman, 1983). The finding that Coordination did not load on any of the three factors in this study runs contrary to the conclusion of Homburg (1984) in his study of 30 writing samples by non-English speakers. Homburg found that the number of coordinating conjunctions per composition was a significant predictor of holistic scores for non-native English speakers. Homburg's findings again suggested a different agenda for holistic scoring by teachers of non-native speakers of English. The results of the instant study were, therefore, consistent with reported research literature, at least to the extent that such research literature was based on writings of native speakers of English. Because of the reliance on regression analysis, there is an implication in prior research literature that a unitary writing construct is measured by the holistic score, which writing construct is composed of the subskills that correlate with the holistic scoring. Furthermore, although they investigated a different set of atomistic writing subskill scores from those included in the instant study, both Freedman (1981) and Chapman et al . (1984) suggested that
PAGE 60
53 the holistic score loaded on a single factor. The results of the instant study, however, clearly suggested that there is not one, but instead are two distinct writing traits measured by holistic scoring, namely the factor identified as Holistic/Coherence/Words and the factor identified as Holistic/Mechanics/Handwriting. Implications for Further Research This factorial validity study suggests that holistic scoring loads highly on two distinct writing constructs. Factor analysis is, however, but one method available to investigate construct validity (Allen & Yen, 1979; Crocker & Algina, 1986). The factor analysis should be replicated to confirm the factorial validity of the holistic score. In addition, the relationship of the holistic score to the three writing traits should be investigated by using other methods of construct validation, including mul ti trai t-multi method matrix analysis, experimental studies, and comparisons of scores of defined groups. Furthermore, different discourse modes and topic selection should be investigated by replicating the factorial validity and by applying other methods of establishing construct validity. It may be that the factors in this study identified as Holistic/Coherence/Words and Holistic/Mechanics/Handwriting are not stable over different discourse modes and topics (Cooper, 1977; Lloyd-Jones, 1977; Moss et al., 1982; Odell & Cooper, 1980; Quellmalz et al . , 1982).
PAGE 61
54 Further studies of construct validity should also be done to determine whether the results of this study on college freshmen and sophomores are stable across different age groups. Additionally, in that Benton and Kiewra (1986) and Freedman (1979) suggested that overall content and overall organizational ability were significantly correlated to holistic scoring, researchers should attempt to incorporate overall content and overall organization into future construct validity studies. Finally, in that Holistic appeared to be more strongly related to the factor identified as Holistic/Coherence/Words (Holistic factor loading = 0.78) than to the factor identified as Holistic/Mechanics/Handwriting (Holistic factor loading = -0.37), future researchers should determine whether the different levels of factor loadings reported in the instant study are the result of greater variation in the former than in the latter variables or whether the CLASP scorers in fact paid more attention to the former than to the latter variables.
PAGE 62
APPENDIX ROTATED FACTOR PATTERNS FOR FOURAND FIVE-FACTOR SOLUTIONS
PAGE 63
56 Table 10 . Rotated factor pattern (varimax rotation) for four-factor solution. Variable -Factor 1 Factor 2 Factor 3 Factor 4 Hoi istic
PAGE 64
57 Table 11 . Rotated factor pattern (varimax rotation) for five-factor solution. Variable Factor 1 Factor 2 Factor 3 Factor 4 Factor 5 Holistic
PAGE 65
REFERENCES Allen, M.J., & Yen, W.M. (1979). Introduction to measurement theory . Monterey, CA: Brooks/Cole Publishing Company. Benton, S.L., & Kiewra, K.A. (1986). Measuring the organizational aspects of writing ability. Journal of Educational Measurement , 23, 377-386. Bertrand, C.V. (1983, November). Factors in holistic ratings of chi Idren's writing . Paper presented at the Annual Meeting of the National Reading Conference, Austin, TX (ERIC Document Reproduction Service No. ED 240 605). Cattell, R.B. (1966). The scree test for the number of factors. Multivariate Behavioral Research , ]_, 245-276. Cattell, R.B., & Jaspers, J. A. (1967). A general plasmode (No. 30-10-5-2) for factor analytic exercises and research. Multivariate Behavioral Research Monographs , 67 (Serial No. 3). Chapman, C.W., Fyans, L.J., Jr., & Kerins, C.T. (1984). Writing assessment in Illinois. Educational Measurement Issues and Practice , 3, 24-26. Charney, D. (1984). The validity of using holistic scoring to evaluate writing: A critical overview. Research in the Teaching of English , 18, 65-81. Chase, C.I. (1986). Essay test scoring: Interaction of relevant variables. Journal of Educational Measurement , 23, 33-41. Chou, F.H., Kirkland, J.S., & Smith, L.R. (1982). Variables in college composition . Augusta, GA: Augusta College (ERIC Document Reproduction Service No. ED 224 017). Cooper, C. (1977). Holistic evaluation of writing. In C. Cooper & L. Odell (Eds.), Evaluating writing 3-31 . Urbana, IL: National Council of Teachers of English. Crocker, L. (1987). Assessment of writing skills through essay tests. In D. Bray & M.J. Belcher (Eds.), Issues in student assessment 56-64. (New Directions for Community Colleges, No. 59). San Francisco, CA: Jossey-Bass.
PAGE 66
59 Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory . New York, NY: Holt, Rinehart and Winston. Diederich, P.B., French, J.W., & Carlton, S.T. (1961). Factors in judgments of writing ability (Research Bulletin RB-61-15). Princeton, NJ: Educational Testing Service. Florida Department -of Education. (1980). Procedures for conducting holistic scoring for the essay portion of the College-Level Academic Skills Test . Tallahassee, FL: College-Level Academic Skil Is Project. Florida Department of Education. (1984). CLAST test administration plan, 1984-85 . Tallahassee, FL: College-Level Academic Skills Project. Florida Department of Education. (1986a). CLAST technical report, 1984-85 . Tallahassee, FL: College-Level Academic Skills Project. Florida Department of Education. (1986b). Student achievement of college-level communication and computation skills in Florida: 1985-86 . Tallahassee, FL: College-Level Academic Skills Project. Florida Department of Education, 3 Fla. Admin. Code 6A-10.31 (1987). The Florida School Code, Fla. Stat. § 229.551 (3)(k) (1987). Freedman, S.W. (1979). How characteristics of student essays influence teachers' evaluations. Journal of Educational Psychology , 7J_, 328-338. Freedman, S.W. (1981). Influences on evaluators of expository essays: Beyond the text. Research in the Teaching of English , 15, 245-255. Gorsuch, R.L., (1974). Factor analysis . Philadelphia: W.B. Saunders. Grobe, C. (1981). Syntactic maturity, mechanics, and vocabulary as predictors of quality ratings. Research in the Teaching of English , ]5, 75-85. Homburg, T.J. (1984). Holistic evaluation of ESL compositions: Can it be validated objectively? TESOL Quarterly , 18, 87-107. Hughes, D.C., Keeling, B., & Tuck, B.F. (1983). Effects of achievement expectations and handwriting quality on scoring essays. Journal of Educational Measurement , 20, 66-70.
PAGE 67
60 Hunt, K. (1965). Grammatical structures written at three grade levels (Research Report No. 3). Champaign, IL: National Council of Teachers of English. Lloyd-Jones, R. (1977). Primary trait scoring. In C. Cooper & L. Odell (Eds.), Evaluating writing 33-66. Urbana, IL: National Council of Teachers of English. McColly, W. (1970). What does educational research say about the judging of writing ability? Journal of Educational Research , 64, 147-156. McCulley, G.A. (1985). Writing quality, coherence, and cohesion. Research in the Teaching of English , ]_9, 269-282. Mellon, J.C. (1969). Transformational sentence-combining: A method for enhancing the development of syntactic fluency in English composition (Research Report No. 10). Urbana, IL: National Council of Teachers of English. Moss, P. A., Cole, N.S., & Khampalikit, C. (1982). A comparison of procedures to access written language skills at grades 4, 7 and 10. Journal of Educational Measurement , J_9, 37-47. Mullis, I.V.S., & Mellon, J.C. (1980). Guidelines for describing three aspects of writing: Syntax, cohesion, and mechanics . Denver, CO: National Assessment of Educational Progress. National Assessment of Educational Progress (1980). Writing achievement, 1969-79: Results from the third national writing assessment . Denver, CO: National Assessment of Educational Progress. Neilson, L., & Piche, G.L. (1981). The influence of headed nominal complexity and lexical choice on teachers' evaluation of writing. Research in the Teaching of English , J_5, 65-74. Nold, E.W., & Freedman, S.W. (1977). An analysis of readers' responses to essays. Research in the Teaching of English , U_, 164-174. Odell, L., & Cooper, C. (1980). Procedures for evaluating writing: Assumptions and needed research. College English , 42, 35-43. Quellmalz, E.S., Capell, F.J., &Chou, C. (1982). Effectsof discourse and response mode on the measurement of writing competence. Journal of Educational Measurement , 19, 241-258.
PAGE 68
61 Sarle, W.S. (1985). Factor. In SAS Institute Inc. SAS R user's guide: Statistics, version 5 edition . Cary, NC: SAS Institute Inc. Stewart, M.F., & Grobe, C.H. (1979). Syntactic maturity, mechanics of writing, and teachers' quality ratings. Research in the Teaching of English , 13, 207-215. Stewart, M.F., & Leaman, H.L. (1983). Teachers' writing assessments across the high school curriculum. Research in the Teaching of English , 17, 113-125. Thomas, D., & Donlan, D. (1980, March). Correlations between holistic and qualitative methods of evaluating student writing, grades 4-12 . Paper presented at the combined Annual Meeting of the Conference on English Education and the Secondary School Conference, Omaha, NE (ERIC Document Reproduction Service No. ED 211 976). Veal, L.R., & Hudson, S.A. (1983) Direct and indirect measures for large-scale evaluation of writing. Research in the Teaching of English, 17, 290-296.
PAGE 69
62 BIOGRAPHICAL SKETCH Gregory K. Weslrwas born in Morgantown, West Virginia, on May 26, 1949. He completed a B.A. in English literature at Ohio State University in 1971, an M.A. in classical Greek at Ohio State University in 1975, an M.A. in linguistics at Ohio University in 1976, and a J.D. at the University of Florida in 1983. Prior to receiving the J.D., he was an instructor in the English department of Ohio State University from 1976 through 1979, teaching composition, grammar, and technical writing. After completing the J.D., he was appointed judicial law clerk to the Honorable Howell W. Melton, United States District Judge, Middle District of Florida, serving from 1983 through 1985. Since 1985 he has practiced law in the area of tax-exempt governmental finance and is currently associated with the firm of Mahoney Adams Milam Surface & Grimsley, P. A., Jacksonville, Florida. He has published articles on the technical writing of both native and non-native speakers of English in TESOL Quarterly , The Journal of Technical English , and elsewhere.
PAGE 70
I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. u^ 6 3an\e\s Algina, Chairperson Processor of Foundations of Education I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Linda Crocker Professor of Foundations of Education I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. ' ' 'Michael i/. Nunnery UProfessor of Educationa/ Leadership This dissertation was submitted to the Graduate Faculty of the Colleqe of Education and to the Graduate School, and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. December 1' Sxj^j—X-^ Chajrperson, Foundat Education \A^ s of 1 A,-,kcV kjs Dean, College of Education 4ZZ. Dean, Graduate School
PAGE 71
UNIVERSITY OF FLORIDA 3 1262 08285 232 7
|
|