Title: Determining instructional reading level
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00098081/00001
 Material Information
Title: Determining instructional reading level an investigation of the relationship among standard cloze tests, multiple choice cloze tests and the informal reading inventory
Physical Description: ix, 151 leaves : ; 28 cm.
Language: English
Creator: Homan, Susan Lubet
Publication Date: 1978
Copyright Date: 1978
 Subjects
Subject: Reading ability -- Testing   ( lcsh )
Cloze procedure   ( lcsh )
Miscue analysis   ( lcsh )
Curriculum and Instruction thesis Ph. D
Dissertations, Academic -- Curriculum and Instruction -- UF
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
 Notes
Thesis: Thesis--University of Florida.
Bibliography: Bibliography: leaves 145-149.
General Note: Typescript.
General Note: Vita.
Statement of Responsibility: by Susan Lubet Homan.
 Record Information
Bibliographic ID: UF00098081
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: alephbibnum - 000095753
oclc - 06349607
notis - AAL1185

Downloads

This item has the following downloads:

PDF ( 3 MBs ) ( PDF )


Full Text










DETETUIININiG INISTRUCTIONIAL- BEADI::G LEVEL:
AN I;NVECIGATION~ OF THE R~ELATION:SHIP AM~OHG
STANDARD CLOZE TESTS, IIULTIPLE CH-OICE CLCZE TECTS
AND THE INFORMAL READI::G IIEVEIITORY








By

Susan Lubet Homan










A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THEE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMEINT OF THE R.ECUIREi:ENTS FOE 1!HE
DEGREE OF DOCTOR OF PHILOSOPHY








UNIVERSITY OF FLOBIDA

1978













ACKNOWLEDGEMENiTS


I wish to express my sincere gratitude to all r~erbers

of my committee.

Dr. Powell, chairman of the advisory committee has

been helpful in several ways. His support and confidence in

my abilities have been of tremendous help. His guidance

through the various stages of writing the dissertation has

been invaluable. More importantly, his example and attitudes

as a professional will have a lifelong influence on the

quality of my professional life.

My special thanks also to Dr. Linda Crocker for her

assistance and advice. It was especially appreciated that

she was always willing to help me though often surrounded by

other pressing responsibilities.

Both Dr. William Ware and Dr. Lawrence Smith contin-

ually offered advice and support. Dr. Ware was of special

help with the statistical analysis.

My thanks go to Dr. Lewis who offered his assistance

and willingly served on my committee though his commitments

were already overwhelming.

Many thanks to Dr. N\avia Llabre whose help with the

data analysis was lifesaving, and whose special friendship

was invaluable.

111









Also appreciated was the help of Dr. Jeri Benson. Her

willingness to help with the finer details of the data analysis,

regardless of other commitments, was especially helpful.

To Mary Garcia, I wish to express thanks for typing

all of the early drafts, and for her continal encouragement.

I owe a great deal to Richard, my husband, for his

compassionate understanding, his willingness to help, and

his never faltering faith in my ability.

To my parents, I owe an exceptional debt of thanks for

providing an atmosphere that encouraged my continuing edu-

cation, and for their continuous and unswerving support and

confidence in all my life decisions.













TABLE OF CONTENTS

PAGES

ACKNOWLEDGEIENTS . . . .. .. ..' 11

ABSTRACT.. .... ... ...... dii

CHAPTER I INTRODUCTION . .. .. .. .. .. .. 1

Statement of the Problem *
ObJectives and Research Questions. 6
Definition of Terms .*9
Limitations of the Study .. 12

CHAPTER II REVIEW OF RELATED LITERATURE . .. . 114

History and Development ..14
Methodological Considerations 16
Reliability .... 21
Criteria 22
Measurement of Readi~g' Comprehension ,..24
Summary of Research .... 29

CHAPTER III PROCEDURES AND METHODOLOGY .. .. 31

Sample ... 31
Instruments Used .. 32
Instrument Development Original Pabsase. 33
Other Instruments 37
Administration 1
Method of Analysis 41

CHAPTER IV RESULTS, DISCUSSION AND TH~EORETICAL
CONSIDERATIONS *7

Results and Discussion *7
Theoretical Considerations ..85

CHAPTER V SUMM~ARY AND CONCLUSIONS .. .. .. . 89









PAGES

APPENDIX A DIRECTIONS FOR THE CLOZE TESTIINGS . .. 96

APPENDIX B STANDARD CLOZE PASSAGES ,. .. . .. 99

APPENDIX C EO:A:! MULTIPLE CHOICE CLOZE PASSAGCES . 107

APPENDIX D KIDDER MULTIPLE CHOICE CLOZE PASSAGES . 120

APPENDIX E INiPFOP&AL R.EADINGC II."ENT'ORY .. .. .. 129

LIST OF REFERENICES .. ... .. . . . . 145

BIOGRAPHICAL SKET1CH .. .. .. .. .. .. .. . 150














Abstract of Dissertation
Presented to the Graduate Council
of the University of Florida
in Partial Fulfillment of the Requirements
for the Degree of Doctor of Philosophy



DETERMINING INSTRUCTIONAL READINlG LEVEL:
AN INiVESTIGATION OF THE RELATIC!ISHIP AMONGC
STANDARD CLOZE TESTS, MULTIPLE CHOICE CLOZE TESTS
AND THE INFORMAL READING INVENITOEY

By

Susan Lubet Honan

December 1978

Chairman: William R. Powell
Major Department: Curriculum and Instruction



The relationship of a new form of cloze test, multiple

choice close (MCC), to standard cloze and the Informal

Reading Inventory (IRI) was explored in this study. The

intent was to provide new information that would assist the

classroom teacher in determining the Instructional reading

level of all students as easily and accurately as possible.

A secondary purpose of the study involved a comparison

of nCC tests. The readability levels of the MCC passages were

determined by traditional methods and by a new system of

readability determination, Rasch calibration.









A third aspect of this study focused on placement

decisions based on the two types of scoring, criteria used

with the IRI to determine instructional reading level.

Second, fourth, and sixth graders were participants in

the study, Similar results at all three grade levels suggEest

generalizatility of results for elementary and intermediate

grade levels.

Correlations between standard cloze and MCC were low

(ranging from r = .27 to .80), considering the same students

were given the same stories in standard close and M~CC forms.

These results raise some question as to whether both forms

are measuring the same type of reading comprehension.

High positive correlations were found between MCC

passages with readability levels determined by traditional

formula andMCC passages with readability levels based on

Rasch calibration, indicating that these two methods of

readability determination yield similar results.

A significant difference existed in placement of stu-

dents by the Powell and Betts IRI criteria. These differences

in placement indicate that classroom teachers should carefully

choose the IRI scoring criteria they will use based on a

conviction of accuracy of placement.

Three major implications for future research and practice

were derived from this study. There is some evidence that MCC

does not measure the same type of reading comprehension as

measured by standard claze or the IRI. The information on


viii









this issue is inconclusive and further study is indicated

before more specific conclusions can be reached. The rela-

tionship between standard cloze and the IRI also appears

tenuous. The scoring criteria used for the IRI can make very

significant differences in terms of accurate placement,
















CHAPTER I
INTRODUCTION


Every fall, at the start of the school year, teachers

begin the process of assessing the knowledge and abilities

of their new students. As part of this process the teacher

seeks to determine the level at which each student can

successfully read and assimilate information.

It is of vital importance that each student receive

reading instruction at the proper level. That is, the books

and materials they use in class should be at a readability

level that is neither too easy nor too hard for each indi-

vidual student. This specific level, at which materials are

challenging for the student without being frustrating is

called the instructional reading level. In order for each

student to make optimum developmental reading progress, all

materials and books should be at his/her instructional

reading level (Dunkeld, 1970).

Classroom teachers continually face the problem of

identifying the instructional reading level for every child

in the class. The responses to this problem have been varied.

The commonly used ways to determine instructional level are

teacher judgement, standardized reading tests, Informal

Reading Inventories (IRI), and close tests (Oliver, 1970).








Teacher judgement involves the placement of students in

reading books based purely on the teacher's own subjectively

conceived ideas. Teacher judgement has the weaknesses inherent

in any measurement based totally on one person's opinion. It

is very subjective and often inaccurate. Research indicates

(Millsap, 1962) that teachers are unaware of frustration
reading level among pupils in basal readers 30 percent of the
time. Students are placed in basal readers at their instruc-

tional level only 70 percent of the time when teacher judgement

is used. Although teacher judgement could be improved and

stabilized with proper training (Millsap, 1962), this is rarely,

if ever, done.
Standardized reading tests are group performance instru-

ments. Patty (1969) concluded that the use of standardized

grade equivalent scores Is not a valid basis for determining
instructional level. Standardized test norms appear to over-

estimate pupil instructional levels by at least one year

(Killgallon, 1942; Patty, 1969; Sipay, 1961; Willilams, 1963).
Teachers who place students in basal readers on the basis of
standardized test scores usually place them at their frustration

reading level (Millsap, 1962).
The IRI is considered to be an accurate and often used

method of determining instructional level. However, the IRI

must be individually administered and is, therefore, time con-

suming for the classroom teacher. It takes approximately twenty
minutes to properly administer an IRI to a student.








The standard close test also provides an accurate measure

of instructional level. Close procedure as developed by Wilson

Taylor (1953) was designed as an instrument for measuring the

effectiveness of communication. Taylor described a functional

unit of cloze measurement as a successful attempt to reproduce

accurately a deleted part of a passage. The decision concerning

the word to be reproduced was made from the remaining context

of the passage. In effect, the reader must comprehend well

enough to predict the word that is missing. The reader contin-

nously draws from context clues to predict the nature of the

language immediately ahead (Porter, 1976).

A close test is developed by taking a passage of approx-

Lmately 250 words, leaving the first sentence intact, and then

deleting every nth word. It can be administered to the entire

class at one time and, therefore, is comparatively less time

consuming than the IRI.

The close test, however, has not gained wide acceptance

by classroom teachers. This is due to several factors. The

cloze test is often very frustrating for the students. Also,

all the blank spaces in a standard cloze passage are sometimes

anxiety provoking. Students who miss 50 percent of the items

feel they have failed or done poorly. However, according to

all existing claze criteria, 50 percent correct indicates the

passage is at that student's instructional reading level

(Alexander, 1968; Bormuth, 1967; Bormuth, 1968; Rankin &

Culhane, 1969).









Teachers also object to standard close tests because of

the scoring procedure. Only exact word replications are

accepted as correct. Synonyms are rejected. Many teachers

feel that a synonym still indicates comprehension and should

be scored as a correct answer.

Standard cloze tests must also be hand scored which

many teachers find cumbersome and time consuming. Some criti-

cism of the standard cloze procedure centers around the argu-

ment that more than comprehension is necessary to complete a

close unit. Even though a passage may be basically understood,

a student might not be able to actually produce, or write in,

the missing words. Especially in the case of students who

speak in a dialect, or for whom English is a second language,

standard cloze test scores might easily underestimate their

true instructional level. It might be hard for those students

to make an exact word replication even if they fully comprehend

the passage (Porter, 1976).

Many of the aforementioned problems could be eliminated

by a multiple choice cloze (M4CC) test. Multiple choice cloze

tests are based on the principles of standard cloze. The

important difference between the two forms is that instead of

leaving every nth word blank, three to five possible answers

are given for each cloze unit. The student's task is to iden-

tify the correct word, rather than produce it. Since this

involves circling or checking the correct word, rather than

producing the right answer, it is much less frustrating to the

student. The multiple choice format would also allow for much









simpler scoring with either machine scoring or the use of an

overlay,

It currently takes teachers approximately three weeks

to place students accurately In reading books. The student's

enthusiasm for learning during the first three w~eekts of school

is probably at its highest point. A more efficient method of

determining instructional level would allow both teachers and

students to capitalize on this eagerness to learn. Using MCC

tests students could be placed in reading groups by the end

of the first week of class, if not sooner.


Statement of the Problem


The primary purpose of this study was to investigate the

relationship among standard cloze, MCC, and the IRL. If a

close relationship among all three variables had existed, the

second intention was to use standard cloze as a base to set

up a viable criterion for MCC test scores in regard to instruc-

tional reading level. Multiple choice close tests could then

be used by classroom teachers to determine the instructional

reading levels of their students.

A third purpose was to investigate the validity of MCC

tests as a measure of instructional reading level. The IRI

was used as a base in determining this aspect of the validity

of MCC tests.

Another consideration was a comparison of the IRI

instructional level scores as scored by the Powell (1969,

1978) and Betts (1950) scoring criteria.








An additional variable was readability. Passages graded

first through seventh g~ade lev-el by traditional readability

formulas were compared to a recent innovation in readability,

Rasch calibration. Two rcr-s of MCC tests were compared; the

readability of one set was deter-ined by traditional formulas

(Dale-Chall, Harris-Jacctroin, and Spache), and the other set's

readability levels were Esset calibrated.


Objectivel :s ?searet Cuestions

Listed below are th~e objectives of this study. Irmmedi-

ately following each objectivle are the specific questions that

will be studied.

The first three questions are preliminary to the empir-

Ical validation of the irnstrants. They offer empirical

evidence of certain. underlying assumptions; namely, that MCC

tests are different fro;, sta-diard cloze tests, that there is

a difference In passage readability at each level, and that

there Is an interaction between different test forms and

different levels.

Objective I
To investigate the similarity between standard close

and MICC tests for the same passages, with the same students.

Question IA
will there be differences between the mean scores for

standard cloze and 14CC passages for the same group of students?









Question IB

Will there be differences among the mean scores on the

three levels of difficulty (readability levels) tested for

each group?

Question IC

Will there be an interaction between the form used

(standard cloze or MCC) and the level of difficulty (reada-

bility level) of the passage?

Objective II

To examine the strength of the relationship between

standard cloze and MCC passages for the purpose of determrining

whether standard cloze scores can be used to predict MCC

scores.

Question II

Will a correlation of .70 or more exist between standard

cloze and MCC? The minimal criterion value of .70 was chosen

for practical significance because this would indicate the

shared variance of the two measures was close to 50 percent.

Objective III

To investigate the relationship amrong, standard cloze

scores, MCC scores, and the IE1.

Question IIIA

Will there be a relationship between standard cloze

scores and IRI scores?

Question IIIB

Will there be a relationship between M1CC scores and IRI

scores?









Question IIIC

Wdill the correlations be greater between !!CC scores and

IRI instructional level scores or between standard close

scores and IRI instructional level scores?

Objective IV

To determine the reliabilities of the standard claze,

MCC, and Kidder cloze passages.

Question IV

Will multiple choice cloze passage reliabilities be

the same as standard close passage reliabilities at the same

level?

Oblfective V

To study the relationship between two forms of MCC tests,

Homan MCC and Kidder MCC.

Question V

Will there be a relationship between total Homan M~CC

scores and total Kidder close scores?

Objective VI

To investigate the relationship between IEI's instruc-

tional level scores as scored by the Powell criteria and Betts

criteria.

Question VIA

Will there be a relationship between IRI's instructional

level scores as scored by the Powell and Betts criteria?

Question VIB

Will there be a difference between instructional level

means as scored by the Powell criteria and Betts criteria?









Question VIC

When the Powell and Betts criteria do not place students

at the same instructional level, will a significantly greater

proportion of students be placed at a higher instructional

reading level by the Powell criteria than the Betts criteria?

Question VID

Will students be placed at the same instructional level

by the Powell and Betts IRI criteria at least 75 percent of

the time?

Definition of Terms


Betts Criteria

The Betts criteria are the traditional standard set up

by Emmett Betts (1950) for determining instructional reading

level from an IRI. The Betts instructional level criteria are:

Book Level Word Pronunciation Comorehension

All 95% to 98% 752 to 89%

Frustration Level

Frustration level is the reading level at which the

individual is "thwarted or baffled by the language (i.e.,

vocabulary, structure, sentence length) of the materials"

(Betts, 1950, p. 152).

Independent Level

Independent level is the highest reading level at which

the individual can read with full understanding and freedom

from mechanical di."ficultie3 (Betts, 1950).









Informzal Reading Inventary (IRI)

An IRI is a series of graded passages with comprehension

questions for each passage. It is used to assess a student's

level of reading (instructional, frustration, independent).

Instructional Level

Instructional level is the highest reading level at

which systematic Instruction can be initiated (Eetts, 1950).

It is the reading level at which the student is challenged by

the material without being frustrated.

Multiple Choice Cloze (MCC)

Multiple choice cloze tests are constructed as standard

cloze tests and based on the sam7e underlying p-inciples.

Every nth word is deleted from a passage of approximately 250

words. Each blank space or item contains the correct answer

plus two to four distractors.

Multiple choice close can be easily scored by machine or

by hand using an overlay. The MCC test retains the best points

of standard close tests while reducing student anxiety and

enhancing ease of scoring. Evidence also seems to indicate it

is a more valid measure of reading comprehension than standard

close tests (O'Reilly & Streeter, 1977, Porter, 1976).

Powell Criteria

The Powell criteria are the differentiated criteria set

up by William Powell (1969, 1978) for determining instrue-

tional reading level from an IRT:









Book Level Word Pronunciation Comprehension

PP-2 872 to 945 55% to 802

3-5 92: to 962 602 to 85%

6+ 94: to 972 65% to 90;!

Senantically correct word

A semantically correct word is similar in meaning to the

deleted word in a functional cloze unit. It would make sense

in context if substituted for the deleted word.

Exam ,e The horse over the tree. Correct

answer: JUMPED; semantically correct answer: FELL.

Standard Cloze Test

A close passage is any passage which has every nth word

deleted. No deletions are made in the first one or two sen-

tences to allow the theme of the passage to be established.

The reader's task is to fill in the blankcs with an exact repli-

cation of the missing word. Cloze tests are distinguished

from completion tests by the fact that cloze test deletions

are made using a set of mechanically objective and prespecified

rules, while the deletions In completion tests may be made

using subjective concepts.

Syntactically Correct Word

A syntactically correct word is the appropriate part of

speech (noun, verb, function, modifier) needed to complete a

functional cloze unit. It is the same part of speech as the

correct word, but does not necessarily make sense if substi-

tuted for the correct word.


~1









Exml. The Jumped over the tree. Correct

answer: HCRSE: syntactically correct answer: CHAIR.


Limitations of the Study


There are five points to be considered as possible

limitations of this study. The sample used was not randomly

selected from the total population. -herefo-e, Generalizing

from these results should be done with caution.

The second graders were tested in only two sessions.

Due to the age, attention span, and slow rate of speed at

which second graders work(, many of them tired and/or gave up

before they had completed all the passages. It took many

second graders an hour to complete the passages when the

standard close and Kidder cloze passages were administered

together.

Three limitations involved the DIi. The comprehension

questions for the IRI were limited in number, only five per

passage. With only five questions, a student who missed one

question would automatically be at instructional level. The

student might actually be able to handle the material independ-

ently, but due to carelessness, or misunderstanding, one

question was missed. Five questions allow for little differ-

entiation between independent, instructional, and frustration

reading levels.

The students were able to reread the IRI to locate

specific information. Several of the comprehension questions

directed the student to "lookt in the passage for," or "find"





13


a specific word. 'his allowed for somre rereading of the pas-

sage which might affect the results. However, th~e rereading
was only to locate specific words or information.
In addition, IRI's were given to only thirty to thirty-
three students at each grade level. Giving every student an

IRI would have presented more complete data for analysis.

However, due to the time involved in administering: IRI's to
548 students, it was not Feasible for this study.














CHiAPTERI II
REVIEW OF RELATED LITERiATUFE


History and Develorment


Yilson Taylor (1953) is credited with the development

of close procedure. He introduced it as, n. . a new psycho-

logical tool for measuring the effectiveness of communication"

(T;aylor, 1953, p. 123). He recognized and tested its useful-

ness as a new approach to readability. Cloze procedure was

seen as n. . a method of intercepting a message from a

'transmitter' (writer or speaker), mrutilating its language

patterns by deleting parts, and so administering it to receivers

(readers and listeners) that their attempts to nake the patterns

whole again potentially yield a considerable number of cloze

units" (Taylor, 1953, p. 126).

An actual cloze unit was considered any single occurrence

of a successful attempt to reproduce accurately a deleted part

of a passage or message, The method used to supply the missing

part was determined from the remaining context (Taylor, 1953).

The term "cloze" comes from the Gestalt concept of

closure (Taylor, 1957). This concept relates to the tendency

to "seen a not quite complete circle as a whole circle by

mentally completing the picture. The same concept, Taylor

assumed, held true of people trying to complete a mutilated









sentence by filling in the ucrds thatr -ade the finished pat-

tern of language symbols fit the a;;a-~ent meaning.

Ohmacht, Weaver and 7.oiler (1:7C), In an attempt to

factor analyze cloze procedure, loundC only a moderate correla-

tion between cloze factors and perceptual cloze factors. The

actual relationship between th~e Gestalt concept of closure is

still open to question.
Another aspect of cloze procedure investigated by Taylor

(1953) was its use for dete--tning restdability. In a corre-
lational study using the Dale-Chall (5ale & Chall, 1948) and

Flesch (Flesch, 191t8) readatility Fo=21las, cloze procedure
was shown to be an effectiv-e anrd rel'atle Trethod of contrasting

the readability levels of varlious passages.

Taylor (1957) assumed trat resiability and comprehensi-

bility were essentially syr~njon-ous terns. He saw the readabil-

ity level as being the sa~e as the und~erstandatility of a

passage. This would seem to be tnue if readability formulas
are accurate. Generally, t;-e lower the readability level of

a passage, the easier it is to understand.
Taylor (1953) specified that edcze procedure does not
deal directly with specific -ea~nings of words. The total con-

text and the redundancy of our language are actually involved.

Cloze procedure counts the In~stances of language usage corres-

pondence, and not the meaning~s theuselves. This relates to a
measure of the likeness betw-een a writer's patterns and the

patterns the reader is anticipatirg whi-le reading. Agreement
between the writer's patterns and the reader's anticipated









patterns create a languaCe usage correspondence. Taylor

(1953) credited ideas from the total language context concept,

Osgood's dispositional mechanisms, and statistical random

sampling as contributing to the development of cloze procedure.
In simplistic terms, cloze procedure involves leaving

every nth word of a passage blank. The student then fills in
those blanks using the context to determine the missing word.

Close procedure has been and continues to be studied.

Taylor (1953) saw its potential as an approach to readability.
Believing that readability and comprehensibility were similar,

if not the same, his original studies explored heavily in

that area. Taylor developed the form of standard cloze most

often used today.

Methodological Considerations


Any passage may be used as the basis for a cloze test.

In his original study, Tlaylor (1953) experimented with trwo

types of deletions, random, and every nth word. Random dele-
tions were chosen by a random number generator in accordance

with the belief that if enough words were struck out at random,

the blanks would come to represent proportionately all kinds

of words to the extent that they occur in the passage. The

second and more convenient method was every nth word deletions.

Several studies have supported the indication that every

fifth word deletion discriminates best in passages above third

grade level (Rankin, 1959; Rankin, 1965; Taylor, 1953), unless

the passage is extremely technical In which case every tenth









word deletions are recommended (Cranney, 1962; Taylor, 1953).

In a separate study, flacl~initie (1961) found that additional

context beyond five words did not help in the restoration of

missing words. He also concluded that omitting every third

word made restoration difficult.

Salzinger, Fortnoy and Feldman (1962) concluded that

having six words on either side of the close blank did not

produce more correct guesses than leaving four words on either

side of the cloze item. They felt students were either unable

to, or simply did not, make use of context of more than five

words on either side of the cloze blank.

Fillenbaum, Jones and Rappoport (1964) did a study using

deletions every two words up to deletions every six words.

They found the greatest differences in performance between the

passages with deletions every two words and deletions every

three words. They pointed out that with deletions every two

words the students were som~etimes able to guess forn, but

rarely capable of replacing the words correctly.

The length of Taylor's (1953) original passages was

approximately 175 words, allowing for thirty-five deletions.

Taylor concluded that scores tended to stabilize after the

first twenty to forty words of an any-word type of deletion

passage. In a later study, Taylor (1956) recommended fifty

deletions as a suitable length for cloze tests. No statistical

support was given for the fifty deletion suggestion.

Rufener (1972) used shorter passage lengths in deference

to the young age of her subjects. The number of deletions









ranged from eighteen for the second grade level passage to

twenty for the sixth grade level passage. Rufener used

cumulative scores based on percentages to ascertain the sta-

bility and reliability of cloze scores, she found stable

scores on some passages within the range of ten to twenty

deletions, or fifty to one hundred words in the passage. She

concluded that stability of scores was more a function of the

individual cloze passage than the length of the passage.

A final question concerning cloze test construction

involves rational versus mechanical deletion of words. Mechan-

ical, also called any word deletion, refers to every nth word

deletion regardless of the grammatical form of the word.

Rational deletion involves deleting only words that have a

special grammatical function (Rank~in, 1959). These words may

be only nouns, verbs, adjectives, and adverbs, as is the case

for lexical cloze. Lexical cloze involves the meaning of indi-

vidual words. This differs from what standard close is assumed

to actually measure, structural meaning. Structural meaning

is signaled by a system of morphological and syntactical clues

apart from words as vocabulary units.

Taylor (1953) indicated that if only important words

such as nouns, verbs, adverbs, and adjectives were deleted it

might not reflect the passage meaning accurately. Two passages

of equal length might contain a very different number of

meaning conveying words, possibly as much as double the amount

in the passage containing the fewest nouns, verbs, adverbs,









and adjectives. The effect of this on the subjects should

be included in the results of the cloze test.

Rankin (1965) discovered that mechanical deletion pro-

duced a sizeable number of nondiscriminating itens which might

lower the reliability of the cloze test. However, the number

of items (deletions) in the passage could be increased,

thereby increasing the reliability of te~~ passage. Ranktin

also indicated that any word deletion claze tests correlated

more highly with the criterion tests of prereadin~g knowledge,

recall, and aptitude. Due to this he considered mechanical

deletions superior.

Greene (1965) compared mechanical and rational deletions.

The rational deletions produced higher reliabilities and item

discrimination. However, Greene noted that the difference in

time necessary to construct the rational cloze tests often

outweighed the advantage of a slight gain in reliability,

especially since there were no significant differences between

the mean scores of the students.

Bloomer (1966) and Louthan (1965) both investigated

various forms of rational deletions. Bloaner (1966) concluded

that marked comprehension losses occur where the structure of

the prose and its meaning are broken by deletions of the basic

meaning carriers of the language, i.e., nouns, verbs, and

modifiers" (p. 66). Hiis results suggest that the part of

speech deleted does have an effect on the ability of the

individual to complete close procedure tests.









Louthan (1965) indicated that systematic deletions of

structural function words (nouns, verbs, adjectives) would

produce a loss in reading comprehension, regardless of the

number of deletions. In a similar study, Rankin (1959)

reported that mechanical deletions correlated significantly

higher with reading comprehension sections of the Diagnostic

Reading Test than rational deletions.

In scoring close tests, points may be awarded for exact

word only or synonyms accepted. Taylor (1953, 1956) recom-

mended giving credit for exact word answers only since this

retains the objectivity of cloze procedure. Accepting synonyms

has been suggested as beneficial when using close procedure as

a teaching technique (Jongsna, 1971). However, if synonyms

are counted as correct, the subjectivity of the scorer becomes

an issue. Also, Cronbach, as paraphrased by Taylor (1953)

states that cloze tests scored objectively satisfy the assumnp-

tions for true scores and can be considered as such.

The areas covered by methodological considerations all

involve deletions. The literature reviewed indicates a strong

preference for using every fifth word deletion. The number of

deletions has been known to range from eighteen to over fifty

and still maintain relatively high and stable reliabilities.

The issue of mechanical versus rational deletion has

proponents on both sides. Most of the researchers Favor

mechanical deletion even though that method often lowers

passage reliabilities. There is almost total agreement on

scoring by exact answer only. Once the subjectivity of the









scorer becomes a factor, the cloze scores could no longer be

considered true scores. Many of these issues are still under

investigation (O'Beilly & Streeter, 1977).

Reliability


Taylor's (1953) original study produced a cloze test

internal consistency measure of .56 using Kendall's W. Rlankin

(1965) pointed out that mechanical selection of words might

produce a sizeable number of nondiscriminating items which

lower reliability. However, other studies of cloze procedure

have yielded very high reliabilities. In a validity study

involving cloze procedure, Bormuth (1969) used split-half

reliabilities and produced highly satisfactory reliability

coefficients ranging from .92 to .94. In a study involving

mechanical and Pational deletion of cloze procedure, Greene

(1965) found K-E #21 reliabilities of .76 and .90. Cranney

(1973), in a study using cloze in both standard and MCC tests,

found reliabilities using the K-R #20 formula which ranged

from .83 to .93

Using MCC close tests, O'Reilly and Streeter (1977)

found very high estimates of internal consistency. The K-R #20

reliabilities ranged from .911 to .97 with a median reliability

of .96.

The close reliability data available strongly indicated

that standard close and M4CC tests were reliable instruments.

However, no overt comparisons were made between standard and

M1CC reliabilities.









Criteria


Bormuth (1967) recognized that a frame of reference by

which to interpret cloze scores was lacking. In a landmark

study, Bormruth correlated cloze test scores with multiple
choice scores to produce workable criteria. In this study

Bormnuth used one hundred fourth and fifth graders. The stand-

ard cloze test had every fifth word deleted and had fifty

items. By regressing cloze scores on multiple choice scores,

a 38 percent correct standard cloze score was equated with a

75 percent correct multiple choice score. The 38 percent was
the lower limit for instructional reading level. A standard

claze score of 50 percent was comparable to 90 percent correct

on the multiple choice test. The standard error of the

estimate, however, was six percentage points.
In a later study, Bormuth (1968) correlated claze scores

with word recognition and comprehension. The Gray Oral Reading

Test was used. The claze tests contained fifty items per test

and had every fifth word deleted. A total of 120 fourth, fifth,

and sixth graders comprised the sample, forty at each grade
level. A matching procedure was used to determine comparable

standard close and comprehension scores. The results indicated

cloze scores of 44 percent were comparable to comprehension

scores of 75 percent. The upper range equated claze scores

of 57 percent to comprehension scores of 95 percent. The
close scores comparable to word recognition scores of 95

percent and 98 percent were 33 percent and 54 percent,









respectively. It should be noted that both of these studies

are based on relatively small samples, especially considering

the generalizability of these scoring criteria.

Using a sample of 105 fifth graders, Rankin and Culhane

(1969) replicated Bormuth's study and obtained similar results.

Fifty item cloze tests with every fifth word deleted were

administered to the students along with multiple choice tests

from the same articles. Regression equations were set up to

predict multiple choice percentage scores. The criteria that

evolved equated 41 percent correct on a close test to 75 per-

cent correct on the multiple choice test. Also, 61 percent

correct on close tests was comparable to 90 percent on the

multiple choice tests. While Rankin and Culhane recommend

that teachers interpret close scores by the criteria from this

study "with some degree of confidence" (Ranktin & Culhane, 1969,

p. 197), it should be noted that this study involved only one

grade level.

In a separate study, Alexander (1968) set up criteria

for cloze scores based on instructional level IRI scores.

Alexander used data from 365 students in his sample from

grades four, five, and six. The number of cloze items ranged

from forty-five to ninety-two. Every fifth word was deleted.

Alexander's criteria had an instructional level range of 47

percent to 60 percent based on the IRI's instructional level

scores. The IRI instructional level scores were determined

using the Powell criteria (Powell, 1969).









All four of these studies are based on fourth only,

fifth only, or fourth, fifth and sixth grade students. There

are possible differences among these Grade levels that dis-

courage direct comparison of these claze criteria.
These standards for determining cloze instructional level

are all based on varying base criteria. However, the 75 per-

cent to 90 percent correct seems constant, even though what

the 75 percent to 90 percent reasure changes with each study,

covering the range of multiple choice tests, oral reading

tests, comprehension questioning and the IRI.

Measurement of 2eadins Concrehension

Conflicting evidence exists In the literature concerning

the validity of standard cloze tests as a measure of compre-

hension. Bormuth (1969) in a factor analytic study of cloze

tests found that one Factor accounted for 77 percent of the

variation in the correlation ratrix. Using: a principal com-

panents analysis, Bormuth analyzed correlations among nine
close tests and seven multiple choice comprehension tests.

He concluded ". . close made by deleting every fifth word

measure skills closely related or identical to those measured

by conventional multiple choice reading comprehension tests"
(Bormuth, 1969, P. 365) -
Porter (1976) stated th-at "the fact that the ability

to predict what lies ahead depends on the ability fully to

comprehend the language being processed at any given moment





provides the justification for cloze procedure as a test of

comprehension" (p. 152).

In another factor analytic study, Horton (19711-75) used

cloze tests from science and social studies, paragraph reading

tests, and twelve tests designed to measure structure of intel-

lect. He performed several analyses and concluded that the

factors appeared to be invariant from one analysis to the

others. The study attempted to establish both construct and

concurrent validity for cloze tests. The construct was

defined as the ability to deal with the linguistic structure

of the language as related to the ability to deal with rela-

tionships among words and ideas. Horton suggested the vari-

ance shared among cloze tests and reading comprehension tests,

reading gain tests, and verbal intelligence tests are probably

a measure of the student's ability to deal with relationships

among words and ideas. The correlations established concur-

rent validity.

Weaver and Kingston (1963) also factor analyzed cloze

tests. College level students participated in the study. The

tests used were the Davis Reading Test, several subtests of

the Modern Language Aptitude Test, the STEP Listening Test,

the Ohio State Psychological Examination, and eight cloze

tests. Weaver and Kingston concluded that cloze procedure was

only moderately related to verbal comprehension. Much

specific variance was unexplained by any of the factors.









Bormuth (1969) questioned the interpretation of Feaver

and Kingston's (1963) data for four different reasons. The

subjects were college students which produced a less varied

ability distribution than most elementary school level studies.

The correlations on which they based their calculations were

different from those obtained from other investigators. The

standardized tests used had unusual factor loadings, and the

close tests had inconsistencies in loading patterns among

themselves.

Carroll (1972) in a discussion of cormprehension and

various tests of comprehension stated, "Cloze technique in

its usual form is too crude to permit measuring the degree to

which the individual comprehends particular lexical or gram-

matical cues, or possesses a knowledge of specified linguistic

rules" (p. 19). He suggests there is no clear evidence that

cloze scores measure the ability to comprehend major ideas or

concepts that run through a discour-se.

The evidence cited by Carroll as leading to this conclu-

sion is the Weaver and Kingston factorial study (1963) pre-

viously discussed, a study by Colemran and Riller (1968), and

Rankin's (1958) study. The Coleman and Miller (1968) study,

as depicted by Carroll (1972), attempted to measure knowledge

gain from a cloze passage. A distinction between knowledge

gain and comprehension or understanding is arguable. E~ankin

(1959) concluded scores based on the deletions of nouns and

verbs seem to measure something different than close scores









based on deletion of function words. This conclusion. h-as no

direct implication in terms of the ability of a cp-anlcally

deleted cloze passage to measure reading comprehension.

Therefore, Carroll (1972) does not seem to have stronE or

clear supportive evidence for his statements.

In another study using college students (Ande-rsn, 1974)

different conclusions were reached. Anderson found cloze

tests were valid measures of comprehension. Anderson also

noted that there were more correct responses on the cloze

tests when a M3CC format was utilized.

Porter (1976) expressed his view that standard clcze

tests do not measure what they claim to measure. Althcuch.

comprehension of the passage is necessary to the successful

completion of a standard cloze passage, that alone is nlot

enough. A student being administered a standard cloze test

must produce language to fill in each cloze space. ?crter

stated "to be valid, a test should measure what it is intended

to measure, and language production is not comprehensi~on,

neither are the two necessarily concomitant" (p. 152). A

student may understand or comprehend a cloze passa,7e and still

be unable to predict the correct word for a cloze ite-. This

is especially relevant for students who speak in a dialect,

or for whom English is a second language.

Porter (1976) proposed devising a method of testing

comprehension with the advantages of standard cloze procedure,

but without necessitating language production. The solution

he suggested was a MCC test. Although MCC would take a little









longer to construct due to the added distractors, the added

test validity would be worth the additional time and effort.

Also, the process of selecting alternatives offers the possi-

bility of more control and flexibility than standard cloze

procedure possessed. This control is exemplified by being

able to chose distractors which vary according to the "depth

of linguistic attainment and fineness of stylistic discrimi-

nation of the student" (Porter, 1976, p 159).

An additional virtue of NICC is that it can be constructed

so that an overlay can be easily used for scoring, or it can

be used with answer sheets and machine scored. O'Reilly and

Streeter (1977) also point out that a MCC test has greater

face validity as a measure of comprehension.

Guth~rie (1973) correlated MCC tests with the Gates

Mac~initle comprehension and vocabulary test to measure the

validity of MCC as a test of reading comprehension. The MCC

test correlated .85 with the vocabulary section of the Gates

Mac~initle and .82 with the comprehension section. The

subjects in this study were thirty-six elementary aged normal

and disabled readers.

Cranny (1973) demonstrated that a MCC test when corre-

lated with the Cooperative Reading Test had a validity

coefficient that showed 25 percent shared variance with the

comprehension sections of the Cooperative Reading Test.

A recent study of O'Reilly and Streeter (1977) stated

the belief that multiple choice technique would retain many









of the good points of standard cloze procedure, such as

absence of questions, and objective item construction, while

improving its applicability as a measure of reading compre-

hension. Also, due to the format of NCC, students would

suffer less from test anxiety which should increase the

reliability of the tests. The excessive difficulty and

ambiguity of the standard cloze test would be greatly reduced.

O'Reilly and Streeter (1977) conducted a factor analytic

study of ECC tests using tests based on Bormuth's wh-Items

format, the California Achievement Test, a Test-w'iseness Test,

and the Short Form Test of Academic Aptitude. _hey concluded

that MCC tests were a measure of reading comprehension that

was essentially independent of IQ.

The main factor of the MCC tests appeared to be one of

literal comprehension. However, there was evidence of two

other factors possibly representing other forms of

comprehension.


Summary of Research


Close procedure is backed by a proliferation of research

as a tool for measuring some sort of reading comprehension.

A dialogue continues concerning the actual type and amount of

comprehension it measures.

Many researchers have attempted to establish criteria

by which to determine instructional reading level from cloze

tests. All of the research in that area and in most other

areas of cloze research has been limited to fourth grade






30


students and above. The field is open for investigating

primary and intermediate Erade children to see if results

are generalizable across grade levels.

The newest area of research in cloze procedure appears

to be in the area of M;CC. The relationship between this new

form of cloze and standard claze is yet to be studied.

Information on the relationship of standard cloze scores

and the IRI is also very limited. Exploration of the rela-

tionships and interrelationships in the aforementioned areas

of MCC, standard cloze, and the IRI might well be of great

potential value to the classroom teacher. Through research

in these areas, a way to determine instructional reading

level with ease, accuracy, and reliability might yet be

fort hcoming .














CHAPTER III
PROCEDURES ANJD METHODOLOGY


sample


The subjects were 557 students selected from a larger

school population of second, fourth, and sixth grade pupils.

Three schools from the Pinellas County School System partici-

pated in the study: Dunedin Elementary School, Lealman Ele-

mentary School, and Mt. Vernon Elementary School. The total

second, fourth, and sixth grade population of Pinellas County

is 18,729.

These three schools were selected because they housed

first through sixth grades and the principals and faculties

were willing to help with the study. The schools were repre-

sentative of three different socioeconomic levels. This

determination was made according to data on free and reduced

lunches. The students who receive free and reduced lunches

are determined by amount of family income. Therefore,

classification on this basis Is Justified.

Lealman Elementary School had 78 percent of its students

on the free or reduced school lunch program. This was con-

sidered the low socioeconomic school. Mt. Vernon Elementary

School had 47 percent of its students on the free or reduced

school lunch program. Mt. Vernon was considered the middle









socioeconomic school. Dunedin Elementary School had 19 per-

cent of its students participating in the free or reduced lunch

program, and was considered the high socioeconomc school.

The students were not assigned to classes on the basis

of reading ability. All students in the class were tested,

including those involved in special programs such as Special

Learning Disabilities and Ecotionally Handicapped. Absentees

were not given makeup tests. Students with incomplete test

data were eliminated from the study.


Instruments Used


Each student was tested on nine passages. The investi-

gator wrote original passages ranging in readability levels

from 1.5 to approximately 7.5- Three of these passages were

presented to the student as standard claze passages. The

second grade students received first, second, and third grade

level passages. Fourth grade students were required to com-

plete passages for grade levels three, four, and five. Sixth

grade students were tested with fifth, sixth, and seventh

grade level passages.

The students were also tested with the same three pas-

sages or stories in an MCC format (Homan. Claze).

Additional MCC passages (Kidder Cloze) written for the

New York School System and Easch calibrated for readability

were also given to each student. Three of these passages

(also covering three grade levels) were completed by each

student. This was done for two purposes: it served as a









checkt on the readability level of the original passages; and

it checked on the possibility of an effect on the H~orran Cloze

tests caused by the student's previously taking a standard

claze test on the same story.

Thirty-six students were rar~doaly selected at the second

grade level, thirty-four at the Fourth grade level, and thirty-

six at the sixth grade level, to te administered IRI's. This

served as a base measure of instructional reading level. It

was used to establish the validity of the Homan Close and to

revalidate standard claze as a measure of instructional

reading level.


Instrument Developmen~t Original Passage


Purpose of Instrument

Stories of approximately 135 words were written at first

through seventh grade reading levels. These passages were

made into standard close and MCC tests to be used to determine

the instructional reading levels of second, fourth, and sixth

graders. Both standard cloze and MCC are considered measures

of reading comprehension which is a direct reflection of

reading level. If a student cannot comprehend material suf-

ficiently at a certain level, he/she cannot effectively work

and learn at that level of instruction. The original cloze

passages, both standard and multiple choice (Homan Cloze)

were intended to provide a measure of reading comprehension

that could be used to determine Instructional reading level.

These passages are included in Appendices B and C.





Source of Items

Every fifth word was deleted from the cloze passages.

Taylor (1953, 1956) and Rankin (1965) both have supported fifth

word deletion. Plac~initie (1961) found that additional context

beyond five words did not help in restoration. In a later

study, Salzinger, Portnoy and Feldman (1962) also determined

that having six words on either side of the cloze blank did not

produce more correct guesses than having four words on either

side of the blank. They concluded that subjects either cannot

or do not utilize context of more than five words on either

side of a blank. An additional conclusion was that words pre-

dicted for every fifth blank are independent of each other.

The following list contains the number of words, number

of deletions, and readability levels of the original standard

close passages. The Dale-Chall readabilities are approximate

grade levels derived from Dale-Chall Formula scores (Dale &

Chall, 1948).

Words Deletions Readability Source

1st grade 100 18 15 Spache, 1977

2nd grade 138 23 2.5 Spache, 1977

3rd grade 126 23 3.5 Spache, 1977

4th grade 1471 23 11.8 Harris-Jacobson

5th grade 126 23 5.5 Dale-Chall (converted
from Formula Score)

6th grade 133 23 6.5 Dale-Chall (converted
from Formula Score)

7th Grade 124 21 7.5 Dale-Chall (converted
from Formula Score









The following is a list of the number of words, deletions,

and readability levels of the Kidder Cloze passages:

Words Deletions Readability Source

1st grade 113 18 1.4 to 1.7 Rasch calibrated and
Spache Pormula for
2nd grade 88 13 2.5 first, second, and
third
3rd grade 66 10 3.5

4th grade 65 10 4.66 Rasch calibrated and
Dale-Chall Forn~ula
5th grade 61 10 5.42 Score for fourth,
fifth, sixth, and
6th grade 74 11 5.84 seventh

7th grade 70 12 6.38

The MCC tests at all levels had three selections for each

deleted word. One choice was the correct answer. One of the

selections was syntactically correct but semantically incorrect.

The third choice was neither semantically nor syntactically

correct.

Four parts of speech were identified for syntactic agree-

ment or nonagreement (Guthrie, 1973). These categories were

noun, verb, modifier, and function. The noun group included

nouns and pronouns. The verb group included verbs and auxil-

iary verbs. The modifier group included adjectives and adverbs,

and the function group included prepositions, articles, and

conjunctions.

The syntactically correct distractors for the First,

second, and third grade passages were selected using systematic

sampling from the Dale-Chall List of 769 words, and from the

Dale-Chall List of 3000 words for the fourth, fifth, sixth,









and seventh grade passages (Dale & Chall, 19118). In each case

the number of itens in the cloze test was divided into the

number of words on the list. The word list was then divided

into equal groups using the derived quotient. If the first

word of the group was not the needed part of speech, the next

word on the list that was the correct part of speech was used.

By using the words on the Dale-Chall lists the readability

level of the distractors was controlled.

The choice that was neither syntactically nor semanti-

cally correct was selected from the passage using a random

number table. The position of the correct answer was randomly

assigned.

scoring
In the standard cloze passages only the exact word was

scored as correct. Synonyms were not allowed. Spelling

mistakes were permitted as long as the word was still recog-

nizable. By scoring only the exact word as correct, the

objectivity of the scoring procedure was greatly enhanced.

The MCC test items were scored as right if the correct

word was circled. If the correct answer and an incorrect

answer were circled on the same item, it was considered a

wron~g answer.

Item Analysis

An item analysis was performed using biserial correlations

between Individual item scores and total scores for the passage.

A biserial correlation is used for computing the correlation

between one continous variable and one artificially dichotomized









variable. The item scores were considered an artificial

dichotomy since other words, or answers, could be inserted

to make sense in the passage and so are "correct" in some

sense.

No items were deleted based on this item analysis.

However, this information may be of value for future research.

Reliability

Internal consistency estimates were computed as meas-

ures of reliability on all passages, standard and FICC.

Validity
Standard claze has been shown to be a valid measure of

reading comprehension (Bormuth, 1969; Banktin, 1959). Criter-

ion data were collected by administering Information Reading

Inventories to 107 randomly selected students. Multiple

choice close and standard cloze scores were correlated with

IRI instructional levels to determine their validities.


Other Instruments


Some MCC passages (Kidder Cloze) taken from the Test

Development fiotebooke developed by the Bureau of School and

Cultural Research, Division of Research, New Yorke State Edu-

cation Department were also administered to each student.

The readability level of these passages had been determined

in two ways. A conventional method, the Spache Readability

Formula (Spache, 1953) was used for the first, second, and

third grade passages, and the Dale-Chall Readability Formula

(Lale & Chall, 1948) was used for the fourth through seventh





grade passages. Also, these passages, along with a series of

MCC passages ranging from first through twelfth grade level,

had their readability level determined by a Rasch calibrated

scale. The Basch based scale is considered to be a "person-

free" estimate of passage difficulties (Kidder, 1977). That

is, calibrations derived from the Rasch model are not sample

dependent. For further information on the Rasch model, see

Hambleton and Cook (1977), or wright and Panchapakesan (1969).

These multiple choice passages served two purposes.

They provided a check on the readability level of the original

passages and were used to determine whether any learning took

place from the standard cloze passages which were always

administered first. Each student took three standard cloze

tests and three Honan Cloze (MCC) tests on the same stories.

They also took three M~CC tests (Kidder Close) on different

stories.

If learning had occurred from the standard cloze tests,

the students would be expected to score higher on Homan MCC

tests than Kidder MCC tests. The standard close tests and

the Homan MCC tests were identical in content, differing only

in form. Therefore, learning from the standard cloze test

would very likely be reflected in a high Homan MCC test score.

Source of Items

The Kidder Close passages were taken as complete stories.

The investigator then randomly deleted every fifth word using

the same procedure used for the original MCC passages. The









same procedure was also followed for selecting the two

distractors.

Reliability

Some of the Kidder Claze passages had only ten items.

Hufener (1972) achieved stable scores after only ten to

twenty cloze deletions, Using cumulative percentage of suc-

cessful scores, most passages indicated stability of scores

after ten to fifteen deletions.

Using internal consistency as a measure of reliability,

the Kidder Cloze passages had respectable reliabilities,

ranging from .65 to .90 (see Tables 14, 15, 16). Based on

the high reliabilities of all Kidder Cloze passages and the

data from the R~ufener (1972) study, it seems reasonable to

conClude that even ten cloze items can yield reliable results.

The three passage scores for each student were combined for

correlation with Homan MCC. The combined passage reliabilities

were higher than the individual passage reliabilities.

Informal Reading Inventory (IBI)

The IRI is a measure of a student's instructional

reading level. Using graded passages, a measurement is

obtained of comprehension and word recognition scores.

The IRI used in this study was developed by Alexander

(1968). It consists of a series of graded stories, each fol-

lored by five comprehension questions (see Appendix E). The

comprehension questions combined questions at the literal,

inferential, vocabulary, and evaluative comprehension levels.









Alexander (1968) developed a close criteria for determining

instructional reading level using this IRI.

The Powell criteria for determining instructional level

from IRI scores were used. However, the scores were also eval-

uated using the 4etts criteria. Differences when using each

criteria were examined.

Thirty-six second graders, thirty-four fourth graders,

and thirty-six sixth graders were chosen randomly and admin-

istered an IBI. Informal Reading Inventory scores were used

as the criterion to validate M1CC as a means of determining

instructional reading level.

Procedure

Each class at all levels had two testing sessions of

approximately thirty minutes each. The first session always

involved three standard cloze passages. In half of the

classes the Kidder MICC passages were also administered at the

first session.

The second testing session, which was always at least

one week after the first session, involved the administration

of the Homan MCC test. Hlalf of the classes were administered

the Kidder I!CC at the second testing session. The IRI's were

given individually after both testing sessions were completed

to the reduced sample (n=107).

Second grade students were administered standard claze,

Homan MCC, and Kidder MCC passages at the first, second, and

third grade readability levels. Fourth grade students were

administered standard cloze, Homan M3CC, and Kidder M:CC passages









at the third, fourth, and fifth grade readability levels.

Sixth grade students were administered standard cloze, Homan

MCC, and Kidder MCC passages at the fifth, sixth, and seventh

grade readability levels.

Administration


All tests were administeredi by the investigator to

Insure uniformity of instructions. The instuations were

given orally. A standard cloze item was demonstrated on the

chalkboard. The example for second grade was: THE DOG RAN

TO DOOR. HE WANTED TO IN. The fourth

and sixth grade example was: ''HE TYINGC WEN:T BACK

FORTH.

There was no time limit on the tests. The students

were permitted to ask their teacher or the investigator to

spell words, but words could not be pronounced for them.

(For full directions, see Appendix A.)


Method of Anal~ysis


All data were punched onto IBM cards for subsequent

analysis. The Statistical Package for the Social Sciences

(Nie, Hull, Jenkins, Steinbrenner &- Brent, 1970) was used to

facilitate computation. The Biomedical Computer Programs,

P-Series (Dixon & Brown, 1977) was used for the analysis of

variance computation. The research hypotheses now follow

stated in null form. After each hypothesis is a brief expla-

nation of the statistical analysis utilized.









Hypothesis IA
There will be no differences between the mean scores

for standard cloze and MCC passages for the same group of

students.

Hypothesis IB
There will be no differences among the mean scores on

the three levels of difficulty tested for each group.

Hypothesis IC
There will be no interaction between the form used

(standard cloze or M1CC) and the level of difficulty (reada-

bility level) of the passage.

Analysis of variance was applied to compare the standard

cloze scores and Haman MCC scores for the same students using

a randomized block factorial design with the subjects serving

as blocks. The students' scores on the three standard cloze

tests were compared to their scores on MCC tests.

Differences due to form (standard cloze and MCC), and

level of difficulty were tested. Thle degree of interaction

between form and level was examined.

These hypotheses were tested at a significance level of

a=.05. Since the assumption of symmetry for the variance-

covariance matrix was not tested, a conservative F test,

the Geisser-Greenhouse test (Kirk, 1968), was used to test

significance for both main effects and the interactions at

the second, fourth, and sixth grade level analyses.





Hypothesis II
A correlation of .70 or more exists between standard

cloze and MCC.

The strength of the relationship between standard cloze

scores and MCC scores was assessed using the Pearson Product-

Mloment correlation. A preset correlational value of .70 was

used to determine the practical significance of the relationship.

Hypothesis IlTTA

There will be no relationship between standard claze

scores and IRI scores.

Hypothesis IIIB

There will be no relationship between MCC scores and

IRI scores.

The Pearson Product-Moment correlation was computed to

assess the relationship between both standard cloze and MCC

to the IRI.

These hypotheses were tested at a significance level of

a=.01. The a=.01 level of significance was chosen because

the results of these tests may determine if the instruments

involved will be used for placement. When considering place-

ment of students for instructional reading level, accuracy

is extremely important. The significance level of a=.01 was

used to guard against a Type I error. All hypotheses that

relate to possible placement decisions were tested at a

significance level of al.01.





Hypothesis IIIC
The correlations between MCC scores and IRI instrue-

tional level scores will be the same as the correlations

between standard cloze and IRI instructional level scores.

The difference between the Pearson Product-Moment

correlation coefficients of standard cloze and MCC to the IRI

was tested for significance usinC the following formula:


t = (l2 -q3) N-3) (12= 23
2 1-r3 -rl2 -r; + 2r23rl2rl3)
where ,

rl2 is the correlation coefficient of the IRI to MCC.
r1 is the correlation coefficient of the IRI to SC.

r23 is the correlation coefficient of SC to MCC
d.f. = N-3.

This formula (Guilford & Fruchter, 1973) is used to test for

significance between correlation coefficients from the same

sample .

Hypothesis IV
Multiple choice claze passage reliabilities will be the

same as standard cloze passage reliabilities at the same

grade level.
The internal consistencies of all instruments were

estimated using the Gitap Program (Baker, 1970). Hayt relia-

bility coefficients are computed by this program. Item dif-

ficulties and biserial correlations were also provided by

this program.









Hypothesis V

There will be no relationship between total Homan MCC

scores and total Kidder Cloze scores. Total close scores

are the students' combined scores on all three Hioman Cloze

or Kidder Close passages.

The relationship between Homcan MCC scores and Kidder

MCC scores was assessed by the Pearson Product-llament corre-

lation method. This hypothesis was tested at a significance

level of a=.01

Hypothesis VIA

There will be no relationship between IEI's instrue-

tional level scores as scored by the Powell and Betts criteria.

The relationship between instructional level scores

from the IRI as scored by the Powell criteria and the Betts

criteria was assessed by computing the Pearson Product-Moment

correlation. The hypothesis was tested at a significance

level of a=.01.

Hypothesis VIB
There will be no difference between instructional level

means as scored by the Powell and Betts criteria.

The difference between instructional level means as

determined by the Powell and Betts criteria was tested by a

t-test for comparison of means. The hypothesis was tested

at a significance level of a=.01.

Hypothesis VIC
When the Powell and Betts criteria do not place students

at the same instructional level, a greater proportion of






46


students will be placed at a higher instructional reading

level by the Powell criteria than the Betts criteria.

Hypothesis VID
Students will be placed at the same instructional level

by the Powell and Betts IRI criteria at least 75 percent of
the t ime.

The number of times the Powell criteria would place

students at the same instructional level was calculated by

hand and reported in percentages. A Chi Square test was used

to test the significance of the differences in proportions

of students placed at the higher instructional level by the

Powell and Betts criteria, respectively.





CHAPTER IV
RESULTS, DISCUSSION AN!D
THEORETICAL COIISIDTRATION;S


Results and Discussions


The major purpose of this study was to investigate the

relationship between standard close and l:CC tests. Several

hypotheses were tested in this regard.

A secondary purpose of the study involved a comparison

of readability levels as determined by traditional methods

and a new system of readability determination, Rasch cali-

bration (Kidder, 1977). A third aspect of this study involved

the IRI and the two types of scoring criteria used with the

IRI to determine instructional reading levels.

The results for the testing of each hypothesis will be

presented, followed by a discussion of those results. The

first hypothesis had the following components:

Hypothesis IA There will be no differences between

the mean scores for standard close and MCC passages for the

same group of students.

Hypothesis IB There will be no differences among the

mean scores on the three levels of difficulty tested for each

group.









Hypothesis IC There will be no interaction between

the form used (standard cloze or MICC) and the level of diffi-

culty (readability level) of the passages.

Mean scores and standard deviations by grade and test

are shown in Table 1.

A randomized block factorial design was used to test

the four parts of the first hypothesis. Each student was

tested on the same six measures. The factors involved in the

analysis were test. form and difficulty level. Test form

consisted of two types of tests, standard close and MCC.

Passage difficulty consisted of three consecutive grade levels.

Separate analyses were conducted for grades two, four, and

six. The analysis of variance (ANOVA) tables for these anal-

yses are presented In ~ables 2, 3, and 4, respectively.

A significant interaction between test form and passage

difficulty was present in grades two, four, and six. Hypo-

thesis IC was rejected. Because the interaction was sig~nifi-

cant, Simple Main Effects were calculated to determine more

accurately where the significant differences were present.

Tables 5, 6, and 7 show the Simple Main Effects summary

tables for grades two, four, and six, respectively.

The tests for Simple Mlain Effects at all three grades

indicated all effects studied were significant. These

findings actually mean that for the level one passage, or the

lowest readability level given to each group (first grade

level for the second grade group, third grade level for the

fourth grade group, and fifth grade level for the sixth grade





















Passage Fassage Passage
Grade Test Level 1 Level 2 Level 3 N

Grade 2 SC 10.6 6.7 4.0 181

Grade 2 MCC 15.8 17.1 13.3 181


Grade 4 sc 9.5 7.9 6.4 168

Grade 4 M!CC 20.3 19.8 17.6 168


Grade 6 SC 12.4 8.7 7.3 196

Grade 6 MCC 21.4 210 18.1 196


Table 1


Means on Standard Cloze and MCC for
Grades Two. Four, and Six





Sour

Test For

Passage
Diffic

Test For
Diffic

Test For
Studen

Difficul
by Stu

Test For
Diffic
by Stu


p <,05

IThe abbreviated summary table contains only the factors
involved in the testing of Hypotheses IA, IB, and IC.


Table 2


Analysis of Variance Abbreviated Summaryl
Table for Grade 'swo


ce df SS MS

m 1 18712.32 18712.32


,ulty Level 2 4019.80 2009.90

~m by
iulty Level 2 1297.49 648.74

m by
It 180 3207.86 17.82

ty Level
Dent 360 6743.10 18.73


F

1049.99"


107.30"


32.16


'n by
!ulty Level
Dent


360 7262.51


20.17










































sp <.05

IThe abbreviated summary table contains only the factors
involved in the testing of Hypothesis IA, IB, and IC.


Table 3


Analysis of Variance Abtteviated Summasryl
Table for Grade 'our


Source

Test Form

Passage
Difficulty Level

Test Form by
Difficulty Level

Test Parm by
Student

Difficulty Level
by Student

Test Form by
Difficulty Level
by Student


df SE

1 32276.84


2 1'172.06


2 52.71


167 2544-59


334 2357.51



334 20511-25


MS

32276.81


736,03


26.36


15.24


7.06


F

2118.31


104.28


4.29'










































'p <.05

he abbreviated summary table contains only the factors
involved in the testing of Hypotheses IA, IB, and IC.


Table 4


Analysis of Variance Abbreviated Suncaryl
Table for Grade Six


Source

Test Form

Passage
Difficulty Level

Test Form by
Difficulty Level

Test Form by
Student

Difficulty Level
by Student

Test Form by
Difficulty Level
by Student


df SS

1 33685.16


2 3545-19


2 572.24


195 1617.56


390 1932.60



390 1902.38


MS

33685.16


1772.59

286.12


8.23


s.96


F

11060.82"


357.71'


58.76





















Source df SS MS F

Form at Level 1 1 2519.41 2519.41 129.93

Form at Level 2 1 9722. 03 9722.03 501. 39

Form at Level 3 1 7768.86 7768.86 400.66


Level at SC 2 3944.00 1972.00 101.39'

Level at MCC 2 1373.38 686.69 35.31'


p < 05


Table 5


Simple Main Effects for Grade -wo





















Source df SS MS F

Form at Level 3 1 9793.44 9793.44 1066.82*

Form at Level 4 1 11916.67 11917.67 1298.111

Form at Level 5 1 10620,00 10620.00 1156.86*


Level at SC 2 836.91 418.46 63.400

Level at MCC 2 687.94 34397 52.11



p <.05


Table 6


Simple M~ain IEffects for Grade Four





















Source df SS MS F

Porm at Level 5 1 78118.25 78'18.25 1303.70

Form at Level 6 1 119976.85 14976.85 2B87.85C

porml at Level 7 1 11432.88 11432.88 1899.1:


Level at SC 2 2811.99 1406.00 285578'

Level at XCC 2 1305.62 652.81 132.69*



'P <.05


Table 7


Simple Main Effects for Grade 6






56


group), significant differences were observed between student

scores on standard cloze tests and MCC tests. The same is

also true at the other two grade levels at which the students

were tested. The form of the test, standard cloze or MICC,

made a significant difference for all three grades at all

readability levels examined. Thus, Hypothesis IA was rejected.

For all three samples (second, fourth, and sixth graders)

the Simple Main Effects test also indicated significant dif-

ferences among mean scores on the three passage difficulty

levels. Therefore, Hypothesis IB was rejected. Since three

passage difficulty levels were tested on each test form, a

further test was necessary to determine exactly which level

scores were significantly different within each form.

Tukey's test for differences in means was used to make

this determination (Hays, 1973). The differences in means

were tested for significance at the .05 level. Tukey's test

results are presented in Tables 8, 9, and 10.

For the second grade, on the standard claze test and

the MCC test, all of the mean scores were significantly

different from each other. In other words, the first level

mean scores were significantly different from the second and

third level scores, and the second level mean scores were

significantly different from the third level mean scores.

For the fourth grade, the standard cloze mean scores

were all significantly different on passages of different

difficulty. The MCC mean scores on passages of different








Table 8


Tukey's Test for Differences in
Means for Standard Cloze Scores
and MCC Scores for Grade Two


Standard Cloze K

Difficulty Level 1 10.56

Difficulty Level 2 6.71

Difficulty Level 3 3.99
1 2

S1 3.85" 6.57

21 2.72




HSD.05~33 3.3 = 1.09


Multiple choice Cloze K

Difficulty Level 1 15.83

Difficulty Level 2 17.07

Difficulty Level 3 13.25
2 13

2 1.24* 3.82r

11 2.58r




HSD.05 -331 = 1.09.0

IMS ERROR = pooled error terms for difficulty level and
test form by difficulty level.








Table 9


Tukey's Test for Differences in
Means for Standard Cloze Scores
and MCC Scores for Grade Four


Standard Cloze i

Difficulty Level 3 9.54

Difficulty Level 4 7.87

Difficulty Level 5 6.83

1 2

1 1.67' 3.16*

2 1.49




HSD.05 3.31 b = .656


Multiple Choice Cloze

Difficulty Level 3 20.33

Difficulty Level 4 19.78

Difficulty Level 5 17.63

1 2

11 .55 2.70'

2 2.15




HSD0 3.316. .656


IMS ERROR = pooled error terms for difficulty level and
test form by difficulty level.








Table 10


Tukey's Test for Differences In
Means for Standard C10ze Scores
and MCC Scores for Grade Six


Standard Cloze R

Difficulty Level 5 12.~4

Difficulty Level 6 8.67

Difficulty Level 7 7.27



S1 3.77' 5.17*

2 1.110




HSD0 3.31 = .524


Multiple choice cloze X

Difficulty Level 5 21.39

Difficulty Level 6 21.03

Difficulty Level 7 18.07

1 2

1i .36 3.32'

2 2.96*




HSD.05- 331V = .524


1MyS ERROR = pooled error terms for difficulty level and
test form by difficulty level.









difficulty were significantly different between levels three

and five, and levels four and five.

The results for the sixth grade group for the standard

cloze test indicated significant differences between the mean

scores for all passage levels.

In the MCC form situation, there were two significant

differences between mean scores. The fifth grade level pas-

sage mean score was significantly different from the seventh

grade level mean scores, and the sixth grade level mean

scores were significantly different from the seventh grade

level mean scores.

In review, significant differences due to test form

were observed at all three grade levels. Significant differ-

ences between passages did exist for each test form between

most grade levels.

Hypothesis II stated that a correlation of .70 or more

exists between standard cloze and IICC. A correlation of .70

or more would be necessary to indicate a practical significance

of the relationship of these two variables.

Tables 11, 12, and 13 depict the correlation coefficients

for cloze tests at grades two, four, and six, respectively.

Hypothesis II predicted a strong relationship (.70 or

more) would exist between standard cloze and MCC. Hypothesis

II was not supported at grade four or grade six. For second

grade, second level, the correlation was .80. It was expected

that the scores on the identical story given to the same stu-

dents would correlate highly, despite the differences in forms.












Table 11


Second Grade Correlations of Cloze Tests

Total Total Total
SCTI SCT2 SCT3 MCCT1 MCCT2 MCCT3 'C M~CC KC

sc71 1.q .115 .17 .45 .43 .48 .59 541 .54
5022 1, .8 5 4 95 75 .62

SCT3 1. I.5 74 .2 .8-7 .62 '43
McCC1 -73 .3$ .62 .59
MCCT2~, .51 .82 .92 .7'1

MCCT3 0 .4 .78 .71

SC i1.0 .7 -63

Total10

Total 0
KG


Underline indicates practical significance (.70)-












Table 12

Fourth Grade Correlations of Cloze Tests

Total Total Total
SCT1 SCTi2 SCT3 MCCTI MCCT2 M1CCT3 CC MCC KC
son1 rM .75 .662 62 .60 .61 .87 .68 7
~ ha .Is9 .64 ,461( .6 94 .71 .67

SCT3 ls 1.51 .53. .5 .90 .59 .52
;4CTI ~ .77 > .65 -87 .711
IcT32 0o .76 .65 .93 .72

Yici3 ;ol .67 .91 .71

Total10



Total10
KC


Un~derline indicates practical significance (.70)













Table 13


Sixth Grade Correlations of Cloze Test Scores

Total Total Total
SCT1 SCT2 SCT3 M~CCTI MCCT2 MCCT3 SC MCC K

SCTI 1.0 .65 .6;2) .46 .566 .56 1 89 .63 .62

SCT2 .591 1 39 .'17 .42 .86 .51 .51

SCT3 na I .35 .53 .50" .84 .58 .51

MCCTI 1 .:. .47 .76 .60

MCCT2 1 .6 .60 .90 73

MCCT3 1. a' .60 .811 .63

Total
SC 1.0 .67 .64

Total
MCC 1.0 (

Total
KC 1.0


Underline indicates practical significance (.70).








It should be noted, however, correlations between total scores

over three passages were .78, .73, and .67 for grades two,

four, and six, respectively.
The fact that the correlations for the same stories

were so low, in some instances indicating a shared variance

of less than 8 percent, suggests that standard close and MCC

may not measure the same thing. A shared variance of less

than 8 percent indicates that less than 8 percent of the

variance in scores of MBCC can be explained by variance in

standard cloze scores.

To further investigate the relationship between MCC

test scores and standard cloze test scores the correlational

information was examined using the multitrait-multimethod

matrix suggested by Campbell and Fisk (1959)

The M3CC and standard cloze score correlations for the

same passage level were considered to be correlations between

measures of the same trait using different methods. These

correlations are called the convergent validity coefficients.

Campbell and Fisk reason that such correlations should be

greater than the correlations between n~easures of different
traits with different methods (L~e., MCC and standard cloze

score correlations for dirrerent passage levels).

The second grade correlations for the diagonal (same

passage-different method) ranged from .46 to .56. If the two
methods (MCC and standard cloze) measured the same construct,

these correlations should exceed the correlations located

within the dotted line triangles. Similarly, diagonal





correlations should at least equal the values contained within

the solid triangles (correlations between different passages

using same method). These ranged from .44 to .69. The

different method-different passage correlations ranged from

.39 to .56.

The ranges of the correlations in Table 13 were similar

enough to question whether standard cloze and MCC measure the

same or independent constructs. The second grade correlations

were so close in range that the question of whether standard

cloze and MCC both measure reading comprehension cannot be

proven or disproven.

The fourth grade test score correlations for different

method-same passage ranged from .56 to .64. The same method-

different passage correlations ranged from .62 to .79. The

different method-different passage correlations ranged from

.51 to .64. At this grade level also the range of correla-

tions were too similar to determine independence or strong

relationship of measured constructs.

The sixth grade test score correlations for different

method-same passage ranged from .'16 to .56. The same method-

different passage correlations ranged from .44 to .67. The

different method-different passage correlations ranged from .35

to .56. In this case, also, the ranges of the correlations

were too close to conclude whether standard claze and MCC were

measuring like or different constructs. The similarities of

the range of correlations also preclude any strong statements

reflecting that standard close and MICC do or do not both





measure reading comprehension, or the same type of reading

comprehension. Additional study in this area might help to

clarify this situation,

Guthrie (1974) suggested the use of Maze Technique (a

MCC form developed in the same way as Homan M~CC) to determine

reading comprehension and if a book is at a suitable instruc-

tional level for a student. Standard cloze has long been

regarded as a measure of reading comprehension (Bormuth, 1969;

Horton, 1975; Rankin, 1959). Some question has been raised,

as evidenced by the current study, concerning whether or not

MCC and standard cloze tests measure the same thing. Perhaps

Guthrie's contention that Maze or NICC measure reading compre-

hension should be reevaluated and investigated further.

Informal Reading Inventory scores correlated more highly

with standard cloze scores than MCC scores (Table 13) at the

sixth grade level, again indicating that MCC or Maze may not

be measuring reading comprehension.

Guthrie (19711) suggested that the amount of items

correct is the student's percentage of amount of comprehension

on any given Mlaze passage. In light of the present study,

this seems unlikely. Additional study is called for to deter-

mine what MCC is measuring. If it is measuring reading com-

prehension, is it a different type of comprehension than that

measured by standard claze tests and the IRI?

The fourth grade correlations between standard cloze

and MCC scores were the highest of the three grade levels.

However, even these scores fell short of the .70 value of








practical significance. Most of the correlations at all

three grades Indicated approximately 28 percent to 30 percent

shared variance.

The value of .70 was chosen because it would indicate

119 percent shared variance; that is, almost 50 percent of the

variance in MCC scores could be explained by standard cloze

scores. If they had less variance in common,, it would not

seem prudent to set up a new criteria for determining instrue-

tional level with NICC from standard close scores.

The total score correlations were much higher than

those of the individual test scores. The total scores were

obtained by using combined total scores across the three

levels tested at each grade. For example, at the second

grade level the total standard cloze score was the combined

score on passages one, two, and three.

These total correlations were about .70 in both second

and fourth grades. The correlation neared practical signifi-

cance in sixth grade with a .67 correlation between standard

cloze and MICC scores. This suggests some value in giving

three passages to each student. The time involved in the

testing procedure would be tripled. By extending the testing

time, much of the value of standard close and MICC would be

lost for the classroom teacher.

Null Hypothesis III was in three parts. In the first

two parts it stated that there would be no relationship

between standard claze scores and IRI scores or between IRI

and MCC scores.








Table 114 contains the IRI, standard cloze, and HCC

correlations for the second, fourth, and sixth grades. The

Powell differentiated criteria (Powell, 1978) were used to

score the IRI's.

Hypothesis IIIA stated there would be no relationship

between standard cloze scores and IPI scores. Hypothesis

IIIB stated the same thing for MCC scores and IRI scores.

However, in many instances, that was not the case.

In the second grade data the IEI correlated signifi-

cantly with the standard cloze level one passage and the MCC

third level passage. The other correlations were not signifi-

cant at the >3.01 level.

None of the fourth grade tests, standard claze or MCC,

correlated significantly with the IBI scores. In the sixth

grade the fifth level MCC did not correlate significantly

with the IBI scores. The seventh level MCC scores were very

close to a significant correlation. All other sixth grade

tests did correlate significantly with the IRI scores (a=.01).

Hypotheses IIIA and IIIB were rejected for grades two

and six, but these null hypotheses were not rejected for grade

four.

The third part of Hypothesis III stated that correlations

between MCC scores and IRI instructional level scores would.

be the same as correlations between standard cloze and IRI

instructional level scores. The significance of the differ-

ences in correlations was determined by using the formula for

comparison of correlation coefficients from the same sample

(Guilford & Fruchter, 1973; Hotelling, 1940).






















Grade Two N IRI SCT1 SCT2 SCT3 M~CCTI MCCT2 i:CCT3

IRI 33 1.0 .81 .32 .07 .16 .12 .75

Grade Four N IRI SCT3 SCT4 SCTS MCCT3 McGh t~ccTS

IRI 30 1.0 .25 .33 .32 .27 .21 .17

Grade Six N IRI scT5 SCT6 SCT7I McCT5 :-CT~6 MCCT7

IRI 31 1.0 .51 .52 .113 -.08 .55 .38


Underline indicates p<.01.


Table 114


Correlations of Informal Reading Inventory
Instructional Level Scores with Standard
Close Scores and liultiple Choice Cloze
Scores for Grades Two, Four, and Six









en pl2- 23 (-3) (2+ r2

21- 23 12 13 +2r23 rl3)

where ,

'l2 is the correlation between IRI and MCC.

rl3 is the correlation between IRI and standard cloze.

723 1s the correlation between MCC and standard cloze.
d.f. is N-3.

The results of this test for significance were mixed.

In the second grade standard claze scores correlations with

the IRI were significantly different from MlCC correlations

with the IRI at all three passages (a=.01). However, in pas-

sage levels one and two, standard cloze correlations were

higher, while in passage level three, the MCC correlation was

higher. This was possibly due to the extremely low scores

many second graders received on the standard cloze third grade

level passage. It was very frustrating for many of them.

Hypothesis IIIC was not accepted for all three passages

of the second grade test. It should be noted, however, that

standard close correlations were higher at levels one and

two.

The fourth grade passages showed no significant differ-

ences (==.01) at any passage level. Therefore, Hypothesis

IIIC was accepted for the fourth grade level passages. There

was no significant difference between standard cloze correla-

tions to IRI scores and MCC correlations to th~e IRI.








The sixth grade results were mixed. On the fifth

grade level passage the standard cloze correlation with IRI

was significantly greater than the MCC correlation to the

IRI. However, on the other two passages (level six and seven)

the correlations were not significantly different (a=.01).

Hypothesis IIIC was not accepted for level five of the sixth

grade passages. It was accepted for levels six and seven
where no significant differences were indicated between corre-

lations of M:CC and IRI scores and standard claze and IRI

scores.

Hypothesis IV states that MCC passage reliabilities

will be the same as standard cloze passage reliabilities at

the same grade level, Tables 15, 16, and 17 list the passage

reliabilities, the range of item difficulties, biserial Corre-

lations, and standard error of measurement for the second,

fourth, and sixth grade tests, respectively.
These results indicated Hypothesis IY was not accepted.

At all levels for all three grades the MCC test reliabilities

were higher than the standard close test reliabilities at the

same passage level. All passages at all levels showed high

reliabilities. However, the sixth grade passages, in some

instances, had lower reliabilities than the second and fourth

grade passages.
The reliabilities were a measure of internal consistency.

The high reliabilities indicated that a large number of items
on a test were measuring the same thing.













Table 15


Passage Reliabilities, Standard Errors of
Measurement, Item Difficulties, and
Biserial Correlations for all Cloze
Tests for the Second Grade


Range Range of
oflItem Biserial
Reliability SE Difficulty Correlations

SCTI .87 1.61 .19 .83 .46 1.0a

SCT2 .89 1.56 .co .65 .00 .96

SCT3 .89 1.30 .00 .38 .00 .99

HMCCTI .93 1.04 .71 .92 .32 1.0a

HMCCT2 .94 1.61 .57 .87 .65 -1.0a

HMCCT3 .91 1.87 .28 .69 .50 .98

KCT1 .90 1.50 .54 .86 .55 1.0a

K(CT2 .89 1.25 .57 .81 .61 -1.0a

KCT3 .80 1.21 .42 .75 .56 .92

TKC .95 2.43 .42 .86 .47 .98



a alue larger than 1.0 possibly due to a violation of the
assumption of a normal distribution.












Table 16


Passage Reliabilities, Standard Errors of
Measurement, Item Difficulties, and
Biserial Correlations for all Cloze
Tests for the Fourth Grade

Range Range of
oflItem Biserial
Reliability SE Difficulty Correlations

scT3 83 1.69 .00 .84 .00 .95
SCT4 .85 1.65 .00 .73 .00 .95
SCT5 .89 1.58 .00 .66 .00 .97
HMCCT3 .91 1.22 .70 .95 .63 1.0a
HM3CCT4 .92 1.32 .65 .95 .68 -1.0a
HMICCT5 .92 1.58 .49 87 .38 1.0a

KCT3 .79 .80 .73 .95 .80 1.0a
KCT4 *78 *99 .63 .91 .59 1.0a
KCTS .79 .90 .58 94 .66 1.0a
TKC .91 1.63 .58 .95 .45 1.0a


aalue larger than 1.0 due possibly to a violation of the
assumption of a normal distribution.













Table 17


Passage heliabilities, Standard Errors of
Measurement, Iteme Difficulties, and
Biserial Correlations for all Claze
Tests for the Sixth Grade


Range Bange of
oflItem Biserial
Reliability SE Difficulty Correlations

SCT5 .75 1.80 .co .93 .00 .91

SCT6 .70 1.80 .01 .94 .06 .84

SCT7 .69 1.65 .00 .82 .00 .77

Hrr~ccT5 .88 1,00 .68 .99 .65- 1.0a

HMCCT6 .92 1.06 .83 .96 .62 1.0a

HMCCTT .84 1.28 .40 .96 .30 1.0a

KCT5 .67 .54 .86 .99 .86 1.0a

KCT6 .82 .49 .87 .99 .96 1.0a

KCT7 .65 .98 .21 .98 .35 1.0a

TKC .85 1.28 .21 .99 .21 1.0a



a~alue larger than 1.0 due possibly to a violation of the
assumption of a normal distribution.





In all three grader, for all passages, the H~oman M'CC

passages had higher reliabilities than the standard cloze

passages. Hypothesis IV was not supported. Homan MCC pas-

sage reliabilities were higher than standard cloze passage
reliabilities.

The range of item difficulties in many instances

(Tables 15, 16, and 17) was very broad. In some cases, the

MCC item difficulties were very high. The sixth grade M4CC

item difficulties for passages five and six ranged from .86

to .99. This indicates that all the items were too easy and

that almost all students were able to get almost all items

correct. When many students get very high scores it is very

difficult to differentiate between students' abilities.

The biserial correlations between the individual item

scores and the total test scores were varied. They tended

to cover the total range of possible correlations.

Hypothesis V in the null form stated that there was no

relationship between total Homan MCC scores and total Kidder

MCC scores. Pearson Product-Moment correlations were used to

determine the strength of the relationship. Table 18 presents

the results of these correlations for second, fourth, and

sixth grades.

Hypothesis V was rejected for all grade levels. A

significant relationship (O..01) did exist between total Homan
MCC scores and total Kidder MCC scores.

Total scores were used in this correlation due to the

shortness of some of the Kidder passages. In many instances















Table 18


Correlations of Homan Mlultiple choice C10ze
Total Scores and Kidder Miultiple Choice
Total Scores for the Second, Fourth,
and Sixth Grades


Grade 2 Grade 2 Grade 1( Grade 4 Grade 6 Grade 6
Total Total Total Total Total Total
HM1CC KMCC HM~CC KMCC HMCC KMCC

Grade 2
Total
HN~CC 1.0 .82

Grade 2
Total
KMCC -1.0

Grade II
Total
HMCC 1.0 .80

Grade 11
Total
KMiCC 1.0

Grade 6
Total
HMCC 1.0 .7

Grade 6
Total
KMCC 1.0




Underline indicates a = .01.





77


a test consisted of only ten items. Also, the reliabilities

for the Kidder cloze passages individually (Tables 15, 16,

and 17) were lower than the total Kidder cloze passage reli-

abilities. However, in no instance did an individual test

have a reliability lower than .65.
The Kidder cloze passages had been Aasch calibrated to

determine readability level (Kidder, 1977). High correlations

between Kidder MCC and Homan MCC passages indicate a strong

relationship between readability levels as determined by

traditional formulas (Dale-Chall, 1948; Harris-Jacobson in

Harris & Sipay, 1975; Spache, 1974) and the new method of
Rasch calibration.

The significant high correlations between Homan MCC

and Kidder MICC also suggest that no learning tooke place for

the students from the first testing of standard cloze pas-

sages to the second testing with Homan M~CC. If the students
had experienced information gain from: the standard cloze

tests, it would have been evidenced by low correlations

between Kidder MCC and Homan MCC, since half of the sample

were given the Kidder MCC passages at the first sitting, and
the other half were administered the Kidder M~CC passages at

the second sitting.

Hypothesis VIA stated that there would be no relation-

ship between IRI's instructional level scores as scored by
the Powell and Betts criteria (a=.01). Hypothesis VIB stated

there would be no difference between instructional level means

as scored by the Powell criteria and Betts criteria (a=.01).









Hypothesis VIC stated that when the Powell and Eetts criteria

did not place students at the same instructional level, a

greater proportion of students would be placed at a higher

instructional reading level by the Powell criteria than the

Betts criteria (o=.01). Hypothesis VID stated th-at students

would be placed at the same instructional level by the Powell

and Betts IRI criteria at least 75 percent of the tim~e.

Table 19 shows the correlations between I3I's scored by

the Powell criteria and IRI's scored by the Betts criteria at

the second, fourth, and sixth grade levels.

Hypothesis VIA was rejected at all three grade levels.

There was a significant (a=.01) relationship between IEI's

scored by the Powell criteria and IRI's scored by the Betts

criteria.

Hypothesis VIB involved the differences between instruc-

tional level mean scores when an IRI was scored by the Powell

criteria and Betts criteria. Table 20 depicts the ceans and

standard deviation of the Powell criteria instructional level

means and the Betts criteria instructional level means.

At all grade levels the Pow~ell criteria instructional

level means were higher than the Betts criteria Instructional

level means. The differences between the means, when tested

for significance with a t-test for means, were all significant

at the .01 and the .001 levels. In terms of placement of stu-

dents, this means an important difference exists depending on

whether the Powell or Betts criteria are used to score the IRI.





Table 19


Correlations of the Informal Readine Inventory's
Instructional Level Scores as Scored by the
Powell Criteria and Betts Criteria for
the Second, Fourth, and Sixth Grades


Grade 2 Grade 2 Grade 4 Grade 4 Grade 6 Grade 6
Powell Betts Powell Betts Powell Betts
IBI IRI IRI IRI IRI IRI


Grade 2
Powell
I.I

Grade 2
Betts
IRI

Grade 4
Powell
IRI

Grade 4
Betts
IRI

Grade 6
Powell
IR1

Grade 6
Betts
IRI


.80



1.0



-~ 1.0



1.0



- 1.0



- - .0


Underline Indicates a = .001.













Table 20


Means, Standard Deviations, and Standard
Errors of Instructional Level Scores as
Determined by the Powell Criteria and
Betts Criteria for the Second,
Fourth, and Sixth Grades



N x SD SE

Grade 2
Powell IRI 32 3.0313 1.656 0.293

Grade 2
Betts IRI 32 2.1875 1.712 0.303

Grade 4
Powell IRI 30 4.533 1.224 0.224

Grade 4
Betts IBI 30 3.867 1.479 0.270

Grade 6
Powell IRI 31 5.419 1.177 0.211

Grade 6
Betts IBI 31 4.903 1.535 0.276









Hypothesis VIB was rejected at all levels. A signifi-

cant difference in instructional level means did exist between

IRI scores obtained by the Powell criteria and those obtained

by the Betts criteria.

While investigating these differences, additional infor-

mation concernling instructional level means became apparent.

A comparison of instructional level means using the Powell

and Betts criteria was made at grade levels one to seven.

Table 21 shows the instructional level standard cloze

percentages as determined from IRI's scored by the Powell

and Betts criteria.

Fewer students were at each instructional level using

the Betts criteria. More students tended not to fit in at

any one level. Due to the nature of the cloze passages used,

second grade students would need to have scored at the first,

second, or third grade instructional level on the IBI to have

their scores included in the instructional level means.

Fourth graders had to be instructional at the third, fourth,

or fifth grade level, and sixth graders had to be instruc-

tional at the fifth, sixth, or seventh grade levels to be

included.

The range of standard cloze instructional level scores

using the Powell criteria was 13 percent to 45 percent. The

13 percent for second grade was exceptionally low. This per-

centage was probably distorted because five of the thirteen

students received scores of zero on the standard close tests.

Many of them did not even attempt to complete any of the














Table 21


Instructional Level Cloze Score Percentages
as Determined by the Powell Criteria
and Betts Criteria


Close a Using
Betts Criteria

57%

36%

37%
395
48%

38%
41%


Cloze I Uising
Powell Criteria


132
371
36X

452
38X

415


Grade
1
2

3
4

5
6

7


Total 59





items. Ignoring that percentage, the range of instructional

level scores using the Powell criteria was 36 to 115 percent.

The range using the Betts criteria was 36 to 57 percent.

Using either IRI scoring criteria base, it indicates that the

existing criteria for standard cloze instructional level

scores may be too high. The original Borruth (1967) criteria

of 38 to 50 percent comes closest to matching the percentages

derived from the present study. The other criteria, Bormuth

(1968) of 44 to 57 percent, Rankin and Culhane (1969) from

4l to 61 percent, and Alexander (1968) from 117 to 60 percent,

all seem too high. Perhaps criteria differentiated by grade

level are needed.

The closeness in range of Powell and Betts close

instructional scores did not reflect the very real differ-

ences present in placing students at their instructional

reading level based on the Powell or Betts criteria.

Hypothesis VIC Involved the placement of students at

instructional level by the Powell and Betts criteria. The

hypothesis stated that when the Powell and Betts criteria

do not place students at the same instructional level, a

significantly greater proportion of students will be placed

at a higher instructional reading level by the Powell criteria

than the Betts criteria (a..01).

A Chi Square test was done to test this hypothesis. It

was expected that when the Powell and Betts criteria placed

students at different levels the Powell criteria would place

students at a higher Instructional level 50 percent of the

time.





Hypothesis VIC was accepted. The Chi Square test

results indicated that the number of times the Powell criteria

placed a student at a higher instructional level than the

Betts criteria was significant. In fact, in every case where

the Powell and Betts criteria did not place students at the

same instructional level, the Powell criteria placed students

at a higher level.

Hypothesis VID stated that the Powell and Betts criteria

would place students at the same instructional level at least

75 percent of the time. Using percentages, it was determined

that the Powell and Betts criteria placed students at the

same instructional level only 51 percent of the tine. Forty-

nine percent of the time the Powell and Betts criteria placed

students at different Instructional levels.

Of that 49 percent, the Powell criteria always placed

students at a higher instructional level. Fifty-five percent

of the time the Powell criteria placed the students one grade

level above the Betts criteria placement. Thirty-eight per-

cent of the time the Powell criteria placed the student two

grade levels above the Betts criteria placement. However, in

five cases (28 percent) the two levels' differences involved

a Jump from frustration level according to the Betts criteria,

to the second grade instructional level according to the

Powell criteria. In the remaining 7 percent (three cases)

the Powell criteria placed the student three grade levels

above the Betts criteria.









The highest instructional level score attained by a

student was used for analysis in all parts of Hypothesis VI.

Hypothesis VID is rejected. The Powell and Betts criteria

did not place students at the same instructional level 75

percent of the time.

The important and statistically significant differences

in placement based on scoring criteria Indicate that classroom

teachers should choose the IRI scoring criteria they will use

carefully based on a conviction of accuracy of placement.


Theoretical Considerations


One of the original intents of this study was to set up

a criteria for Homan MCC to be used by the classroom teacher

for determining instructional reading level. This was to be

accomplished by using a regression equation based on Homan

MCC scores regressed on standard cloze scores for the purpose

of setting up an equation that would predict Homan KCC scores.

~I the predicted Homan MCC scores correlated highly with the

actual Homan M3CC scores, criteria for Homan I!CC instructional

level scores could be set up based on how the same students

performed on standard cloze tests.

It was assumed that the correlations between standard

cloze and MCC would be high, especially since the same stu-

dents were given the same passages, only in different form.

However, the correlations between standard cloze and

MCC scores were much lower than anticipated. The accuracy

and practicality of using standard cloze to predict H~oman M1CC









close scores became an issue. Since only the second grade,

second level correlation scores reached the desired level of

practical significance, it was impractical to set up the new
criteria for Homan MCC. The criteria would not have been

accurate for classroom use.

The fourth grade correlations represented the highest

overall correlations between standard cloze and XCC. Also,

the Fourth grade distributions were all normal. For this

reason, the fourth grade scores were used to determine how

accurate or practical a prediction of scores could be made

from standard claze to Homan McC.

The fourth grade sample was randomly divided into halves.

One half was used to derive the regression weights; the other,

to cross validate the regression weights. This was done with

three separate regressions, one for each passage level. The

regression equations are listed in Table 22.

The predicted Homan MCC scores based on the regression

equation were then correlated with the actual H~oman MCC

scores of the second half of the sample. The correlations

ranged from .61 to .69. Using the highest correlational .69,

only 1)7.6 percent shared variance existed between standard
close scores and Homan MCC scores.

The purpose of the prediction equation was to set up the

criteria for determining instructional reading level. When

dealing with the placement of students in the correct instruc-

tional level and thereby correct reader, having more than 50














Table 22


Constant and b Values Used
in the Prediction Equation



Grade Level b Constant

3 0.3837 16.1227

4 0.6950 14.4465

5 0.6091 13.9831





88


percent of the variance of MCC not explained by standard

close leaves too much room for placement error.

These results reinforce that standard claze scores

should not be used to develop a criteria for usinC Homan ECC

scores in the classroom to determine instructional reading

level.















CHAPTER V
SUMMARY AND CO:ICLUSIONS


Standard cloze and the IRI are both used to determine

instructional reading level. This study was designed to

explore the relationship of a new form of close test, MCC, to

standard cloze and the IRI. The intent has been to obtain

new information that would assist the classrooms teacher in

determining the instructional reading level of all students

as easily and accurately as possible.

Six main objectives are included in the investigation.

1. Comparison of standard cloze and MCC tests for the

same passages, with the same students.

2. Examination of the strength of the relationship

between standard cloze and MCC passages for the purpose of

determining if standard cloze scores can be used to predict

MCC scores.

3, Investigation of the relationship among standard

cloze scores, MCC scores, and the IRI.

4. Determining the reliabilities of the standard cloze,

MCC, and Kidder MCC passages.

5. Examination of the relationship between two forms

of MCC tests, Homan MCC and Kidder MCC.





6. Investigation of the relationship between IRI's

instructional level scores as scored by the Powell criteria

and Betts criteria.

Three schools, representinE high, medium, and low socio-

economic groups were involved in the study. All second,

fourth, and sixth graders at all three schools participated

in the study. A total of 5118 students were included in the

final analysis of data.

Each student was administered three standard cloze

tests. One test was at the student's grade level, one was a

grade level above the student's grade level, and the third

was a grade level below the student's grade level.

Multiple choice close tests were made from the same pas-

sages used for the standard cloze tests. Cn a second testing

occasion, each student was administered three NCC tests

covering the same grade levels as the standard cloze tests.

Additional Kidder MCC tests were administered. These

tests were also at three different grade levels. One-half of

the students were administered the Kidder iNCC test with the

standard cloze testing, and the other half were administered

the Kidder MCC test with the Homan MCC test. Individual IRI's

were administered to at least thirty students at each grade

level.

Six hypotheses were tested in the study. Some hypotheses

had several sub-bypotheses connected to them.

1. Hypothesis IA stated that there would be no differ-

ences between the mean scores for standard cloze and MCC








passages for the same group of students. Miean scores, how-

ever, were not equal, and the differences were shown to be

statistically significant. The form of cloze test used did

affect the score of the students. On the average, students

scored lower on standard cloze tests than on MCC tests.

Hypothesis IB stated that there would be no differences

among the mean scores on the three levels of difficulty
tested for each group. For the second grade all passage

scores were signlificantly different from each other. At the

fourth grade level all standard cloze mean scores were siGnif-

icantly different from each other. However, for MCC scores

the fifth grade passage mean scores were significantly differ-

ent from both the third and fourth grade level mean scores.

The third and fourth level mean scores were not significantly

different from each other. At the sixth grade level all

standard cloze mean scores were significantly different from

each other. The sixth grade MCC scores indicated the seventh

level passage scores were significantly different from the

fifth and sixth level passage scores. As would be expected,

student scores decreased as the readability levels of the

passages increased. The MCC passages did not also show this
difference in scores, possibly because of the very high

scores many students received on all three levels of MCC

passages.

Hypothesis IC stated there would be no interaction

between the form used and the level of passage difficulty. A





significant Interaction was evidenced by the data, and did,

in fact, exist.

2. It was hypothesized that a correlation of .70 or

more exists between standard close and MCC. Surprisingly,

this was not the case. The correlations between standard

cloze and MCC on the same passages were much lower than

anticipated. Thus, it seems reasonable to question whether

the two test forms actually measure the same type of reading

comprehension.

3. It was hypothesized in IIIA that there would be no

relationship between standard cloze scores and IRI scores.

Hypothesis IIIB stated there would be no relationship between

MCC and IRI scores. These correlations were very low in some

instances and only significant on occasion. The sixth grade

correlations were most often significant. This raises ques-

tions as to whether the IRI and standard cloze are both meas-

uring the same thing in attempting to determine instructional

level.

Hypothesis IIIC stated the correlations between MCC

scores and IRI instructional level scores will be the same as

the correlations between standard cloze scores and I':I instruc-

tional level scores. The results of analysis were mixed.

For the second grade sample significant differences

were apparent between correlations of IRI scores and MCC

scores and standard close and TRI scores. At two passage

levels the standard cloze and IR.I score correlations were

higher.




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs