Title: Frequency, celeration, and variability of academic performance as predictors of learning
CITATION THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00098073/00001
 Material Information
Title: Frequency, celeration, and variability of academic performance as predictors of learning
Physical Description: vi, 130 leaves : ; 28 cm.
Language: English
Creator: Trifiletti, John J ( John Junior ), 1947-
Copyright Date: 1980
 Subjects
Subject: Learning, Psychology of   ( lcsh )
Prediction of scholastic success   ( lcsh )
Special education   ( lcsh )
Genre: bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )
 Notes
Statement of Responsibility: by John J. Trifiletti.
Thesis: Thesis (Ph. D.)--University of Florida, 1980.
Bibliography: Includes bibliographical references (leaves 100-107).
General Note: Typescript.
General Note: Vita.
 Record Information
Bibliographic ID: UF00098073
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: alephbibnum - 000100342
oclc - 07382206
notis - AAL5803

Full Text













FREQUENCY, CELEBRATION, AND VARIABILITY OF
ACADEMIC PERFORMANCE AS PREDICTORS OF LEARNING












By

John J. Trifiletti


A DISSERTATION PRESENTED TO
STHE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY



UNIVERSITY OF FLORIDA


1980





















To Diane, my wife and best friend,

and to the expression of our love,

John Cristopher, III.













ACKNOWLEDGMENTS


I wish to express my sincere appreciation to all

my committee members for their support, encouragement,

and for their constructive suggestions, with special

thanks to Dr. William Wolking for his encouragement,

guidance, and intellectual stimulation throughout my

program of studies. Deserving of special acknowledgment

is Dr. Cecil Mercer for advice, support, and writing

skills.

I would also like to extend my appreciation to

Dr. Thom Hodgson for his guidance in the systems

engineering area and programming skills.

Finally, I wish to acknowledge my wife, Diane,

who has given me the emotional and intellectual

support I've needed throughout my program.


iii














TABLE OF CONTENTS


ACKNOWLEDGMENTS


ABSTRACT .

CHAPTER I INTRODUCTION

Importance of the Study
Rationale .
Statement of the Problem


vi




2
8
. 8


CHAPTER II REVIEW OF RELATED LITERATURE . . 12

Early Identification Research . . . 14
Prediction-Performance Comparison Matrix . 15
Single Instruments as Predictors . . 19
Multiple Instrument Batteries as Predictors 23
Teacher Ratings as Predictors . . . 27
Areas of Assessment for Prediction . . 28
Frequency Measurement of Academic Performance .30
Frequency versus Percentage Statements . 33
Considerations for Measurement of Frequency 35
Record Floor 36
Record Ceiling 37
Performance Ceiling . . 38
Studies of Frequency Performance Standards 41
Math Frequency Standards . . . 42
Reading Frequency Standards . 44
Spelling Frequency Standards . . . 52
Writing Frequency Standards . . . 53
A Frequency Model of Learning . . . 56
Predictive Studies Using Frequency Measurement . 59
Predictive Studies 61


CHAPTER III METHOD .


. . 70


Subjects .
Equipment .
Setting .
Procedure .
Training Procedures . . .
Initial Precision Assessment .
Instruction .
Experimental Design . . .
Data Collection ..
Data Analysis .


. 70
. 71
. 72
. 72
S . 73
. 73
S 76
. 76
. 77
.* 77
* S 77


iii










CHAPTER IV RESULTS . . .

Reliability .
Analyses of Prediction Results

CHAPTER V DISCUSSION . .


Findings .
Interpretation of the Findings . . .
Problems and Limitations of the Study . .
Practical Implications .
Suggestions for Further Research . . .

REFERENCES .

APPENDICES

A. DEFINITIONS .

B. DESCRIPTION OF SPARK II INSTRUMENT

C. FORTRAN PROGRAM FOR COMPUTERIZED CHARTS
AND SUMMARY STATISTICS . . .

D. SAMPLE FREQUENCY CHART . .

E. SYLLABUS OF TEACHER TRAINING SESSION .

BIOGRAPHICAL SKETCH .


S 100



108

111


S 114

125

S 128

129


*









Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of
the Requirements for the Degree of
Doctor of Philosophy



FREQUENCY, CELEBRATION, AND VARIABILITY OF
ACADEMIC PERFORMANCE AS PREDICTORS OF LEARNING

By

John J. Trifiletti

December, 1980


Chairman: William D. Wolking
Major Department: Special Education

This study examined the predictive ability of

frequency, celebration, and variability measures of

academic performance during assessment for predicting

celebrations during instruction. Data from 360 separate

teacher-learner-task triads in reading, spelling,

writing, and mathematics were evaluated. Learners

were assessed over a five day period, then received

twenty days of instruction during which daily timed

samples of their academic performance were recorded.

The amount of variance in celebrations explained

by baseline predictor variables was generally small.

Moderate amounts of variance were explained by

predictor variables for phonics sounds tasks and see

to say tasks.













CHAPTER I

INTRODUCTION


There are multiple dimensions to the academic

performance of learners in the classroom. Accuracy of

performance, expressed in terms of percentage statements,

has traditionally been used as the primary measure of

academic performance. Recently, other dimensions of

academic performance have received attention (Haughton,

1971a; White, 1972a; White Haring, 1976). These include

frequency, the speed of performance; celebration, the

rate of change in performance over days; and variability,

the variation in frequency about a trend line.

The problem of interest to this study is the use of

frequency, celebration, and variability measures of

performance during assessment to predict subsequent

academic performance during instruction.


Importance of the Study

Schools are designed to change the behavior of child-

ren. Ideally, learners progress from little or no knowledge

or skill to a desired level of academic performance. This

change occurs as a direct or indirect result of instruction.

The general purpose of assessment is to determine whether

a learner's performance is changing appropriately.

Assessment for instruction or teaching refers to the process

1









of obtaining information about a student's instructional

needs (Wiederholt, Hammill, S Brown, 1978). To facilitate

instructional programming the assessment must provide

information in two areas. First, it must help the teacher

select what to teach the individual student. Second, it

must help the teacher determine how to teach the student

for optimal progress (Mercer & Mercer, 1980).

Of equal importance is the use of assessment to

predict and prevent failure. Such prediction depends

upon the sensitivity of educational measurement and

the examination of multiple dimensions of performance.

This study is aimed at determination of the utility

of frequency, celebration, and variability of academic

performance during assessment for predicting subsequent

performance during instruction. As such, it builds and

extends upon previous research which has explored the use

of multiple dimensions of academic performance to predict

subsequent learning (White, 1972a).

Rationale

Traditionally, educational assessment has been of

limited success and has progressed little beyond

assumtions and concepts from the 19th century. The ideas

of Galton, Simon, Binet, Terman, and others with respect

to global variables and group measurement techniques

remain relatively unchanged except for the development of

new statistical procedures for the evaluation of group

performance data. Statistical evaluation has not









compensated for the inaccurate and incomplete measurement

of academic performance fostered by reliance on accuracy

as a single dimension of academic performance.

Traditional educational measurement is based upon

norm-referenced comparison. An individual's performance

is converted to a norm-referenced variable such as

grade level, age level, or percentile. Sensitivity is thus

limited by the dimensions and units used in measuring.

An additional problem with traditional educational

measurement is the reliance on percentage accuracy

statements to describe a single dimension of performance.

Percentage statements are subject to a number of measurement

problems (Haughton, 1969; White & Haring, 1976). These

include an arbitrary ceiling effect and inadequate descrip-

tion of performance. When percentage accuracy statements

alone are used to describe academic performance, changes

beyond 100 percent cannot be reflected by the measurement.

Yet there are many ways in which learners can increase

performance beyond 100 percent accuracy. For instance, the

latency (time to onset) of the performance can decrease, or

the speed of performance can increase. Furthermore,

accuracy statements inadequately describe academic

performance when there are multiple ways to obtain the same

unit of measurement. For instance, a student who correctly

answers three out of four questions during a class period









might be compared with a student who correctly answers

30 out of 40 questions. The latter student's performance

far exceeds the former, yet using percentage accuracy

statements alone, both students will be equated at

75 percent correct.

Lindsley (1971) has developed alternative measurement

procedures within an instructional system of precision

teaching. In this model, frequency measurement is

employed to describe multiple dimensions of academic

performance. Frequency measures have been found to be

more sensitive for description of individual behavior

than evaluation using percentage criteria (Pennypacker,

1972). With frequency measurement, both speed and

accuracy dimensions of performance are considered.

As such, frequency is a better measure of the fluency

of performance.

It is believed that high frequencies of performance

facilitate acquisition and retention of academic behavior

(Starlin, 1971; Haughton, 1971a). The rationale for high

rates of accurate responding during instruction is that

they insure against practicing errors and rapid loss of

skills due to inadequate learning. Key differences between

precision teaching and traditional educational measurement

include emphasis on frequency as a standard measurement unit,

emphasis on multiple dimensions of performance, the use of









criterion-referenced as opposed to norm-referenced

comparison, and frequent direct measurement (see Table 1).

Lindsley's (1971) precision teaching begins with

measurement of the frequency of performance prior to

instruction. This baseline measurement serves as a

reference point for comparison with subsequent measure-

ments during instruction. Following baseline measurement,

specific skills are identified as targets for instruction,

and instruction begins. During the course of instruction,

daily measurements of the performance of each skill are

obtained and graphically displayed on standard behavior

charts. Integral to precision teaching is a system of

instructional decision-making and optimization based on

observation of the daily measurements.

White and Haring (1976) have recently expanded upon

the baseline aspects of precision teaching with their

procedures for precision assessment. Precision assessment

provides for examination of a wide range of skills in the

learner's repertoire through the use of mixed skill probes.

A mixed skill probe is similar to a traditional achievement

test in that the items from many skills are presented. The

difference is that frequency of performance is measured,

and the assessment does not end at this point. Next,

single skill probes are used to obtain speed and accuracy

measures of skills in need of instruction. A single skill

probe differs from a mixed skill probe in that it contains










Table 1

A Comparison of Traditional
Measurement and Precision Teaching


Traditional Precision Teaching
Psycho-Educational Behavior-Analytic
Measurement Measurement




Norm-referenced Criterion-referenced

Measurement before and after Daily or frequent
instruction, infrequent measurement

Measurement direct or Measurement always
indirect direct

Individuals are compared to Individuals are
a group compared with
themselves

Considers accuracy dimension Considers multiple
of academic performance dimensions of
performance

No standard unit of measurement Frequency used as
standard unit of
measurement









items from only one skill domain. Additionally, single

skill probes contain many items representing a single

skill, and thus constitute a better sample of the

movement than the mixed skill probe or an achievement

test. A third type of skill, the tool skill probe,

is used to assess skills such as saying digits, writing

digits, saying letters, writing letters, and saying

letter sounds. These skills are considered prerequisite

to more advanced skills which build upon them (see Table 2).

An additional dimension of academic performance is

celebration, the rate of change in frequency across days.

Celeration has been used as a criterion measure to

evaluate performance during instruction. White and

Haring (1976) recommend a number of possible procedures

for determining acceptable performance during instruction,

one of which is the standard celebration. The standard

celebration procedure is based on research by Liberty (1975).

In a working paper from the University of Washington, Liberty

analyzed several hundred programs dealing with all types of

skills and children of all ages. Liberty observed that,

of the children whose programs showed some progress, about

53 percent accelerated at a rate of X1.25 (25 percent

improvement in frequency per week). About 66 percent of

the children achieved a -1.25 for deceleration targets

(25 percent improvement in frequency per week). In the

absence of other criteria, White and Haring recommend









minimum celebration values of X1.25 for acceleration

targets and -1.25 for deceleration targets.

In summary, the sensitivity of measurement

provided by precision teaching and precision assessment

is seen by many educators as an improvement over

traditional educational measurement. The use of

multiple measures of performance, criterion-referenced

evaluation, and continuous daily measurement holds

great promise for increased instructional control and

subsequent increased academic performance. The use of

frequency and multiple dimensions of academic performance

to predict learning is a relatively new area for research.

This study explores the use of frequency and multiple

dimensions of initial academic performance to predict

performance during instruction.


Statement of the Problem

The problem of interest to this study is the use

of frequency, celebration, and variability dimensions of

performance during baseline assessment to predict

academic performance during instruction. The specific

questions this study will address are:

(1) Can frequency, celebration, and variability of

baseline performance on tool skill probes be used to

predict the rate of learning during instruction?

(2) Can frequency, celebration, and variability of

baseline performance on single skill probes be used to









Table 2

The Relationship ,of Tool Skills To
Complex Academic Skills


Prerequisite Tool Skills


Write Digits 0 9


Complex Academic Skills


Write

Write

Write

Write

Write

Write


Write
words

Write

Write

Write
words


Write Letters A Z


digits for addition

digits for subtraction

digits for multiplication

digits for division

digits for fractions

digits to record time


letters for basic sight


letters for texted words

letters for name

letters for spelling









predict the rate of learning during instruction?

(3) Can frequency, celebration, and variability

of baseline performance on mixed skill probes be used

to predict the rate of learning during instruction?

(4) Is there a differential predictive relationship

by academic task stimuli from baseline performance to

performance during instruction? The different academic

task stimuli of interest include letters, digits, phonics

sounds, sight words, texted words, number problems,

phonics words, and spelling words.

(5) Is there a differential predictive relationship

by input and output modality of learners from baseline

performance to performance during instruction? The

different input and output modalities of interest include

see to say, see to write, see to do, hear to say, hear to

write, hear to do, think to say, think to write, think to do.

For each of the above questions, the rate of learning

will be measured by celebration. Celeration is defined as

the percent of change in frequency per week.

The answers to these questions increase our knowledge

of how academic skills are learned. This study also

extends our knowledge of measurement procedures. History

makes clear that the development of accurate observation

and measurement procedures inevitably leads to an explosion

of technology and knowledge. The telescope in astronomy,

the microscope in biology, x-rays and blood counts in

medicine, the Geiger counter in physics, Neilson ratings









in television advertising, and the cumulative recorder

in psychology are illustrative. Following each of these

technological achievements in measurement was a rapid

upgrading of the discipline involved.

The consequence of refined and sensitive educational

measurement may be a technology of education in which the

effects of a given instructional procedure on a given

individual are predictable and replicable. The questions

under investigation in this study are a small step toward

the development of such a technology of educational

measurement.













CHAPTER II

REVIEW OF RELATED LITERATURE


The review of literature is presented in three

parts. The related area of early identification

research is reviewed in the first section. The focus

here is on early identification studies concerned with

the prediction of learning failure.

The second section contains the theoretical and

research basis for the use of frequency as a measurement

datum for academic performance. Consideration was given

to major theoretical landmarks in the literature, as

well as experimental works. Studies of frequency

performance criteria for academic behaviors in reading,

writing, computational math, and spelling are reviewed

in this section.

The final section of the literature review contains

predictive studies using frequency measurement. These

studies are similar in nature to the present research

and are the foundation from which this research extends.

In order to survey the related literature, an ERIC

document search was made using key words "frequency,"

"rate," and "learning." In addition, the following

journals were searched from their initial publications

to date, with the exception of Review of Educational










Research

1.

2.

3.


which was searched from 1960 to present.

Academic Therapy

Exceptional Children

Educational Technology


4. Journal of Applied Behavior Analysis


Journal of Applied Psychology

Journal of Educational Measurement

Journal of Educational Psychology

Journal of Educational Research

Journal of Experimental Analysis of Behavior

Journal of Experimental Psychology

Journal of Learning Disabilities

Journal of Measurement and Evaluation

Journal of Personalized Instruction

Journal of Research and Development in Education

Journal of School Health

Journal of School Psychology


In order


to locate important theoretical discussion,


the reference lists of selected journal articles were

searched for relevant texts. Personal communication with

experimenters added to the literature review.


5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.









Early Identification Research


There is currently a great interest in screening

infants and preschool children to predict which ones

are at risk for experiencing difficulty with subsequent

learning. This interest is based on the assumption

that treatment initiated prior to schooling will

alleviate school-related problems. The studies of

Kirk (1958) and Skeels (1966) and the reviews by

Tjossem (1976) and Mercer, Algozzine, and Trifiletti

(1979) have formed a basis for continued research in

early identification and prediction.

Proponents of prediction procedures to identify

preschoolers at risk of school problems suggest several

advantages of early identification. First, they believe

early identification efforts are more likely to be

successful due to the belief that the behavior of young

children is more suceptable to change than that of older

children (Hayden, 1974; Stimbert, 1971). Secondly, early

identification enables preventive interventions during

optimal developmental periods when personality charac-

teristics are forming (Hayden, 1974; Stimbert, 1971).

Finally, early identification enables earlier family

adjustments and acceptance, providing additional support

for intervention strategies (Hayden, 1974). Disadvantages

of early identification are primarily concerned with the

effect of misdiagnosis and labeling (Keogh & Smith, 1970).









Prediction-Performance Comparison Matrix

Mercer, Algozzine, and Trifiletti (1979) have

presented a model for interpreting the predictive

utility of standardized assessment instruments and

batteries in which both vertical and horizontal

percentages are analyzed. Integral to this model is

construction of a prediction-performance comparison

matrix which allows observation of predictions and

outcomes (see Figure 1). Levels of prediction in terms

of poor performance or good performance are compared

with poor or good levels of criterion performance.

Quadrant A of Figure 1 represents those students who

performed poorly and who were predicted to perform

poorly. Quadrant B refers to those students who

performed well on the criterion measure, but were

predicted to perform poorly. These are referred to

as false positives. Quadrant C refers to students who

were predicted to perform well, but in fact performed

poorly. These are referred to as false negatives.

Quadrant D refers to students predicted to perform

well, who in fact did perform well.

There are a number of ways of comparing prediction

with performance on the matrix. Mercer (1975) reported

that most prediction studies utilize the horizontal

method of comparison. In the horizontal method observed

values in each quadrant are compared with prediction














PERFORMANCE


Poor


Poor


Prediction



Good


Figure 1


Prediction-Performance Comparison Matrix


Good


Predicted Poor Predicted Poor
Performed Poor Performed Good

(valid positive) (false positive)
A B

Predicted Good Predicted Good
Performed Poor Performed Good

(false negative) (valid negative)
C D









levels. Figure 2 is an example of a prediction-

performance comparison from a hypothetical prediction

study. Percentage of correct and incorrect outcomes

can be obtained by applying the horizontal analysis

method. In this case, 80 of 100 children predicted to

perform poorly actually did so (80 percent), while

20 of 100 children predicted to perform poorly did

well (20 percent). Similarly, 270 of 300 children

predicted to perform well did so (90 percent), while

30 of 300 predicted to perform well actually performed

poorly (10 percent). The overall hit rate describes

the number of children who were correctly identified.

It can be figured by adding the poor predicted poor

in quadrant A with the good predicted good in quadrant

D and dividing by the sum of all quadrants. The

overall hit rate for the example in Figure 2 is 87.5

percent.

Although the horizontal method serves as a way

of organizing and evaluating prediction-performance

information, it neglects consideration of the

relationship between the observed values within the

quadrants and the actual performance levels. This

can be accomplished by vertically analyzing the matrix.

Figure 2 includes vertically computed percentages.

For example, 72.7 percent (80 of 110) of the poorly

performing students were predicted poor, and 14.5













PERFORMANCE


Poor


PREDICTION


Good


110
Performed
Poorly


100
Predicted
Poor



300
Predicted
Good


290
Performed
Well


Figure 2

Numerical Example of Prediction-
Performance Comparison Matrix


Poor


Good









percent (20 of 290) of the good performing students

were predicted to do poorly. Percentages generated

by the vertical method will obviously differ from

those figured by the horizontal method.

Gallagher and Bradley (1972) advocated using

the horizontal method for establishing false positives

and the vertical method for computing false negatives.

They also favored evaluating the overall hit rate and

visually inspecting the entire matrix.


Single Instruments as Predictors

The majority of prediction studies use single

instruments as predictive measures. General readiness

tests (Ferinden & Jacobson, 1970; Lessler & Bridges, 1973),

intelligence tests (Lessler & Bridges, 1973), language

tests (Lyons & Bangs, 1972), perceptual-motor tests (Keogh

& Smith, 1970), and general physical factors such as

unusual birth history (Galante, Flye, S Stephens, 1972)

have been used as single predictors. An analysis of

these studies using horizontal and vertical percentages

is presented in Table 3.

As illustrated in Table 3, the Metropolitan Reading

Readiness Test (MRRT) yielded favorable percentages in

both short-term and long-term analyses. In comparing

this instrument to the Lee-Clark Readiness Test (LC) and

the California Test of Mental Maturity (CTMM), Lessler

and Biidgeo (1973) Coneluded that the MRT is th bnt






20





Table 3


An Analysis of Single Instruments as Predictors Using
Horizontal (H) and Vertical (V) Percentages




Study Sample Prediction Length of Performance Valid False False Valid
Instrument Prediction Test Positives Positives Negatives Negatives Overall


II V II V II V II V


lillt


Ferinden & 67 kinder- Evanston
.lacobson garden Early
(1970) children Identifica-
tion Scale
Ferinden & 64 kinder- WRAT
Jacobson garten Reading
(1970) children
Ferindcn & 67 kinder- MRIRT
JarobFon gartei
(1970) thillhen
Keogh & 2') kinder- Bender
Smith garten
(19701 children
Kenlh & 26 kinder- Bender
Smith .arten
(1970) children
Galante, 71 kinder- Unusual
Fle. & garden birth
Stephens children history
(1972j
Galante 71 kinder. Birth


Flye. &
Stephens
(1972)
Lyons &
Bangs
(1072)
Lyrnn| &

(1972)

Lyons &
Bangis
(1972)

Lyons &
Bangs
(1972)

l.r'sler &
lBlndqes
(1973)
Lessler &
Brikls
(1973)


garten
children


35 1st


order



LLAT


H month; WRAT


Reading



,8 months WRAT
8 months WRAT


Reading
6 ywais CAT
Reading


6 years CAT
Arllhmeilc


7 years Stanford
Aciievenment
& I lorn
Expectancy
7 years Sl.iford
Achievement
& Horn
Expectancy
I 2 years SRA


grader with special Arithmetic

23: 1t1 LI.AT 11/2 years SRA

special
intervention
35 11 LLAT '1/2 years SRA
graders with Rcading
special
intervention
23 1st LLAT IV'. years SRA
graders without leading
special
Intervention
293 Isl MRII 9 months CAT &
graders Teacher
Rating
196 Isl MRRT End of 2nd CAT &
graders grade Teacher
follow-up 21 month Ratiny


56 50 44 17 20 50 80 83


63 100 37


27 0


0 100 73


57 85 43 28 8 15 92 72 76


47 8: 53 44 9 13 91 56 65


40 100 60 45 0


0 100 55


56 41 44 14 24 59 76 S6 72




25 27 75 37 34 73 66 63 52




67 35 33 33 65 65 35 67 46


91 71 9 11 33 29 67 .'l


30 70 70 64 25


82 75 18


18 25


30 75 35




25 75 82


78




46


R6 87 14 14 13 13 87 86



91 62 9 10 39 38 61 90











Table 3 continued


Study Sample Prediction Length of Performance Valid False False Valid
Instrument Prediction Teat Posltlve@ Positives Negatives Negatives Overall
H V H V H V H V Hit"


Lessler & 293 1st Lee Clark 9 months CAT &
Bridges graders Heading Teacher
(1973) Readlrnss Rating
Test
Lessler & 293 1st California 9 months CAT &
Brldqes gralers Te!t of Teacher
(l1'73) Mental Raling
M.altlrlty


82 77 18 17 22 23 78 83



82. 78 18 18 22 22 78 82


Note. From "Early Identification An Analysis of the
Research". By C.D. Mercer, B. Algozzine, and J. Trifiletti,
Learning Disability Quarterly, 1979, 2, 12-24.


aThe median overall hit rate is 73.


--------~---









predictor of the three. Badian (1976) reported a

median correlation of .58 between the MRRT and subsequent

reading achievement and concluded that group-administered

readiness tests yield better prediction results than

group-administered ort individually administered

intelligence tests. In evaluating Badian's conclusion,

it is important to remember that the relationship

between predictive and criterion instruments is

crucial to the outcome of predictive studies, i.e., the

nature of the items on each test will greatly influence

the relationships between them. In support of this

qualification, Dykstra (1967) noted that letter naming

is one of the best predictors of reading achievement.

Analysis of studies using language tests as single

predictors indicates that educational intervention can

greatly influence achievement. Lyons and Bangs (1972)

used the Language and Learning Assessment for Training

Test (LLAT) to predict reading and mathematics achievement

with and without intervention. The overall hit rate of

the LLAT declined when children received intervention.

In other words, the predictive outcomes were influenced

by subsequent educational programming.

In a study using the Bender Visual-Motor Gestalt

Test (Bender) as a predictive instrument, Keogh and

Smith (1970) obtained good false negative percentages

but found a large percentage of false positives. Keogh









and Smith (1961) and Ferinden and Jacobson (1970)

suggested that a good score on the Bender is usually

followed by satisfactory achievement; however, low

scores tend not to be predictive.

While only one study of the predictive nature

of physical factors contained enough information to

obtain a horizontal and vertical analysis, many

investigations have been reported within this area.

Physical anomalies (Waldrop & Goering, 1971; Waldrop,

Pedersen, & Bell, 1968), developmental history (Denhoff,

Hainsworth, & Hainsworth, 1972; Hoffman, 1971; Pasamanick,

Rogers, S Lilienfeld, 1965; Wilborn & Smith, 1974), and

dental enamel defects (Cohen & Diner, 1970) have been

studied in attempts to establish their utility as

predictive measures. Mercer and Trifiletti (1977) have

examined the studies involving the predictive nature of

physical factors.


Multiple-Instrument Batteries as Predictors

A well-known multiple-instrument prediction study

was conducted by de Hirsch, Jansky, and Langford (1966).

A sample of 53 middle-class kindergarten children was used.

The battery consisted of 37 variables which were used to

predict reading performance. The authors concluded

that while background information is not useful as a

predictor, chronological age is a significant predictor.









Also, it was reported that predictions for girls are

more accurate than those for boys. An analysis of the

horizontal and vertical percentages obtained for the

de Hirsch et al. study and other multiple-battery

prediction studies is presented in Table 4. An overall

hit rate of 91 percent is indicated as are good percentages

in other categories, with the exception of false positives.

Gallagher and Bradley (1972) are critical of this study

and indicate that the results were obtained by applying

the index to the same group on whom it was developed.

Cross validation has not been as successful.

Feshbach, Adelman, and Fuller (1974) used the

de Hirsch Index to predict reading achievement in second

grade. As indicated in Table 4, the overall hit rate

was 73 percent and the number of false positives was

high. Likewise, Eaves, Kendall, and Critchton (1972)

administered the de Hirsch Predictive Index in a prediction

study and obtained positive results for prediction of

concurrent medical diagnosis of Minimal Brain Damage (MBD).

Eaves, Kendall, and Critchton (1974) used the de Hirsch

Modified Predictive Index (MPI) to predict teacher-

recommended grade placement. Their results were similar

to those of Feshbach et al. The overall hit rate was 76

percent and a high percentage of false positives was

found. The de Hirsch indices do not seem to have strong

empirical support.












Table 4


An Analysis of Multiple-Instrument Batterise as Predictors
Using Horizontal (H) and Vertical (V) Percentages



Study Sample Prediction Length of Performance Valid False False Valid
Instrument Prediction Test Poeltives Positives Negatives Negatives Overall
H V H V H V H V Hit"


de Hirsch. 53 kinder- de Hirsch
Jansky, & garden Index of 37
Langford children Variables
(1966)
Book 425 kin- MRT.
(1974) dergarten Bender.
children Slosson




Book 425 kin- MRT.
(1974) dergarten Bender
children Slosson


2 years


Gray Oral
Gates
Reading


9 months I.evel
achieved
on Scott
Foresman
Reading
Series
2 years Level
achieved
on Scott
Foresman
Reading
Series


71 91 29 10 3 9 97 90 91




93 48 7 3 31 52 69 97 75






55 94 45 13 1 6 99 87 88


Eaves. 50 kinder- Predictive Concurrent Medical
Kendall. & garden Index. Draw Immediale diagnosis
C- chton children A-Person. & of MBD
(1972) Name Print-
ing


Eaves. 42 chll- MPI End of
Kendall. & dren Neuropedia- grade
Crichlon follow-up tric Exam 2 years
(1974) Psychologi-
cal Exam


2nd Recom-
mended
grade
placement


88 96 12 4 4 11 96 89 92


65 81 35 27 14 19 86 73 76


163 kin- MPI
dergarten
children


2 years Recom-
mended grade
placement


Feshbach, 572 kin- de lllrsch 15 months G;ltes-
Adelman, dergarten Predictive MacGin!ile
& Fuller children Index Reading
(1974) Test


59 63 41 10 8 37 92 90 85



61 26 39 74 25 7 75 93 73


473 kin- Satz
dergarten Battery
white
mates


End of 1st Teacher
grade rating of
2 years reading using
a scale


New chil- Abbreviated 9 months Teacher
dren; 12R Satz Sept. to rallng of
ktlhrl Iinely .1iii' a,!livel ment

ihlltl,'n
children
(male,
female.
black.
white)


16 100 84 21 0




10 100 81 45 0


Eaves.
et al.
(1974)


Satz &
Friel
(1974)

Salz &
Friel
11')7(1)


0 100 79




0 100 55












Table 4 continued


Study Sample Predlctlon Length of Performance Valid False False Valid
Instrument Prediction Test Poaltlvea Poaltlves Negatives Negatives
H .V H V H V H V


Overall
Hita


Satz, New chil- Abbreviated End of 2nd Classroom


Friel, & dren; 175 Satz


Rudegealr kinder-
(1976) garten
white
males
Reported 458/473
In Satz. kinder-
Taylor, garten


Frel, &
lietcher
(1977)
Reported
In Satz
et al.
(1977)

Reported
In Satz
et al.
(1977)


Satz
et al.
(1977)


grade reading
ry 3 years level


Salz
Battery


End of 2nd CInssioom
grade reading
3 years level


29 89 71 25 2 11 98 75






28 89 72 31 2 11 98 69


males


419/473
kinder-
garten
white
males
442/473
kinder-
garten
white
males
114 kin-
dergarten
children
(male,
female,
black,
white)
follow-up


Salz
Battery




Satz
Battery




Abbreviated
Satz
Battery


End of 2nd
grade
3 years



End of 5th
giade
6 years



End of 1st
grade
2 years


Classroom
reading
level



Classroom
reading
level



Classroom
reading
level


34 91 66 34 3 9 97 66





63 58 37 9 11 42 89 91





60 71 40 11 7 29 93 89


Same as Language End of 1st
above Battery grade
2 years


Classroom
reading
level


55 71 45 13 7 29 93 87 84


Note. From "Early Identification An Analysis of the
Research". By C.D. Mercer, B. Algozzine, and J. Trifiletti,
Learning Disability Quarterly, 1979, 2, 12-24.


aThe median overall hit rate is 79.


Satz
et al.
(1977)


__ ~_____I___









Satz, Taylor, Friel, and Fletcher (1977) utilized

linear discriminant function analyses to arrive at an

optimal predictor score for a set of 22 variables. The

criterion variable was teacher ratings of classroom

reading level. In a series of studies (see Table 4),

the Satz battery was longitudinally evaluated with a

group of 473 boys. Some of the major findings of

these studies include the following:

1. The overall hit rate for the Satz battery was

considered adequate as was the valid negative rate.

2. False negatives (i.e., children in need of treatment

but not identified) tended to be overrepresented.

3. False positives (i.e., those identified but not in

need of treatment) tended to be overrepresented.

4. The battery seemed adequate in identifying low-risk

children, but problems were apparent with regard to

selecting those children needing intervention.


Teacher Ratings as Predictors

By asking teachers to identify children needing

extra intervention, a fairly simple identification

procedure is utilized. Haring and Ridgway (1967)

analyzed the results of screening for 1200 kindergarten

children. They noted that teacher perceptions are

accurate predictors of future school-related problems.

Similar results are reported by Benger (1968), Ferinden









and Jacobson (1970), and Cowgill, Friedland, and

Shapiro (1973).

Teachers' ratings tend to be most effective in

identifying those children in need of intervention

and those not likely to need special programming.

An analysis of errors within a study by Keogh and

Smith (1970) indicates that only false positive

mistakes were made. This would result in children

being placed in programs when they really did not

need them. The magnitude of such errors would depend

upon the negative effects of labeling as balanced by

the positive effects of educational programming.


Areas of Assessment for Prediction

Language, intelligence, motor, social-emotional,

and preacademic are the primary areas which have been

included in early identification assessment. To date,

the preacademic area appears to identify high risk

learners more accurately than any of the others (Badian,

1976; Keogh & Becker, 1973; Magliocca, Rinaldi, Crew, &

Kunzelmann, 1977). These investigators suggest that

areas of assessment have direct relevance to criterion-

performance measures. These skills may include

recognition of letters, letter sounds, numbers, shapes,

colors, body parts, and basic concepts.

In summary, with full consideration for cost, time,

and effort involved in early identification and prediction,









teacher perception seems to offer more advantages than

batteries and single instruments. Numerous investiga-

tors (Badian, 1976; Benger, 1968; Glazzard, 1977;

Haring & Ridgway, 1967; Keogh & Becker, 1973; and

Kottmeyer, 1947) report that teacher perceptions are

good predictors of school problems, especially if

teachers are provided checklists which include items

that are related to academic learning. Only two

studies (Feshbach, Adelman, & Fuller, 1974; Keogh &

Smith, 1970) were located which provided full matrix

data. Their overall hit rates were impressive, i.e.,

90 percent and 77 percent, respectively.









Frequency Measurement of Academic Performance

The choice of an appropriate datum to describe

academic performance has been given careful and

deliberate attention by many experts in the applied

sciences of behavior change (Haughton, 1971a; Johnston

g Pennypacker, 1980; Koenig, 1972; Lindsley, 1971;

Lovitt, 1968; Skinner, 1953; and White & Haring, 1976).

Skinner (1953) in a landmark theoretical

discussion o tihe imipor'llnce of frequency as a datumr

discussed its advantages for a technology of special

teaching. The major considerations are outlined below:

1. Frequency of a response is an orderly datum. The

curves which represent its relation to many types of

independent variables are encouragingly simple and smooth.

2. The results of frequency measurement are easily

reproduced. It is seldom necessary to use groups of

subjects and associated statistical control to demonstrate

results. This method permits a direct view of behavioral

processes, whereas previously behavioral processes have

been inferred.

3. Concepts and laws which are emerging from studies of

frequency have an immediate reference to the behavior of

the individual.

4. Frequency of response provides a continuous record

of many basic processes. A learning curve can be









followed across many days of instruction and the condition

of the response at every moment is apparent in the record.

5. Frequency of response lends itself well to automatic

data recording and collection.

6. Frequency of response is a valuable datum because

it is a physical referent for the concept of probability.

As such it is a simple and direct datum which will

generally serve as a more deliberate description of

behavior than will inferences and hypothetical constructs

such as "learned," "mastered," "skilled," and "knowledgeable."

Such language becomes more useful when one can say that

a learner can solve two-digit addition problems at a

frequency of 50 correct digits per minute with two or

less error digits per minute.

Skinner's central point is that the element which is

used to describe academic performance must be a function

of the behavior of the learner.

White and Haring (1976) have described academic

performance in terms of movements. The smallest change

in learning that can be measured is an increase or

decrease of one whole movement. A movement is the

equivalent of a discrete observable response. Movements

have both physical and temporal features. Prominent

among the physical properties are topography, force, and

locus. Topography is the muscular or skeletal "shape"

of the behavior or behavior sequence. For example,









a movement may be either written or oral, involving

different muscles. A hand may be raised or lowered,

involving a different sequence of the same muscles.

Force is the magnitude of a movement. An individual

may whisper or shout. A child may push another

child playfully or shove with enough force to

physically damage another. Locus is the direction or

target of a movement. A child may talk to a peer or

to the teacher. Answers may be written in the proper

positions or not.

In addition to the physical properties of topography,

force, and locus, a movement also possesses temporal

dimensions. These include duration, latency, and

frequency. Duration is the amount of time a movement

lasts. A child may take a minute or several minutes

to respond to a question. A child may take a long time

to get dressed in the morning, or may dress quickly.

In these examples, the physical elements of the

movement may remain essentially the same, but they

take more or less time to complete. Latency is an

important temporal dimension of behavior. It is the time

between an event in the environment, usually an

instruction, and the onset of the movement. A child may

take a long time to begin to answer a question. He

may take a long time to begin dressing in the morning.

Frequency is the number of times a movement occurs

during an observation period. By convention, frequency








for academic skills is expressed in terms of one minute

of observation (Lindsley, 1971).

The importance of frequency can be examined through

an example of reading behavior in which one student reads

at a frequency of 100 words per minute, while another

student reads at 10 words per minute. They each form the

words in the same way (i.e., the physical properties of

the behavior are the same). They each begin reading at

the same time (i.e., they demonstrate similar latency).

Both students read in the same amount of time (i.e.,

their duration is the same). Interestingly, these

students read differently. One student is much more

fluent with the skill than the other, as evidenced

by reading more words in the same amount of time.

Traditionally, experiments in learning have been

concerned with changes in the character or topography of

behavior (Bijou, 1972). The student learns how to do

something new, acquiring new behavior. But the conditions

which produce the topography of new behavior may continue

to have an effect when the topography dimension no longer

changes appreciably. After behavior has been acquired,

further reinforcement maintains it as part of the current

repertoire of the individual (Ferster & Skinner, 1957).


Frequency versus Percentage Statements

Many of the studies reviewed contrasted frequency

measurement of academic performance with percentage

statements (Haughton, 1969; White, 1972a; White & Haring,








1976). Accuracy percentage is defined as the number of

correct responses divided by the number of items presented

or attempted. Traditionally, educators have emphasized

accuracy in the initial development of academic skills.

It is becoming increasingly apparent that frequency

of academic performance is at least as important as

accuracy, and perhaps even more important when advanced

material is presented (Gaasholt, 1970; Haughton, 1971a;

Starlin, 1971). Frequency seems to be a better

indicator of the individual's ability to maintain,

generalize, and apply academic skills outside the

classroom (Thomas, 1972; White 9 Haring, 1976).

White and Haring (1976) have identified three

difficulties that can arise with use of percentage

statements. First, there must be adequate time to attempt

each item. If performance is based on all the items ,

presented, when the student does not have adequate time

to attempt all of them, then the student is penalized for

being slow. If the percentage is based on only the

attempted items, the student learns to attempt only those

items he is sure of. Secondly, percentage statements do

not address the actual number of movements the student

has made. For example, a basketball game in which six

out of tem baskets are made yields a percentage of

60 percent. The same student in another game could make

three out of four baskets for a superior percentage









of 75 percent. But realistically, fewer baskets were

made on a per-game basis. The final and most

serious deficiency of percentage statements is that they

can only measure learning up to the point where the

student stops making errors. In fact, learning can improve

beyond this point. Accuracy is only one way in which

academic performance can improve. An upper limit on

the value which a measurement can take is called a

record ceiling. Record ceilings represent artificial

limits on the measurement strategy. They ignore the

fact that one or more aspects of a student's performance

can be further developed.

In general, professionals tend to use percentage

statements when they want to describe an individual's

accuracy, and they use frequency when they wish to

describe an individual's fluency or speed. It is

important to realize that there is no need to support

a dual system of educational measurement. Frequency

statements can be converted to percentage statements

through a simple calculation:

PERCENT CORRECT FREQUENCY CORRECT
FREQUENCY ERRORS


Considerations for Measurement of Frequency

The primary concern of measurement is that the data

accurately represent the specific properties of the

behavior which is being measured (Johnston 9 Pennypacker,









1980). Three problems can be identified which influence

the measurement of frequency of academic performance.

These are the record floor, record ceiling, and

performance ceiling.


Record Floor

A record floor is the lowest possible non-zero

value that can be calculated or recorded in any

given measurement situation (White, 1972a). In many

situations, the record floor is determined by the accuracy

of the timing device. For instance, if timings are made

with a stopwatch with a second hand that jumps in 0.25

second intervals, events which take less than 0.25

seconds will "fall through the floor" and will be

recorded as "zero." Because data points falling below

the record floor are disjointed from the remainder of

the data by the distance from the record floor to zero,

they will tend to pull any estimates of progress down

toward the record floor. For this reason, data which

falls below the record floor should be avoided. As a

rule of thumb, White (1972a) suggests that prediction

based on data where more than 10 percent of the points

fall below the record floor will generally not be

accurate. In fact, data collection procedures should

be designed to place the majority of data at least ten

times above the record floor. For example, a stopwatch









with a 0.25 second capacity should not be used for

movements which are likely to be less than 2.5

seconds in duration.

Record floors are easily calculated for forms of

data which constitute a ratio (e.g., frequency = count/

time; percentage = count correct/total count). In cases

such as these, a "1" is placed in the numerator of the

ratio and the result is calculated (e.g., frequency record

floor = 1/time; percentage record floor = l/total count).

One can lower the record floor and increase the

sensitivity of measurement in many cases by increasing

the value of the denominator. In the case of frequency,

the observation time should be increased. There will

be situations in which modification of the data collection

procedure is impossible or not practical. It may then be

necessary to alter the unit of behavior. For example, if

"answers" to math problems are being counted, perhaps

counting "digits written" will increase the magnitude of

the data points sufficiently above the record floor. The

increase in sensitivity gained by selection of smaller and

smaller units of behavior must be weighed against the

increasing difficulty of counting and recording such movements.


Record Ceiling

There is a limit to how high data values can go.

Unlike toe record floor, the record ceiling is usually









not a function of the timing device or measurement

instrument per se. It is usually a function of the

measurement situation or the conditions of measurement.

If the behavior being counted is answers to math facts,

and there are exactly 50 problems on a sheet for a

one minute timing, then the highest possible recordable

frequency is 50 answers per minute. Rates of behavior

will be restricted by any artificial record ceiling

above which the rate is undefined (White g Haring, 1976).

There are several ways in which record ceilings

appear to affect academic performance in addition to the

simple limits they impose on rates. In many cases, a

learner's frequency will begin to slow down as the record

ceiling is approached, and before it is actually

reached. In other cases, there may be a "jump" to the

record ceiling as it is approached. Many learners

decelerate their frequency after reaching a record

ceiling. Because a record ceiling can affect academic

performance in unpredictable ways, it is advisable to

adjust the measurement condition so that the ceiling

is roughly ten times greater than the highest expected

value of the data.


Performance Ceiling

In addition to record ceilings and floors imposed

by data collection procedures, there is undoubtedly some

physiological limit to any human behavior (White & Haring,









1976). Performance ceilings can sometimes be estimated

by measurement of the learner's ability to perform a

"tool" movement which is prerequisite or a physical

component of the task behavior. For example, the tool

movement for writing answers to math facts is writing

digits 0 through 9. By measuring the learner's

frequency of writing digits 0 through 9, one can estimate

the maximum possible frequency for solving math facts.

In theory, if the learner could solve the math facts

instantaneously, then the frequency for math facts

would have an upper limit or performance ceiling

imposed by the time necessary to write the digits.

In practice, tool movements as estimators of

performance ceilings seem to have a fairly stable

relationship to task behaviors. White (1972b)

reports findings from a study with 18 elementary

children in which almost all children wrote digits 1.6

times faster than they were able to solve math facts.

White and Haring (1976) suggest targeting tool

movements for instruction in situations where the

frequency of the tool movement is less than one-half

to two-thirds of the desired task performance. The

rationale for increasing the tool rate is that the

performance ceiling will be proportionally increased,

thereby removing the possibility of a physiological

limit to the frequency of the instructional movement.









In summary, the recent application of operant

technology to learning has introduced frequency as an

important measure of academic performance. Lindsley

(1971), Starlin (1971) and other learning technologists

have used dual measures of frequency correct and

frequency errors to describe academic performance.

The advantages of using frequency units include increased

sensitivity of measurement for small changes in learning,

the potential for more direct observation of learning,

and the possibility of a universal scale of measurement

for both within and between learner performance. Similar

measures in the physical sciences have had great success

and have contributed to the advancement of knowledge

through more precise control of phenomenon. A popular

example is kilometers per hour as a measure of velocity.

The choice of frequency as the preferred unit for

measuring academic behavior is not without problems.

The sensitivity and utility of frequency measurement is

subject to problems of measurement design including

record floor, record ceiling, and performance ceiling.

However, these problems are not insolvable and may be

viewed as elements of instructional design. They do not

outweigh the advantages of frequency units over

percentage statements with respect to describing academic

performance.









Studies of Frequency Performance Standards

Considerable attention has been recently directed

to the performance of learners in the basic skill areas

of reading, writing, spelling, and math computation.

These skills are seen as important prerequisites for

later, more complex academic learning in content areas

such as social studies and science. Since basic skills

are of importance, various attempts have been made to

specify performance criteria for them. Despite the

critical role that mastery of these skills is assumed

to have on learning outcomes, percentage accuracy

statements continue to be chosen for performance standards

despite little or no evidence that optimal learning

will result (Block, 1974).

White and Haring (1976) define criterion performance

as the minimum level of performance which facilitates

learning in the next step of a sequential task hierarchy

and/or is required for maintenance in, or improvement of

the environment of the learner. In practice, performance

standards are derived from empirical observation of learners

who are considered proficient at the skill.

In reading programs, the assumption is usually made

that the frequency of reading words will of necessity be

very slow in early stages (Jenkinson, 1973). However,

Speer and Lamb (1976) have demonstrated a strong relationship

between high frequencies of first grade students' visual









processing of letters, and subsequent reading achievement

as measured by the Gates-McGinite Reading Test. The

implication is that if frequency of academic performance

is important in terms of learning outcomes, educators may

be setting the stage for failure by exclusive attention

to percentage accuracy statements as the criterion for

academic performance. Studies of academic performance

standards for math, reading, spelling, and writing will

now be presented.


Math Frequency Standards

Haughton (1971a) has reported a strong relationship

between the frequency of writing digits and math

computation (r = .9 to .99). Learners were instructed

to write digits one through ten for one minute. It was

observed that children who wrote digits at frequencies

of 20 to 30 digits per minute or less were also poor

performers in math computation. Haughton also reported

that learners performing at frequencies of 30 to 40

digits per minute on basic math facts were able to

accelerate while progressing to more complex tasks. Those

learners performing below 30 digits per minute decelerated

their frequencies as they progressed to more complex tasks.

This finding has been replicated in Marie Gaasholt's

(1970) research. Gaasholt found the frequency of 80 digits

per minute when writing digits one through ten, and the

frequency of 40 to 50 digits per minute on basic math facts









to be appropriate performance criteria for movements in

addition, subtraction, multiplication, and division.

Tomaras (1974b) reported normative frequency data

for tracing digits sampled from seven first grade class-

rooms in the Tacoma Public Schools S.S.T. Learning

Disability Project. Three distinct stimulus sheets were

used with digits one through four, five through seven,

and eight through zero. Median frequencies from the seven

classrooms were 29, 31, 23, 37, 37, and 43 digits per minute.

In a separate study, Tomaras (1974a) reported the

results of using a timed measurement format for mathematics

instruction in the first grade. A performance criteria of

30 digits per minute was set for fluency using probes

from the Tacoma Publis Schools S.S.T. Learning Disability

Project and addition and subtraction sheets adapted from

the Addison-Wesley Math Series. The control group

followed the traditional classroom format of the Addison-

Wesley workbook supplemented by other teacher worksheets.

No frequency performance standards were required in the

control group. The results indicate that the mean grade

equivalent score measured by the California Achievement

Test in Mathematics for the experimental group was grade

1.6. This can be compared to a mean grade equivalent score

of 1.3 for the control group. Findings were significant

at the .10 level. The timed measurement format also

affected the spread on the grade equivalent scale,









placing more children higher on the scale than at

the bottom of the scale. Tomaras reports a median writing

digits frequency of 32 digits per minute based on screening

thousands of children in first grade classrooms. The

frequencies varied from 0 to 60 digits per minute.

Wolking and Schwartz (1973) gathered normative

frequency data on a number of academic skills across

grades. Data from low achievers (LO) and high achievers

(HI) are presented in Table 5. The frequency of high

achievers for writing digits to addition problems varied

from an average 12 correct digits per minute with 1 error

per minute in first grade to an average of 86 correct

digits per minute with 0 errors per minute in sixth grade.

Low achievers varied from an average of 1 correct digit

per minute with 4 errors per minute in first grade to

54 correct digits per minute with 0 errors in sixth grade.

Thomas Lovitt has supervised a large number of

precision teaching projects at the Experimental Education

Unit at the University of Washington. Lovitt (1976)

reports that 50 digits per minute is considered an adequate

frequency standard for most mathematics skills.


Reading Frequency Standards

A large number of research efforts have been directed

toward identifying the frequency of oral reading necessary

to generate proficient readers. Eric Haughton (1971a)








Table 5


Rate of Growth Toward Adult Proficiency:
Differences Between High and Low Achievement Children, Grades 1 6


Grade One Two Three Four Five Six Adult
Number Skills C E C E C E C E C E CE C E


Hear Instructions
Write Numbers, Seq.

Hear Numbers, Dictated
at 32/minute
Write Numbers

See Numbers, Random
Say Number Names


22 2
6 3


14 1
6 8

76 0
28 3


See Addition Problems 12 1
Write Numbers 1 4

Reading Skills


38 0
26 3


32 0
13 1

94 0
62 0

23 0
9 2


62 0
36 0


32 0
32 0

112 0
80 0


34 0
14 3


81 0
63 0


32 0
32 0


88 0 116 9
87 0 94 0


32 0
32 0


32 0
32 0


120 0 124 0 150 0
98 0 102 0 124 0


48 0
34 0


58 0
33 1


86 0
54 0


156 0



32 0


204 0


132 0


Hear Instructions
Say Words

See Letters d,b,p,q
Mark d's


120 0 132 0 132 0 156 0 174 0 180 0
108 0 108 0 132 0 144 0 156 0 156 12


20 0
14 4


28 0
18 0


31 0
44 0


36 0
33 0


38 0
36 0


49 0
41 0


252 0


76 0









Table 5 continued


Grade One Two Three Four Five Six Adult
Reading Skills C E C E C E C E C E C E C E


Hear Letters Dictated
at 30/minute
Write Letters

See Small Letters
Say Letter Names

See Small Letters
Say Letter Sounds

See Regular Words
Say Words

See Irregular Words
Say Words


19 11 28 0 30 0
4 26 14 15 21 9


68 4 84 0
24 22 48 6


36 4
1 8


52 0
16 4


104 0
64 4

44 0
24 4


50 8 100 0 108 0
1/2 38 6 11 28 8


50 9
0 -


30 0
24 2


30 0
28 2


30 0
28 2


124 0 122 0 142 0
76 0 88 3 104 0


50 0
36 3


126 0
54 8


52 0
30 3


64 0
34 6


138 0 148 0
74 4 104 2


90 0 104 0 128 0 134 0
6 17 16 12 48 12 74 8


150 0
90 6


30 0


196 0


108 0


198 0


198 0


Writing Skills


See Squares
Write X's


20 0
12 0


See 0's
Put Dot in O's

See Numbers, Random
Write Numbers

See Letters, Random
Write Letters


22 0
18 0


38 2 43 2
24 11 39 9


25 0
11 0

25 0
11 0


30 0
29 0

27 0
30 0


28 0
28 0

58 1
44 8

44 0
39 0

40 0
41 0


38 0
28 0

66 5
56 4

66 0
52 0

61 0
48 0


50 0
30 0

79 0
65 0

79 0
61 0

74 0
61 0


45 0
41 0


85 0
81 0

100 0
80 0

88 0
72 0


78 1


110 0


145 0


127 0










Table 5


- continued


Grade One Two Three Four Five Six Adult
Writing Skills C E C E C E C E C E C E C E

Hear Regular Words 4 0 14 2 24 0 32 0 32 0 36 0 48 0 HI
Say Letters 0 1 6 4 6 10 6 20 4 22 4 LO

Hear Irregular Words 2 6 12 4 18 2 28 0 32 0 32 0 48 0 HI
Say Letters 0 2 5 5 6 8 6 14 4 16 5 LO









found that children reading above 100 words per minute

in third and fourth grade did not decelerate their

performances when the reading curriculum became more

advanced. It was concluded that a minimum rate of

100 words per minute is a useful performance standard

for oral reading.

Johnson (1971) found that 90 percent of students

reading below 50 correct words per minute had

relatively high error raten betweenn 2 and 20

errors per minute). Only 30 to 40 percent of students

reading between 50 and 100 correct words per minute

has errors in oral reading. Of the students reading

above 100 words per minute, only 10 percent made

errors.

Starlin (1970) reported that children whose frequency

of oral reading was 5 to 10 words per minute had severe

difficulty with reading and had not mastered prerequisite

reading skills such as saying sounds. Haughton (1971a)

reported that some of these children needed speech

acceleration because they talked too slowly. The

conclusion was that oral reading ability is a function

of the frequency of certain prerequisite skills such as

saying sounds, saying phonetically irregular words

(also called basic sight words), and speech production.









Camp (1973) studied the relationship between

frequency of reading texted words and long term

retention in 46 children with severe reading disability.

The majority of the children had learning curves

qualitatively similar to normal children. Rank order

correlations between reading frequencies and three

measures of retention ranged between r = .54 and r = .94.

The data suggests that individual differences in frequency

may account for a large share of individual differences

in retention.

Thomas (1972) studied instruction in reading rate

acceleration and the effects upon comprehension. Three

experimental classes were randomly selected in each of

grades two, four, and six in Montana schools. Standardized

protests and posttests of comprehension were administered

to nine classes including 407 learners. The experimental

classes were trained to increase frequencies of reading

texted words over a six week period at the beginning of

the school year. Results of the study indicate that the

learners in second grade made significant gains in reading

comprehension. The first grade experimental group made

significant gains in both reading rate and comprehension.

When the groups were equated on the basis of intelligence

quotients, the differences maintained. The differences

did not decline when measured again at the end of the


school year.









Alper, Nowlin, Lemoine, Perine, and Bettencourt

(1974) reported that 100 words per minute with two or

less errors per minute is considered mastery for oral

reading. Th author reported several other academic

performance criteria in current use (see Table 6).

These frequencies may be contrasted with Lovitt's

(1976) recommendations of 100 words per minute for oral

reading and 65 per minute frequencies for see to say

word parts. Suggested performance standards from the

Great Falls Montana Precision Teaching Project are a

more recent guide (see Table 7). These standards

suggest frequencies of 200+ for oral reading of texted

words and 60 to 80 sounds per minute for isolated

phonics sounds.

Summarizing the studies in reading, two salient

factors emerge: (1) there is a frequency standard for

oral reading above which children make few errors and

are able to progress to more difficult reading without

decelerating, and (2) there appears to be important

prerequisite skills to oral reading. By developing

each prerequisite skill to an appropriate proficiency

level before introducing new materials, acquisition

moves smoothly and rapidly (IIaughton, 1969). Holding

a skill at proficiency level is simplified because all

the precursors have been thoroughly learned.









Table 6

Tentative Mastery Levels in Reading


Rate Correct Rate Incorrect


Sounds:

Consonants

Vowels

Alphabet Names

Phonetically Predictable
3, 4, and 5 Letter Words

Dolch Sight Words

Reading in Books at
All Grade Levels


80/min.

80/min

80/min.


80/min.

60-80/min.


100-120/min.


1-2/min.

1-2/min.

1-2/min.


1-2/min.

1-2/min.


1-2/min.









Spelling Frequency Standards

Starlin (1971) presented academic standards for

spelling performance. For learners in kindergarten

through second grade, a frequency of 30 to 50 correct

letters spelled per minute with 2 or less errors per

minute was considered adequate. For third grade

through adult levels, 50 to 70 correct letters spelled

per minute with 2 or less errors per minute was

considered adequate. These standards can be contrasted

with suggested performance standards from the Great

Falls Montana Precision Teaching Project. The Project

recommends 80 to 100 letters per minute for hear to

write dictated spelling words, and 15 to 25 words per

minute for hear to write dictated words.

The Starlin (1971) study was the only experimental

study which could be identified on the subject of spelling

frequency standards. Researchers have been reluctant

to study this area because the frequencies seemed to

be dependent on the rate of presentation of the stimulus

words. Recently, new techniques have been used whereby

the stimulus words are presented at a much faster

rate than the child can spell. In the typical application,

a stimulus word is presented every 3 seconds. Some words

are not spelled, but a stimulus word is always present

shortly after completion of a previous word, regardless

of how long it takes to spell the previous word.









Writing Frequency Standards

Kunzelman (1970) and Haughton (1971a) have demonstrated

that children and adults have a higher frequency of writing

letters in words than writing letters in isolation.

Wolking and Schwartz (1973) report normative data on

writing random letters. High achievers averaged from 25

letters per minute with no errors per minute in first

grade, to 88 letters per minute with no errors per minute

in sixth grade. Low achievers averaged from 11 letters

per minute with no errors per minute in first grade to 72

letters per minute with no errors per minute in sixth

grade (see Table 5). This data should be evaluated in

a developmental sense rather than as performance standards

per se. Wolking (1980) recommends the use of 120 letters

per minute with 2 or less errors per minute as a general

performance standard for writing skills. Lovitt (1976)

has used 125 symbols per minute as the desired frequency

standard, but cautions that research has not been

conducted which could justify that value.

Suggested performance standards from the Great Falls

Montana Precision Teaching Project (1979) are presented

in Table 7. It can be observed that the performance

criteria listed are generally higher than other

recommendations. Performance standards change periodically.

The trend is toward higher frequencies.









Table 7

Suggested Performance Standards
Great Falls Montana Precision Teaching Project


Skill Standard

Reading

See/Say Isolated Sounds 60-80 sounds/min.

See/Say Phonetic Words 60-80 words/min.

Think/Say Alphabet (fowards) 400+ letters/min.

See/Say Letter Names 60-100 letters/min.

See/Say Sight Words 80-100 words/min.

See/Say Words in Context (oral read) 200+ words/min.

See/Think Words in Context (silent read) 400+ words/min.

Think/Say Ideas or Facts 15-30 ideas/min.


Handwriting

See/Write Slashes

See/Write Circles

Think/Write Alphabet

See/Write Letters (count 3 for each
letter: slant, form, and ending)

See/Write Cursive Letters Connected
(count 3 for each letter)


Spelling

Hear/Write Dictated Words

Hear/Write Dictated Words


200-400 slashes/min.

100-150 circles/min.

80-100 letters/min.


75 correct/min.


125 correct/min.




80-100 letters/min.

15-25 words/min.









Table 7 continued


Skill

Mathematics

See/Write Numbers Random

Think/Write Numbers (0-9) Serial

See/Say Numbers

Think/Say Numbers in Sequence
(count-bys)

See/Write Math Facts


Standard


100-120 digits/min.

120-160 digits/min.

80-100 numbers/min.


150-200+ numbers/min.

70-90 digits/min.









A Frequency Model of Learning

White and Haring (1976) define fluency in relative

terms. In their opinion, the question of fluency should

be judged in terms of what will make the skill useful

for the child. Reading at a frequency of 50 words per

minute may well serve the needs of a second grade child,

but will not suffice for the college freshman. White

and Haring have proposed a five-stage learning model

through which a child acquires each discrete academic skill.

In their model, the learner can move through stages of

acquisition, fluency building, maintenance, application,

and adaptation. Identification of the first three stages

of the model is based on the frequency of correct and error

performance. Closely related skills can be in several

different stages of the model, dependent upon the

frequencies of performance for each skill. In the White

and Haring learning model, differential instruction is

programmed depending on the learning stage which in turn

is based on the frequencies of academic performance.

Although there is little research to support the

White and Haring learning model, it has stimulated efforts

to identify academic performance standards to use as goals

in the fluency-building stage. The model is primarily of

practical value in that the stages of learning have specific

and concrete teaching procedures. The White and Haring

work has organized and directed the efforts of many learning

technologists.









In summary, the majority of studies included for

examination of frequency measurement criteria are an

outgrowth of the precision teaching model developed

by 0. R. Lindsley (1971). Central to this model is the

use of frequency measurement. The studies of frequency

performance criteria for academic skills have followed

two schools of thought. One belief is that the criteria

should be set at the median frequency for a group of

similar peers. The followers of this belief have

identified normative frequencies for various grade levels

and types of learners. The problem with this type of

data is that populations operating under a precision

teaching model will have much higher normative frequency

values than populations operating in a traditional

instructional framework.

Other researchers have attempted to identify critical

frequencies above which subsequent learning will be

facilitated and below which learning is hindered. The

problem with this approach is that it is dependent on an

identified sequenced curriculum. Furthermore, once the

critical frequency is identified, a higher frequency must

be specified as the desired criterion frequency in order

to avoid the problems associated with the performance

ceiling.

Despite the many problems of identifying frequency

performance criteria, it remains a fertile area for





58



research. The objective is to identify frequency

performance standards which will produce maintaining or

rising initial frequencies and celebrations as new

curriculum material is presented. The central issue

appears to be whether critical frequencies can

be identified which will facilitate academic performance

for basic skills in reading, spelling, writing, and math

computation.









Predictive Studies Using Frequency Measurement

Prediction using any method is a tenuous endeavor.

In essence, one is implying that the conditions which

produced the prediction data will remain in effect and

produce similar progress in the future (White, 1972a).

In reality, many new variables can be introduced at any

time, or the existing values can diminish or be enhanced

(i.e., through boredom, maturation, etc.). The problem

is further complicated by the complex role of the

teacher in arranging the instructional environment to

maximize learning. Almost by definition, conditions

will not remain the same in a learning situation.

Implied in the use of data for instructional

decision-making under the precision teaching model is

prediction of academic performance. One predicts, on

the basis of available academic performance information,

whether or not the learner will meet established performance

standards within an acceptable time limit. If the

prediction indicates failure to attain the standard, then

the instructional procedures are changed. If the pre-

diction indicates attainment of the standard, than a change

should not be necessary. Thus prediction of academic

performance serves a primary role in the timing and

selection of instructional methods and procedures.

In predicting academic performance, one is concerned

with the trend of the frequency values. There are a









number of methods for quantifying the trend, among

which the most popular are celebration and slope.

The use of data from academic performance to

predict subsequent performance requires resolution of

two basic considerations: the type of data collected,

and the amount of data needed for prediction. White

(1972a) maintains that the type of data used for prediction

should have the capacity to vary over a wide range of

values. In addition, the data should remain comparable

from day to day. White identifies three types of data

which meet these criteria: counts of the number of times

a behavior occurs where the time in which the counts are

taken remains constant from day to day; frequency, in which

the behavior count for each day is divided by the time over

which the counts were collected; and time or temporal values

such as latency, the time necessary to begin responding once

an instruction is given. Frequency is the most commonly

employed of these acceptable data types.

The ability to accurately predict subsequent academic

performance depends on the quantity and quality of data

used for prediction. Theoretically, the more data used

for prediction, the better the chances for accurate prediction.

White (1972b) analyzed the predictive utility of the

median slope for predicting academic performance within

instructional phases (i.e., within periods of instruction

in which the conditions of instruction were held constant).









The criteria for success in prediction was the deviation

of data values from the median slope predicted line of

progress. White found that 9 and 11 data points had

optimal predictive ability over a wide range of days

into prediction. Interestingly, a reduction of

predictive utility occurred after 13 predictive data

points. This was attributed to reduction in the

number of projects to sample which contained many

data points. In general, the prediction had greater

accuracy as the number of data points for prediction

increased.


Predictive Studies

White (1972a) analyzed the results of 116 classroom

precision teaching projects. The projects included tasks

such as writing answers to addition and subtraction

problems, saying digits, and reading sight words. Four

different methods were compared for using initial

frequency data to predict subsequent academic performances

of single individuals. All four methods were slope

calculations. The methods were compared using 3 initial

data points, then gradually increasing the number of

days into prediction until 11 or 12 data points were

included in the calculations. The slope methods used

included the least-squares regression solution, median

slope method, corrected slope method, and split-middle

technique.









The findings indicate that median estimates of

trend, and in particular the median slope method, were

more consistently accurate than the least-squares

regression solution. The probability of acceptable

prediction over one week or more into the future was

not within reasonable limits until at least seven

data points were used for prediction. The split-middle

technique performed well enough to be considered an

alternative to the median slope method for use by

teachers and other educational practitioners.

Critical analysis of the White study reveals that

there was much variety in which method was better in

individual cases. Over a large number of cases, the

median estimates predicted better than other methods.

Another problem with the study was the method of ranking

used to judge the "closeness" of prediction. Exact

quantification of the accuracy of prediction was not

reported.

A significant finding not mentioned in the conclusions

of the White (1972a) study was the influence of variability

of the frequency values. The median slope method was

drastically affected by trends in deviations or variance

in the data, to the exclusion of trends in frequency. A

correction factor was applied to the median slope calcula-

tions to minimize the effects of variability in the data.









Consequently, the utility of variability for predicting

future academic performance of single individuals was

not explored.

In a landmark dissertation study, Koenig (1972)

studied the utility of least squares straight line

projections on semilogarithmic charts to predict future

frequencies. In this research, the first 10 to 14

frequencies of each phase of instruction was used to pre-

dict to the next 10 to 14 frequencies within phase. The

percentage of future frequencies contained in the envelope

formed by bounce lines was used to assess the accuracy of

the projection technique. A total of 14,452 phases across

a variety of academic and behavioral tasks were analyzed.

The major findings of this study were:

1. Variability about straight line celebrations and quarter-

intersect celebration lines was found to be relatively constant

in proportion and symmetrical above and below the line.

2. The least-squares celebration line was slightly better

at bisecting future data than the quarter-intersect method.

Both methods performed adequately.

3. The bounce envelope projection technique did not

perform well for predicting future frequencies. The

least-squares projection envelope contained 70 percent

or more of the projected frequencies only 42 percent of

the time. About 21 percent of the time, 90 percent or

more of the frequencies were contained.









4. Projecting only to the next quarter containing five

to seven of the frequencies improved the projection.

The overall conclusion from the Koenig study was

that least-squares and quarter-intersect straight lines

usually represent human frequencies accurately. It

should be noted that straight line methods represented

the data well despite the fact that about one-third of

the human frequency phases were changing more than 10

percent weekly. The other two-thirds of the phases

changed less than 10 percent weekly.

Koenig (1980) analyzing data from over 8,000

children reports that prediction within phases of

precision teaching episodes requires at least six data

points for math and spelling, and eight data points for

reading to attain significant correlation values for

prediction. The dependent measure was the frequency of

correct movements. Data which contains large amounts

of variability require additional data points to obtain

significant prediction correlations. Koenig recommends

exclusion of the first data point, and at least ten

daily observations for prediction studies using frequency

correct measures. The data in Table 8 summarize and

organize Koenig's prediction results for math, spelling,

and reading movements. It can be observed that the t

values for correlation become acceptably small and








Table 8


The Predictive Strength of Learning Indicesa
Based on Nine Performances (Excluding First)
Compared to Learning Indices Based on Fewer Performances


Learning Index Learning Index Based on Days 2 through 10
Based On Days MATH SPELLING READING
mean comparison mean comparison mean comparison
r t p r t p r t p

2 through 3 .18 11.9 .0005 .09 10.3 .0005 .40 8.7 .0005

2 through 4 .51 7.4 .0005 .44 5.9 .0005 .37 7.3 .0005

2 through 5 .69 5.4 .0005 .19 5.1 .0005 .34 5.7 .0005

2 through 6 .77 1.7 .08 .68 2.4 .02 .80 5.7 .0005

2 through 7 .85 1.7 .08 .97 1.9 .06 .88 2.6 .01

2 through 8 .93 1.9 .06 .99 1.4 .17 .94 0.2 .85 *

2 through 9 .98 2.2 .03 .99 1.4 .17 .97 -1.2 .25


Note. Personal


communication with Carl Koenig 1980
communication with Carl Koenig, 1980


aData represents over 8000 children' movements in math, spelling, and reading.

b681 third grades with all 10 performances reported.

*Indicates points at which error probability passes through p = .05. For math and
spelling this occurs on days 2 through 6. For reading days 2 through 8.









significant at six days for math and spelling (t = 1.7

and 2.4, respectively), and eight days for reading (t = .2).

Pace (1980) explored the use of celebration and

frequency for predicting children referred for special

education services. Using the International Management

System Learning Screening Procedure (Koenig & Kunzelmann,

1980), an overall hit rate of 64 percent was achieved

across grades one through six. The sum of celebration

ranks in the lower quartile was used for prediction.

This procedure yielded a 25 percent referral rate

composed of a high percentage of false positives and

false negatives.

The same screening procedure using frequency

rather than celebration increased the overall hit rate

from 83 percent to 100 percent across grades. These

values compare favorably with the best of the early

identification studies using traditional measurement

techniques. The significant finding of this study was

that celebration ranks lessen the accuracy of prediction,

while frequency ranks demonstrated superior predictive

utility. This must be viewed as a tentative finding due

to limitations in the design of the study which did not

include mentally retarded children and an overall three

percent minority population in the school district under

investigation. It should be noted that under the Learning

Screening Procedure, the content of the screening items

is taken from skills in the existing curriculum.









Stiles (1973) studied the utility of using

frequency measures from individual learners to predict

their scores on the Short Form Tests of Academic Aptitude/

Comprehension Tests of Basic Skills (SFTAA/CTBS) of the

California Test Bureau. Data from 348 third grade

students' performances on the SST stimulus sheets were

used for prediction. One minute samples from each day over

a ten day period were collected on such skills as

writing random digits, addition problems, subtraction

problems, hear to write letters, writing random letters,

saying sounds, and saying words. The celebrations and mid-

point frequencies were used to predict SFTAA/CTBS performance.

Results of the Stiles study indicate that 36 percent

of the variation or differences in language intelligence

scores can be accounted for by differences in rates of

performance on the SST skills. The variables which

explained the greatest amount of the variance (i.e., the

variables which entered the regression formula first) were

the frequency of saying words and writing random numbers.

The addition of other variables to the equation increased

the variance accounted for by 6 to 12 percent.

Hanby (1975) described the use of frequency of oral

language performance to identify children with language

deficiencies. A non-tested preschool class was used to

select a picture from the Ginn Pre-Reading Picture









Series which had the highest number of verbal responses.

In the testing procedure, children were asked to tell

what they saw happening in the picture. The frequency

of words responding during a 15 or 30 second timing each

day was recorded. A class summary of frequencies was

plotted on polar logarithmic charts and language

deficient students were identified by a criterion of

one-half of the class median frequency value. An

alternative criteria discussed was the median frequency

by age in months.

An advantage of using the frequency procedure described

in the Hanby study is that the class median celebration

value could be used as a criterion for instruction.

The class median celebration value reported was X1.3

(30 percent growth in frequency values per week).

In summary, predictive studies using frequency

measurement can be classified into two groups: those

which attempted to predict academic performance within

phases of instruction (White, 1972b; Koenig, 1972; Koenig,

1980); and those which attempted to predict to some

external criteria (Pace, 1980; Stiles, 1973; Hanby, 1975).

The latter studies are similar to the early identification

studies reviewed earlier in that the emphasis is on

prediction of children at risk for academic failure.

They differed in that frequency measurement was the dependent

measure used for prediction.









All of the investigations in this section were

concerned with frequency correct measures, while a few

studies also considered celebration and variability

for prediction (White, 1972a; Koenig, 1972; Pace, 1980;

Hanby, 1975).

Important findings from the frequency measurement

studies include the number of data points into prediction

(Koenig, 1980); the relationship of variability to both

celebration and frequency correct measurements (White,

1972a; Koenig, 1980); and the utility of frequency correct

measures over celebration measures for prediction of

problem learners (Stiles, 1973; Pace, 1980).

None of the frequency measurement prediction studies

attempted to combine frequency, celebration, and variability

measures for prediction of subsequent academic performance.

Additionally, no studies attempted to predict across

phases from baseline to instruction.

The present study extends and builds upon the previous

research findings to include multiple dimensions of

frequency, celebration, and variability measures for

prediction of academic performance. It also breaks new

ground in attempting to predict from baseline to

instructional phases.













CHAPTER III

METHOD


Sixty learners were assessed over a five-day period

by fourteen teachers using the Sequential Precision

Assessment Resource Kit II (Trifiletti, Rainey, &

Trifiletti, 1979). This assessment procedure was

followed by twenty days of academic instruction on the

skills identified as deficient during assessment. During

instruction the teachers used precision teaching methods

to maximize academic performance. Following instruction

the data from the study were analyzed to determine the

predictive utility of frequency, celebration, and variability

dimensions of academic performance. The content of instruc-

tional tasks was coded for further analyses of relationships

between frequency, celebration, and variability of academic

performance during assessment and instruction.


Subjects

The subjects were students enrolled in the Summer

Learning Disabilities Program sponsored by the Department

of Special Education of the University of Florida. They

ranged in age from seven years two months to fifteen years

five months.

In May, 1979, a packet of information was mailed to

teachers of exceptional student education in Alachua County,

70









Florida. The teachers were asked for referrals of

exceptional learners who might benefit from summer

instruction. Children were selected from the referrals

on a first-come, first-served basis for a total of

sixty children. The entire population of sixty children,

fourteen teachers, and two teacher-managers of the Summer

Learning Disabilities Program were used for the study.

The students in the study were exceptional students

from public and private schools in Alachua County, Florida.

The majority of the students were children with mild to

moderate specific learning disabilities.

The teachers employed in the study were enrolled

as a practicum teaching experience for partial fulfillment

of the requirements for the Master of Education in Special

Education. The teacher-managers elected the experience

as part of the requirements for the Doctor of Philosophy

in Special Education.


Equipment

Apparatus

The assessment instrument used for the study was

the Sequential Precision Assessment Resource Kit II

(Trifiletti et al., 1979). This instrument is based

upon White and Haring (1976) procedures for precision

assessment. A description of the Sequential Precision

Assessment Resource Kit II (SPARK II) instrument is

provided in Appendix B. Probes from the SPARK II were









used to gather data from which frequency, celebration

and variability measurements were derived.

Tool skill probes, single skill probes, and mixed

skill probes from the SPARK II were administered. Teachers

scored the performance of learners on the probes and this

information was recorded on assessment summary sheets.

A computer program was used to generate weekly

frequency charts for each teacher-learner-task (triad)

combination. The charts displayed both frequency and

accuracy of performance on each skill, and were cumulative

from the onset of instruction. A listing of the computer

program written in Fortran language is provided in Appendix

C. Appendix D contains a sample frequency chart and

directions for interpretation.


Setting

The study was conducted at the P.K. Yonge Laboratory

School of the College of Education, University of Florida.

Assessment and instruction of learners was carried out in

ordinary elementary classrooms during a special summer

program. The dates of the assessment and instruction

phases of the study were from June 25, 1979, to August 7, 1979.


Procedure

The study consisted of three phases. The first was

an orientation and training period for the teachers and

managers of the study. The second phase was a five-day









assessment period during which each learner was

administered probes from the SPARK II. The final

phase consisted of 20 teaching days of individualized

instruction for each learner. The instruction employed

precision teaching methodology in order to maximize

academic performance. Each of these phases of the

study will now be described in detail.


Training Procedures

All of the teachers and teacher-managers attended

a college course and an intensive workshop on precision

teaching procedures prior to onset of the study. These

procedures were reviewed during a two-hour training session.

A syllabus of the training session is included in Appendix E.

The teachers were trained in interpretation of the

computer charts during individual sessions with their

teacher-manager.


Initial Precision Assessment

During the assessment phase of the study, the

teachers administered three types of probes from the

SPARK II; tool skill probes, mixed skill probes, and

single skill probes. The mixed skill probes were used

to identify possible deficient skills in reading,

mathematics, and language arts. Mixed skill probes were

administered to each learner on the first and second days

of assessment. Teachers then examined the results of









the mixed skill probes to select single skill probes

for further assessment. The frequency, celebration,

and variability derived from learner's performance

on mixed skill probes were used to predict subsequent

academic performance.

The single skill probes administered to each

learner were selected on the basis of information

gained from the mixed skill probes. Single skill

probes were administered on days three, four, and

five of the initial assessment period. Data from the

single skill probes were used to select skills for

instruction. The frequency, celebration, and varia-

bility measures derived from the single skill probes

were used to predict subsequent academic performance.

Tool skill probes were used to assess performance

on important prerequisite skills such as saying letters,

writing letters, saying digits, and writing digits. The

tool skill probes were repeatedly administered to each

learner on each day of initial assessment.

Frequency, celebration, and variability measures of

each learner's performance on the tool skills were

computed. These data were later used to predict

subsequent academic performance.

The order of administration of probes for initial

assessment is presented in Table 9. Upon completion of

the initial assessment, teachers selected six to ten









Table 9

Administration Schedule of
Probes During Initial Assessment


Day 1 Day 2 Day 3 Day 4 Day 5


Tool Tool Tool Tool Tool
Skill Skill Skill Skill Skill
Probes Probes Probes Probes Probes



Mixed Mixed Single Single Single
Skill Skill Skill Skill Skill
Probes Probes Probes Probes Probes









skills for instruction with each learner. Selection was

based upon teacher's clinical judgement and the information

obtained during assessment.

Instruction

The third part of the study involved instruction

using precision teaching methods. Teachers administered

daily timed probes from the SPARK II and other sources

for instructional targets. Performance data were charted

daily by each teacher. These charts were collected

each weekend and the data were keypunched by the

experimenter to produce weekly computerized charts of

each learner's performance. The computerized charts

flagged celebration values which were less than 25 percent

weekly improvement. Teachers were required to change

their instruction for these skills in order to optimize

learning.

Experimental Design

The experimental design for this experiment follows

a predictive model. The predictor variables were median

frequency correct, weekly celebration, and variability

of performance on academic skills during assessment.

This information was obtained by measuring performance on

mixed skill probes, single skill probes, and tool skill

probes from the SPARK II. The criterion variables were

weekly celebration during the first instructional phase and

second instructional phase.









Frequency was defined as the median value of the

correct movements per minute of observation. Since the

computer generated charts used an equal interval scale,

celebration was defined as the ratio of two points seven

days apart on a least squares regression line plotted

through the frequency correct points. Variability of

academic performance was defined as the sum of squares

of error about a regression line plotted through the

frequency correct points.

Data Collection

Data describing the baseline initial assessment were

collected and recorded by teachers on summary forms

provided by the experimenter. Data from daily instruction

were collected by teachers and recorded on ratio charts.

Data from initial baseline assessment and instruction were

keypunched by the experimenter. This effort produced

the computerized charts as well as formed the data for

statistical analyses.

Data Analysis

The data for this study were statistically analyzed

using multiple regression procedures and the stepwise

solution (Nie, Hull, Jenkins, Steinbrenner, S Brent, 1975).

In this procedure, the computer enters independent variables

into the regression equation in single steps from best to

worst. The independent variable that explains the greatest

amount of variance in the dependent variable will enter








first; the independent variable that explains the greatest

amount of variance in conjunction with the first will

enter second, and so on. In other words, the independent

variable that explains the greatest amount of variance

unexplained by the variables already in the equation enters

the equation at each step. Minimum default values of

F ratio (F = .01) and tolerance (T = .001) were used to

control inclusion of independent variables at each step.

The computer facilities of the Northeast Regional

Data Center (NERDC) and the Statistical Package for the

Social Sciences (SPSS) were employed to obtain the data

analyses. The data were organized into an SPSS data set

on punched cards by the addition of a control card to

each set of teacher-learner-task data. The control card

contained codes which categorized the triad into nine

categories of stimulus content, nine categories of input/

output modality, and three categories of skill. These

categories and their codes are presented in Table 10.

Although not all of the categories are utilized in this

study, they were designed to be of utility for subsequent

programmatic research. Also present on the control card

were codes for the teacher, learner, and number of data

points obtained during assessment. Triads with less than

three data points during assessment or instruction were

omitted from the statistical analyses since variability

values could not be meaningfully computed for them.









Table 10

Categories for Coding of Each
Teacher-Learner-Task Triad


STIMULUS MOVEMENT ASSESSMENT
CONTENT MODALITY PROBE


1. Letters 1. See-say 1. Tool skill probe

2. Numbers 2. See-write 2. Single skill probe

3. Phonics sounds 3. See-do 3. Mixed skill probe

4. Sight words 4. Think-say

5. Texted words 5. Think-write

6. Number problem 6. Think-do

7. Phonics words 7. Hear-say

8. Shapes/symbols 8. Hear-write

9. Spelling words 9. Hear-do













CHAPTER IV

RESULTS


The measurements used for prediction were the median

frequency correct value from baseline assessment phase,

the weekly celebration value from baseline phase, and

the variability value (i.e., the value for the sum of

squared deviations from the line of best fit through

the frequency correct points) of baseline phase. Each

case was composed of a separate teacher-learner-task

triad. The predicted variables were the weekly celebration

value for the first instructional phase (i.e., phase two)

and the weekly celebration value for the second instructional

phase (i.e., phase three). Triads were selected for analyses

if the number of data points in predictor and predicted phases

was three or more.


Reliability

Table 11 provides percentage agreement between the

experimenter and the observer (teacher) for each week of

the study. Agreement information is shown for the

frequency correct values collected daily by the teachers.

A small samole of each teacher's recording during

one day of each week was rescored by the experimenter.









Table 11

Percentage of Interobserver Agreement for Recording
of Predictor Variable across Weeks of the Study


Baseline

Week 1

98

99

100

85

100

87

100

100

98

100

100

94

100

100


Week

100

100

100

100

100

89

100

100

100

100

98

100

100

100


Teacher

Teacher

Teacher

Teacher

Teacher

Teacher

Teacher

Teacher

Teacher

Teacher

Teacher

Teacher

Teacher

Teacher

Teacher


Instruction

2 Week 3 Week 4

100 100

100 100

97 100

100 100

98 100

100 100

100 100

100 100

100 100

100 100

100 100

100 98

100 100

94 100


Week 5

100

99

100

100

100

100

100

100

99

100

95

100

100

100









The minimum acceptable total agreement percentage for

recording of frequency correct values was 80 percent.

Percentage values were generally lower during

baseline than during instruction. The lower reliability

during baseline can be attributed to the unfamiliarity

of the teachers with the recording procedures used to

collect the data.


Analyses of Prediction Results


The results of prediction are displayed in Table 12,

Table 13, and Table 14. The first column contains the

number of separate teacher-learner-task triads which

produced the correlation values. The second column on

the tables contains the type of skill probe, type of

stimulus content, or type of input/output modality

represented by the triads. Columns 3, 4, 5, and 6 contain

product moment correlation values for frequency correct

measures. Column 7 contains the amount of variance R2 ex-

plained by prediction variables predicting to celebration for

the first instructional phase. Column 8 contains the amount

of variance R2 explained by predictor variables predicting

to celebration for the second instructional phase.

The questions of interest to this study will now

be examined with respect to the statistical results.

(1) Can frequency, celebration, and variability of

baseline performance on tool skill probes be used to

predict the rate of learning during instruction?









Only 2 percent of the phase two celebration was

explained by frequency, celebration, and variability of

tool skill performance during assessment (see Table 12).

The prediction to phase three explained 4 percent of the

variance in celebration. There were 93 triads in the

prediction to phase two celebration, and 46 triads in the

prediction to phase three celebration. It appears that

frequency, celebration, and variability of tool skill

performance are not useful for prediction of celebration

during instruction.

(2) Can frequency, celebration, and variability of

baseline performance on single skill probes be used to

predict the rate of learning during instruction?

Only 1 percent of the phase two celebration was

explained by frequency, celebration, and variability of

single skill performance during assessment (see Table 12).

The prediction to phase three also explained 1 percent of

the variation in celebration. There were 177 triads in

the prediction to phase two celebration, and 66 triads in

the prediction to phase three celebration. Frequency,

celebration, and variability of single skill performance

are not useful for prediction of celebration during

instruction.

(3) Can frequency, celebration, and variability of

baseline performance on mixed skill probes be used to

predict the rate of learning during instruction?








Table 12


Analyses of Triads Categorized
by Type of Skill Probe


Amount of Variance R2
Explained by Baseline
Predictor Variablesa
Correlation r


Mean
Frequency
Phase 1
with
Mean
Frequency
Phase 2


Median
Frequency
Phase 1
with
Median
Frequency
Phase 2


Celeration Variability
Phase 1 Phase 1
with with
Celeration Variability
Phase 2 Phase 2


Predicting
To
Celeration
Phase 2


Predicting
To
Celeration
Phase 3


100 Tool .94* .91" -.03 .43* .02 .04
Skill (N=93) (N=45)

182 Single .85* .81" .07 -.01 .01 .01
Skill (N=177) (N=66)

16 Mixed .90* .90* -.01 .16 .11 ---
Skill (N=16) (N=2)



a Predictor variables were baseline median frequency, baseline celebration, and
baseline variability


Correlation value r was significant at the .05 level


Number
Of
Triads
for
r


Type
Of
Skill
Probe









Eleven percent of the variation in phase two

celebration was explained by frequency, celebration, and

variability of performance on mixed skill probes (see

Table 12). There were 16 triads in this prediction.

The first variable to enter the regression equation

was variability, followed by median frequency.

The prediction to the second instructional phase (phase

3) contained only two triads, therefore the R2 value

was not meaningful.

(4) Is there a differential predictive relationship

by academic task stimuli from baseline performance to

performance during instruction? The different task

stimuli include letters, digits, phonics sounds, sight

words, texted words, number problems, phonics words,

and spelling words. These task stimuli will be discussed

individually in the following text.


Tasks with Letters

There were 60 triads in the prediction to phase two,

and 23 triads in the prediction to phase three celebration.

Seventeen percent of the variation in phase two celebration

was explained by frequency, celebration, and variability

of baseline performance on tasks with letters. Six percent

of the variation in phase three celebration was explained by

the prediction. Celeration entered the regression pre-

diction equation first, followed by variability and median

frequency for predicting to phase two.







Table 13

Analyses of Triads Categorized
by Type of Stimulus Content


Correlation r


Amount of Variance R2
Explained by Baseline
Predictor Variablesa


Stimulus
Content


Mean
Frequency
Phase 1
with
Mean
Frequency
Phase 2


Median
Frequency
Phase 1
with
Median
Frequency
Phase 2


Celeration Variability
Phase 1 Phase 1
with with
Celeration Variability
Phase 2 Phase 2


Predicting
To
Celeration
Phase 2


Predicting
To
Celeration
Phase 3


64 Letters


.93*


44 Digits .98*
19 Phonics Sounds .76*


3 Sight Words

5 Texted Words


.68

.91*


129 Number Problem .75*

26 Phonics Words .54*

8 Spelling Words .91*


.90*
.96*
.68
.70

.90*

.71*


.88*


.30*
-.28
.50
-.98

-.34

-.08

.12

-.35


.03
.75*
.02


.17 (N=60)
.07 (N=41)
.53 (N=17)


.06 (N=23)
.07 (N=24)
.22 (N=9)


.72

.31


-.03

-.06

.64


.02 (N=126) .02 (N=39)

.08 (N=26) **

* A **


*Correlation value r was significant

**Insufficient data for computation.


at the .05 level.


Number
Of
Triads
for
r









Tasks with Digits

There were 41 triads in the prediction to phase two

celebration, and 24 triads in the prediction to phase three.

Seven percent of the variation in phase two and phase

three celebration could be explained by baseline predictor

variables on tasks with digits (see Table 13).


Tasks with Phonics Sounds

There were 17 triads in the prediction to phase two

celebration, and 9 triads in the prediction to phase three

celebration using frequency, celebration, and variability

of baseline performance as predictors (see Table 13).

Fifty-three percent of the variation in performance during

phase two could be explained by baseline performance.

Twenty-two percent of the variation in performance during

phase three could be explained by baseline performance;

however this figure should be viewed with caution due to

the small number of triads for computation.


Tasks with Sight Words

The R2 values on Table 13 for prediction of performance

with sight words have been omitted since the number of

triads for computation of these values was less than 10.


Tasks with Texted Words

The R2 values on Table 13 for prediction of performance

with texted words have been omitted since the number of

triads for computation of these values was less than 10.









Tasks with Number Problems

There were 126 triads in the prediction of phase

two celebration using frequency, celebration, and variability

of tasks with number problems. Only 2 percent of the phase

two celebration was explained by baseline variables.

Likewise, only 2 percent of the phase three variation in

celebration was due to variation in baseline variables of

frequency, celebration, and variability on number problem

tasks (see Table 13).


Tasks with Phonics Words

There were 26 triads in the prediction of

phase two celebration using frequency, celebration,

and variability of tasks with phonics words. Eight

percent of the variance in phase two celebration could be

explained by performance characteristics during baseline

(see Table 13). The prediction to phase three celebration

had only 8 triads, and thus the R2 value has been

omitted.


Tasks with Spelling Words

There were only 8 triads each in the predictions

of phase two celebration and phase three celebration.

As such, the R2 values have been omitted (see Table 13).

When the number of cases for multiple regression is less

than 10, the R2 values are artificially inflated.









The amount of variance R2 in phase two celebration

explained by baseline predictor variables ranged from

2 percent for number problems to 53 percent for phonics

sounds. Predicting to phase three celebrations generally

yielded lower R2 values. Two of the five academic task

stimuli appeared to have a differential predictive

relationship and were higher in amount of variance

explained by predictor variables. In general, the

predictive ability of frequency, celebration, and

variability during assessment were characterized by

low positive values.

(5) Is there a differential predictive relationship

by input and output modality of learners from baseline

academic performance to performance during instruction?

The different input and output modalities of interest

include see to say, see to write, see to do, hear to say,

hear to write, hear to do, think to say, think to write,

think to do.

Since the teachers in this study were free to decide

instructional targets, not all of the input and output

modality combinations were present in the data. Sixty-five

triads were see to say, 133 were see to write, 24 were

think to say, and 74 were think to write. These will

now be discussed individually.









See To Say Tasks

Sixty-two triads were included in the prediction of

celebration during phase two from baseline performance on

see to say tasks. Twenty-six percent of the variation in

phase two celebration could be explained by frequency,

celebration, and variability of baseline performance on see

to say tasks. There were 29 triads in the prediction of

celebration during phase three from baseline performance on

see to say tasks. Only 3 percent of the variation in phase

three celebration could be explained by baseline performance

measures (see Table 14).


See To Write Tasks

See to write tasks included 130 triads for prediction.

Two percent of the variation in phase two celebration was

explained by baseline performance measures. There were 39

triads in the prediction of phase three celebration from

baseline performance on see to write tasks. Again, only 2

percent of the variation was explained (see Table 14).

Think To Say Tasks

There were 20 triads in the prediction of phase two

celebration from baseline performance on think to say tasks.

Five percent of the variation in phase two celebration

was explained by baseline performance measures. There were

12 triads in the prediction of phase three celebration. Nine-

teen percent of the variation in celebration was explained

b bageline po@dic t V'iable (bI' Tddll 1 U







Table 14


Analyses of Triads Categorized
by Type of Input/Output Modality


Correlation r


Amount of Variance R2
Explained by Baseline
Predictor Variablesa


Input
Output
Modality


Mean
Frequency
Phase 1
with
Mean
Frequency
Phase 2


Median
Frequency
Phase 1
with
Median
Frequency
Phase 2


Celeration
Phase 1
with
Celeration
Phase 2


Variability
Phase 1
with
Variability
Phase 2


Predicting
To
Celeration
Phase 2


Predicting
To
Celeration
Phase 3


65 See/Say .83* .79* .42* .08 .26 (N=62) .03 (N=29)

133 See/Write .76* .72* -.09 -.03 .02 (N=130) .02 (N=39)

24 Think/Say .88* .85* -.25 .33 .05 (N=20) .19 (N=12)

74 Think/Write .87* .79* .07 .20 .16 (N=72) .09 (N=31)




a Predictor variables were baseline median frequency, baseline celebration, and
baseline variability


Correlation value r was significant at the .05 level


Number
Of
Triads
for
r


_ __ __









Think To Write Tasks

There were 72 triads in the prediction of phase two

celebration from baseline performance on think to write

tasks. Sixteen percent of the variation in phase two

celebration could be explained by baseline predictor

variables. There were 31 triads in the prediction .of

phase three celebration from baseline performance on

think to write tasks. Nine percent of the variation

in celebration was explained by frequency, celebration,

and variability of baseline performance on think to

write tasks (see Table 14).

In summary of the analyses of triads categorized

by type of input/output modality, the amount of variance

R2 in phase two celebration explained by baseline predictor

variables ranged from 2 percent for see to write tasks,

to 26 percent for see to say tasks. Predicting to phase

three celebrations generally yielded lower predictive

values except for think to say tasks where the R2 value

increased.

It appears that there is a small differential pre-

dictive relationship by input/output modality of the task,

however the amount of variance R2 explained by predictor

variables is generally small. The predictive ability of

frequency, celebration, and variability of baseline

performance ranged from low positive to moderate positive.













CHAPTER V

DISCUSSION


There has been a great deal of interest in the

prediction of learning performance. The extent to which

this is possible depends upon the type of data and the

amount of data for prediction (White, 1972a). Previous

studies have examined the degree to which performance can

be predicted within instructional phases (White, 1972a;

Koenig, 1972; Koenig, 1980).

The present study employed operant learning techniques

to further clarify the relationship of variables that may

be useful for prediction of academic performance. Of

interest was the utility of prediction from baseline

assessment to instructional phases. Students' frequencies

of performance on a great number of tasks were measured

during a week of baseline data collection. Each separate

triad was categorized into eight different types of

academic task stimuli, four different types of input/output

madality, and three different types of skill probe.

Following baseline data collection, teachers selected

skills for instruction and delivered instruction over a

five week period. Daily frequency measures were recorded by

teachers. The data were analyzed by computer to produce









frequency, celebration, and variability values for each

phase of instruction


Findings

The findings of this study are discussed relative to

the experimental questions. Prediction from baseline

performance to performance under instruction was generally

low when skills were categorized by tool skill, single

skill, or mixed skill probes. The amount of variation in

instructional celebration which could be explained by base-

line performance measures ranged from 1 percent with single

skills to 11 percent with mixed skills. The important

finding is that a student's baseline academic performance

is not a controlling or predicting factor for performance

during instruction.

Prediction categorized by type of stimulus content

produced differential predictive strengths. Unfortunately,

R2 values for sight words, texted words, and spelling

words had to be omitted due to small numbers of triads

available for their calculation. Of the remaining

stimulus categories, strength of prediction (amount of

variation in instructional celebrations) ranged from

R2=.02 for number problems to R2=.53 for phonics sounds.

Letters stimuli (R2=.17) predictions were slightly stronger

than digits stimuli (R2=.02) predictions. In general, there

appeared to be a trend for language stimuli to predict better

than math stimuli. This trend is tentative and will require




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - Version 2.9.7 - mvs