The effects on academic performance and study behavior of two testing variables in the Personalized System of Instruction

MISSING IMAGE

Material Information

Title:
The effects on academic performance and study behavior of two testing variables in the Personalized System of Instruction
Physical Description:
iv, 66 leaves : ill. ; 28 cm.
Language:
English
Creator:
Barkmeier, David R ( David Raymond ), 1948-
Publication Date:

Subjects

Subjects / Keywords:
Individualized instruction   ( lcsh )
Academic achievement   ( lcsh )
Study skills   ( lcsh )
Genre:
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Thesis:
Thesis--University of Florida.
Bibliography:
Includes bibliographical references (leaves 50-53).
Statement of Responsibility:
by David R. Barkmeier.
General Note:
Typescript.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 000098172
notis - AAL3615
oclc - 06689798
System ID:
AA00003463:00001

Full Text











THE EFFECTS ON ACADEMIC PERFORMANCE AND STUDY
BEHAVIOR OF TWO TESTING VARIABLES IN THE
PERSONALIZED SYSTEM OF INSTRUCTION



















By

David R. Barkmeier


A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL
OF THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY




UNIVERSITY OF FLORIDA
1980
















TABLE OF CONTENTS


Page


ABSTRACT . .

CHAPTER I: INTRODUCTION .

Self-pacing .
Mastery Criteria .
Lectures .
Study Objectives .
Proctoring .
Unit Size .

CHAPTER II: EXPERIMENT ONE .


. . iii


Method .
Results .
Discussion .


CHAPTER III: EXPERIMENT TWO .

Method .
Results .
Discussion .


REFERENCES

APPENDIX A:


APPENDIX B:


APPENDIX C:


APPENDIX D:


APPENDIX E:


APPENDIX F:


EXPERIMENT ONE PERCENT CORRECT ON
FIRST ATTEMPTS ON EACH UNIT BY STUDENT


EXPERIMENT TWO PERCENT CORRECT ON
FIRST ATTEMPTS ON EACH UNIT BY STUDENT

EXPERIMENT ONE NUMBER OF TESTS TAKEN
ON EACH UNIT BY STUDENT .

EXPERIMENT TWO NUMBER OF TESTS TAKEN
ON EACH UNIT BY STUDENT .

EXPERIMENT ONE PERCENT CORRECT ON
BEST ATTEMPT ON EACH UNIT BY STUDENT .

EXPERIMENT TWO PERCENT CORRECT ON
BEST ATTEMPT ON EACH UNIT BY STUDENT .


BIOGRAPHICAL SKETCH . . .


. 54


S 56


S 58


. 60


S 62


. . .
. . .
. . .


.*














Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy




THE EFFECTS ON ACADEMIC PERFORMANCE AND STUDY
BEHAVIOR OF TWO TESTING VARIABLES IN THE
PERSONALIZED SYSTEM OF INSTRUCTION

By

David R. Barkmeier

March 1980

Chairman: Dr. James M. Johnston
Major Department: Psychology

Two studies were performed, each of which varied testing

procedures in courses taught by the Personalized System of

Instruction. In the first study, a counterbalanced, within-

subject reversal design was employed to assess the effects of

opportunity to retest on study behaviors and test performance

of students in a nine-unit introductory psychology course.

Students were allowed either two or five test opportunities

on each unit, alternating every two units between the two

conditions. Mean performance scores and mean total study time

for first attempts on all units completed under the two-

attempt condition were higher than when the five-attempt condi-

tion was in effect. The mean best performance on each unit

was nearly the same for both conditions, although the mean

number of attempts-per-unit was 0.7 less under the two-attempt

phases, which suggests that limiting attempts is a means of

iii









saving student time and instructional resources without a

necessary decrement in performance.

In the second study, a counterbalanced, reversal design

was employed to assess the effects of a contingency designed

to improve performance on the initial attempt on unit tests of

students in an introductory psychology course. The contin-

gency consisted of a limit placed on the amount that unit

scores could be improved on test retakes above the initial

attempt score. When this contingency was not in effect, the

unit score was determined by the highest score obtained on any

of the three allowable attempts, regardless of the initial

attempt score. After the first unit, students progressed

through the course by changing from one condition to the other

after every two units. Mean first attempt test performance

and mean reported study time were higher and the number of

attempts taken per unit was lower when this contingency was

in effect. Final unit scores were approximately equal regard-

less of condition. Results are discussed in terms of cost

effectiveness for both students and instructors.














CHAPTER I
INTRODUCTION


The approach to undergraduate instruction labeled the

Personalized System of Instruction (PSI) was developed by

Keller and Sherman in 1963 (Keller, 1968). The components of

this system of instruction are radically different from the

traditional lecture/discussion methods of instruction. The

components of PSI were selected to combine the benefits of

programmed instruction with a high degree of personal inter-

action. Keller (1968) described the essential elements of

PSI as follows:

(1) The go-at-your-own-pace feature, which
permits a student to move through the course
at a speed commensurate with his ability and
other demands upon his time.
(2) The unit-perfection requirement for ad-
vance, which lets the student go ahead to new
material only after demonstrating mastery of
that which preceded.
(3) The use of lectures and demonstrations as
vehicles of motivation, rather than sources of
critical information.
(4) The related stress upon the written word
in teacher-student communication; and, finally,
(5) The use of proctors, which permits re-
peated testing, immediate scoring, almost un-
avoidable tutoring, and a marked enhancement
of the personal-social aspect of the educa-
tional process. [p. 83]

It should be noted that in actual practice, there is a

large degree of variability in how the various components of

PSI are employed. Also, many instructors employ only a few

of the above components in their courses. Due to this









variability in the implementation of PSI, there has been

considerable debate concerning the degree to which course

procedures can deviate from Keller's original description and

still be properly considered a PSI course (Sherman, 1976). It

is not the intention of this discussion to add to this debate,

but rather to acquaint the reader with the fact that few of

the research studies discussed in the following pages conform

exactly to Keller's original description. The term "PSI,"

therefore, will be used in its broadest generic sense.

Early research concerning PSI focused on experimental

comparisons between classes taught using the PSI approach and

classes taught using more traditional teaching techniques.

Sherman (1976) reported that over 150 such comparisons have

been conducted with almost all comparisons demonstrating

favorable results for the PSI method. Robin (1976) reviewed

thirty-nine comparison studies. Of these thirty-nine studies,

thirty reported significant differences in favor of the groups

exposed to all or some of the elements of PSI. Of the nine

studies in which significant differences in favor of PSI were

not found, only one study reported significant differences in

favor of a more traditionally taught class. Robin reported

that the mean difference on final examinations between students

taught by the PSI method and students taught by more tradi-

tional techniques was 9%. Hursh (1976) similarly analyzed ten

studies and found that students taught by PSI methods scored

from 7 15% better on final examinations than students taught

by traditional techniques. However, these favorable findings









must be tempered by the fact that almost all of these comparison

studies contained at least one of the following methodological

flaws (Hursh, 1976). The potential confoundings Hursh listed

are differences in instructors, curriculum materials, grading

criteria, testing formats, student selection, and student

expectations. However, the fact that favorable comparisons

have been demonstrated over a large range of institutions,

student abilities, subject matter, and instructors provides

strong evidence that the PSI method or elements of the PSI

method favorably affect students' academic performance. Since

there has been such a great plethora of comparison studies with

most of them containing several methodological flaws, the

following review will concern only those studies which most

closely follow the guidelines listed by Hursh.

One of the first investigations of the effectiveness of

PSI was conducted by McMichael and Corey in 1969. This study

compared the scores on a common objective final exam of intro-

ductory psychology students enrolled in a class taught by PSI

methods with three classes of students taught by the lecture

method. The results demonstrated that the students enrolled

in the PSI class performed significantly better on the final

exam than the three classes taught by the lecture method

despite the fact that the final contained some test items that

were previously seen by the students in the three control

classes. Also, students in the experimental class rated the

course significantly better than the students in the control

class on a common course evaluation.









The major methodological flaw in this study was the use

of different instructors for each of the four sections. It

is possible that these significant differences were due to

the particular instructor and not the particular instructional

method used. However, McMichael and Corey point out two

lines of evidence that indicate that it was the method of in-

struction that produced the favorable results. First, they

argued that this potential confounding was mitigated by the

fact that there were no significant differences between final

exam scores for the three control groups despite the fact that

all three had different instructors. Secondly, examination

of the final exam scores from previous semesters indicated

that there were no differences between the four instructors

on this measure when all four used similar methods of instruc-

tion. In conclusion, it appears that these two lines of evi-

dence provide reasonably strong support that it was the PSI

method that significantly improved exam performance.

A second comparison study which controlled for five

of the six potential confoundings mentioned by Hursh (1976) was

performed by Born, Gledhill, and Davis (1972). The experimental

class was taught using the principles enumerated by Keller (1968)

with the exception that restrictions were placed on the degree

to which students could control the pacing of their examination

taking behavior over the term. These restrictions were that

the students were required to finish a certain number of units

before the midterm and the final exam. The results indicated

that students in the lecture class performed significantly






5


worse than the students taught by the PSI method on both the

midterm and final exam.

Another comparison study which appears to have met most

of Hursh's criteria was performed by Cole, Martin, and Vin-

cent (1975). This study controlled for experience of instruc-

tors, curriculum, student selection, and testing format. The

experimental group in this study was taught by the PSI method,

and the control group was taught by the lecture method. At

the end of the term, the control and experimental groups were

administered a common final. This final constituted one-third

of the course grade for the control group, but did not con-

tribute to the course grade of the experimental group. Despite

the fact that this exam had no bearing on the experimental

subjects' grades, they performed significantly better on this

exam than the control subjects. Also, students in the course

taught by the PSI method reported that they enjoyed the course

more than the students in the control group.

In summary, it appears that the PSI method produces both

improved test performance and better course evaluations.

Despite the various methodological problems that have attended

most of these studies, the evidence is overwhelming that the

PSI method or some aspects of the PSI method are responsible

for this improvement.

Since the comparison studies have been consistently

favorable, many investigators have turned to analyzing the

individual components of the PSI package. These investigations

have focused on determining whether all of the original PSI









components are necessary for the effectiveness of the PSI

method. Also, investigators have been interested in deter-

mining how variations in the parameters of each component

affect academic performance. The.following review will analyze

only a few of the large number of studies that concern each of

the PSI components.


Self-pacing


The term "self-pacing" generally refers to an arrangement

in which the students themselves schedule when they take

course examinations rather than the instructor establishing

test times. The term "self-pacing" is a misnomer for at

least two reasons. The first reason is that, though students

allowed to set their own testing times usually have a wide

latitude in setting those times, the range of times that is

available to test are limited by practical reasons. Related

to this is the requirement at most colleges that all course

work be completed in a given term. This restriction of course

forces students to schedule their exams so that they finish

all requirements within the specified time period. The

second reason that the term "self-pacing" is a misnomer is

that it implies that the controlling contingencies which dic-

tate when students test lie inside the individual. Obviously,

external environmental factors determine the pacing of stu-

dents' pacing of examinations. These environmental factors

may include students' outside employment schedules, other










course requirements, amount and difficulty of the material in

the PSI course, the television schedule, or other such factors.

The research evidence concerning course arrangements

whereby students are given wide latitude as to the completion

of course examinations reveals that students exhibit behavior

which is best described as a fixed-interval scallop. Students

testing under such course arrangements normally wait until

the end of the term to complete the majority of their assign-

ments. In cases where the incomplete option is available,

many students do not complete the course until after the term

has ended.

Since student procrastination has been a major problem

with course arrangements that allow students a wide degree of

latitude, investigators have examined several methods to de-

crease this procrastination. These methods generally fall into

two major categories. These are the establishment of deadline

dates for the completion of course units or point arrangements

which consequate test pacing according to a pacing standard.

The first study to be reviewed is characteristic of the dead-

line approach to the procrastination problem. This study was

conducted by Miller, Weaver, and Semb (1974). Students during

some portions of the course were required to complete one unit

of material per day and during the other portions of the course

no deadlines were present. The results indicated that the

utilization of deadlines strongly influenced when students

tested. Students completed 1.06 units per day under the









deadline contingency, but completed only 0.27 units per day

when the contingency was absent.

The use of point systems to reduce procrastination has

been extensively investigated. This review will present only

two studies which are representative of this type of approach

(Bitgood and Segrave, 1975; and Semb, Conyers, Spencer, and

Sanchez-Sosa, 1975). Bitgood and Segrave experimentally com-

pared three different point systems. The first point system

was a decreasing point system in which students received more

grade points for completing work early in the term than later

in the term. In the second point system, students received

more points for completing work later in the term. For the

third point arrangement, students received the same number of

points for completing assignments regardless of when the as-

signments were completed.

The results indicated that the various point systems

effectively controlled the pace at which students completed

assignments. Students who were assigned to the group that

received more points for work completed early in the term

completed significantly more units during the first third of

the term than the last third of the term. The group of stu-

dents who were given more points for completing work during

the last third of the course completed significantly more

units during that time. The third group of students who were

assigned to the fixed point contingency completed more units

during the final third of the semester than in the first

third of the semester. Also, students who completed the









course under the early completion contingency finished

significantly more units than the other two groups. Thus it

appears that point contingencies that encourage students to

complete units early in the term not only control when a stu-

dent completes units, but also increase the total amount of

work completed during the term.

Semb, Conyers, Spencer, and Sanchez-Sosa (1975) examined

the effects of four different point arrangements on student

pacing. Students in the first group earned points regardless

of when they completed the units. The second group of students

lost points for each day that they failed to maintain the

experimenter-designed rate of progress.' The third group of

students earned points only if they completed units on or

ahead of the experimenter-suggested rate of progress. For the

last group of students, points could be earned in two different

ways. First, as with the first group, points were earned for

each completed assignment regardless of when that assignment

was due. Secondly, these students earned extra points if they

completed assignments ahead of the experimenter-designated

rate of progress.

The results indicated that students in the last three

groups completed their assignments at a relatively even pace

throughout the term, while students in group one completed the

majority of their work during the last three weeks of the

course. Also, significantly more of the students assigned to

group one withdrew from the course. Since there was relatively

little difference in the rate of completion between groups two,








three, and four, Semb et al. recommend that instructors

adopt a pacing contingency that reinforces desired rates of

progress rather than one that punishes undesired rates of

progress.

Glick and Semb (1978) concluded that approximately one-

half of students enrolled in a self-paced course will complete

the course at a satisfactory pace. For the remaining stu-

dents, the establishment of instructor-imposed pacing contin-

gencies seems to effectively control their behavior. Hursh

(1976) concluded that the evidence favoring pacing contingen-

cies was overwhelming. It seems apparent that the establish-

ment of flexible pacing contingencies may be an appropriate

compromise between the desire to allow students as much

freedom as possible to schedule their work and the experimen-

tal findings that approximately half of all students cannot

adequately handle that freedom.


Mastery Criteria


Most instructors who use the PSI method require that

students achieve a minimum criterion on examinations before

they are permitted to proceed to the next unit of material.

The level of this criterion normally ranges from 80 to 100%.

Several studies have been conducted to determine the effects

of various criteria on academic performance (e.g., Johnston

and O'Neill, 1973; Davis, 1975; and Semb, 1974).

Johnston and O'Neill (1973) demonstrated that student

test performance matched the prevalent criterion present at


1










the time. Three different criteria were used in the study,

and three groups of subjects tested under each of the criteria

for at least part of the course. This result in and of itself

is of course not unexpected. It seems obvious that if stu-

dents are required to meet or exceed a certain mastery cri-

terion before they are allowed to proceed to the next unit of

material, they will meet that criterion or drop out of the

course. However, the interesting finding in this study was

the fact that the one group of students who were not required

to meet a stated criterion failed to match even the low

criterion used with the other groups. This finding is tem-

pered by the fact that Johnston and O'Neill used rate correct

and incorrect as their measure of test performance. It is

possible that students in the group that did not meet the

rate specified by the low criterion used with the other groups

did in fact perform as well as the other groups in terms of

accuracy. Rate of responding is a measure that is infre-

quently used in educational settings, and it is likely that

this was the first occasion that these students had contacted

this measure. Thus, even though the student's performance

was converted into rate measures, the student himself may

have been judging the quality of his performance in terms of

percentage correct.

Kulik, Jaksa, and Kulik (1978) point out that mastery

criteria studies which assess only process variables such as

unit test performance do not reveal how changes in mastery

criteria affect other outcome variables such as final exam


1









scores. Several investigators have assessed whether the

effects of different mastery criteria for individual units

affect final examination scores. They have generally found

that students, who were required to meet a relatively high

criterion, perform better on a final than students who were

required to meet a relatively low criterion (Bostow and Blu-

menfeld, 1972; Bostow and O'Connor, 1973; and Semb, 1974).

Kulik et al. (1978) conclude that there is sufficient evidence

that high mastery criteria effectively lead to higher levels of

academic performance. Hursh (1976) in his review of mastery

criterion studies also states that the evidence strongly

supports the function of the mastery criterion in personalized

courses. It appears that instructors would be wise to follow

the suggestion of Johnston and O'Neill (1973). They urged

instructors to begin with a relatively high criterion and to

adopt teaching strategies that would allow that criterion to

be increased.


Lectures


Keller (1968) suggested that lectures should be used to

motivate students and not to impart critical information for

which students would be responsible on a test. However, the

evidence concerning whether lectures do in fact motivate

students has not been favorable. Lloyd, Garlington, Lowry,

Burgess, Euler, and Knowlton (1972) assessed the function of

lectures in a personalized course. Lloyd et al. found that

course attendance dropped to less than 50% if there was no


1









benefit to the student for attending. The benefits that Lloyd

et al. manipulated were giving points to attend lectures and

presenting information relevant to course tests that was not

available elsewhere.

Keller (1968) found similar results as Lloyd et al. in a

study in which lectures presented in the course were optional

and were to be attended only by those students who had mas-

tered the prerequisite units. Keller reported that approxi-

mately one-half of those students eligible to attend any given

lecture actually attended.

Despite the fact that the evidence concerning lectures

reveals that students do not attend in great numbers unless

attendance is related to their grade, many students complain

about the lack of contact with the instructor. Though the

establishment and maintenance of a Personalized System of

Instruction is extremely time consuming on the part of the

instructor and staff, many students are not aware of this

fact. Instructors should be cognizant of student attitudes

toward personal interaction and adopt strategies that will

increase student-faculty contact time in ways other than

the standard lecture.


Study Objectives


The fourth principle explicated by Keller is that com-

munication between students and instructors should be by the

written word. One of the forms of written communication

that instructors provide in most PSI courses is study objectives.









Study objectives are intended to inform the student of what he

will be responsible for on an examination. Students, therefore,

should be able, with the help of these objectives, to focus

their study time on only that material which the instructor

has designated as important. The type and quality of study

objectives vary widely (Hursh, 1976). Study objectives may be

stated in terms of multiple choice, fill-in-the-blank, or

short essay questions.

Many of the research studies that examined the effective-

ness of study objectives have used tests partially composed

of the exact study questions provided to the student (Semb,

1975; and Semb, Hopkins, and Hursh, 1973). The results of

these two studies indicated that students performed signifi-

cantly better on test items that appeared as study questions

than on test items they had not previously seen. These

findings would seem to be predictable and obvious. They

would be of use only if the instructor's objective was to

teach the answers to a given set of questions.

Research concerning study objectives is lacking in as-

sessing the broader effects of PSI study objectives on such

things as long term retention, student study behavior, and

stimulus control. Kulik, Jaksa, and Kulik (1978) point out

this deficiency in the PSI research in their review of the

literature concerning study objectives. Instructors who are

employing the PSI method should be aware of this lack of

evidence in making decisions about the use of objectives.









Further research is needed to answer the important questions

concerning study objectives and the PSI method.


Proctoring


Proctors in most courses taught by the PSI method serve

at least three separate functions. These are to evaluate

student answers, to provide tutoring, and to provide social

interactions. The evidence concerning the effectiveness of

proctoring on academic performance has been mixed. Farmer,

Lachter, Blaustein, and Cole (1972) demonstrated that stu-

dents who received at least some proctoring took signifi-

cantly fewer unit attempts and scored significantly better on

a final examination than students who received no tutoring at

all. Students who received at least some tutoring were proc-

tored either 25, 50, 75, or 100 percent of the number of times

that they tested. Since there were no significant differences

between students proctored some of the time and students who

were proctored all of the time, it appears that proctoring

can be intermittent and still be effective. Farmer et al.

argue from these results that the social function of proctor-

ing may have acted to increase student motivation, thereby

producing the improvement in performance. However, they pro-

vided no independent assessment of student motivation to see

if in fact it increased as a function of contact with proctors.

Many other studies have not found the beneficial effects

of proctoring reported by Farmer et al. Kulik, Jaksa, and

Kulik (1978) reviewed five studies which assessed the effects










of proctoring on student academic performance. None of these

five studies demonstrated a positive effect on final examina-

tion performance for proctoring. In fact, one study (Hindman,

1974) found that students who had little interaction with

tutors performed significantly better on a final examination

than students who received more tutor interactions.

Barton and Ascione (1978) compared the academic perfor-

mance of two groups of students, one of which received proc-

toring while the other group did not. The nonproctored

group did not receive tutoring, but did receive immediate

feedback on the correctness of their answers. The results

indicated that the nonproctored students outperformed the

proctored students on four different academic performance

measures. These measures included first attempt performance,

number of attempts per unit, number of units completed, and

the number of A's earned. Barton and Ascione state that the

results of their study indicate that proctors do not need to

engage in rapport building, answering initial questions,

praising, or verbal remediation.

If this finding is generalized to other PSI systems, it

would suggest a great savings in the time and money that is

presently expended in training proctors. However, as Barton

and Ascione point out, there are two reasons why the tutoring

aspect of proctoring should be retained. These reasons include

the fact that many investigators have reported that proctors

benefit from engaging in proctoring and the fact that stu-

dents in the present study stated that they preferred to










interact with proctors and receive proctor-provided verbal

feedback.

Kulik, Jaksa, and Kulik (1978) report in their summary

of the literature concerning proctoring that the evidence

does not support the contention that tutoring is beneficial

to the students being tutored. Of the three functions of

tutoring, the only function that has been consistently sup-

ported is the provision of immediate feedback. Of course,

immediate feedback can be machine-provided, eliminating the

need for a person to perform this task. Kulik, Jaksa, and

Kulik (1978) further point out that most of the studies con-

cerning proctoring have contained serious methodological flaws.

However, it is possible that investigations which are properly

designed may demonstrate a beneficial effect of proctoring.


Unit Size


The specification that the amount of material contained

in a given unit be relatively small was not one of the ori-

ginal components described by Keller (1968). However, the

arrangement of the other components of Keller's system make

relatively small units necessary. The use of small units

offers several advantages. One of the advantages of small

units is that instructors can test students over a larger

percentage of study objectives than with larger units. A

second advantage is that frequent testing allows instructors

to pinpoint weaknesses in material, study questions, tests,

etc., while there is still time to correct deficiencies










during the term. And finally, small units, which normally

contain a smaller number of objectives, allow tutors to

spend more time on each objective that students have trouble

mastering.

The research evidence concerning unit size indicates

that it is a powerful variable. Kulik, Jaksa, and Kulik (1978)

state that small units and frequent quizzes are more effective

in promoting academic achievement than large units and less

frequent quizzing. Hursh (1976), in his review of the re-

search concerning unit size, states that, while most of the

evidence favors small units, the evidence is not completely

convincing.

For example, Born (1975) compared the performance of

three different groups who were assigned three different sizes

of units. The text material was divided into 27 modules, and

the unit size for the groups was either one, two, or three

modules. The results indicated that there were no differences

between groups on final examination performance, total study

time, or withdrawal rates. However, students who were assigned

small units studied more consistently than students who were

assigned larger units. Also, students who tested over small

units did not spend very long periods of time at any one study

session.

O'Neill, Johnston, Walters, and Rasheed (1975) used a

within-subject design to assess the effects of three different

sizes of units on study behavior and academic performance.

The small units contained 30 pages, the medium units 60 pages,










and the large units 90 pages of material. Two groups of

students progressed through different sequences of unit size

to control for sequence effects.

The results indicated that performance on the initial

attempt on each unit was an inverse function of the size of

the unit, that students delayed testing on the unit longer

for larger units, and that the larger the unit the more at-

tempts that were required to achieve mastery. They also

found, not surprisingly, that students spent more total time

studying for larger units than smaller units. The research

evidence and the instructional advantages afforded the use

of small units strongly indicate that instructors should keep

the size of units relatively small.


While the preceding review of the literature concerning

the various components of the PSI method has demonstrated

that many investigators have contributed to a great body of

information concerning these components, there still remain

many unanswered questions. The following two studies address

two such questions. The first study examined the effects on

academic performance and study behavior of the number of op-

portunities to test. The second study examined the effects

on academic performance and study behavior of a limited im-

provement contingency placed on the initial test of each unit.













CHAPTER II
EXPERIMENT ONE


One of the major elements of PSI is the employment of a

unit mastery criterion. Students initially failing to demon-

strate mastery of a given unit of material are given further

opportunities to attain mastery. It is possible that an un-

limited or high number of opportunities to achieve mastery may

encourage the student to make minimal preparation for each

attempt, hoping that it will be sufficient to meet the mastery

criterion. If the student's preparation is insufficient, the

major consequence is only the necessity of taking an additional

test. In contrast, if the number of opportunities to test is

limited, then the consequence of underpreparing for a final

attempt might be a lower grade.

One of the effects of the contingencies engendered by

multiple retests may be to prompt a disadvantageously low

amount of study preparation for each test attempt, resulting

in a greater number of attempts than is necessary and some-

times an even greater total study time on each unit than would

have been needed had it been distributed over fewer attempts.

While limiting the number of opportunities to retake unit

quizzes would produce a lower grade for units not mastered

in the requisite number of attempts, the availability of the

consequence does not mean that it will be often encountered;










in fact, it may have highly desirable effects on academic

performance, study time per unit, total study time in the

course, and number of attempts per unit. The present study

investigated this question by examining the effects on

test performance and study time of two different levels of

maximum test attempts per unit.


Method


Subjects

A total of 62 undergraduate students were subjects in

the present study. The subjects were primarily members of

the freshman and sophomore classes. These students were en-

rolled in a course entitled "Introduction to Psychology."

Only data from subjects who completed the course are included

in the analysis.


Course Content

The required text was Introduction to Contemporary

Psychology by Fantino and Reynolds (1975). The text was

divided into nine units consisting of 40 to 65 pages each.

The units were adjusted for roughly equal difficulty based on

performance and study data obtained from students who com-

pleted previous offerings of the course. The course met four

days a week for an academic quarter in a standard lecture

hall, although attendance at lectures was not required.


Testing and Grading Procedures

The students came to a testing room separate from the










lecture room at a time they had previously arranged and were

given a 20-item written fill-in test. After answering the

items, a student moved to an evaluation room where a proctor

scored the test and tutored the student. Students were per-

mitted to take only one test per day per unit, and a minimum

of one unit had to be completed each week. The tests, based

solely on text material, were generated from an item pool of

approximately 80 items per unit which were computer-generated

(with replacement) into random samples of unique tests for

each student. Proctors were instructed to grade strictly

according to an answer sheet generated by the computer. Stu-

dents were allowed to appeal the grading of their tests by

registering a written form which was submitted to the instruc-

tor of the course.

Raw scores on tests were transformed into unit points,

and, finally, course grades according to the schedule in

Table 1.


Experimental Procedures

Upon completion of Unit 1, students were randomly as-

signed to one of two groups which determined whether they would

be allowed a maximum of two or five attempts to meet criterion

on different units. Table 2 shows the sequence through which

the two groups progressed through the course. When the two-

attempt condition was in effect, students were allowed two

opportunities to test on each unit. When the five-attempt

condition was in effect, students were allowed five oppor-

tunities to test on each unit. Students alternated every two















Table 1. Grading criteria


Number correct


Unit points


Total unit points


18-20
16-17
14-15
12-13
10-11
08-09
06-07
05 or below


88-90
84-87
76-83
67-75
66 or below


Table 2. Sequence of experimental conditions


Number of attempts
Group 1


Number of attempts
Group 2


Grade


Units


2 and 3

4 and 5

6 and 7

8 and 9










units between the two conditions, and Group 2 proceeded through

the course in the opposite order from Group 1. Following the

completion of each unit, students were notified in writing

as to which condition would be in effect for the next unit.

Students were also required to complete a Study Recording

Form adapted from one developed by Johnston, O'Neill, Walters,

and Rasheed (1975). Students were taught how to use the form

to record their study behaviors at the beginning of the term.

They were instructed to record study information immediately

following each study episode, and they turned in a completed

form each time they were tested. The information recorded

included total study time, time spent engaging in various

study activities, and the distribution of these times be-

tween test dates. The completed Study Recording Forms were

checked for errors and omissions by a graduate student not

otherwise involved with the course, and students were notified

in writing if errors were discovered in the manner in which

they completed the form.


Results


Since the experimental conditions were not imposed until

Unit 2, data from Unit 1 are excluded from the analysis.

Figure 1 shows the mean percent correct for all students in

each group on initial attempts on Units 2 through 9. These

mean scores were higher on all units when the number of

attempts was restricted to two rather than five, the differences
























I


L"*C


Lo 0 0 0 0 0 0
cO 00 I- D D tO IwO

1338d03 LN38d3d


r--

(D
c,
LO --
:D


- o
4- -
C,

- E
E 0)
0r
a) 4-)
4-1 4-3
o c

.,-
0>


II II II It


DO


0---


a*


- -
































O


Ik


I I O I i i O I
) LO O io O In O l
S \J O -- lO CJ O f-
r r ro rn o ro SJ




31Ai manis Ao S31nNIAI


- 00


- 0
D
-.


- C\J


C-L CL
o3
0 0
S- S-
CD CD


If II II



DO


U,
VI,
4-)
CL


E
4-)


.4-
a
M-
U,


0
r



4-



0
S-



a,




0-
Zo





cL




I1
-r-

i,


o--




L C
E 0
O >
- *r-
U





4-r-
4-- 4-





















-0L

F-


SldIJ311V JO y38VLnN


c
0
0 *r-
.r- -o
4i--
*0 C
C 0
0 t
U
0
0 E
E E
+J 4-.
CL Q (0
3 3
000 >
c S.- 5 r-
(D LC -> 4-


II II II

O0


1 1 I I r r


z


4


o"ll


B










ranging from 5.3% higher on Unit 8 to 12.9% higher on Unit 7.

The mean difference across all units for both groups between

the two conditions was 8.1%. An analysis of performance by

individual subjects was conducted by separately combining

the scores for the four first attempts taken in each of the

two conditions. Each subject's total initial attempt score

in one condition was then compared with his total for the

other condition. This analysis revealed that 47 subjects

performed better and 13 subjects performed worse on initial

attempts when testing under the two-attempt condition. No

differences were found for the remaining two subjects.

The means for each group of reported total study time

for the first attempt on each unit are depicted in Figure 2.

These means represent the reported first attempt total study

times for 24 students in Group 1 and 23 students in Group 2.

Nineteen subjects were excluded from study behavior analysis

for one or more of the following reasons: an admission by

the student of at least occasional dishonesty in filling out

the Study Recording Form; failure to turn in two or more

forms; or return of forms that consistently contained errors.

The mean reported total study time was higher for both groups

when the two-attempt limit was in effect than when the five-

attempt limit was in effect. The mean difference in reported

study time across all units for both groups between the two

conditions was 69.9 minutes.

Figure 3 depicts the mean number of attempts-per-unit for

each group. The number of attempts taken under the five-attempt








condition was higher than those taken in the two-attempt

condition for all units. The differences ranged from 0,3

attempts in Unit 5 to 1.1 attempts in Unit 7. The mean

difference between the two conditions across both. groups

was 0.7 attempts.

A Friedman two-way analysis of variance by ranks for

correlated samples (Friedman, 1937) was performed on the means.

of each of the above three dependent variables; initial test

performance, initial test study time, and number of attempts.

The test revealed significant differences for all three

variables Cdf = p <.01).

Finally, the mean of the best attempts on each unit for

tests taken under the two-attempt condition was 83.3% and

under the five-attempt condition, it was 85.8%. Mean total

reported study time for all attempts on each unit was slightly

higher for units completed under the two-attempt condition

on six of the eight units. The mean reported total study

time for all units taken under the two-attempt condition for

both groups was 472.63 minutes, and under the five-attempt

condition, 445.75 minutes.

Discussion

The present study demonstrated that, when limited in the

number of attempts allowed on each unit, students performed

at a higher level on initial tests than when allowed a larger

number of tests on each unit. Performance on the initial

tests on each unit when limited to two attempts was 8%










higher than that of the five-attempt condition. Reported mean

time spent studying for the initial attempt on a unit was also

more than an hour more when students were limited to two at-

tempts rather than five attempts per unit. This supports the

contention that when students are given a practically unlimited

or very high number of test opportunities, they engage in less

preparation for initial tests than when the number of retests

is more limited. As a result, when allowed five attempts,

students took 45% more tests on each unit than when limited

to two tests per unit. However, the greater amount of study

and higher score on first attempts contributed to essentially

the same final level of performance on all units for students

performing under the two-attempt condition compared with the

performance under the five-attempt condition. These con-

clusions are warranted by the clear and consistent group ef-

fects generated by the within-subject manipulations.

While the mean reported total study time for all tests

on a unit was higher for the two-attempt condition than the

five-attempt condition, this may be attributed to the composi-

tion of retests with replacement employed in the course. As

further attempts were taken on a given unit, the percentage

of previously seen questions on a given test obviously in-

creased. Since students completing units under the five-

attempt condition took many more attempts to complete a given

unit, the test completed on a later attempt was in some sense

an easier test than the initial test. If all tests had been










constructed with new items, it is likely that the five-attempt

condition would have produced greater total study time than

the two-attempt condition.

This study has shown that the number of opportunities

permitted for retesting on each unit is a variable which has

unambiguous and reliable effects on both study behavior and

academic performance. A high or practically unlimited number

of opportunities for testing on a unit embodies contingencies

which have the effect of producing less initial study on unit

material, resulting in poorer initial performance, and a

greater number of tests on each unit than may be necessary.

When this is the case, it exerts costs on students and in-

structional resources. In this particular course, if the two-

tests-per-unit condition had been in effect for the entire

term, the data would predict that approximately 405 less tests

would have been taken than if the five-tests-per-unit condition

had been in effect.

Of course, restricting the number of retakes may increase

the number of students who fail to reach a desired mastery

level. Obviously, the number of opportunities for retest

interacts with such variables as the difficulty of materials

and tests, unit size, grading criteria, academic skills of the

students, etc. Instructional technologists should consider

the demonstrated effects of opportunity for retest in relation

to the above variables when designing instructional procedures.














CHAPTER III
EXPERIMENT TWO


The previous study demonstrated that the opportunity for

retest is clearly a major variable affecting student perfor-

mance. Students, when restricted to only one retake, scored

higher on their first test on a unit than when they were

allowed up to four retakes. The problem with restricting

retakes is that it makes the use of a mastery requirement

impossible. However, other course arrangements might also

produce an equivalent improvement in study time and initial

test performance without significantly decreasing the num-

ber of final unit scores that are at a mastery level.

Marholin (1976) assessed the effects of one such possible

arrangement by implementing a minimal grade penalty on the

initial test on a unit. When this minimal grade penalty was

in effect, a student's unit score was determined by a com-

bination of the student's scores on two attempts. The first

attempt contributed 30% and the second attempt contributed

70% toward the final unit score. Student performance on

initial tests was higher on units in which this minimal grade

penalty was in effect, and the total number of tests taken

under the minimal grade penalty condition was lower. However,

the generality of these findings is limited by the testing

arrangement employed in the course in that the second test on









each unit consisted solely of items that the student had not

passed on the first test. Since the second test consisted

solely of questions drawn from the first test, this arrange-

ment would seem to encourage minimal study effort for the

first test on a unit. Therefore, studying for the second

test might be restricted to the learning of material specific

to the answering of missed questions. The minimal grade

penalty should have decreased this strategy, but Marholin

did not measure study behavior in order to see if this was

the case. It appears that testing contingencies such as

Marholin employed would produce different patterns of study

and performance than a testing arrangement in which the cor-

relation between initial tests and subsequent tests of iden-

tical questions was less than one.

The present study examined the effects of a contingency

designed to increase performance on the first attempt at

each unit test. Students were limited in the amount that

they could increase their unit scores on specified units

above their initial test performance in an effort to deter-

mine if such a contingency would increase study time and thus

generate higher initial test scores.


Method


Subjects

A total of 74 undergraduate students enrolled in an in-

troductory psychology class were subjects in this study. The

subjects were primarily members of the freshman and sophomore









classes. Only data from subjects who completed the course

are included in the analysis.


Course Content

The required test was Introduction to Contemporary

Psychology by Fantino and Reynolds (1975). The text was

divided into nine units consisting of 40 to 65 pages each.

The units were adjusted for roughly equal difficulty based on

performance and study data obtained from students who had

completed previous offerings of the course. The course met

four days a week for an academic quarter in a standard lec-

ture hall, although attendance at lectures was not required.


Testing and Grading Procedures

The students came to the testing room, which was in a

different location than the lecture room, at a time they

had previously arranged and were given a 20-item fill-in test.

After answering the items, they moved to an evaluation room

where a proctor scored the test and tutored the student. Stu-

dents were permitted to take only one test per day per unit,

and a minimum of one unit had to be completed each week. The

20-item fill-in-the-blank tests of text material were generated

from an item pool of approximately 80 items per unit, which

were computer-generated (with replacement) into random samples

of unique tests for each individual. Proctors were instructed

to grade strictly according to an answer sheet generated by

the computer. Students were allowed to appeal the grading of

their tests by registering a written form which was submitted










to the instructor of the course. Raw scores on tests were

transformed into unit points, and finally, grades according

to the schedule in Table 3.


Experimental Procedures

Upon completion of Unit 1, students were randomly assigned

to one of two groups. Table 4 shows the sequence through

which the two groups progressed through the course. Under the

limited improvement condition, students were allowed through

retesting to increase their final unit score by only two

points over that obtained on the initial test for that unit.

Therefore, in order to be able to obtain the maximum number

of unit points (10), a student had to achieve a minimum of

eight unit points (70% correct) on the initial test on that

unit. When this contingency was not in effect, the student's

unit score was determined by the highest score on any attempt

regardless of the performance on the initial attempt. This

condition is referred to as the highest score condition in

Table 4.

Students were allowed a maximum of three attempts on

each unit regardless of condition. Analysis of student per-

formance in previous offerings of the course indicated that a

maximum of three tests per unit was adequate to achieve a

score of 90% or better if the initial test score was at or

above 70%. Upon completion of each unit, students were

notified in writing as to which condition would be in effect

for the next unit.


1















Table 3. Grading criteria


Number correct


Unit points


Total unit points


18-20
16-17
14-15
12-13
10-11
08-09
06-07
05 or below


88-90
84-87
76-83
67-75
66 or below


Table 4. Sequence of experimental conditions


Contingency
Group 1

Highest
Score

Limited
Improvement

Highest
Score

Limited
Improvement


Contingency
Group 2

Limited
Improvement

Highest
Score

Limited
Improvement

Highest
Score


Grade


Units


2 and 3


4 and 5


6 and 7


8 and 9








Students were required to complete a Study Recording Form

adapted from one developed by Johnston, O'Neill, Walters, and

Rasheed (1975). Students were instructed to record their study

behaviors on the form immediately following each study episode,

and they turned in the completed form each time they were

tested. Information recorded included total study time, time

spent engaging in various study activities, and the distribu-

tion of these times between tests. The completed Study Re-

cording Forms were checked for errors and omissions by a

graduate student not otherwise involved with the course. Stu-

dents were notified in writing if errors were discovered on

their completed forms.


Results


Since the experimental conditions were not imposed until

Unit 2, data from Unit 1 are excluded from analysis. Figure 4

depicts the mean percent correct for the first attempt on

Units 2 through 9 for both groups and shows that the initial

test scores were consistently higher when the limited improve-

ment condition was in effect. A Friedman two-way analysis of

variance by ranks for correlated samples (Friedman, 1937) was

performed on the means of the initial test performance. The

test revealed that initial test performance was significantly

better under the limited improvement condition (df = 1, p< .01).

The mean of the differences between the two conditions for

initial attempts on all units was 9.6%. The differences ranged

from 4.0% on Unit 4 to 13.9% on Unit 3. The percentage of


































103l00 %


-4- Co
0 0
S- S-
0 C3D


11 II II

DO


E


-(0



z
:: D
-l 1
-clt


-re)

-CJ


0


04I'






























0.


'S


On 0 LtO N
LO LO qt lq
ioom l


3VJI AniS jO S3inLNlw


'-4 Cj

0 0
CJ G3


l -



:- D


C:
0
r-0

c c,
0 0
r- U
*S- S-




O.r-
U E

-) 0-
U E


CL
E
c1,





c,
4-3













(,
0
4--

4--
LU
0
I-









a)
E
4,

-T
3






--i
S..
CL
o

-0
11 +-
4- C
0 3
Q-


II I II I



DO
!


0---


C3,11






















,0


O- -


- (D


-O 0)
h-


- cJ


SId3liiV dO d39NViNN


-1 C\j
0- Qa
0 0
S L


II II I II


00


1


a,--









initial attempts at or above 70% was 84.8% when the limited

improvement condition was in effect and 62.2% when the highest

score was in effect. An analysis of performance by individual

subjects was conducted by separately combining the scores for

the four first attempts taken in each of the two conditions.

Each subject's total initial attempt score in one condition

was then compared with his total for the other condition.

This analysis revealed that 58 subjects performed better and

fourteen subjects performed worse on initial attempts when

testing under the limited improvement condition. No dif-

ferences were found for the remaining two subjects.

Figure 5 presents the mean reported total study time for

the first test on each of the eight units. These means repre-

sent the study time for 28 students in Group 1 and 29 students

in Group 2. Seventeen subjects were excluded from study be-

havior analysis for one or more of the following reasons: an

admission by the student of at least occasional dishonesty in

filling out the Study Recording Forms; failure to turn in two

or more forms; or submitting forms that consistently contained

errors. The mean reported study times were higher for initial

tests taken under the limited improvement condition on all

units except Unit 4. A Friedman two-way analysis of variance

by ranks for correlated samples (Friedman, 1937) on the means

revealed that reported study time for initial tests under the

limited improvement condition was significantly greater than

when this condition was not in effect (df = 1, p< .05). The

mean difference between the two conditions in reported study

time for initial attempts on all units was 79.46 minutes.










The mean number of attempts per unit by each group is

reported in Figure 6. The number of attempts per unit taken

under the limited improvement condition was consistently

fewer than those taken in the absence of that condition. The

mean difference between the two conditions for all units

was 0.3 attempts per unit. The means of the two conditions

were compared using the Friedman two-way analysis of variance

by ranks for correlated samples (Friedman, 1937) test. This

analysis revealed that the total number of attempts per unit

taken under the limited improvement condition was signifi-

cantly fewer than when this condition was not in effect

(df = 1, p< .01).

The mean of the best attempts was 86% correct under the

limited improvement condition and 86.2% when this condition

was not in effect. There was also little difference between

the two conditions in the total amount of study time for all

attempts. The mean total reported study time for all units

taken under the limited improvement condition was 523.6

minutes per student per unit and 501.6 minutes per student

under the highest score condition.


Discussion


The present study clearly demonstrated that the limited

improvement contingency produced consistently higher first

test performance. This finding confirms the results reported

by Marholin (1976) and greatly extends their generality. Mean

reported time spent studying for initial tests was also higher


I










when students completed tests under the limited improvement

contingency than in the absence of that contingency (with the

exception of Unit 4, for which the difference in initial test

performance between the two conditions was also the smallest

of all the units).

The absence of a difference in mean best test scores be-

tween the two conditions is not surprising considering the

powerful effects on performance of a specified high criterion

for final grades (Johnston and O'Neill, 1973). The criterion

for a grade of "A" employed in this study corresponded closely

to a mean percentage of 90% for all units, and it is highly

probable that the effects of the "A" criterion minimized any

difference on final test performance that the two conditions

might have otherwise prompted due to differences in first

attempt study and performance. A second explanation of this

lack of a best score difference is that variation in study

times for first attempt tests was compensated by study times

for subsequent tests on each unit (mean total study times for

all units being quite similar for both conditions). Third,

if the test composition procedure had not employed replacement,

a difference between final test scores might have emerged.

Because each additional test on a unit included a great pro-

portion of items already seen, the larger number of attempts

taken when the highest score contingency was in effect resulted

in a greater number of "easier" tests, thereby probably in-

flating the final unit scores in this condition.










Similarly, testing with replacement may have contributed

to the absence of large differences in total study time for

all attempts between the two conditions. By taking more

second and third attempts, students testing in the highest

score condition were preparing for "easier" tests. If the

level of difficulty was equivalent for all attempts, it is

likely that study time would have had to be increased above

the obtained amount. Therefore, if all attempts were equal

in difficulty, it would be hypothesized that total study time

for a given unit would be lower when a limited improvement

contingency was in effect.

While the mean difference between the number of tests

taken under the limited improvement and unlimited improvement

conditions may seem relatively small (0.3), it does have

considerable practical significance. If this difference was

projected to two classes, the students in the class not em-

ploying the limited improvement condition would take approxi-

mately 18% more tests than students in a class employing a

limited improvement condition. Students in our courses fre-

quently state on course evaluations that they feel that they

take too many tests. This feeling is likely to be evident in

courses which use the PSI method of instruction. Adoption of

a limited improvement contingency, which should lower the

number of tests taken, would maintain many of the benefits

of PSI while somewhat decreasing this negative feeling.

In order for students to be eligible to earn the maximum

number of unit points when testing under the limited improvement










condition, they were required to achieve at least eight unit

points (70%) on the initial test on a unit. In the present

study, less than 15% of the total number of initial tests

taken under this condition fell below 70%. Almost all of

these scores below 70% were made by students who received

a "C" or less in the course. Also, over one-half of these

initial test scores below 70% were made on the last day that

testing was allowed on a unit. Students, by waiting until the

final day available to test on a unit, had therefore for-

feited their opportunities for retests. There were only nine

instances where students scored higher on subsequent retakes

than the two point limit on improvement would allow. Thus,

it appears that a limited improvement contingency empirically

established can produce higher initial test scores while not

artificially limiting final test performance.

Many, if not most, PSI courses allow multiple test op-

portunities for each unit of material. In many PSI offerings,

the number of retakes allowed is either limitless or func-

tionally limitless. Subsequent retakes in courses that offer

large numbers of attempts almost necessarily contain ques-

tions that appeared on earlier tests. This is particularly

true when the ratio of the item pool size to the number of

questions on each test is relatively small. Many students

confronted by such a testing arrangement may adopt a strategy

of attempting to "beat the system." Such a strategy would

involve the attempt to learn as many of the specific test

items as possible and therefore expending minimal effort










studying. The utilization of a limited improvement contingency

should greatly reduce the degree that students would employ

such a strategy.

It must be emphasized that any limited improvement con-

tingency should be empirically established. The level of the

limited improvement contingency used in the present study

was established by examining study and test performance records

of students who had completed previous offerings of the course.

The level of such a contingency might vary depending upon

several other components of a course. These components would

include the desired level of mastery, the number of available

retakes, the difficulty of the materials, the type and dif-

ficulty of the test items, the size of the units, and the

educational history of the students, among others.

In Experiment One, approximately 25% of the students did

not perform as predicted on initial attempts. These students

scored higher on initial attempts when they were allowed four

retakes rather than when restricted to one retake. In Ex-

periment Two, approximately 22% of the students also did not

perform as predicted on the same measure. An important ques-

tion to consider is whether these students differ in some

important way from the students who did perform as predicted.

One possible difference is that the groups may differ in

academic ability. In order to test this possibility, means

were calculated for all initial attempts separately for those

who performed according to the experimental hypothesis and

those who did not. This analysis revealed for both experiments









that students who did not perform in the predicted direction

scored higher on initial attempts than those who did. These

differences were approximately 13% in Experiment One and 6%

in Experiment Two.

This finding suggests that students who normally per-

form at a high level may not be affected by offering them

multiple retests or placing a limit on the amount that they

can improve their unit scores. Of course, this finding is

tempered by the fact that no other measures of student

abilities were obtained in these studies and the fact that

many students who also performed at the same high level did

behave on initial unit tests as predicted. Further research

should be conducted which examines students' entering skills

in relation to variables similar to those in the present

studies.

This finding also leads one to question whether the

results of these studies are wholly generalizable to all PSI-

taught classes. In many courses taught by the PSI method,

students usually demonstrate mastery of a particular unit of

material in one or two attempts. Once students learn that

they can demonstrate mastery by studying at a moderate level,

contingencies such as those employed in the present studies

may not be as effective. In the present studies, mean per-

formance on initial attempts rarely exceeded 80% even when

the contingencies which encouraged maximum performance were

in effect. Students also reported that they spent a consi-

derable amount of time studying for these attempts. It is









possible that students enrolled in a class where "A" level

performance is rarely achieved on first attempts even after

considerable study may be more susceptible to the contingencies

employed in the present studies than students enrolled in classes

where "A" level performance is more readily achieved on first

attempts. Therefore, future studies should be directed toward

this possibility by examining various unit difficulty levels

in conjunction with contingencies similar to those employed in

these studies.

A limitation on number of retakes or on the amount that

a score can be improved over a first attempt score makes the

employment of a mastery criterion impossible. Since mastery

criteria have been demonstrated to be a valuable component

in PSI systems, other contingencies may be employed in con-

junction with those investigated in the present studies which

maintain mastery as a course component. For example, instead

of placing a limit on the number of attempts that a student

may take on a given unit, the limit could be placed on the

number of attempts that would count toward the student's grade.

Then if a student failed to demonstrate mastery during those

limited attempts, he would be required to continue testing

until he did. A similar sort of course arrangement could also

be used with a limited improvement contingency. If such a

contingency was established judiciously, most students would

seldom find themselves in the situation of taking tests which

do not count toward their grades. Of course, if a student

was required to take tests which did not count toward his






49


grade on several occasions, he may develop a negative attitude

toward the course. It is strongly recommended that course

designers take into account student ability levels, unit

difficulty levels, and student performance before establishing

such contingencies.


1














REFERENCES


Barton, E. J., and Ascione, F. R. The proctoring component
of personalized instruction: A help or hindrance?
Journal of Personalized Instruction, 1978, 3, 15-22.

Bitgood, S. C., and Segrave, K. A comparison of graduated
and fixed point systems of contingency managed instruc-
tion. In J. Johnston (Ed.), Behavior research and tech-
nology in higher education. Springfield, Ill.: Charles
C. Thomas, 1975.

Born, D. G. Exam performance and study behavior as a function
of study unit size. In J. Johnston (Ed.), Behavior re-
search and technology in higher education. Springfield,
Ill.: Charles C. Thomas, 1975.

Born, D. G., Gledhill, S. N., and Davis, M. L. Examination
performance in lecture-discussion and personalized in-
struction courses. Journal of Applied Behavior Analysis,
1972, 5, 33-44.

Bostow, D. E., and Blumenfeld, G. J. The effect of two test-
retest procedures on the classroom performance of under-
graduate college students. In G. Semb (Ed.), Behavior
analysis and education. Lawrence, Ks.: University of
Kansas Support and Development Center for Follow Through,
Department of Human Development, 1972.

Bostow, D. E., and O'Connor, R. J. A comparison of two
college classroom testing procedures: Required remedia-
tion versus no remediation. Journal of Applied Behavior
Analysis, 1973, 6, 599-607.

Cole, C., Martin, S., and Vincent, J. A. Comparison of two
teaching formats at the college level. In J. Johnston
(Ed.), Behavior research and technology in higher educa-
tion. Springfield, Ill.: Charles C. Thomas, 1975.

Davis, M. L. Mastery test proficiency requirement affects
mastery test performance. In J. Johnston (Ed.), Behavior
research and technology in higher education. Springfield,
Ill.: Charles C. Thomas, 1975.

Fantino, E., and Reynolds, G. S. Introduction to contemporary
psychology. San Francisco: W. H. Freeman and Company,
1975.










Farmer, J., Lachter, G. D., Blaustein, J. J., and Cole, B. K.
The role of proctoring in personalized instruction.
Journal of Applied Behavior Analysis, 1972, 5, 401-405.

Friedman, M. The use of ranks to avoid the assumption of
normality implicit in the analysis of variance. Journal
of the American Statistical Association, 1937, 32, 675-
701.

Glick, D. M., and Semb, G. Effects of pacing contingencies in
personalized instruction: A review of the evidence.
Journal of Personalized Instruction, 1978, 3, 36-42.

Hindman, C. D. Evaluation of three programming techniques in
introductory psychology courses. In R. S. Ruskin and
S. F. Bono (Eds.), Proceedings of the First National
Conference. Washington, D. C.: Center for Personalized
Instruction, 1974.

Hursh, D. E. Personalized systems of instruction: What do
the data indicate? Journal of Personalized Instruction,
1976, 1, 91-105.

Johnson, K. R., and Sulzer-Azaroff, B. The effects of dif-
ferent proctoring systems upon student examination per-
formance and preference. In J. Johnston and G. W. O'Neill
(Eds.), Research and technology in college and university
teaching. Atlanta, Ga.: Georgia State University Urban
Life Center, 1975.

Johnston, J. M., and O'Neill, G. W. The analysis of perfor-
mance criteria defining course grades as a determinant
of college student academic performance. Journal of
Applied Behavior Analysis, 1973, 6, 261-268.

Johnston, J. M., O'Neill, G. W., Walters, W. M., and Rasheed,
J. A. The measurement analysis of college student study
behavior: Tactics for research. In J. Johnston (Ed.),
Behavior research and technology in higher education.
Springfield, Ill.: Charles C. Thomas, 1975.

Johnston, J. M., and Pennypacker, H. S. A behavioral approach
to college teaching. American Psychologist, 1971, 26,
214-244.

Keller, F. S. "Good-bye, teacher ." Journal of Applied
Behavior Analysis, 1968, 1, 79-89.

Kulik, J. A., Jaksa, P., and Kulik, C. C. Research on component
features of Keller's personalized system of instruction.
Journal of Personalized Instruction, 1978, 3, 2-14.










Lloyd, K. E., Garlington, W. K., Lowry, D., Burgess, H.,
Euler, H. A., and Knowlton, W. R. A note on some rein-
forcing properties of university lectures. Journal of
Applied Behavior Analysis, 1972, 5, 151-156.

Marholin, D. The effect of a minimal grade penalty on mas-
tery quiz performance in a modified PSI course. Journal
of Personalized Instruction, 1976, 1, 80-85.

McMichael, J. S., and Corey, J. R. Contingency management
in an introductory psychology course produces better
learning. Journal of Applied Behavior Analysis, 1969,
2, 79-83.

Miller, L. K., Weaver, F. H., and Semb, G. A procedure for
maintaining student progress in a personalized univer-
sity course. Journal of Applied Behavior Analysis, 1974,
7, 87-91.

O'Neill, G. W., and Johnston, J. M. An analysis of test item
types as a determinant of student academic performance
and study behavior. Journal of Personalized Instruction,
1976, 1, 123-127.

O'Neill, G. W., Johnston, J. M., Walters, W. M., and Rasheed,
J. A. The effects of quantity of assigned material on
college student academic performance and study behavior.
In J. Johnston (Ed.), Behavior research and technology in
higher education. Springfield, Ill.: Charles C. Thomas,
1975.

Robin, A. L. Behavioral instruction in the college classroom.
Review of Educational Research, 1976, 46, 313-354.

Semb, G. The effects of mastery criteria and assignment length
on college student test performance. Journal of Applied
Behavior Analysis, 1974, 7, 61-69.

Semb, G. An analysis of the effects of hour exams and student-
answered study questions on test performance. In J.
Johnston (Ed.), Behavior research and technology in higher
education. Springfield, Ill.: Charles C. Thomas, 1975.

Semb, G., Conyers, D., Spencer, R., and Sanchez-Sosa, J. J.
An experimental comparison of four pacing contingencies.
In J. Johnston (Ed.), Behavior research and technology
in higher education. Springfield, Ill.: Charles C.
Thomas, 1975.

Semb, G., Hopkins, B. L., and Hursh, D. E. The effects of
study questions and grades on student test performance
in a college class. Journal of Applied Behavior Analysis,
1973, 6, 631-643.






53


Sherman, J. G. PSI: Current implications. In R. S. Ruskin
(Ed.), An evaluative review of the personalized system
of instruction. Washington, D. C.; Center for Person-
alized Instruction, 1976.














APPENDIX A
EXPERIMENT ONE
PERCENT CORRECT ON FIRST ATTEMPTS ON EACH UNIT BY STUDENT

Group 1


Unit


Student


45
95
80
75
80
50
75
90
45
50
75
100
90
80
45
30
65
90
50
65
65
75
90
55
50
65
80
70
45
45
100
40


75
90
60
85
80
80
85
100
30
55
70
95
80
70
95
80
80
95
70
80
85
95
75
75
75
35
95
55
50
90
85
95


80
90
70
100
75
35
70
90
30
75
80
75
65
60
90
70
75
90
60
80
70
90
80
45
45
15
70
85
35
70
35
40


90
70
70
90
90
60
90
95
40
60
95
95
100
65
90
90
90
70
55
90
85
95
65
80
30
60
100
85
60
95
80
90


45
25
45
70
65
80
55
100
0
40
60
90
85
40
90
55
75
95
15
65
65
90
40
25
45
50
80
80
60
80
45
55


X = 67.34 77.03 66.88 66.88 69.06 78.75 59.69 61.09










APPENDIX A Continued


Group 2

Unit
Student 2 3 4 5 6 7 8 9

1 55 45 45 40 45 50 45 75
2 30 25 40 65 45 45 30 55
3 90 85 80 85 85 70 90 100
4 90 85 95 75 55 95 75 80
5 75 40 95 80 100 80 90 75
6 35 35 45 65 30 85 50 45
7 25 40 75 55 50 35 40 70
8 60 60 90 85 65 80 75 80
9 30 40 65 75 05 10 30 45
10 55 85 65 50 70 75 85 90
11 85 75 95 70 60 80 65 80
12 90 80 90 80 90 90 75 95
13 70 90 60 70 80 80 90 65
14 90 80 60 65 30 25 60 75
15 20 55 35 50 20 45 30 20
16 80 80 90 90 75 70 90 70
17 90 80 100 85 90 80 70 95
18 20 60 40 65 20 35 45 55
19 60 85 90 80 80 95 70 75
20 90 60 90 80 80 70 70 85
21 55 70 50 80 55 50 80 65
22 60 60 80 60 60 60 55 90
23 90 90 95 85 95 85 90 85
24 35 45 45 50 45 15 45 15
25 75 90 90 95 75 80 70 60
26 65 90 90 95 85 95 60 90
27 15 45 80 70 50 40 75 80
28 70 70 65 80 70 90 65 80
29 75 90 90 90 95 100 70 90
30 70 80 75 85 75 65 65 90


X = 61.67 67.17 73.50 73.17 62.67 65.83 65.00 72.50


1















APPENDIX B
EXPERIMENT TWO
PERCENT CORRECT ON FIRST ATTEMPTS ON EACH UNIT BY STUDENT

Group 1


Unit
5


Student


35
55
50
70
70
70
90
75
65
40
55
80
80
95
50
65
75
70
60
100
55
85
80
60
80
75
80
40
85
70
35
80
75
50
40
15
75
65


90
80
60
65
90
65
90
85
70
55
95
80
75
75
70
90
95
75
85
95
90
100
80
55
90
80
85
75
90
95
65
95
60
90
70
80
45
70


50
90
70
85
70
65
100
90
95
45
65
85
80
85
85
75
80
95
95
85
35
95
85
80
65
90
80
70
90
85
45
50
85
40
80
70
50
70


X = 65.66 68.16 78.95 76.58 68.82 75.13 75.26 77.76

56










APPENDIX B Continued

Group 2

Unit
Student 2 3 4 5 6 7 8 9

1 75 100 90 90 75 100 80 95
2 80 65 75 70 80 65 50 80
3 85 75 95 85 70 95 80 80
4 80 90 85 95 90 85 80 65
5 65 85 0 35 70 65 50 45
6 65 75 80 70 85 90 55 50
7 85 75 80 75 65 75 35 65
8 70 80 50 60 80 70 60 70
9 65 75 75 75 55 80 70 45
10 70 90 55 70 75 90 70 90
11 40 45 60 50 60 80 20 35
12 85 90 80 85 90 85 85 80
13 95 70 80 95 95 80 75 75
14 90 85 95 90 85 70 85 80
15 95 90 100 95 85 100 95 95
16 60 80 90 60 80 80 45 40
17 65 75 75 75 85 70 15 65
18 85 80 65 60 75 75 70 65
19 70 95 95 80 85 90 90 85
20 60 65 85 25 75 90 35 55
21 85 75 90 75 90 100 80 85
22 95 100 85 90 80 100 30 35
23 80 75 45 70 70 95 25 85
24 85 85 80 85 70 80 90 75
25 90 85 65 70 80 95 60 75
26 70 95 65 70 75 100 75 70
27 100 95 90 90 75 95 90 75
28 70 65 65 80 90 90 25 50
29 90 90 80 60 80 85 55 95
30 75 80 80 80 75 60 35 60
31 80 95 65 45 80 85 80 70
32 80 75 75 75 85 75 90 50
33 90 75 50 60 85 90 30 80
34 70 90 75 65 90 90 60 75
35 80 95 85 75 95 80 90 85
36 85 95 95 80 75 85 65 70


X = 78.05 82.08 75.00 72.50 79.31 84.44 61.81 69.31













APPENDIX C
EXPERIMENT ONE
NUMBER OF TESTS TAKEN ON EACH UNIT BY STUDENT

Group 1

Unit
2 3 4 5 6 7 8 9


X = 1.72 1.53 2.22 2.09 1.69 1.44 2.25 2.00


Student










APPENDIX C Continued

Group 2

Unit
Student 2 3 4 5 6 7 8 9

1 1 3 2 2 3 2 2 1
2 3 3 2 2 3 4 2 2
3 1 3 2 2 2 3 1 1
4 1 3 1 2 3 1 2 1
5 2 3 1 2 1 2 1 2
6 3 2 2 2 4 2 2 2
7 4 3 2 2 3 4 2 2
8 4 4 1 2 3 4 2 1
9 1 2 1 2 3 2 2 2
10 2 3 2 2 2 2 2 1
11 3 3 1 2 2 4 2 1
12 1 2 1 2 1 1 2 1
13 2 1 2 1 2 2 1 2
14 1 2 2 2 4 2 2 2
15 3 2 2 2 4 4 2 2
16 2 2 1 1 3 2 1 1
17 1 4 1 2 1 2 2 1
18 4 3 2 2 4 4 2 2
19 2 3 1 2 2 1 2 2
20 1 4 1 2 4 4 2 1
21 3 3 2 2 5 4 2 1
22 3 2 2 2 3 4 2 1
23 1 1 1 2 1 2 1 1
24 2 4 2 2 5 4 2 2
25 3 1 1 1 4 2 2 2
26 3 1 1 1 2 1 2 1
27 3 2 1 2 3 2 1 1
28 3 2 2 2 3 1 2 2
29 2 1 1 1 1 1 2 1
30 3 3 2 2 2 4 2 1


X= 2.27 2.50 1.50 1.83 2.77 2.57 1.80 1.43














APPENDIX D
EXPERIMENT TWO
NUMBER OF TESTS TAKEN ON EACH UNIT BY STUDENT

Group 1


Unit


Student


S= 2.03 1.92 1.61 1.74 1.87 1.82 1.63 1.42

60










APPENDIX D Continued

Group 2


Unit


X = 1.64 1.42 1.92 1.86 1.83 1.44 1.89 1.69


Student













APPENDIX E
EXPERIMENT ONE
PERCENT CORRECT ON BEST ATTEMPT ON EACH UNIT BY STUDENT

Group 1


Student


Unit
2 3 4 5 6 7 8 9


60
95
80
80
90
65
90
90
55
65
95
100
90
80
100
75
80
90
50
70
70
95
90
55
60
90
80
85
85
65
100
80


80
90
95
100
80
80
90
100
30
80
70
95
95
70
95
85
90
95
70
90
90
95
90
75
75
90
95
70
95
90
100
95


90
90
100
100
90
95
100
100
30
80
90
95
75
90
90
95
100
90
60
90
90
90
95
75
80
80
90
85
95
100
95
90


60
90
80
90
90
90
95
90
40
80
90
95
95
90
90
90
95
90
85
100
95
90
90
40
80
85
85
80
90
80
90
85


90
100
85
80
90
85
80
90
35
80
95
90
90
65
95
75
85
90
60
90
95
95
90
20
65
85
90
80
80
90
95
90


90
85
70
90
90
75
90
95
65
85
95
95
100
85
90
90
90
100
80
90
85
95
95
80
95
95
100
85
80
95
100
90


80
95
75
90
100
80
90
100
65
70
90
90
90
90
90
95
95
95
60
85
90
90
85
90
45
75
95
80
90
80
95
95


65
80
45
90
85
80
80
75
35
55
90
80
100
60
70
80
90
80
80
90
95
70
90
55
55
95
90
70
75
85
80
90


S= 79.84 85.63 87.34 85.16 82.03 88.91 85.47 76.88










APPENDIX E Continued

Group 2


2 3


55
70
90
90
80
90
95
90
30
90
90
90
90
90
60
95
90
60
90
90
85
90
90
80
90
100
75
90
95
90


Unit
5


80
80
100
95
95
60
90
90
75
90
90
90
90
90
65
100
95
90
95
95
95
85
90
90
90
90
70
90
90
90


60
55
90
95
95
65
90
90
65
90
95
90
90
80
40
90
100
75
90
90
90
80
95
90
90
90
80
90
90
90


X = 83.67 87.83 84.00 86.17 89.83 89.83 77.67 82.33


Student


90
80
85
100
90
75
80
85
75
85
95
90
70
80
70
90
90
80
80
100
80
80
100
90
95
95
85
90
90
90


90
90
90
90
100
85
90
95
70
100
95
90
95
85
65
95
90
90
95
95
90
80
95
85
90
90
90
85
95
100


90
70
90
95
95
85
90
90
75
95
95
90
90
90
60
90
95
90
95
100
90
90
90
90
90
95
90
90
100
100


75
85
100
80
80
65
90
80
80
90
80
95
80
95
65
70
95
65
75
85
65
90
85
80
90
90
80
80
90
90













APPENDIX F
EXPERIMENT TWO
PERCENT CORRECT ON BEST ATTEMPT PER UNIT BY STUDENT

Group 1


Unit


Student


90
55
80
85
90
70
90
90
90
75
90
80
90
95
85
65
95
85
90
100
90
85
80
95
100
95
80
95
90
80
75
95
75
90
65
85
80
90


90
80
100
90
90
90
95
95
90
80
90
95
95
95
90
85
95
95
90
90
90
90
80
90
100
85
85
90
90
90
70
95
90
95
95
90
90
75


90
80
90
80
90
70
90
95
70
85
95
90
75
75
95
90
95
100
90
95
90
100
95
85
90
95
90
95
90
95
90
95
80
90
90
80
75
80


80
80
75
80
95
50
95
90
85
65
90
90
90
75
70
80
95
90
95
90
90
90
80
85
90
90
90
85
100
90
55
95
85
90
90
70
75
70


80
50
90
85
90
80
100
90
95
65
75
95
95
90
95
85
95
85
95
70
95
75
75
85
85
95
95
90
90
90
85
90
80
95
95
55
75
100


85
90
95
85
95
65
100
90
95
85
100
90
95
85
85
90
100
95
95
95
90
95
90
100
95
90
90
70
90
100
70
85
85
95
95
70
80
90


X = 84.61 89.74 88.03 83.68 85.66 89.34 82.50 81.97


65
70
50
65
85
65
85
85
100
70
90
90
90
95
85
80
95
80
75
90
70
90
95
85
90
90
85
65
90
85
75
85
80
90
85
80
70
95










APPENDIX F Continued

Group 2

Unit
Student 2 3 4 5 6 7 8 9

1 90 100 90 90 95 100 80 95
2 95 65 90 90 85 80 90 80
3 85 75 95 85 95 95 95 80
4 90 90 90 95 90 100 75 80
5 65 85 70 85 70 65 50 45
6 65 75 80 70 90 90 55 100
7 85 75 90 85 90 75 70 65
8 85 90 65 80 80 70 60 70
9 65 75 90 95 90 85 70 90
10 90 90 80 90 90 90 85 90
11 65 45 85 75 75 80 70 70
12 95 90 100 85 90 95 95 80
13 95 90 95 95 95 80 100 100
14 90 100 95 90 90 95 85 80
15 95 90 100 95 100 100 95 95
16 75 85 90 75 80 90 65 70
17 65 75 80 100 90 70 80 65
18 95 90 90 90 95 90 70 90
19 70 95 95 80 85 90 90 85
20 80 85 95 90 95 90 80 90
21 95 100 90 95 90 100 80 85
22 95 100 100 90 80 100 95 90
23 85 90 80 90 100 95 80 85
24 95 95 90 90 85 95 90 95
25 90 85 95 70 80 95 60 75
26 95 95 90 95 95 100 90 70
27 100 95 90 90 100 95 90 75
28 85 75 65 80 90 90 60 80
29 90 90 95 75 100 95 90 95
30 85 90 95 90 90 85 85 70
31 80 95 100 100 85 95 90 90
32 80 75 90 90 90 90 90 80
33 90 90 100 80 90 90 95 80
34 90 90 90 85 90 90 85 95
35 95 95 95 90 95 90 90 85
36 100 95 95 80 90 95 100 80


X= 85.56 86.67 89.58 86.94 89.92 89.44 81.94 81.94














BIOGRAPHICAL SKETCH


David Barkmeier was born in South Bend, Indiana, on

October 12, 1948. He received a Bachelor of Arts in psychology

from Indiana University at South Bend in May of 1972. He

earned a Master of Arts in psychology at Western Michigan

University in Kalamazoo, Michigan, in August of 1974, under

the direction of Dr. Jack Michael.

Mr. Barkmeier began his doctoral work at Georgia State

University in Atlanta, Georgia. He continued his doctoral

studies at the University of Florida, earning a Doctor of

Philosophy in psychology in March, 1980, under the direction

of Dr. James Johnston.

Mr. Barkmeier currently resides with his wife in Belmont,

Massachusetts, where he is affiliated with the psychology

department of Northeastern University.













I certify that I have read this study and that in my
opinion it conforms to acceptable standards of scholarly
presentation and is fully adequate, in scope and quality,
as a dissertation for the degree of Doctor of Philosophy.





James M. Johnston, chairman
Associate Professor-of Psychology







I certify that I have read this study and that in my
opinion it conforms to acceptable standards of scholarly
presentation and is fully adequate, in scope and quality,
as a dissertation for the degree of Doctor of Philosophy.





4enry S. pfnnypacker
Professor of Psychology







I certify that I have read this study and that in my
opinion it conforms to acceptable standards of scholarly
presentation and is fully adequate, in scope and quality,
as a dissertation for the degree of Doctor of Philosophy.

/r



Edward F. Malago r /
Associate Profes or f Psychology













I certify that I have read this study and that in my
opinion it conforms to acceptable standards of scholarly
presentation and is fully adequate, in scope and quality,
as a dissertation for the degree of Doctor of Philosophy.





Richard Griggs /'
Assistant Professor of Psychology







I certify that I have read this study and that in my
opinion it conforms to acceptable standards of scholarly
presentation and is full adequate, in scope and quality,
as a dissertation for the degree of Doctor of Philosophy.





John Newell
Professor of Foundations of
Education






This dissertation was submitted to the Graduate Faculty
of the Department of Psychology in the College of Liberal Arts
and Sciences and to the Graduate Council, and was accepted as
partial fulfillment of the requirements for the degree of
Doctor of Philosophy.

March, 1980


Dean, Graduate School








































UNIVERSITY OF FLORIDA
S 1 111111111111111111111111262 08553 9426
3 1262 08553 9426