Citation
An Experimental analysis of clinical judgment

Material Information

Title:
An Experimental analysis of clinical judgment
Creator:
Perez, Francisco Ignacio, 1947-
Publication Date:
Copyright Date:
1972
Language:
English
Physical Description:
ix, 83 leaves. : illus. ; 28 cm.

Subjects

Subjects / Keywords:
Clinical judgment ( jstor )
Clinical psychology ( jstor )
Efficiency metrics ( jstor )
Minnesota Multiphasic Personality Inventory ( jstor )
Philosophical psychology ( jstor )
Production efficiency ( jstor )
Psychological research ( jstor )
Psychology ( jstor )
Recordings ( jstor )
Research methods ( jstor )
Dissertations, Academic -- Psychology -- UF ( lcsh )
Prediction (Psychology) ( lcsh )
Psychology thesis Ph. D ( lcsh )
Genre:
bibliography ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis -- University of Florida.
Bibliography:
Bibliography: leaves 79-82.
Additional Physical Form:
Also available on World Wide Web
General Note:
Typescript.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
022668672 ( AlephBibNum )
13986211 ( OCLC )
ADA5204 ( NOTIS )

Downloads

This item has the following downloads:


Full Text









-. F E` i;;.rT.A .: ,L i '" F i I I-' I .l'L.. I, Il !


rKANC il:T It WilO C10PP











A ii_.i .'- T. T i l. I:C '. T :I i T TI- -. '.*' TE
L I L L '. f H l l l .- i .1 1 I L i
i: T i L Fij I L L'I1i. 1 f FTH .I I [I I -, I .
FOnF TF- E C E ; i .i i.if Li iL1:i :




INIi. I i..17i F IFL'JU L P

ir:"


























COPYRIGHT


By


FRANCISCO IGNACIO PEREZ



1972











ACKNOWLEDGMENTS


Over the past three years I have had the good fortune and honor

of working with Dr. Paul Satz, chairperson of this committee. His

long hours of hard work, his persistence, his patience and continuous

support, but primarily his ideas and inspirations have been invaluable.

As a teacher, colleague and friend his contribution to me has been

great.

Special gratitude is extended to Dr. Henry S. Pennypacker, a

truly great colleague and friend, for his continuous support and

encouragement throughout my training. His ideas and work helped me

see the light.

I would also like to express my sincere appreciation to all the

members of my committee, Dr. William Wolking, Dr. Jacquelin Goldman,

and Dr. Hugh Davis for their help in the organization and critique of

this manuscript.

The cooperation of the six judges was instrumental for the success

and completion of this study. I am greatly indebted to Marcia Keener,

Tom Van Den Abell, Lenay Suarez, Carlos Alvarez, Gerald Reynolds and

Brian Lindner for their many hours of hard work.

Many thanks are also extended to Mike Cruse and Bunny Wardlaw for

their help in the preparation of this manuscript.

To my wife, Ginny, I give a special thanks. Without her inspira-

tion, suggestions and moral support both his research and I would be

incomplete.











TABLE OF CONTENTS


Acknowledgements

List of Tables

List of Figures

Abstract . .


INTRODUCTION

METHOD . .

RESULTS .

DISCUSSION .


APPENDICES . . . . . . . . . .

Appendix A Instructions .. . ...... ...

Appendix B Daily Correct and Incorrect Frqu.n.:.:r

REFERENCES . . . . . . . . .

Biographical Sketch ..........


iii

v

vi

vii '










LIST OF TABLES


TABLES PAGE



1 Experimental Design Schematic . . . . ... 16 & 17

2 Accuracy Ratio Celeration . . . . . .... 25

3 Accuracy Ratio Frequency Multiplier . . . ... 35

4 Record Floor Celeration . . . . . .... 36

5 Record Floor Frequency Multiplier . . . ... 40

6 Frequency Correct and Record Floor Growth Ratios .. 42

7 Frequency Incorrect and Record Floor Growth Ratios .44

8 Accuracy Ratio Total Bounce . . . . .... 45












LIST OF FIGURES


Figure Page


1 Daily Accuracy Ratio for F 1 . . . . . 26

2 Daily Accuracy Ratio for F 2 . . . . .. 27

3 Daily Accuracy Ratio for P I . . . .... 28

4 Daily Accuracy Ratio for P 2 ........... 29

5 Daily Accuracy Ratio for I I . . . . .. 30

6 Daily Accuracy Ratio for I 2 . . . . .. 31

7 Summary Accuracy Ratio Celerations ......... I

8 Summary Accuracy Ratio
Frequency Multiplier . . . . ..... 3. .
9 Summary Record Floor Celerations . . . ... .

10 Summary Record Floor
Frequency Multiplier . . . . ... .









Abstract of Dissertation Presented to the Graduate
Council of the University of Florida in Partial
Fulfillment of the Requirements for the
Degree of Doctor of Philosophy




AN EXPERIMENTAL ANALYSIS OF CLINICAL JUDGMENT




By



Francisco Ignacio Perez


August, 1972


Chairman: Paul Satz, Ph.D.

Major Department: Psychology


The present study provided the first experimental application of

continuous and direct recording of operant methodology to the clinical

judgment process. This novel application attempted to provide initial

answers to four questions: 1) How stable are the daily predictions

made by judges? 2) How does the time involved in making a clinical

judgment influence accuracy? 3) What is the effect of an increase in

available information on clinical prediction? 4) What is the effect

of experience level on clinical judgment?








Six judges, two first year graduate students in Psychology

(F 1; F 2), two Psychology practicum students (P 1; P 2),

and two interns (I 1; I 2) were asked to make daily discrimination

between test protocols belonging to men convicted of first or second

degree murder and those belonging to men convicted of crimes against

property. Levels of information were increased in four phases:

Experiment 1 MMPI's only; Phase 1 New set of MMPI's; Phase 2 -

r:rI' r': and ih):.- :, ri! .i ': .in. F i: h Ch: anr iO.. ~ .i pr i -

L:.I l ',r.i ; f :. I : f I': l o. c,:, r ,i.a cn: ,.in .l i.:.,.'_, n ,.:al r l' r. ar.d



iril fr -4uen:,' correct ir.1 Trej.lnc, Incorr. ct of r.he .ail,

dl :cr i rTi n, .n .. r pl-r.' d r.. ,I i :r i.:ri Ju..:- r or, i SL EnJr' [ .h j r

Ch i t. ih. r :ul r. -e e:ci it..j in Tri .f ri.) Ich nejF pha6:

Criirn in th .E- iTierr. fEcr' .c:, h *a'.ll,' ci nc i .l ju..1i rt of : :r.

Jul. =-. The-:,e i riCt: ie li C ::e. in r..rm f cCu',c :,', efficiernc,,

:- [ p, r'O-,,rn jrd J l `[ l b un.r:c .

ThE r:Jsulr.; .JAT,:.n: tr,t i ri r i f di ferert o ffC..:r.:. Irn .-ii

*,.:.:, the eiffe : ,E.r ,:'or:' : e r. ii r ih p l i j u i rch: fr ehr.ple,

th:, c,.erili I:, .-accu,: C l .. l for mT : t judje:. H,:ce er' ior of r.he

finlin,: i e ur i nr. icip r.,: part culirl., tre o.er ail ncgli gilh.

erfe:r. o'f .l.d in. in ir',riatia l r .o the clinir:jl .uJ :,min t prices: The

r :u "lt: al: *lT'm .::r.rite a nu iter of ('-, .:fferC::. Thre:e .er.: I)

Ti dire':ct -.I :,:reTii, : ,i plic.atinr, a.:, rjv and within ju-l e.. of

thr :.tjt i c, r. t he l .1 l pr E.I-icir. i,:rn: r' pl: ir.E:, ; 2) The repli-

c: ltr :.n if 'he li ..if e scntiol li fre irc :. ber..:t -r. icel rar.,on

fr'e ueni iei: rind Cclieratin rec,-ri frloOr Thrc finIinng. pri'.i e c


i i i






starting point for future applications of operant methodology to

the study of the clinical judgment process.









INTRODUCTION


The major emphasis in the field of psychological assessment has

recently turned away from construction and validation of tests toward

a fuller consideration of other factors in the assessment situation.

A large number of studies have been carried out focusing upon various

aspects of the complex process of clinical prediction. Some studies

have been concerned with issues of theoretical importance and others

have also dealt with more practical contributions.

Clinical versus statistical prediction. --One of the main areas

of research since Ieehl's (1954) influential book has been to compare

clinicians to statistical formulae. Meehl (1954) cited 20 relevant

studies comparing actuarial with clinical methods and found that in all

t.ur .:.r :. th :.:, rr.:r l ; i r rir:..i : C.:r.:i: nd r.O u,1 .:.r : urpi: d

i ir. .:i ; r i r r. ~-:c:ur : .-f pr.i .j;.i r :rite 1:.r. Th i f ir.J.j:i

p l:' iJ J a t:.r.:4 i:': [t r i 1 i . .bi c r :r. ;r r i r 1-i1r : 1: i :

f.,ujr., i .: r. :i r icr.ij31 .:r.I:, :, .,: r : ,r' ,i.. -.:] [t, i in j

]jr-r' , r' ;li.,hl i l Cm.li i:i .-- il th hli I 4 .- I :ir.. i.: i n h. -

i: f.' u il rn. ui. i*.u ia ir. l, r I iI i,:, .i .,1 : t . :, i rV 'l. .r. f

..-i, n a ar rc r.:.:.r..:l'l l [ret I.r r .31 i.i r.:a ,. ri. r. i:l r: rc :ii r.i. nn



In c.. : i it r. h.- *: 'i n ., i .- :u ,c ru r l i % t i. :r:.r. irue ,







(Meehl, 1969). Meehl (1965) recently reviewed the relevant research

literature published since 1954, and concluded again that two-thirds

of the fifty studies showed a statistically significant superiority of

actuarial methods.

Holt (1958) criticized Meehl's conclusions on the basis that

several stages are involved in clinical prediction and that Meehl's

comparison focused only on one of these. He argued that the cross-

validation possible with actuarial techniques is not possible with

clinical techniques, and that, therefore, the two methods are not

logically comparable. Holt (1958, 1970) states that the evidence in

favor of actuarial methods may be a function of the experimental

design, which puts the clinician at a disadvantage, rather than the

actual superiority of statistical prediction. An additional problem is

that the clinician has seldom been given the opportunity to incorporate

the actuarial information in formulating his final decision.

Available information. --The information available to the clini-

cian often has been based on non-quantitative data such as interview

material, case history data, and projective tests. Goldberg (1968) has

shown that in such situations, the clinician has been inferior to the

actuarial methods and his judgmental accuracy has decreased both with

increased levels of test information and clinical experience. Shagoury

and Satz (1969) examined the effects of levels of quantitative infor-

mation on judgmental accuracy in a clinical statistical decision-making

task (brain damage versus normal protocols). Judges were provided with

increasing increments of statistical information. The results showed

that judgmental accuracy increased substantially with increased levels

of information. The increase in judgmental accuracy was also shown to







vary with different strategies which the judges utilized during the

experiment.

Moxley and Satz (1970) asked judges to make postdictive judgments

on the length of stay in psychotherapy (short or long) for a sample of

mental health service clients. Judgments were made under four con-

ditions in which tests and statistical information increased incre-

mentally at each level. It was found that accuracy increased over

levels of information. Moxley (1970) states that if "the clinician

is able to incorporate the statistical information he may equal or

surpass the accuracy of actuarial methods."

EffEcr of -:1..i ,:-. r n.inr, -- .:-.;.rit :ru.li : f':..u:. in or. t -

t i l rt : i ri.J : l ini. ,l :.re tior n r .e i: .-:l r .ii [h Er.: *,e : [ orn

Io e peErie '.i. Th e i.J-.rce in Il .,ic ri r : urfr i : iin.i, En t i

e pc'ri, rE: ,-: r:r :. : pr-.r di.-i.:.n ,.:.jri.., '. .-c ; :. In r,i r.: i.-.u

f j Lh .- yu liEr i.i thi:r, .:...n ri .u( ,: '*:* rh. a: iliE., r..:. udlj pc pl.:

lift 1. 4''.(.) r p.: rr:...r [r :. r:.:r .. .i nt C. c ni l tr irninr. :ucr,

prh,: : i : :ienr il: :. l .- er i t p i.:.l)i E, [ !re mor aCur it

ju'.idj i :. f po .:pl, tri n r; : n i: i 'l | ,: l..l i : .: '..r .:1 irni. l > i:,'I -j

:tu-,3 rE:. 'I.,lIl.i.er'. I',1'?') I r:.*.,' l [ Er ir ir. .j .;: irl n.; i r.; ,i, r. no r.

;upr l ri, r [1 '.i.lua r.e Lu,jn:r'. .:r : Lr.Et, n rl. in I:r' -.jl l r. r t i, n

,llr '.ii; fr.:.ni I.,-n j r- l:t it j:.r:,[ .:.,' ,i:. *jr .in Tir an.l e i ii E., l i l. )

t.rou,,lh [Ch: ii .ei3'.ur' upC E[.:. .J tE Lr,'i [I-.', re:. iCi- fI.urt:rr j.l.11l i'l n-

I~ L i r u :. I 't fouin n o r i .1]fc,:r ne in iccur :, t.rt:-on

cliniCi.: ,r.: ird or.her ju,) .: ;, t .: r.:! ';. rr -qu .ll, i :Lritiut,.l In

fi'nrlJir'i t.l .:I,- i r: n ei t h r upn ri.j, r i ,- ir:. f .r i i r ,: c l c i l i

found. rn.. :i r] fica.in li ferei n.: in .-ccuri.:, ii rijir-, :"i -: ph. i-

:C a[ .d, .ild o io:phl iC: ,j iu ;,:' In i 7[ic, u,:in. l .udqrinE: t..f







psychosis or neurosis from MMPI data, Goldberg (1965) found that staff

judges and trainees achieved the same accuracy on the average, that

the four best and two worst judges were trainees, and that there was

wide variability on each diagnostician's performance over different

samples. In a more recent study, Perez and Satz (1971) found that

there were slight differences in accuracy between graduate students

in clinical psychology and clinicians in predicting length of stay in

therapy from MMPI profiles. There was a higher overall hit rate for

graduate students than clinicians. Taft (1955) advances the explana-

tion that trained professionals, somewhere in their training, acquire

a "set" which interferes with making unbiased objective decisions.

The process of clinical thinking: cognitive models. --Among the

researchers who have attempted to study and describe the workings of

the clinician's "mind" there seem to be two camps. There are those

who hold that clinical thinking is a mystical, intuitive, and thus

inexplicable process. There are others who theorize that the clini-

cian's thoughts are orderly, logical, definable, and even, in some

cases, deserving of mathematical models.

Among the former is Luft (1950) who has described the process of

clinical judgment as a sifting, screening, and synthesizing of case

materials in "some intangible way." Another is Mann (1956) who in his

review of Meehl's (1954) book emphasized the complexity of human

decision-making and the importance of considering all the factors

involved in judgments.

Perhaps one of the earliest major efforts to demonstrate that the

diagnostic process is capable of truly rigorous investigation was

Hoffman's (1960) study in which he reduced the diagnostic process to







mathematical models. He proposed both linear and configural models,

and suggested that a fruitful approach to research would involve

focusing upon the individual as the unit of research, and studying

his behavior as it relates to each of these mathematical models.

In one analysis of components of clinical inference Hammond, Hursch,

and Todd (1964) applied a multiple regression technique to Brunswik's

lens model. The statistical components derived were used to examine

several types of previous studies. Judges were found generally to

combine cues linearly, but the authors argued that the simple rating

tasks studied are conducive to a linear response system. They

suggested that, for more complex judgments, the lens model of analysis

can prove useful in studying human cognitive processes.

Studying the way in which the clinician processes data, Wiggins

and Hoffman (1968) devised three statistical models, and compared them

u ith clinical ijuicment- of pT. -hlir,- and niuri.:.:i: fr.:.Ti ;11PI profile ,.

A I r ,.j *' ir ,:iu ,lrir .;, jr.l i : mi. IT,,:,,_|;-I i. r,-i' u ;-.l F..,il r.: irjl c i [.,

ihat In.: "in, n,,:,l.;i t:t J c :.:,r,t;.L I : julj,: t, ui.JrI ml, : rr.,,i .

r j I_ ] i . ,T,, .: I i rTh. i'. no: r" i c 9, rt i t I .1 f:n '; ,. ', tI, e t .:

cr tnfi ur'i uor iirn r : I, il r ,..irunir of linici'l i iiini ,r b ti t n

a. Ic ,r,, ,icc:urC ., .

In n iffi t :. :t lu th.: C::.nCii [ it [Oh.? i.,:..,, : .f c i li n.r.C l

]u.Jri.rn t, : ,p. I i I.') u il j t ,: I H 'm n ', nu .1 i ,: i. '.

r'jLi r ,r arid 3r, 3ri, :, : i ili.. :. r : t ri : I l Z. .

C l.ni.:A] jud,, r..-nr : frO, i ll i rI' i Irefi.- ,Ju3,1.; : d rn dl di ,[te r.:.u;

* ..:i:ir' r. h, r h i.it ,, : r :l:pi i :il. 1 i mehl r ir ,:, r



irt,L. c''-Tii,:.,n it f r,, p c:,,:.-, i ,: _' e r.,?n, rnt upon [he tEpl cf







analysis used to study it. Multiple regression analysis suggested that

the judgment process is simple, and that judges used predictor variables

in a linear additive way. The validity coefficient analysis suggested

that judgments are made in a complex, configural way, which agreed with

the judges' subjective impressions of the way in which they utilized

the data.

Feedback and clinical prediction. --Given the discouraging in-

formation that clinical judgments are often inaccurate, a number of

researchers have directed their attention to the use of feedback as a

device to improve accuracy. Feedback always involves giving the subject

some information about his performance in successive trials. Bilodeau

and Bilodeau (1961) say, "Studies of feedback or knowledge of results

(KR) show it to be the strongest, most important variable controlling

performance and learning." Ammons (1956), in surveying the effects of

knowledge of performance, concluded that KR almost universally results

in more rapid learning and a higher level of performance.

There have been many ideas advanced about the role of feedback in

terms of how it may influence work, learning, and performance (e.g.,

Ammons, 1956). Sechrest, Gallimore, and Hersh (1967) present two

hypotheses: a) that feedback operates by providing information by

means of which the subject can adjust his implicit hypotheses; or b)

that feedback serves as a motivational function by convincing and re-

minding the subject that the task is one in which improvement is expect-

ed and possible. According to Underwood (1966), the most dramatic

effects of KR can be shown for tasks in which the precision of the

response is initially very poor, and for which the subject can give

himself at best minimal feedback.







Sechrest, Gallimore, and Hersh (1967) devised three experiments to

provide feedback on predictive accuracy in the expectation that feed-

back could be used to improve performance. Their study stemmed from a

recommendation by Holt (1958) that clinicians should have training

which makes it possible for them to validate themselves as predictors

in much the same way as tests are cross-validated. The "clinicians"

studied were undergraduate students, and the prediction task involved

interpretation of short sentence completion protocols. They found

evidence in all three experiments for the superior performance of those

subjects who received feedback, but the bulk of the evidence suggested

that the feedback effect was attributable to enhancement of motivation

of the subjects, rather than to specific informational value.

Rotter (1967) points out, however, that the Sechrest, Gallimore,

and Hersh study had a number of design limitations. These are: a) the

subjects were undergraduates with little experience and possibly low

,,,r.' ,r. i,:,r,, ;, r.:: r ir l'i l I'. rf_-:,n...i) ,T,.i r, i ._ t.f.in rcV., :mjli fr

.a j r jii.:l r ir : r c i. .I I r :,.* l J. i. : o r r. o Lr .r r r',i r . : l ,- ,,,n J

rij i .*. r. t f inf, r .:r : r. ,h : IL j C'.I l., 'J .l:d 1 .i) i 'n 1 t1 .J n

ii r. nT i ri r:. n t r I :. ri n .. r. i. :. .c t. *r li.b.j l C 1 .5IhIri i tSi r..i

a I ijj .-rc r. T i ,o r:,c[ ,i irc..C .: r. 1 r th .ri ,j r I ru i, n1 i ,- [h. i Fr r,.:.t

t h I : 4' ,: : h ,J i J. ', r -. r S ,1 .O l h ,' .i i: , r,.r r t h er o' r n i.:.t r.h p

Sh. .,^, t,,'. it r. :,- '" *,,r:- *: '- *: 1 , r. ,,n I ' ]: -. r, r i' 'ie n .

i l i'.. i i .,.) : ,j.j iJ r.-, :r.-: ,;,t i,,, ii i .n T ,Ji r.: rE .w I : 3C

C, inin,) .:. i,..'"o:: In:, r 1: 1 p : : :r. ,l i rle.i 1r':';.r) I [ Ar --


,,f ".':Cui c .jC.,. Th. c, 'i i:'. l .c ', fc .: : ,) r.. r.i. -a c :,.i I'.C

9 r, .,'..,v :. r.',r r,'_ la: ,' ,, ,'e,' .: n , rn rn rcrt.: t:,r,'Ca':







of low-accuracy judges showed substantial improvements for both pre-

dicted criteria; however, the training had no noticeable effect on the

judgments of the high or moderate-accuracy judges.

Perez (1970) studied the effects of feedback on a problem that

clinicians face in their practice -- that of predicting length of stay

in psychotherapy. Sixteen judges, eight clinical psychologists and

eight graduate students in clinical psychology were asked to predict

length of stay in psychotherapy from MIPI profiles. Judges were

randomly divided into a feedback condition and a no-feedback condition.

It was hypothesized that judges in the feedback condition would do

significantly better than judges in the no-feedback condition. The

results, while in the predicted direction, were not significant,

largely because of the high level of initial accuracy on this clinically

relevant task. In agreement with Watley's (1967) finding, inspection

of the performance of each judge individually revealed that feedback

was most beneficial for those judges starting at a low accuracy level.

The clinician behaving. --Clinical psychologists have produced a

widely varied number of hypotheses in response to the evidence that

they have not yet demonstrated their diagnostic prowess.

Hunt, Wittson, and Hunt (1953) have suggested that the confusion

may result not only from lack of ability in the diagnostician, but also

in poorly delineated diagnostic categories and the professional customs

which permit careless diagnosis or inaccurate diagnosis for administra-

tive reasons. Some (Holt, 1958; Sawyer, 1966; and Taft, 1959) have

proposed that the available research comparisons between clinical and

statistical methods are essentially not parallel and are therefore

meaningless. Others (Hunt, Arnhoff, and Cotton, 1954; Hunt and Jones,








1962) have pointed out that the application of formal scoring, mathe-

matical models, and certain statistical treatments to clinical data

tends to distort research findings.

Little (1967) and Rotter (1967) have pointed to such artificial-

ities in research studies as the use of undergraduate judges to re-

present experienced clinicians, the lack of adequate criterion data,

the use of inadequate test data for the prediction required, and in-

complete description of the criterion to be predicted. Some (Payne,

1958; Cole and Magnussen, 1966) have argued that diagnostic test

C.or.: ir.: u:-: :: :Ii. C r .M ..r jr.:.: ic c atl ':.r .: tr. ':: I .4:; r.

r,:[. r' l] t ,] ",,T; t, : ,,n',, I.- ., t [ .- tTnt, O:r (r: 'rn, :, i an [lh.at

.h pi r- -I; Cll i h:- i:.f i '. ; r. i, : lit,. I: h ,: mu.:lh I- : r: i-3riln..- itrn II:uli

th. pr-.- l [ i i.:.ri .f th b toh i.:.,i I (,:.rc ,i: r..:. : .:.o f r.l.e .. a l rn, r ..

cc.ntL r.u l I .h ir' att.E itL .: t .-. i -: l.:. p t. .trALr :t r.: ru.uin rir M. A h

(l'* l, ir.n con rr :t. 1.3: [pro., :..l t.in [ i nn.:.:i: b I.. l E'l :' VIE

up ri'~Cr c:-input r. 'I"1 [hi Lh clin ic ,n ic:ni.:nir."i 5. L .r r.I:,,,rr.h nr..

[. cJCl-oriC -ri :, .

it i: [r.'e;-n 1, r.u'J [ r. [he c.?nrilii[in. ] ri' ult: fourid

e lp: rio n riE.r: n r, r.i t"o cli ri :.:i prt.i: ct r. Er.: dui larI.1., L,:'

li~ t f AOr 3.a quiA' uT, r .ai l InthI:ldi .i. r. itu.l, .ii: corpl( ,

t, ." t chri.:h I' I'.' i l e. [I,. t r..: 0,. : i f :ci.' ric. re" -s l :.cri,-

iuon : p1 i ri r.:ir .i'-. : r :.r, .3nI c i ntrioi. Fr..'r t ih. pr lou: r-.i

of t h I t., i lr uri, it : c.i: .,ir trh t Mr., re: .,ar,:h ,3r'_ of clin.:al

pr..J;ction 1: fr t.rirrj itn rt...rhrn. rh. r :~ .- .:,.c l:. liehil 1 l? -l :| are-l

trj a PF ,: u. r. :.:.oc I rin i:.r l .i u.li in 3 :tu.], i: ne .iE ,Jl t find .u

.h4tlh .r r1.) r.:. i.naL .). r 'r : chn. c l ani iC :t it.l' l u.:h. r L.t r

ha.n r t ,., .]T reI th r.h .:. : tari 1, l"u.:-. .i :[ r'Fk oi j on' a r., oif .qu3al








or near-equal mediocre guessers." No attempt has yet been made to

provide an answer to this challenge. The precise experimental

methodology of free-operant conditioning, applied to the continuous

and direct recording of the clinical judgmental process, provides a

unique opportunity to study judges making their predictions on a

longitudinal basis.

Experimental analysis of clinical prediction. --Since Sidman's

(1960) influential book, Tactics of Scientific Research, psychology

has witnessed many innovative applications of operant methodology to

traditional problem areas (Ullmann and Krasner, 1965; Ulrich, Stachnik,

and Mabry, 1969, 1970). A most notable example is Lindsley's (1969)

application of the continuous and direct measurement of operant

methodology to the study of traditional psychotherapy. The underlying

thesis of these applications has been that variability is not intrinsic

to the subject matter but, stems rather from discoverable and con-

trollable causes. Any sample of behavior is under the control of a

multiplicity of variables, some of them presumably held constant in a

given experiment, and others simply unrecognized. Sometimes the

variability in a set of data can be located among such factors. Two

subjects may be found to differ in their response to variable A, not

because there is intrinsic variability in the relation between

variable A and behavior, but because they differ in their response to

variable B, which interacts with variable A. The process of tracking

down sources of variability, and thus explaining variable data, is

characteristic of the scientific enterprise (Sidman, 1960).







Sidman (1960) believes that the control of data in research does

not depend upon the amassing of large groups of subjects, or even

large samples from an individual subject. He states that "We must

consider our science immensurably enriched each time someone brings

another sample of behavior under precise experimental control."

Sidman believes that the adequacy of a technique in experimental

psychology should be evaluated in terms of the reliability and pre-

cision of the control it achieves over the independent variables.

According to Sidman (1960), experiments are often carried out

to test the fruitfulness of a new technique. Sometimes the technique

is developed deliberately in order to obtain information that could

not be gained by standard methods; sometimes the technique is simply

tried out of curiosity as to the kind of data it will yield. Technical

*jd olr"-.n:mnt L in ..P.erii.r, l .., i3, in.:i iT|:pro..em'En.r : nr.

T ri: urine i n: r. r'jn. r :. r.,dr, *:;.1i h j: .. r' re ':.r.1 t. [ ,l :c .h i : -

tjr. .jra ..n'l: : r.h,- d, :i nrn :, : r ii .j ] ,.F r.aru" [,:, .- E i

pi r i :u lr r I .:L.., r *.n: r r,3l i :e.3 ippi, r3 r.u: [,. p. rI'f.:,rr m. r in, ur'.: tio.n:,

jnd r.ei [A-n i.:n fr lid r.: I'rnniqu.j r.j rin jaCr :.

It : c l.,ar 'ijthat th- r fr:.r m r.:.e bj, ti v r: r.:i rch in th

fieid -:.f c:'inicai ju.A i i n hn L. ,r. rri cu l ei, e : .::ed (r,.. rer.

l l. Thi: : t Jd, ri-re re:ent: a br'o: d. but : ,: a[i c. I .ttr ip[t t*:

uid,-r:t rid r,.he pr, c.;:: r 1 c. linc i l jud.mjT rI 5 ',u r: fruitf',i and

ri,.:-,r.u:u ,ppro:. h to thi : .:.bl. i : r,'r,:, c. thr',i ugf the :.,ttew,]t c

inr i t 1 i.:.r ,i r Iei ar. t )riabl : rhe., iriluence the ju.1dqrernt

r1'0'..: :: r he '. .: lir .ti.:.r ,:. f Cirlinrou u n dir.:r liij.'.ureirent to

r'l: .n, .:3 j,'jiud rt, a urniu: opp:''r unit, e i:t: to bin1 ne l11 .hl in

r.hi: intrii ir i ai rei





12

Precise measurement of clinical prediction. --O.R. Lindsley and

his associates have developed the most powerful single tool to measure

human behavior the Standard Behavior Chart. This chart permits the

daily recording of behavior frequencies ranging from 1000 per minute

to one per day; frequencies ranging from one per day to one per

twenty weeks may also be recorded without changing the coordinates of

the chart. This chart, therefore, provides us with a standardized

means of depicting and analyzing frequencies of virtually any human

behavior on a continuous basis. Further, because it is standardized,

it facilitates the comparison of data across different levels of

clinical information as well as comparing different experience levels

and how these variables effect clinical prediction. Even more

importantly, the Standard Behavior Chart greatly facilitates communi-

cation of research findings among scientists.

According to Wolking and Schwartz (1972), there are several

features of the precise behavioral measurement system which makes it

distinctive from traditional statistical measures. These are: (1)--

The basic unit of measurement is frequency, which is defined as the

ratio of the number of behaviors emitted divided by the number of

minutes during which the behavior has been observed. Thus, the in-

escapable dimension of time is made an integral part of the basic

datum. (2)--A very important feature of this system is that it

measures behavior directly. Direct measurement involves: defining

the behavior so precisely that it can be counted with high reliability

(pinpointing); counting the number of occurrences of the behavior and

the number of minutes of observation and making a permanent record of

both (recording); and finally, calculating the frequency of the





13

behavior observed by dividing the number of movements by the number

of minutes. (3)--There should be continuous recording that is, the

movement should be measured daily or every time the behaver engages

in the behavior, if it is less than daily. (4)--The most important

feature of this system is the graphic representation of the daily

rates, which provides a unique opportunity for rapid and accurate

communication and comparison of facts about behavioral processes.




RESEARCH QUESTIONS


In considering the present multifaceted design, a number of

research questions arise. The present knowledge in the area of

,1 r[ic i l r', J r i.:.. ;2: [:1.., -i. iiL ar ..:.ir' iru i r.:.r, r.,: rnri, t.

f 1r'1l7ul.j 1 r, ir r .h .:.:.r. t, E r ,,f h ,. h :. : H i.-: .,-.:r th. r : r.;-:rh

rque l i.[i,',r: ;,,-lr., re lri p..rt ]r.r r. r..r, ii r,] r.hin i .5in.l : i sr.l

pr'e:;rt..ili rj [,. L iu i T..:,: :qj.: ir'.rn : r'e.


I. HO. : tl ... r'.;r er. 1311 r li,[ r.'r: ,i,1i : I.*, i u'l .
Ti : I: 3 .'. :.: r l .- r.r qu:: ri r. i. i crh r, ni L.rrT.
r.:.:. r ihe r r.he I:. p. t. h. r, r r. :- r. l,. .,. ':- ,ri..L
: r.: fl'r r. .,.; in.j .J lr i'r r.:-'.r.li., .. r r,[- .- ,r,',. .i rj rf lt
iri,.r,, r..[ .)1 p. .i .. [f i:.r,, :. r.:,..i.J 'u: rn f .rt p i,
.p l,:, r ,.] i ,,' : n :i. : u;.:r i':.r, .3r'~. Ir. ti. .u. i r, l I., : :-.

2. ,[ effct: i i in ,r, :. rE ir- .i i l ib c ir, for t ir.r
hbii E lI r,,.:.i1 pr _... i.;. [ i .o ri : r..h i h :. ,i r :
a i .] i .; r ir ..;r. 1, r ir. -n : ,i i i ; i r l...rIni c.n
in.. r: i : i ,.:.: r ii r ,, in- I .r: : l 3r i oin
pr'': l. i ur-ue .:*;.-3rrunr i r,., ,i ur.: h r: f r-: t
di r. rli ...,i i n. i r.. .I. lu.j 'l I lu. .: ell .: tc
,1 i:ui"E i lE :tlbi lit r t rn,.






14

3. What effect will experience level have on clinical
prediction? Previous research indicates in general,
that as experience level increases, accuracy de-
creases. The application of the present methodology
provides a more powerful technique to compare each
judge independently.

4. How does the time involved in making a clinical
prediction influence judgmental accuracy?
Researchers have not previously studied this
question. The use of rate in analyzing judgmental
accuracy provides a most sensitive and natural
measure, for it takes into consideration the
amount of time spent in making a clinical judgment.












METHOD


Subjects

To insure the best control of this relevant variable, judges with

similar backgrounds and experience with diagnostics were chosen. Six

judges participated, representing three levels of experience. Level

1--Two first-year graduate students (F 1 F 2); Level 2--Two practicum

students (P 1 P 2); Level 3--Two interns (I 1 1 2). All judges

were chosen from the University of Florida and are currently enrolled

as graduate students in clinical psychology. Each judge was his own

control.



[I,, r er" i I1:

T L. ; i,,.:ri l: .r: : : t..J .. h' i.Y:.our., 114 1). H.. u,.li

C" ,ii ern i,,lr' :.: .l I,', "r. h, F i...ri.j5 i t. I P'r'i-.:.r F, ifr.rd f l i.:, i d +.

Thirt iinr,. icn. l ir.t .J .:.f fir:t- .:r : .:.:.rl 1-.t1ar.. iurd. r. r. ,:..TripiF, r J

.r -r c r r. *5,A.4.tl:, ;.'C .:,l.. :..i ra.j pr r::.r.. it, f ,t' ,,,: i 1r,. tt rt ,,ri

:lri i,.: .J .:.t .F i ir : i n : pr. r r 5 ur I ii.'. .:... i ..: I. i.:.-

.r ph:, F l irL i. i, l .tl.,r., lil lf.:. r:h ..:' l' :, rnd .;n r:.rn.., :.rre







f ''ier r., T Ll.1 i f.:.r :." i.;r.iri ,.:.r" th. 1. 41:ig lid.1.-e .i ,: ied

i' i, r. ,iCt JIn-rh r Vt' rir. eriatl 1r-: It, I .: .: trni t ot a n, 1. -i.:l.- .:.r













'0 '0


L). 0 0. / O0 <
'0- S too. t3 too !+'0

tO Q- to On to S to Ont S
to at-C-0 at-Ct
- s~ora C lcn ro
a 0ccr a ior


r~m

.0 to to
01 U i- -

to OS-.0->
to: .-Ct0 t'
to at-CCO


"s
u0.


fO0-
ton
U:f L3I
(_) L +
i- iO nr
a. s-O c
T 0 *r


to '0



to: Ot to C) o


C-0*- Cd roC1 OQS
SCornC U- Scorn 1


(0
U


SO
to Otto
a ot c
S o-r
Scorn


ua u
0>4 C to to


E V1 >- t) i- n -
4-D 10 .0
to to 0s 0 o^
S to Sct Q-5toC






rn

c
i- -n






s o a a a
C- a 5

1C i
I1 E
C. I

E, fi ( -d
r


>01 S
bi S


r0.
0

toa

L) C- 4->
-. to Ottro









a C-


-t-
fcO


I I
Lt-









'J
0,
C) C 4-

S m~












UC) C -


'Urn








U

0


0


uC
C)5-)
50-

SUTr -



c-s-c
0 0 0l
U 5-4-
- U) 0)0







Sec




0~m

5 0o







non-homicide (for instructions, see Appendix A). Shagoury (1971)

found that a discriminant function analysis correctly classified 83%

of the total sample. Following the free operant methodology of con-

tinuous and direct recording, each judge was asked to make their pre-

dictions on a daily basis. The research consisted of two experiments.

Experiment I. --The purpose of this experiment was to study the

stability of the daily predictions made by judges using MMPI's only.

This area has received no attention in the experimental literature.

Operant methodology, with its unique feature of continuous and direct

recording, provides a most powerful tool to study this phenomenon.

Sidman (1960) defines a stable, or steady, state as one in which

the behavior in question does not change its characteristics over a

period of time. Two major types of experimental interest in steady-

state behavior have developed. One of these may be termed "descriptive"

and the other "manipulative." Experiment I is a purely descriptive

study in which a set of experimental conditions are maintained over an

extended period of time, providing an account of both the stable and

the transitory aspects of the resulting behavior. This form of re-

search is fundamental to the establishment of behavioral control

techniques, and of baselines from which to measure behavioral changes.

The data yielded by such an experiment do not relate an aspect of
behavior to several values of a manipulated independent variable.

Rather, the resulting curves show some aspect of behavior as a function

of time in the experimental situation. It is the characteristics of

behavior in time, under a constant set of maintaining conditions, which

are of major interest. According to Sidman (1960), the descriptive





19

investigation of steady-state behavior must precede any manipulative

study. Manipulation of new variables will often produce behavioral

changes, but, in order to describe the changes, we must be able to

specify the baseline from which they occurred. Otherwise, we face

insoluble problems of control, measurement, and generality.

A major problem faced in experiments involving the manipulation

of steady-states is that of deciding whether the behavior in question

has stabilized. According to Sidman (1960), there is no assuredly

final answer. He states that "The utility of data will depend not on

whether ultimate stability has been achieved, but rather on the reli-

ability and validity of the criterion. That is to say, does the

criterion select a reproducible and generalizable state of behavior?

If it does, experimental mar,;pul r.i ..: n of : r.3 -: r. r. a: .J r...j b.,

the criterion, will yield da a r.0r0t "..i .: r i., I nrd .j,1r, .i. : :,l.. r.)

other situations. If the s- .m ,,-: r.: ,r rior, : i .. u o,

failures to reproduce and to iepi iz r.e :.,: r.ut r: i i, trh. L p ,: n.Tr.re l

findings will reveal this fact

How does one select a r.E i.,-: r r .rit i I i,. i i :. n.

to Sidman (1960), no rule to f,,il.w, fr. L trh critri .' :r, ,ill .i.-;.r..i up-

on the phenomenon being inve:tcir- ~; t- iup:r tr- i. ..i i .:.r : pf ,r-ici.rr,

control that can be maintain. . I .er- cr iFi ,,i i.:,. t .rm .,u.i. -

steady-state behavior are e>.ter i u:eruit L, r:.l.-..in icCJ.i.r

over an extended period of .iT.-. ui rr n. char .- in ri. e.-.' ,,,enar,.

conditions, it is possible -o ,l vI r.iiTm- ;, i. :. i,f

stability that can eentuall, -be r :ir.r.inr l: ia irr.'trion ci r then ,ie

selected on trne c 1 :. of the:: :,r: ar.icr: Th. A. e l um O:f tli

criterion choscr, .:r, L.b- .o iriir .. .a t., r.h c.r,1arlirnL;: I , t hi L:ul il ]








data. If the steady-state criterion yields orderly and replicable

functional relations, it may be accepted as adequate.

Procedure for Experiment I. --Judges were presented with 20 MMPI

profiles, 10 belonging to homicides and 10 to non-homicides (base rate

of .5). They were asked to discriminate between the profiles of

homicides and non-homicides. Each judge was presented with the same

set of profiles on a daily basis until stability of prediction was

reached. The criterion of stability was orderliness in the data.

Experiment II. --This experiment consisted of a systematic

replication of Experiment I plus the study of the effects of adding

new information on clinical judgment. Four phases were involved:

Phase 1 -- Systematic replication of Experiment I. According to

Sidman (1960), the soundest empirical test of the reliability of data

is provided by replication. The application of continuous and direct

recording provided a unique opportunity to attempt to replicate the

findings of Experiment I.

Phase 2 -- Phase I was used as baseline data to study the effects

on clinical judgment of adding new information, in this case Rorschach

protocols, to MMPI profiles. The research previously reviewed (Gold-

berg, 1968; Shagoury and Satz, 1969; Moxley and Satz, 1970) seems to

indicate that as more information is available to the clinician,

judgment accuracy increases. The present methodology provided a more

powerful technique to study this phenomena on a daily basis instead

of the previous one session studies.

Phase 3 -- This phase was identical to the previous phase except

that biographical and EEG data was added to the existing information.

Phase 2 was used as baseline data.







Phase 4 -- This phase was identical to the two previous phases

except that a summary of the findings of a multivariate analysis on

the data as found by Shagoury (1971) was provided to each judge to

assist in making his judgment. Phase 3 was used as baseline data.

Orderliness of data was the criterion used for termination of this

phase.

Procedure for Experiment II. --Phase 1 -- Judges were asked on

a daily basis to predict homicides from non-homicides using a new set

of 20 MMPI profiles with base rate of .5. This phase was discontinued

when stability was reached. The criterion of stability was orderliness

in the data as well as comparison with stability in Experiment I.

Phase 2 -- Judges continued making their daily predictions. In

this phase, the MMPI profiles of Phase 1 and the appropriate Rorschach

protocols were utilized. Orderliness of data was the criterion for

stability.

Phase 3 -- This phase was identical to Phase 2, exc't rii-, )iuj -:

made their daily predictions with the addition of biogr.phi'cIl itj

and EEG reports.

Phase 4 -- Judges continued making their daily pre.ictror.:. bur

this time a summary of the relevant findings, as found :., 'ulri. rijte

analysis on the previous personality, biographical and .:loloicil .jtij,

was given to each judge (Shagoury, 1971).













RESULTS


The measures used in this study, frequency of correct predictions

and frequency of incorrect predictions, were plotted on Standard

Behavior Charts (Behavior Research Co.). Plotting linear data on a

log scale provides one with a picture of proportional changes in

behavior frequencies rather than absolute changes (Koenig, 1972).

Information that the frequency of occurrence of a given behavior has

doubled or halved is considerably more valuable than information that

the frequency of occurrence of the behavior ias changed by one

arbitrarily defined unit.

In order to understand the present results, it is necessary to

briefly familiarize the reader with the Standard Behavior Chart as

well as the current procedures of data analysis.

Chart scales. --The horizontal dimension across the bottom of the

chart represents calendar days. Each chart runs for 140 consecutive

days or 20 weeks. The vertical dimension up the left side of the chart

is the scale of frequencies or rates. The unit of measurement is

movements per minute.

Record floor. --The record floor is the lowest measurable per-

formance frequency other than zero. The record floor is found by

dividing the number of minutes in the time sample into one, the smallest

number of movement cycles that can be observed. The record floor sets

the lower limit of the sensitivity of the chart as a measurement system

22








for each day. Below the floor is an area of record blindness. The

symbol of the record floor is a horizontal dashed line at the computed

level of the floor for a given day.

Celeration. --Few statistical measures are available for des-

cribing continuous changes in behavior over time. Therefore,

researchers interested in continuous observation and recording of

behavior have developed several new measures for this purpose (Koenig,

1972). Frequencies displayed on the Behavior Chart are usually either

5.:.:.r il i 1 i ,r I ) ir .:.:.: I ri j i rc -n 1 i i : k i ii .: C e,: : i. l i .: I : i .: n

t h *r :r': i t* im f.:.r rrh :: :*-l* )j i r :Ini .14 l r ie : ir j t c. r.:hip.

1',-l.-r i i 1 :,, i. :, ,, i:_ re ,: .:.t .:; r ,l, ,:, *:u lrr i 1. 1 t, fr,.. ,urii_ i '.,t r.: .p..i:.n .j ln ,

u ...r i .i ,- t. : 1 rri,.,d ':,f hn'i".. Tr.,: ,;. r:r1r. ,rl..n .:,:,-. fr r.: r. t ': fur, :r ,:.rr -

; k11. r, l ,.ed i ,: -Jth : f L r,]-. I .I ,I ,f L. : f i :r .. J i r. C i1r,,, .


:ir.; rh 1., : i .: ,qru, : j.r r i r r.o. .',: '.. .L,-


, h 11.,11nt ar, r:h.r ,:l, i l| the ,l.: .l:,er',pl.n.,-1[ .i",r,-. [n, ,lj 'i 'l.,, in i L J
.:ur j -:,, r .: i i :,, : : ,, -:.r ir. u i, ) ,,I : j L": u ': [ -. : .: f i I :..




,,iej:t, re. it> p :r. '.r:c.. ] ,.n l,.. ,-; ,:.j. r ,- i [ ,: ,i, i .r ic, ,,[ h I : ..:11 irlI. i ,:,ri

.r [.., r ,-:u : Tr, : .: r.ua l :h rr: ,:, t rh il. L pr. :i i E r.: r,'. -i. :h







Th. .i::ul ., 1 i i,:. 1 d;. filr 1 r, tr, t, reer iv ue i.,
.: i ri i i. i : T c:.:uL i;., ru j'E u,:, fj ,L Ic .:r,

.luJ, I: i CrI .l 'f F r -jrir. [ r l i .ir T cr r ti fr ) r,
,1udgq cs: j1 ~ *.*.r 3 L 5 dr- -1 -1 L. h). .. 5 1'rL 0, i,, f-1








indicates that the frequency correct is equal to the frequency in-

correct. A value less than one indicates that the frequence incorrect

is higher than the frequency correct, and a value greater than one

indicates that the frequency correct is higher than the frequency

incorrect. The reader might want to convert these values into

percentage (e.g., x 1.0 = 50%; x 9.0 = 90%). The accuracy ratio

celebration measure provides the opportunity to compare the celebrating

effects of adding new information to the clinical judgment process for

each judge as well as across judges.

Figures 1 through 6 present graphically the daily accuracy ratios

for each judge. Table 2 shows a summary of the accuracy ratio celera-

tions per phase for each judge. Inspection of Table 2 shows that the

accuracy ratio celebration coefficients ranged from i 1.56 Movements

per minute per week (M/m/w) to x 2.27 e/m/w. Figure 7 shows a graphical

summary of the accuracy ratio celebrations across judges. Overall, there

was essentially no acceleration or deceleration of accuracy over time.

There were four exceptions. Figure 3 shows that P l's accuracy

accelerated x 2.27 M/m/w in Exp 1 (MMPI's only) and x 1.6 M/m/w in

Phase 2 (IIMPI's + Rorschachs). Figure 6 shows that I 2's accuracy

accelerated x 1.51 M/m/w in Phase 2 (MMPI's + Rorschachs) and decelerated

a 1.56 M/m/w in Phase 4 (MMPI's + Rorschachs + Biographical Data +

Formulas).


Accuracy Ratio Frequency Multiplier

To measure the effects of a new procedure on the first day of a

phase, the frequency multiplier, or step, is used. The frequency

multiplier gives a measure of the increase or decrease of frequency














Cx x 0 O



x x .4. .I4


*^-
0)





m/
n3

-c




CL





co













U(




I- &

--
<0
s-


xo













o Co
O. 0


















*!* I*


x x x x
< X< X< X


2-


Ii 2'r. r


II

sT I I


D o
o o


*I. X


x 4
X c




























L 7
0


a



w
,,i,






C)
. -.
S EU
U)
L L

U)


oIivu Aflovdnov
















0 I I I



0
0
o




0





0


(/)





C.) u
\ Q













0 o
o



















00 00 L
I IU









00 o" 0 O 5
0g- 0 0\
(jI'l't |*J H `Y n -






















































00 0 0 0
00 0 L -
0_ -


OiJ1v Avuonoov


0 L
Zo
_o

-1





0 LLJ
.0(
C



-0 )
















2 I 1 I I I I I


o






2'





'-A






0


0.- .-


0


.n
0


*r:
0





o

0
0


C. Q
W
UJ



IO
c)

0 n
*in w
U

0 "
0


00 0o
0o.-1
















I I 0
\0 0



0
So
0






o a


-0 a
oi .








OIo o
C )




Lo

m )













n- 0
oIvX VflD3





























4/'


00 06 .2. ,
- o


O0lva 11 V iS d V?


w.







N-
V.


i -
0 o
0 0


1 .- . .-. I'


I I I . I


1
hi





32



































vl
r

c,
t/


Cr
0
+-




(U








correct or frequency incorrect the first day new information is added

to the clinical judgment process. It is a comparison of the last data

point of the old phase with the first data point of the new phase.

A frequency multiplier of x 1.0 Movements per minute per day (M/m/d)

indicates that there has been no increase or decrease in accuracy with

the introduction of new information to the clinical judgment process.

A step of x 2.0 M/m/d indicates that accuracy has doubled with the

introduction of a new phase. A step of 2.0 Ni/m/d indicates that

accuracy has halved with the introduction of a new phase.

Figure 8 shows graphically the steps for each judge as new phases

were introduced. The measures for each phase from left to right belong

to: F-1; F-2; P-1; P-2; I-1; I-2. Table 3 presents a summary of the

accuracy ratio frequency multipliers. Inspection of Table 3 indicates

that the accuracy rjti.., fr-.qu-n.: mAultirpli;er rns-.a froi-m = ?.0 1 'n'd

to x 4.0 M/m/d. '"re:. :r. '...i ure: t l on. i _. Figu.- : in...d c a ;

that the maximum a.:.:~al- Trr, .-t :.: a.r. .... r. in r r .. .J i.t :., .:.

Phase 4 (formulas:. Tri, i1.1D, n ,:.f n .e- f.:., .:hi ) c.:. .ju .

overall the least cr, is. 5 ..:'u c, .::p[ i'.o, i : h,::- _CCurc,

decreased 3 M/m/ Tr,. j.j ir 1 :,'n I, n.: I in- .et ,i" lil i :i .

well as the additi':'" *,f hi :_ It i.: ~ c.i', i *l.j. .r':Ju,.:3 r :, t:

momentary decrease. in .j:ij.:.


Record Fl::r .el.-r: i :, n Cf r ,, .: c.

In r. (*r .r '. i r i .: r t t li r ::.r .j i' .: r i .]i,: [ : trt

amount of time a :;u1. :.:-.' i in.0 ni: .-Ji l pr.l-. tr. ir.n it I;

therefore, a meast ef i'. n.: e,:c rd fl .:.. :elr t i : :., ,: t.

judge are located ii' I..'"r.di F.. TI t.l J :h, ; m u r i ,i f t .- r .' o d'













p, I IaI




n -
I
Sa

< 02

- :j ,Q






Ct
N 0-,



Lu C

N 5 C






L u _.

LI "
a



aCO
















o 0 0 0


CC






'C
0) Ci



a 3




C:

C


9 0
- ,c
X X


0 1- 0 LCO































--
E











.* .- l .


x x


1 lJ








36










*-V



















t N. t o en M
UJ Co o o M
S i- ,- N- i- Cn














x x x x x

















COC
a o =



























C i- i e i e i- CO
a e X X X X X
























I 2 a




CMC
l X X x X X X













a)






0







floor celebrations per phase for each judge. Inspection of Table 4

indicates that the record floor celebration coefficients ranged from

1 1,21 M/m/w to x 3.22 M/m/w. Figure 9 shows a graphical summary of

the record floor celebrations. Overall, there was clearly an increase

in the efficiency of the judges' daily predictions. The maximum

acceleration in efficiency (x 3.22 Ti/m/w) was obtained for I 2 in

Phase 4 (MMPI's + Rorschachs + Biographical Data + Formulas). The

maximum deceleration of efficiency (+ 1.21 M/m/w) was observed for

F 1 in Exp 1 (MMPI's only).


(.c, .- l. 1 .: J" iu.; _, r, 1r 'ul i .; r

r ,i .: : : r i i i i t, ; r. r i rc f qi :l r
i r r.. i. ri r.- i .: ci i i:,, i 'i iC, i .r,. ,r 0 ir, I.: ,, .-j 1 ri.4


I' : 'liJ' c. 1 r i : i i :.r .; i.: ,... . : c.'on, r'. ,J. Tl. -, f..-.J, .: ,,


, i. i r[,. i i: [ -.:. : ) r".':., :,r ,h, n, ,-,r. if;l C f .

II) u r- I : 0 ,: ,' ,-r r.. 'l'-.i 11 r.r.., i.:,,. 1 [fi .-' ti .: t.:ir .:r l 'u .ji

,. r,.L i,-,, :. rr, i -,juc.-,. 5r, ,, : : ri' ,: ." ",= Tr', i_:

C.. nr m .! il --n- tri: F-1: -J. 1-1: r-: i- : i:. .. : -rjyr r.
,:, 1 L, ,: i r.,:,F: f-.: f0 i . i : -- '. -I : I -I T i0 ,i c ,: .:r r
. :u..3 =r , .:. rLr.:,: J r ,:, Ir .lu.in .:. fTu ti pl i r, :i,- Ir :,:.. I .: i.:. ,:, r

T.=t-Li '. ir.il :Cj .; t ': r.C r, i .,:...r.1 i"l,.....,r' fri" ,- u r,:, frn iCifii r: ("r:i[ -.an ._l

fr',:,. i ..' l l I [. In 'r .1. Fiq jr- 1.1 irirJic:.r : that in .=1i

iC .ip[ .1. l t r. .1:-.. o r .'f r y i *crc 'ai' n o u:.) iF i, t r3 wr t

r iu:r i *:r, in .ffi r,.- . Tr, : c :.t i nr, :'.: c .urr J ir. F ii r-in t -

dr [ I L.:n j_ f th:: I, i..r' i : 1 di ti f.i, :h p.r O C r.I a 1 t t ,

:i rii;'rcv [ in. .( r f I. II m -l in -: .ct .i n ..f F -i Le 10 t t

Ctet L .r i-*i ..:rall Jr.i a i; 5 in P1 I -.. ric r 0a -Jrr w1 iuth I....





38





I i II








w
a-




LU
U)

ao -

C





0
I I
CL


















a-



0 00 0, -L) 1 0 L
00 0 to -













I I i I I -- I I I i I *



LU


iL









o <4.
Id


I ID










w r
0 L 0 0
N N


0. I D














Q.*.i _


ij'











0 L-I 0








a-












-cN kG
Cl




0



PO
(U










ar
'-
S -c






N
"3






>0
o- a











u- a1 N tO
CsJC










o a
5-o









a,
U <


1 C
ur

Q- m


to co C- (D
4. .4. .4. 4.





om a C
N N
** ~ 4 *4* .4*


C 0o o

3- N i-
A4. .I. *I* .4.


0 t ON


2.4. ++ I I


u-t







-~u IN~ -
IlU Oa :~







addition of Phase 2 (Rorschachs). The second most noticeable decrease

in efficiency occurred with the addition of Phase 4 (formulas).


Frequency and Record Floor Growth Ratio Effectiveness

The growth ratio is used to assess the relationship between two

celebrations. In the present study we are interested in the relation-

ship between celebration correct and celebration record floor as well

as celebration incorrect and celebration record floor. In other words,

the growth ratio provides a measure of the relationship between the

celebration of the time spent in making the daily predictions and the

celebration of the daily correct and incorrect frequencies. The growth

ratio is independent of both the initial frequencies and the two

celebrations. It is therefore, a measure of effectiveness. A growth

S i I.'. O r 1.1.r1 i i'..J r : j rh i rh ? .'. i i r .:.r f'. i .r .:.;-r I .: r ri.,n r. -


c .rr ', :. r F .. r ~ : i .. r r. r i .: r. ., ., e h qr r. h l, .r

r'c. E. ) r or.: Ir..: ir. j r -.:r r :r. r.:f c.; I r i r:. .:f I -. r .. i i .:nr
i n ,: : r'r. r i: ,jru r r : w r r i i:.r , r. s ,d T :, ii r, r i :


i ir [H ,- ,*.:.1 ic i ,r i p,-. -r n1 c:rr ,.: r r i r .1:. .

T i , : .'r,,:,; : 6 r ,,: T.;. u,: r ,:u ".,r'r.-r r. r'.J r.,:.r'd l' *,.,:.r t.lrt',i, [ nr r ntr,-,"
p.r F ,i:. rr ,.:,h iu.C. in: -..: :. .L. T I ,bi. :' rC .i r .: ,I 4 C

T,,. i t'. r ,nr ,.,: [,. ... t'r.C. -. r:l. ,,, r,, rh,:l r ,c .' -,,-i o t ; r r .i. : 1,




L ., I l,',i i. rd i.-4 *r fl, :. rl ;i : r ,,. 'liLh: ). i t: a i nr l

S*i '', I [ : Or' r -. i;, :- : ( l'r i l C.r:_ rn ,-h: I frn.1 i i n

f ] ( i.1 + :,r:.:h ,.-I, : I_,'C ,.,r +.r cl, i I.' t F .,r'TIjI : I.








































0 01


o ,


-m 0
LU L*
-I a)
C- -a)


an 0
O O












U- -


aO 0 C





























0 0 1 0
o o o


co o


CO 0


-~ C'~ C'








Table 7 shows the frequency incorrect and record floor growth

ratios per phase for each judge. Inspection of Table 7 indicates that

the growth ratios ranged from .56 to 1.32. Overall, there was

essentially no difference between the incorrect celebration and record

floor celebration. There were five exceptions. P 1 obtained a

growth ratio of .56 in Exp 1 (MMPI), .78 in Phase 1 (MMPI's only) and

.74 in Phase 2 (MMPI's + Rorschachs). I 2 obtained a growth ratio

of .84 in Phase 2 (MMPI's + Rorschachs) and 1.32 in Phase 4 (MMPI's +

Rorschachs + Biographical Data + Formulas).


Accuracy Ratio Total Bounce Variability

In order to assess the variance around celebration lines on the

b[eha i.,i.r Ch,ir[. t ,,,n (119 1 h,; .:,.;c:l p d rthe rt l bcjrr.. ,' w im ure .

cO finnd trhe tctil bounce ine i: .lr,-n ,r parll .l to rh. c lera ticr'

1 ini Lrir.u.j.)h [h- frequ.nrji, rirthe: ar L,; t r.. Trh n irie i; .1l r n

p rall l tI o th .:c l. raj tior lir: I .h.rth :-u h Ith: fr.qu,'nc. far[th;i L t,.lo:i

it. There .li tnce be ?t.., [thc:-. twl Ou r I int. : pr.:: : .1 a- ,i rael ,:.

jef ini: lh. [.:.C i] t.:.urice ir..jrd the c.: ier .,r l ir. l ..ri ig I, ]'n '2)i

h a h.wn r i that r.h prc.pc.rtli n' .r rl ir: i 3Irounr. r.r :t [r, ihr. liri of

ieler-tir in fre'r uncir.: u:u.a l, r.-m. .rn: cor; [ r regrardle of the

,al f r.he frle. 4 r,: C :. Trhiu:, [ .1 t-.: .unC I: uIei1 .i a iel.iE ure of

hri. ;i.ro .:u .n r t. r i tL Of t[h. 1 Il, pr'dC t itE .'

T.able ; pr.:ert.; :ucTT.j Eof [Ih ai ura, r'tiO [totr l tOunce per

phiL:e for e..:r, jUij rIn:ecti ori or f Tt.le r.:',.E : r tha ture alI

tiour..:, r.anqed fo ,n I ) to I I 11.': Th.: hi.jhE C total t.,uncEl of r

..aL t.r. in..l t.. i i r.r the aJd i r icn ...f Ph, a 2 I(PC.r.:ihach ) .














tO 0
) o a
o
n3 ,
QI


O 0 a,
n3 i


'0 cr0
C!




















0 0
o


D 0 0 C
cn o


0 0 0 0
03 C)




















r- oa a







So -












0 o C. cJ


t*1-
'0



-I
E


t3

0

0

*0
L^

r- 0



I- a
'0
U
4-
u

o
'0
0

U
'0
c

&
I,


CIJ ,


- CMJ
1 i
LL L,















4,-
0:

'4









4"
0)








oo








0
4)








c
F-
0 S
C Q









cC 4
i






2
i-I
CO Q







UJ


o a
o oM
m c














x x









o a
o N-
N X
K K


o o ao





















. -
xx x x













X X XD






















i i i :


H- * X X


I I








A total bounce of x 1.0 indicates that there is no variance around

the line of best fit. P 1 obtained a total bounce measure of x 1.0

in all phases except Exp 1 (MMPI's). In comparison with Koenig's
(1972) data 1, the present results indicate that the accuracy ratio

total bounce for each judge is considerably below the average. This

can be taken as a powerful indication of stability of accuracy in

daily judgments,























1. Koenig (1972) investigated 13,941 human behavior projects
deposited in the Behavior Bank and found that the average
total bounce was x 5.9.













DISCUSSION


The present study provided the first experimental application of

continuous and direct recording of operant methodology to the clinical

judgment process. This novel application attempted to provide initial

answers to four questions: 1) How stable are the daily predictions

made by judges?; 2) How does the time involved in making a clinical

judgment influence accuracy?; 3) What is the effect of an increase in

available information on clinical prediction?; 41 What is the effect

orf e ..r'i nc;- 1 .. i or, .:1 nical J;,iu r..ntL Th, prr L.se nt ri'uits re

l:h :ucc:.,l i, r.hin r. t e r'r ,T. u,,ork ,:f tr,.:e q ;t r.n:

Thie re ul..: it .',-n:r:tri[,: 1. nurit.r fli t'f r r, r r,-i:.tte .u In :o'r..

c :-'.:,, i .-t :ff. :t r cr O, r rr ri 1r- e.'u:L r 4-.r:hi; r'mr i h, iT, ,

trie o. r il i ,.i i..::ur .:, I-...1 ,i.r m t j : Ht'owe..F.r, :o eri of t~e

findrin w-r'i. jrL i ipei.r,-j, pirEi.:ul.rl rth- u.rali rn ,lIgible effc r.

of .j.d inr inf.:.r.at.ion Lo r.h, c:liricail juJ.l T.er,r. pro.:.-, Ine r. IEl

al'o I c 'l rn:.i t i -1 i ru T C .-r f r, i.. ff-.C Ti-.:. were: 1) Th. .dir.:r.

ajn i t..:i[i i r.,pli: i.,ir, ,.ro-:. rn itii n ju ge : r. th; at ilit,

Of r.h. i .il pr .Jic ir. ,n. .;r.: : r:. -:, -. Th- r-plic:aci r o-f he

ircr.3:.. in effi.:i.rnc :.cr,.u pr i: for ,.ea jujge: rand, 31 Trh.

r.pli d:atI-rn of r l i.C of is-nti l .ji .ff.r..hrn. : bcti .n .: .l ritIln

fr; u-nrii.-: nr .:r ei-riti:on r'cor. floor. Thr.,' finlingl prom ld- a

sC5rtinL point for tfturw re:. r.:r.i n clinicAl jui.T nrr ',h1 ,:-

t.,ia[i c r. .] ir.cr. ripllr, ir.i.n of Lh pric .-n[ E :tu]., ill prf, oi.j

4;







reliability and generality of these results.


Stability of Daily Predictions

It has been eighteen years since Meehl (1954) stated that "Pre-

sumably some kind of longitudinal study is needed to find out whether

and to what degree the 'good' clinician is stably such, rather than

being merely the momentarily luckiest fellow among a crew of equal or

near-equal mediocre guessers." The present study provides a partial

answer. The results indicate that in all cases (judges and phases)

except one, the individual predictions were stable. The exception
was I 2, with the addition of Phase 2 (Rorschachs). However, since

the accuracy level for most judges across phases was 50%, these results

have to be interpreted with caution. A stable 50% accuracy level is

easy to maintain. In our sample of judges, only P 1 (See Figure 3)

maintained stability above 50% accuracy across phases. He was the only

steady "good" clinician that could be identified. Inspection of each

chart indicates that F 2's predictions (See Figure 2) in Phase 4

(MMPI's + Rorschachs + Biographical Data + Formulas) were stable above

50% accuracy as well as P 2's (See Figure 4) in Phase 2 (MMPI's +

Rorschachs). These findings indicate that, in the present sample of

judges, when a judge was identified as "good" (identified by con-

sistently predicting correctly above chance) at least in one phase,

his predictions were stable across that phase. Future longitudinal

research should identify these "good" clinicians before attempting

to replicate the present findings.







Efficiency of Daily Predictions

The use of frequencies in analyzing judgmental accuracy provided

a most sensitive and natural measure of efficiency, for it considered

the amount of time spent in making a clinical judgment. The present

results showed that, overall, there was a clear increase in the

efficiency of the judges' daily predictions. It was also shown that

efficiency decreased when new information was added to the clinical

judgment process. Inspection of Table 10 shows that the maximum

decrease in efficiency was obtained with the addition of Phase 2

(Rorschachs). This can be taken as an indication that the integration

and interpretation of the Rorschachs combined with the MMPI's required

the most time and consequently the maximum drop efficiency. It is

interesting to note that the maximum decreases in efficiency in Phase

2 (MIPI's + Rorschachs) occurred, in all cases (judges within Phase 2)

except one, with the medium (P 2) and high ;.rinr.cel i i;

I 2) judges. These judges had knowledge in tr. ir tirprE~t!in of

the Rorschach; therefore, a decrease in eff`'enir.: i: wn ind i.tflr.

that they were making use of this knowledge. Th.n :.,)nd iT,:t n.:.tie-

able decrease in efficiency occurred with th j.j.li in o..f u nr. J

(formulas). Once more, the maximum decrease in iic.:in. or Curr-.3

with the medium (P 1; P 2) and high experi-rc..j (I I, I 2)

judges. It seems like the least experienced jud).e: (F 1, F .;,

presented with a novel set of information, dic'ijde not to, pFr.a Ic.uc

additional time in attempting to integrate trn: w,, in.-.rrm.raion.








Effectiveness of Daily Predictions

The effectiveness ratio (growth) provided a measure of the

relationship between the celebration of the time spent in making the

daily predictions and the celebration of the daily correct and in-

correct frequencies. The present results indicate that there was

essentially no difference between either the correct celebration and

record floor celebration (See Table 6) or the incorrect celebration and

record floor celebration (See Table 7). This indicates that, overall,

most judges within each phase expended less time in making their

daily predictions as the phase progressed, but their accuracies were

uniquely stable within and across phases. That is, they became more

efficient without a concomitant increase or decrease in accuracy.

Two judges were the exception. P 1 (See Figure 3) increased in

efficiency and accuracy throughout Exp 1 (MMPI's) and throughout

Phase 2 (MMPI's + Rorschachs). I 2 (See Figure 6) increased in

efficiency and accuracy throughout Phase 2 (MMPI's + Rorschachs) but

decreased in accuracy and increased in efficiency in Phase 4 (MMPI's

+ Rorschachs + Biographical Data + Formulas).


Levels of Information Across Levels of Experience and Accuracy of

Daily Predictions

The accuracy ratio celebration measure and the accuracy ratio
frequency multiplier provided the opportunity to compare the celebrating

effects of adding new information to the clinical judgment process

for each judge as well as across levels of experience. The present

results indicate, in general, that there was essentially no








acceleration nor deceleration of accuracy over time within a phase.

The results also indicate that across phases, the maximum accelerating

steps were obtained with the addition of Phase 4 (formulas). The

addition of Phase 2 (Rorschachs) produced overall the least change

in accuracy. The addition of Phase I (new set of MMPI's) as well as

the addition of Phase 3 (biographical data) produced the most initial

decreases in accuracy. Individual differences were observed across

phases between judges. These individual differences are discussed

according to levels of information.


Exp 1 MMPI's Only. --Most judges predicted at a 50% accuracy

1'>.-.1 (Fi-.,urjr 1 t.rou.)h '.l. Tr.r. c .- r.e t[wo e, C pri r,:, t.. h c:uL 'rri .

wi th mT.-iiu,. e rper' r ic, d jud. : P I I ,. F .)ur.- 2) nri '. e.: .e r, :

accuri' : r, un the : c:...1r. ,l i fl rri : [E 1 Ti r.I rh r, i ) U r. i .) r rn irp.J

3[taL.e till th .:n.j ,f tr,: prha::. F ( :. F i.u,.: 4) pr.:dict d .:or,-

:i: tenr. l, It.elow .:,rce r,.J rhi- ,cur' :., .lii.1 no:c c.:.ler I t, ;.:r- r. e


Prr :,e 1 rtl i': 'nrri, --l.::.t juJ : pr ,- 1 i ar. C ~':. accura .,

lI. > l IF:ur.-: I [ihrou..h i Trh..r a: n .i .-.ti rr. P I (i.

Fijurie 31 predic:r;. s i 5 '. j,:.:uri C .. L.i O:n fO:uir r: rhe fie Ia,:

in rth: : cu'rd ,-el uf rri: i h,:e. T ble. E in-lici[ e: that (h. intro-

, u:, ,, I-,' -i,- 1I pr'dluc-dl n.T er r. .lerre, :e ir aCcurC : ilr brh

Of th. rc:.rn-. .p.ri-.ni..:. ju.lc: (f I, F ): in onre ilK.ii. um.-e p. riern.:e

jud.: (F II ; an.l iri, in, il. ,-.- p rire e-l ju .g (i I). .lr,:c tt.,-re

M; ,-,,.; ,.,ill n r,, c.l, 'ratiO1 in ace 'ri:y ir, ['.i I pra (.Ee Fliure 21,

n :.',. i-'ecr: -ce ri,:.' p r,' narri r .








Phase 2 MMPI's + Rorschachs. --Three out of six judges pre-

dicted mostly at a 50% accuracy level (Figures 1 through 6). The

addition of Phase 2 (Rorschachs) had no initial effects (See Figure 8)

nor celebrating effects (See Figures 1 and 2) on the non-experienced

judges (P 1; P 2). These judges predicted mostly at a 50% accuracy

level. The addition of this phase produced no initial effect in

p 1 (See Figure 8), but his accuracy increased above 50% on the
second week in this phase. P 2's accuracy increased initially with

the addition of this phase (See Figure 8), and remained at 60% (See

Figure 4) for the rest of the phase. The addition of Phase 2 had no

initial (See Figure 8) nor celebrating (See Figure 5) effects on I 1,

I 2's judgments became unstable with the addition of Phase 2 (See

Table 8). The initial effects on I 2 was an immediate decrease in

accuracy (See Figure 8), but accuracy accelerated (See Figure 6) with-

in the phase to a terminal accuracy of 60%.

Phase 3 MMPI's + Rorschachs + Biographical Data. --Most judges

predicted at a 50% accuracy level (Figures 1 through 6). With the

non-experienced judges, the addition of Biographical Data produced

an initial decrease in accuracy for F 2 (See Figure 8), but no

effects were found for F 1 (See Figure 8). The addition of this

phase produced an initial decrease in accuracy for both of the medium-

experienced judges. P l's predictions (See Figure 3) remained

stable at 60% accuracy and P 2's predictions (See Figure 4) remain-

ed stable at 50% accuracy. The addition of this phase had no effects

on I 1 (See Figure 5), his predictions remained at 50%. I 2's

predictions (See Figure 6) decreased initially from 60% to 50% accuracy








with the new phase, and remained stable at this level.


Phase 4 MMPI's + Rorschachs + Biographical Data + Formulas.

--The addition of this phase produced overall the maximum increase in

accuracy of all phases. The addition of this phase elicited an initial

increase in accuracy for the two non-experienced judges (See Figure 8).

F 1's accuracy (See Figure 1) increased to a maximum of 70% but

decelerated within this phase. F 2's accuracy increased initially

(See Figure 8) to 60% (See Figure 2) and occasionally to 70%. The

addlit ion .:,f1 h '. .1 I f.:, i r L : ) pro...ucl d t h. r i.Tiu', ,r.cr- :j in

a:C 'ji a. u, Ef r t r..3rn--.p ,rri cn.: jJd..::. F.,:r F I rt...e Ii.)aj r.: ) the

'ddii r. ior i, t 'h ',nj1 a: E.1r,: lu. 'C-d ri, h r an irilLitil t.:(. Irn ..:iur aIC,

(l.ee Fi ur, -) n:,r' 3 CEl; rj 3 .rn .J f.', [. Hi: pr+. 3idr i -iI: rt iT. i ri.i.3

(:t'"lF .it f Jc:uri.:,. Th I a- f cc 'rr: d fr : C 2 (2H O ui r. 4),

ut in trill *:,: h i: ri pr .-ti ,ri: r Ti, n, .j :[, .l e ,I[ I i:Curi ,.
F i- .lur'- L h r,,Ii ,: a l :, r, .j t or : ri, ,r, l.3i r Jn ,.', :,jI ,:) E.:,I, .ju:3C. '
Figure Ini..j.L. L; hr r..: Ildition cf fhe 'frrnal;.) Jpr.duced

rno ini ] l 'ffe: ,:t n 1 ,ri jr. incri:r. in i:,:C r i ,.r I r .

I I': i.:C urI:., r r.: r.~.4 l ,on: nr I t rt .:ee fLui- 3). [ F' :

accur.c, *l.: d ] r i d f'r:'i )r. iniL ,] i '. C r I c o t, rnlrir l .i 6

.iCl:ur 3 (:,.-a F jur r. l.

TO :u.Tir3ri ,r h,: i l i:,or. :f r n w I i n l: 1 Inr'. nr'i r to L he

.iudii.ir. il p.r.:.c : d i nJ o c t irti nI .-. :r i:E i: : r, .:, i.: r', :,:.

pha: Tre :i r -i, Li C .r-ti : w.are Lhu r'e-I i o: n i f the ilr, r.,a.r in

.: w'C:u I :O irri 1 T r, :C .' i ) fr'r bot non-.-.: ri n.:e~'l :judjl,: I.F I,

F with th.e iad it i i:n o, f P I. I riu1.l. Ai ..:, wit h th.: Ij.ji iLi0

oi Pri.i. 4 I Ir.ijII ) i I ccurca., InCrCdE d to a m aid um iurif u '.I.

Lith t LE riTair..i ir.) cf 6, I I : IrF1 irtr [ing t note that lth-








two non-experienced judges increased in accuracy from 50% to 70% with

the addition of Phase 4 (formulas). Shagoury (1971) found that these

formulas predicted accurately 83% of the sample of homicides. It

seems, from these results, that non-experienced judges (F 1; F 2)

tended to ignore the actual test protocols and looked for the relevant

cues provided by the formulas. The same could be said for I 2.

Nevertheless, no judge approximated the overall accuracy of the

formulas.

On the basis of the proceeding findings, a few general comments

can be made. Some of these comments may, at present, lack generality.

This study is only a first attempt to apply the single-subject re-

search methodology of experimental analysis to study the clinical

judgment process. Future replications of these findings will provide

the final test of the reliability and generality of the present data.

The present results are in conflict with more traditional studies

of increase in levels of information. These studies (Shagoury & Satz,

1969; Moxley & Satz, 1970) found that accuracy increased as levels

of information increased. It should be pointed out, however, that

the kind of information presently used was in part different from the

two previous studies. In these two studies, the information used was

quantitative (Z scores; base rates; conditional probabilities; etc.),

and in the present study, some information (MMPI's; Rorschachs) was

qualitative, and some (biographical data; formulas) was quantitative.

It is interesting to note that, in the present study, Phase 4

(formulas), which was purely quantitative data, produced the maximum

increase in accuracy. Future applications of experimental analysis







to clinical judgment should use quantitative data only, so that a

better comparison between these studies can be accomplished.

The present results indicated that the "good" clinician

(identified by consistently predicting correctly above chance) is

stably "good" on his clinical judgment, and not merely the "momentari-

ly luckiest fellow among a crew of equal or near-equal mediocre

guessers" as stated by Meehl (1954). This finding was replicated

across phases for P 1, the only "good" clinician that could be

identified, as well as within phases for F 2 and P 2. This find-

ing is intriguing and warrants the need for more longitudinal studies

of "good" judges.

A d1':.:.:.' ra.' r. ri :jl r. ',j : tree .:. er'rc ie ir..-, l w I a ur c f T)o: C

j.,J'd e: in the pr:re:. T i".T le. ha.:.ur.. 1,19l'1) .:.r,.l tr, r l i:-

criiTi'nant frn:[ion r ,n.i l,:l: dl: rinir.ra t,. i C,'ur i. ( ,f rne [.:.[ il

.ple. l t ju'liei .Jil.:r A riIrst.dJ iet .,c'ri n..i. e; :, ri rnor.-rn:. ,-i.: ..'

t l.' )CCur ,, c h the :,;:[ jij.m. e r,. in.: n c i i 1 r, .:.f )

c'Cur Tcy. A ruTL.r :.' in terrpr[t n : nl [.: ,,; pr *.i. 3 [tc F.lain

th:c i:ult: i'. r L: rh Ln .Sniotr .pl :ofi C5s: -r.:n,:.:en c.:iuld

h -'S e b..n tr.h. :ri. .i ':.: : he. dI .:'IhTiri.an lt fur..:t.'r, jrn., thur s,

th : Ii-:.: riff.:l u r.j .3 i; ic iiiri e. A :-urind p:, ti i it, : teri

(peci l tr ning ri r. t ri.i J, t C ..nL. in the .'i li .]b Ini f orm.ls1:.orn

[.: S'. i .;.cu'lte : i : rtn, i.:.n. Thl: p.:. :: i tL ii ': in[ -. L.,

[E *E r.: Er i i.:.r. rr, |r ptr1tnct I '. El haI a nc. noi c i.:E 5 -. f t: r ,n

l'J .ert.ril ,Cur' :,. F uture ri'; .:r:h : h ulJ tL:[. Chi t : '.::' Iti ..

u:. in,- f c .: ci11, tr a i n Ji] d'i: E[.) .' :l'cr', r. t.-t zerrr

h',.mic l nl rn.:.-riu-.:.ii.j, e I cj. pr rt:,t:O.:ol: n.j l.ts [th. 3.:.:Ur' .

their' preliccti:r: ic h 3 nr. .:jrpl A th;rd nl wri t threi:tenin:r





56

hypothesis, previously proposed by Meehl (1956), is that clinical

judgment is not one of the talents of the clinician and therefore he

should relinquish the role of clinical judgment to the more accurate

computer. Recent findings by Blumetti (1972) provide the most

convincing argument against this proposition.
Most importantly, the present study brought a new sample of

human behavior, in this case clinical judgment, under precise and

continuous measurement. This was accomplished through the uniqueness

of the Standard Behavior Chart.



























APPENDICES


























APPENDIX A


Instructions












INSTRUCTIONS

Phase I



This is a research study investigating clinical judgment. You

will be presented with 20 Minnesota Multiphasic Personality Inventory

(MMPI) profiles of inmates at Raiford State Prison. Ten of the twenty

MMPI profiles belong to men convicted of first or second degree murder.

The remaining ten profiles belong to men convicted of crimes against

property; as breaking and entering, robbery, forgery or arson, but

not of any crimes against the person (as assault). (That is, base

rjt, c .51.

four tall a: rto Er, ro .11:cr J ir.aL [ erv. &rn t[h iIna pr.-,f l -1 :

.: hl l Li I oF,-i o r.lh o l':iT'i c i -'r :.up a" ri ,ricr- d) r,.:, ]- it ; pusi i'le

to corr c r.* l c ;i f, .)il il prorlil .. [ i: ri., [h. i r ,,ur par -

dicticn .i i i in : TiC j. L r. h ip u [:i brto r-r:r.jnrl d :.r, :p .:[ Of r, e.

,'. ,i ppli-i t. p: ir clinical












INSTRUCTIONS

Phase II


In this phase you will be presented with 20 cases of inmates

at Raiford State Prison. Each case in the folder has the appropriate

HMPI and Rorschach protocol. Ten of the 20 cases belong to men

convicted of first or second degree murder. The remaining 10 profiles

belong to men convicted of crimes against property; as breaking and

entering, robbery, forgery or arson, but not of any crimes against

the person (as assault). (That is, base rate = .5).

Your task is to try to discriminate between the cases as to which

belong to the homicide group and which do not. It is possible to

correctly classify all the profiles. It is hoped that your prediction

will in some way help us to understand one aspect of the decision

making process as it is applied by psychologists in clinical settings.











INSTRUCTIONS

Phase III


This phase is similar to the previous phase except that each

case in the folder has the appropriate MMPI, Rorschach and biograph-

ical data. Ten of the 20 cases belong to men convicted of first or

second degree murder. The remaining 10 cases belong to men convicted

of crimes against property; as breaking and entering, robbery, forgery

or arson, but not of any crimes against the person (as assault).

(That is, base rate = .5).

Your task is to try to discriminate between the cases as to which

t i'lunq cc [ h.: h'.ii IIC .l y r h r .:. norA. [r. 1i c: i t : i. [t.-

cc.i rt.:l., A: Ii: l : j t profn le. It ij riut -.l [h.C. L :ur pri -

,li.:i[hin ri l i :O ,r h J htelp u: [r.:. und.;r: !arnd .,ri- ,:f'i :,L of thc

.jc 1.:l i iF. a ,r p..i r,:.:c : [ : ,V p, i .j ,, p ,,:h, :. : i n c i i n i c l

c E I .II












INSTRUCTIONS

Phase IV


In the past several weeks you have been making decisions based on

MIPI, Rorschachs and biographical data. The purpose of the present

phase is to provide you with the optimal salient findings (Shagoury

and Satz, 1971) of the statistical analysis, performed on the 60

protocols of which the present 20 is a random sample, as found by the

computer. No variable by itself was discriminatory. However, when

the data was subjected to a multivariate analysis, the following

variables in some combination (i.e., linear) were shown to correctly

classify 80% of the sample (only 7 homicides and 3 controls were mis-

classified, yielding a valid positive rate of 70% and a false positive

rate of 10%).


These are the salient variables as found by the computer:

Variables Confidence Value (T)

Goldberg Score 7.51 *

M Responses 17.74 *

Total Rorschach Responses 1.70

Percentage of Human Content 4.46

Percentage of Minus Responses 20.72 *

Percentage of Whole Responses -28.50 *

Sum C -8.60 *







Variables (Continued) Confidence Value (T)

Total Pathological Content Responses 8.41 *

I.Q. -6.33 *

Grades Completed -4.27

Prior Felony Convictions -4.50

Prior Misdemeanor Convictions -3.13



T 12, 47 5.44, p .05



Summary of Table: The homicide group showed a higher Goldberg

score (+7.51), more M responses (+17.74), a higher percentage of minus

responses (+20.72), a lower percentage of W responses (-28.50), a

l, .er C,i, C (-3.61l), ,",r'e r :(. ,n:e: .:.F path.1 .3;l:.c l C.:rntrnt ( ?2.41),

ar.,. a 1l. n.: r i) (-6 '? ). oi.r; : ,r, i ,'r n [ i r f-ere,-n ": t. e in r.gm i,: i.

jnd3 r,,.r,-hu) ,i eni] l *r] ',jupF w, r, ,jljnd 'ii "ri r.:p.c r [,; .. l rnuiTr. r" .:.

-r .:r, .:,, r.-: ,'n I l.7,j.;. o fer.':-r-ta'a ., ruT r. .:,:,nt nr, 1' .-16.) .

dlu ,: a i r. 1, l (- 1. -), .i fpri.,r Ti.. .1-- ;,r. r or fel:jr., corn. i.-[ on





in er.rjct. r. :.,f LEE anjd p:.r:..r. 'it vU r i l,:*I

The ioll.:. i n. j in a : a: dded r -.-.ij r ar..j .3 .E t r. Fr tr..

corpirer ar,.l,;i;: rd, :,j.. :- : rre p: ::it. i :.1 n-hin, r o, -

blr., ,r. Of d: t l l [rn.ujr .:.:'r. i. r.,r 1 alr..iethr.r L,, ,r,.or'ir;.l and

r,..ri:.n al C .)r.,,i.: rw" n.:. dit Ein.:. ,n [re e r.:.r.ai iT i ,, ure: nthE

qi'- : ., : r .'' :. '..]i ,.'.:,f ._t.normdiT [ anj p.r :.':ni i ri is: [urt. n,:.C
remT, ir. d..







There were 11 cases of severe abnormalities in EEG. Five of them

occurred in the homicide group and six in the non-homicide group.

However, every homicide case with a severely abnormal EEG was

associated with personality disorganization, whereas only one case in

the non-homicide group with a severe abnormality in the EEG was

associated with personality disorganization. Borderline abnormalities

in EEG were not discriminatory with a trend toward more abnormality in

the EEG in the non-homicide group.

The following rule can be stated:

If the biographical-personality variables point to
disorganization and the EEG is severely abnormal,
consider higher probability of homicide behavior.
However, if the biographical-personality data is
not disorganized, and the EEG is severely abnormal,
consider the likelihood of non-homicide.

Your task at this time is to consider these variables (Computer

and interaction) and make your predictions as to which profiles are

homicide and which are not.


Goldberg's Scores for MMPI Profiles

The MMPI data was evaluated for the degree of personality dis-

organization by means of Goldberg's (1965) formulation. The Gold-

berg formula is a quantitative equation based upon the following

scales:

X = L + Pa + Sc C(y + Pt)


If the Goldberg value is high (X 55) the S is classified psychotic,

and if the Goldberg value is low CX 35) the S is classified neurotic.

Intermediate Goldberg values are considered indeterminate. Using these







cut-off values, Goldberg found a hit rate of 74%, with valid positive

rate of 62% and false positive rate of 18%. It has proven to be one

of the better decision rules for differentiating psychotic from

neurotic profiles.


Goldberg scores for your sample of 20 protocols:


Protocol # Score

44520 34

4'37 4-

34


75S.52 31

151 .0 4P





lil24, 64

4611c. 3J.

1? 3"S6 "S

35661 53

S40

51
47" 3')3 I



;1:1; 31
31

1.63 4;

-.1346 i9
7,)3-'
2 7' )..cc,























APPENDIX B


Daily Correct and Incorrect Frequencies































o
- \I o

I -
-- N

_____- 0


00o o o 0o
0 i 0

31nrjllM d3d SINNJiAOvJ


C\.







*v4


GA



I-


( a


I I I


I I I





68









I I I I I o

o


0
-V

o

-O2
0




0 o


Z w






0 0
OW
06 0



_4 ' cn -o
C l
o


o 0O u-









q 0.
00nNIN n iN
g 00








3nNllNJ U3d S1N31A13AOW





69






o0 I I I I i I I I 0
o
O



-_c
o





o W :

O
O
LI.n






? -

., .-N
CO


c 0,







00 n o 0 0
S- '
0"'O 0


31inr.ll 3Ud SlrJ3V0 3/,O'














o I I I 0


-0



to
0
-_2

-0


O L-










-C-)
.0 I
o Z u















o .O. .
o \O
i o


oi' o o











: f0 i
3 n N 0' \' *An -
ci x 1- iO 0











































317NIItm d3d SIr.31J3,AO'.VJ

























U)



o*
C
Q 4

LJ 5




< I

w 5
0
a


31nNIN U83d S1N3]13AOIN












S I I I I I I I i I 0










0
oj -
--


"D ._ 1


acr >
"N -
C




\.,,
1 0





, _, \ \.0 C
_. __'_______ -








oo oD og "1 0
-- \I 0 0
31n 0d Si 3A O. O

3BnilNIj y3d s1rJ3/j3AoL*j


























C
cr g

o
w %


C -
o



0
D
U)
jn ,


31nNIN 83d S1N3I3VAON






75








o


0
"o

0


0
o




0






A -0<
0uJ





0-, ., -
o *











(0






I 0
0 0



31minr 83d SIr.J3L'JUAOL"j













I I I


0 2
00 8 L


31nNIIN 83d SIN3U3AON


\.


o
00
00
o lr


0
o











0 i 1 I I I .0
04 1


\ 0

\0






- .



0 0-
I0









3mn 6-d o
*C \ \ C'






O0 0 o 0
0 *o 0 0

3InNlIl'J d 3d SIlN3l3,',1,',












































31nNIN H3d SIN3BL3AOW













REFERENCES



Ammons, R.B. Effects of knowledge of performance: a survey and ten-
tative theoretical formulation. Journal of General Psychology,
1956, 54, 279-299.

Bachrach, A.J. Psychological Research: an introduction. Random
h ., ,u , i .:. r i r .

E. I 1. I*.- c ard E,,i i: u. I i. r .,,r i l i- irnlnr ..rr.,s


bIuiTi r i. nt..r.-, rEu ti :t .:f r Ii,;.: ii ..r uj :raLt:'. : i i.i .' r :ti..n.
Ll, .j rL.ii, r' L'[i .r.ir .1i w rr r.irn. n n .-i i. r I Fit .:.ri. I '

,' J.. .. r.j v.in r ,. h. -r'-t: -, ,r i'n r i I.u 'rn l .: f
IC, : r, u i a I ,h i I, ', -,, r..4t ..

'i-lidL-ir'n L.I TI .. rff .:r i. -r . l ,', ',i _n r:' )u ci l ,.' rnit : roe
i -j.nri .. r' i.rj -i.: r. I i T r r y I.-,., ,' -. r Ii r [ r.


1.,.1 .. : r L f. In .i .r : i r ..: ri. i ,.' : : i.n .j i 'ii. :i.; i
O f .:l ,., i ,. r' u. r-.- u r. i : [ t h.. i ol. i :
r.,:.-_r Ir , .:r,,;,'- r1 J ',:, l i,.:[, 1 . . h :,,' -,', "




i', l rin. Il fI l r i .: .; ,.: ur ;: f ,:ru ri ; ii .'I i.. i .r,. .;, riv-n.: ,J
ir. : i.: -1- .i,,n -r.a ua, r .r- .-. r.r i i ni.:. .iu ;.,,i-i, t -. I .
.,,.rr, i :-, ~'r,. i tini I :n..,I I , i '- -

Sa'i.iC. I.- H.j' i .:i, I .i mr. : r.:1. .31 -. j l i -. 1 [ ch< -.:.i -.:.o'ri.nt:


H.:.ifi ir. F.I Ti pr.alii,,I.:.r ,i.i: r-. ir; .-ar,t ir.:.i f .:1 ii i.: I j.adjui n t..
F : cr..:. I.:..i i i, u l c ir,. l I. I l '- 11.

H:. , r ., i ,i i .: r.. i r t, ir. J .,:1 i -: rn.:ir l r .-.n i..n
.. . .. .:. i .n .J .: i : *:r .








Holt, R.R. Yet another look at clinical and statistical prediction:
or, is clinical psychology worthwhile? American Psychologist,
1970, 25, 337-349.

Hunt, W.A., Wittson, C.L. and Hunt, E.B. A theoretical and practical
analysis of the diagnostic process. In P.H. Hoch and J. Zubin
(Eds), Current problems in psychiatric diagnosis. New York,
Grune & Stratton, 1953, 53-65.

Hunt, W.A., Arnhoff, F.N. and Cotton, J.W. Reliability, chance and
fantasy in interjudge agreement among clinicians. Journal of
Clinical Psychlogy, 1954, 10, 292-296.

Hunt, W.A. and Jones, N.F. The experimental investigation of clinical
judgment. In A.J. Bachrach (Ed) Experimental foundations of
Clinical Psychology, 1962, 26-51.

Lindsley, O.R. Direct behavioral analysis of psychotherapy sessions by
conjugately programmed closed-circuit television. Psychotherapy:
Theory, Research and Practice, 1969, 6, 71-81.

Little, K.B. Research etiquette in the study of clinician's behavior.
Journal of Consulting Psychology, 1967, 31, 16-18.

Luft, J. Implicit hypotheses and clinical predictions. Journal of
Abnormal and Social Psychology, 1950, 45, 756-759.

Mann, R.D. A critique of P.E. Meehl's Clinical versus Statistical
reductionn. Behavioral Science, 1956, 1, 224-230.

Meehl, P.E. Clinical versus Statistical Prediction. Minneapolis:
University of Minnesota Press, 1954.

Meehl, P.E. Wanted a good cookbook. American Psychologist, 1956,
11, 263-272.

Meehl, P.E. A comparison of clinicians with five statistical methods
of identifying psychotic MIiPI profiles. Journal of Counseling
Psycholog, 1959, 6, 102-109.

Meehl, P.E. The cognitive activity of the clinician. American
Psychologist, 1960, 15, 19-27.

Meehl, P.E. Seer over sign: the first good example. Journal of
Experimental Research in Personality, 1965, 1, 27-32Z

Moxley, A.W. The effects of statistical information on clinical judg-
ment. Unpublished Doctoral dissertation, University of Florida,
1970.







Moxley, A.W. and Satz, P. The effects of statistical information on
clinical judgment. Proceedings: 78th Annual Convention, APA,
1970.

Oskamp, S. Clinical judgment from the MMPI: simple or complex?
Journal of Clinical Psychology, 1967, 23, 411-415.

Payne, R.W. Diagnostic and personality testing in clinical psychology.
American Journal of Psychiatry, 1958, 115, 25-29.

Perez, F.I. The effects of feedback on clinical predictions.
Unpublished Master's Thesis, University of Florida, 1970.

Perez, F.I. and Satz, P. The effects of feedback on clinical predictions.
Proceedings: 79th Annual Convention, APA, 1971.

Rotter, J.B. Can the clinican learn from experience? Journal of
CL.r.:uij i F: ch,.. 1. 7. 31. 1.-l ..

,lin. T -i.. Ta f P. an.]d I il.. .'. L i i ri,:,l .r. i.-n, n
:.:.ir it .,; r.ri.:. ,. .i .'.,l H .l I r,..- r, .ir.,l j Jiri;[..r. ic i'.

I, : c. al .:. *11,.:-.'rr I .: ,r.: i ri,'! ,1,.] :. r. r. :Ar 1
:._.'r :.l. ' j : Fi l .- i i. 19.?l-6. i,. 1.C -- ',.



,1. i-il. -_

L dl..J ri4 II. ij : t.: :,.r ,, r. r- | Er ,. i .'*. l:} : Ci .? i, ., Jr.: [l.n
I r l 1 '- ..

Si 3':Ijii, ? .i r. , r i irl f r "r: r '..f lr. 1. ir.i.. in r'uiTl, .:.E., or.
j l r, i,: iv,.-.ili ,,rl..r f. '',,;.:.l in : 7"[.h ;',rr,:u l i ,1,,'l..eri i"f. rEi, h ',/ .

I .1- : r. i r i r I
l lr.i a r. p. r r.. :. n] r r r i r .:, riT.i n .
I.Ir, .l-, ir,c.,j ,., ", ,i ,::.rr ri.:,n I.Iri .* r:1r. O r il.. 'r d. i' 1.




af P, I' ;!.P i,..i.. ,,: of .- r-:"',r. 1 r.., :'.:',- , r'"? ,ch.,hlo. ic'i1
1C u il l r. r \ :, l 7 'r, "----. '

Ul I,-, i,,n n, L.f J 1 lr I r :*;r,', '6i. 1 r i, r ic, rr.
i tl tI i ri,-: i r. [ ,iiJ Ili :,r. i.i, r, 1' ~ 1 .

Ulr,,Ih, f., "r..,:-i l, T1 a7 ri,] l''ir. i;:,',r ',:,l .:f r ,l.jin ri t.eh. i c.r.
:. r r. ,: .,', ,, r, C i .., l.r, i,- i Tl-ir :T -, 1 .- '; .





82

Ulrich, R., Stachnik, T. and Mabry, J. Control of human behavior.
Scott, Foresman & Co., Glenville, Illinois, 1970.

Underwood, B.J. Experimental Psychology. Appleton-Century-Crofts,
New York, 1966.

Watley, D.J. Feedback training and improvement of clinical forecasting.
Journal of Counseling Psychology, 1968, 15, 167-171.

Wiggins, N and Hoffman, P.J. Three models of clinical judgment.
Journal of Abnormal Psychology, 1968, 73, 70-77.

Wolking, W. and Schwartz, V.A. Applied Behavior Analysis and Learning
Disorders. In P. Satz and J. Ross (Eds) The Disabled Learner:
Early detection and intervention. Rotterdam, The Tetherlands:
University of Rotterdam Press, 1972, In Press.













BIOGRAPHICAL SKETCH


Francisco I. Perez was born in Havanna, Cuba, May 21, 1947.

He came to the United States in October, 1960. He graduated from

Belen Jesuit Preparatory School, Miami, Florida, in June, 1965, and

received his Bachelor of Arts in psychology from the University of

Florida in June, 1969.

In June, 1969, he enrolled in the Graduate School of the

University of Florida where until the present he has pursued his

. :l t. r'. e,.re o: Ilt r.,r *:.i ..rt ;n. r c,' ,'.:. :r F'n'. lru:.:.rr,.

ui r.nn; h i i r i .t r ,r.: r. i, n : iu ,. j i ru.: hi .. ,r : l.i:; t.-r I:., I


rr: ,t i ,r in F,. V:. ." ,:.hi in L,-ccTiL.r, I''"

C u r r .n l. i r t i i 3 rr i ,J r.I [ rig f'-5 r F 'r I . :.r .1 n i H1 1l ,:,n r. r, : ,









I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.




Paul Satz, Chairmn
Professor of Psychology and Clinical
Psychology


I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.



. -
Professor of Psychology


I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.


,H ,i ,. . .

Professor of Psychology and Clinical
Psychology


I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is fully
adequate, in scope and quality, as a dissertation for the degree of
Doctor of Philosophy.




Jaauelin R. Goldman
Associate Professor of Psychology
and Clinical Psychology










I certify that I have read this study and that in my opinion it
conforms to acceptable standards of scholarly presentation and is
fully adequate, in scope and quality, as a dissertation for the degree
of Doctor of Philosophy.




Tilliam D. Working
Associate Professor o ucation

This dissertation was submitted to the Department of Psychology
in the College of Arts and Sciences and to the Graduate Council, and
was accepted as partial fulfillment of the requirements for the degree
of Doctor of Philosophy.

August, 1972




Dean, Graduate School




Full Text
xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EFSWU2DJ3_R0098C INGEST_TIME 2017-07-14T21:50:44Z PACKAGE UF00097633_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES



PAGE 1

AN EXPERIMENTAL ANALYSIS OF CLINICAL JUDGMENT By FRANCISCO IGNACIO PEREZ A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA ' 1972

PAGE 3

COPYRIGHT By FRANCISCO IGNACIO PEREZ 1972

PAGE 4

ACKNOWLEDGMENTS Over the past three years I have had the good fortune and honor of working with Dr. Paul Satz, chairperson of this committee. His long hours of hard work, his persistence, his patience and continuous support, but primarily his ideas and inspirations have been invaluable. As a teacher, colleague and friend his contribution to me has been great. Special gratitude is extended to Dr. Henry S. Pennypacker, a truly great colleague and friend, for his continuous support and encouragement throughout my training. His ideas and work helped me see the light. I would also like to express my sincere appreciation to all the members of my committee, Dr. William Wolking, Dr. Jacquelin Goldman, and Dr. Hugh Davis for their help in the organization and critique of this manuscript. The cooperation of the six judges was instrumental for the success and completion of this study. I am greatly indebted to Marcia Keener, Tom Van Den Abel 1 , Lenay Suarez, Carlos Alvarez, Gerald Reynolds and Brian Lindner for their many hours of hard work. Many thanks are also extended to Mike Cruse and Bunny Wardlaw for their help in the preparation of this manuscript. To my wife, Ginny, I give a special thanks. Without her inspiration, suggestions and moral support both his research and I would be incomplete. iii

PAGE 5

TABLE OF CONTENTS Page Acknowledgements ni List of Tables v List of Figures vi Abstract vii ix INTRODUCTION "• METHOD 15 RESULTS 22 DISCUSSION 47 APPENDICES 57 Appendix A Instructions 58 Appendix B Daily Correct and Incorrect Frequencies . 66 REFERENCES 79 Biographical Sketch 83

PAGE 6

LIST OF TABLES TABLES PAGE 1 Experimental Design Schematic 16&17 2 Accuracy Ratio Celeration 25 3 Accuracy Ratio Frequency Multiplier 35 4 Record Floor Celeration 36 5 Record Floor Frequency Multiplier 40 6 Frequency Correct and Record Floor Growth Ratios . . 42 7 Frequency Incorrect and Record Floor Growth Ratios . 44 8 Accuracy Ratio Total Bounce 45

PAGE 7

LIST OF FIGURES F i 9" re Page 1 Daily Accuracy Ratio for F 1 26 2 Daily Accuracy Ratio for F 2 27 3 Daily Accuracy Ratio for P 1 28 4 Daily Accuracy Ratio for P 2 29 5 Daily Accuracy Ratio for I 1 30 6 Daily Accuracy Ratio for I 2 31 7 Summary Accuracy Ratio Celerations 32 8 Summary Accuracy Ratio Frequency Multiplier 34 9 Summary Record Floor Celerations 38 10 Summary Record Floor Frequency Multiplier 39

PAGE 8

Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy AN EXPERIMENTAL ANALYSIS OF CLINICAL JUDGMENT By Francisco Ignacio Perez August, 1972 Chairman: Paul Satz, Ph.D. Major Department: Psychology The present study provided the first experimental application of continuous and direct recording of operant methodology to the clinical judgment process. This novel application attempted to provide initial answers to four questions: 1) How stable are the daily predictions made by judges? 2) How does the time involved in making a clinical judgment influence accuracy? 3) What is the effect of an increase in available information on clinical prediction? 4) What is the effect of experience level on clinical judgment? vii

PAGE 9

Six judges, two first year graduate students in Psychology (F 1; F 2), two Psychology practicum students (P 1; P 2), and two interns (I 1 ; 1-2) were asked to make daily discriminations between test protocols belonging to men convicted of first or second degree murder and those belonging to men convicted of crimes against property. Levels of information were increased in four phases: Experiment 1 MMPI's only; Phase 1 New set of MMPI's; Phase 2 MMPI's and Rorschachs; Phase 3 MMPI's and Rorschachs and Biographical Data; Phase 4 MMPI's and Rorschachs and Biographical Data and Formulas. The frequency correct and frequency incorrect of the daily discriminations were plotted for each judge on a Standard Behavior Chart. The results were described in terms of how each new phase change in the experiment affected the daily clinical judgment of each judge. These effects were discussed in terms of accuracy, efficiency, step, growth and total bounce. The results demonstrated a number of different effects. In some cases, the effects were consistent with previous research; for example, the overall low accuracy level for most judges. However, some of the findings we^e unanticipated, particularly the overall negligible effect of adding information to the clinical judgment process. The results also demonstrated a number of new effects. These were: 1) The direct and systematic replication, across and within judges, of the stability of the daily predictions across phases; 2) The replication of the lack of essential differences between celeration frequencies and celeration record floor. These findings provide a viii

PAGE 10

starting point for future applications of operant methodology to the study of the clinical judgment process.

PAGE 11

INTRODUCTION The major emphasis in the field of psychological assessment has recently turned away from construction and validation of tests toward a fuller consideration of other factors in the assessment situation. A large number of studies have been carried out focusing upon various aspects of the complex process of clinical prediction. Some studies have been concerned with issues of theoretical importance and others have also dealt with more practical contributions. Clinical versus statistica l prediction . —One of the main areas of research since fleehl's (1954) influential book has been to compare clinicians to statistical formulae. Meehl (1954) cited 20 relevant studies comparing actuarial with clinical methods and found that in all but one of these, actuarial methods consistently equaled or surpassed clinicians in accuracy of predicting a criterion. This finding provided a force potentially capable of crumbling the diagnostic foundations of clinical psychology as it is practiced today. In a later paper Meehl (1956) emphasized that, while the clinician has some useful and unique talents, clinical prediction is not one of them, and he concluded that for many diagnostic problems the clinician is a costly middleman who could be more productive doing research and therapy. Interest in the clinical versus actuarial issue continued, presumably in response to Meehl 's challenge that those who found his "box score" disturbing should publish research which refutes it 1

PAGE 12

(Meehl , 1969). Meehl (1965) recently reviewed the relevant research literature published since 1954, and concluded again that two-thirds of the fifty studies showed a statistically significant superiority of actuarial methods. Holt (1958) criticized Meehl 's conclusions on the basis that several stages are involved in clinical prediction and that Meehl's comparison focused only on one of these. He argued that the crossvalidation possible with actuarial techniques is not possible with clinical techniques, and that, therefore, the two methods are not logically comparable. Holt (1958, 1970) states that the evidence in favor of actuarial methods may be a function of the experimental design, which puts the clinician at a disadvantage, rather than the actual superiority of statistical prediction. An additional problem is that the clinician has seldom been given the opportunity to incorporate the actuarial information in formulating his final decision. Available information . —The information available to the clinician often has been based on non-quantitative data such as interview material, case history data, and projective tests. Goldberg (1968) has shown that in such situations, the clinician has been inferior to the actuarial methods and his judgmental accuracy has decreased both with increased levels of test information and clinical experience. Shagoury and Satz (1969) examined the effects of levels of quantitative information on judgmental accuracy in a clinical statistical decision-making task (brain damage versus normal protocols). Judges were provided with increasing increments of statistical information. The results showed that judgmental accuracy increased substantially with increased levels of information. The increase in judgmental accuracy was also shown to

PAGE 13

vary with different strategies which the judges utilized during the experiment. Moxley and Satz (1970) asked judges to make postdictive judgments on the length of stay in psychotherapy (short or long) for a sample of mental health service clients. Judgments were made under four conditions in which tests and statistical information increased incrementally at each level. It was found that accuracy increased over levels of information. Moxley (1970) states that if "the clinician is able to incorporate the statistical information he may equal or surpass the accuracy of actuarial methods." Effect of level of training . --Recent studies focusing on statistical and clinical prediction have also dealt with the question of experience. The evidence indicates, rather surprisingly, that as experience increases, prediction accuracy decreases. In his review of the qualities which contribute to the ability to judge people, Taft (1955) reported that persons without clinical training, such as physical scientists and experimental psychologists, are more accurate judges of people than are clinical psychologists or clinical psychology students. Goldberg (1959) showed that trained clinicians were not superior to graduate students or secretaries in predicting brain damage from Bender-Gestalt protocols. Sarbin, Taft, and Bailey (1960) brought the literature up to date when they reviewed fourteen additional similar studies. Most found no difference in accuracy between clinicians and other judges; the rest were equally distributed in finding the clinician either superior or inferior. Grebstein (1963) found no significant differences in accuracy in naive, semi-sophisticated, and sophisticated judges. In a study using judgments of

PAGE 14

psychosis or neurosis from MMPI data, Goldberg (1965) found that staff judges and trainees achieved the same accuracy on the average, that the four best and two worst judges were trainees, and that there was wide variability on each diagnostician's performance over different samples. In a more recent study, Perez and Satz (1971) found that there were slight differences in accuracy between graduate students in clinical psychology and clinicians in predicting length of stay in therapy from MMPI profiles. There was a higher overall hit rate for graduate students than clinicians. Taft (1955) advances the explanation that trained professionals, somewhere in their training, acquire a "set" which interferes with making unbiased objective decisions. The process of clinical thinking: cognitive models . —Among the researchers who have attempted to study and describe the workings of the clinician's "mind" there seem to be two camps. There are those who hold that clinical thinking is a mystical, intuitive, and thus inexplicable process. There are others who theorize that the clinician's thoughts are orderly, logical, definable, and even, in some cases, deserving of mathematical models. Among the former is Luft (1950) who has described the process of clinical judgment as a sifting, screening, and synthesizing of case materials in "some intangible way." Another is Mann (1956) who in his review of Meehl's (1954) book emphasized the complexity of human decision-making and the importance of considering all the factors involved in judgments. Perhaps one of the earliest major efforts to demonstrate that the diagnostic process is capable of truly rigorous investigation was Hoffman's (1960) study in which he reduced the diagnostic process to

PAGE 15

5 mathematical models. He proposed both linear and configural models, and suggested that a fruitful approach to research would involve focusing upon the individual as the unit of research, and studying his behavior as it relates to each of these mathematical models. In one analysis of components of clinical inference Hammond, Hursch, and Todd (1964) applied a multiple regression technique to Brunswik's lens model. The statistical components derived were used to examine several types of previous studies. Judges were found generally to combine cues linearly, but the authors argued that the simple rating tasks studied are conducive to a linear response system. They suggested that, for more complex judgments, the lens model of analysis can prove useful in studying human cognitive processes. Studying the way in which the clinician processes data, Wiggins and Hoffman (1968) devised three statistical models, and compared them with clinical judgments of psychosis and neurosis from MMPI profiles. A linear, a quadratic, and a sign model were used. Results indicated that the sign model best described 13 judges, the quadratic model 3, and the linear model 12. There was no significant relationship between configural or linear style and amount of clinical training, or between style and accuracy. In an effort to study the complexity of the process of clinical judgment, Oskamp (1967) utilized both Hoffman's multiple regression procedures and an analysis of validity coefficients to investigate clinical judgments from MMPI profiles. Judges made a dichotomous decision of whether the patient was hospitalized for medical or psychiatric reasons. Oskamp found that the conclusion drawn regarding the complexity of the process was dependent upon the type of

PAGE 16

analysis used to study it. Multiple regression analysis suggested that the judgment process is simple, and that judges used predictor variables in a linear additive way. The validity coefficient analysis suggested that judgments are made in a complex, configural way, which agreed with the judges' subjective impressions of the way in which they utilized the data. Feedback and clinica l prediction . --Given the discouraging information that clinical judgments are often inaccurate, a number of researchers have directed their attention to the use of feedback as a device to improve accuracy. Feedback always involves giving the subject some information about his performance in successive trials. Bilodeau and Bilodeau (1961) say, "Studies of feedback or knowledge of results (KR) show it to be the strongest, most important variable controlling performance and learning." Ammons (1956), in surveying the effects of knowledge of performance, concluded that KR almost universally results in more rapid learning and a higher level of performance. There have been many ideas advanced about the role of feedback in terms of how it may influence work, learning, and performance (e.g., Ammons, 1956). Sechrest, Gallimore, and Hersh (1967) present two hypotheses: a) that feedback operates by providing information by means of which the subject can adjust his implicit hypotheses; or b) that feedback serves as a motivational function by convincing and reminding the subject that the task is one in which improvement is expected and possible. According to Underwood (1966), the most dramatic effects of KR can be shown for tasks in which the precision of the response is initially very poor, and for which the subject can give himself at best minimal feedback.

PAGE 17

Sechrest, Gallimore, and Hersh (1967) devised three experiments to provide feedback on predictive accuracy in the expectation that feedback could be used to improve performance. Their study stemmed from a recommendation by Holt (1958) that clinicians should have training which makes it possible for them to validate themselves as predictors in much the same way as tests are cross-validated. The "clinicians" studied were undergraduate students, and the prediction task involved interpretation of short sentence completion protocols. They found evidence in all three experiments for the superior performance of those subjects who received feedback, but the bulk of the evidence suggested that the feedback effect was attributable to enhancement of motivation of the subjects, rather than to specific informational value. Rotter (1967) points out, however, that the Sechrest, Gallimore, and Hersh study had a number of design limitations. These are: a) the subjects were undergraduates with little experience and possibly low motivation; b) test data (10 ISB responses) may have been too small for a valid judgment; c) while knowledge of how the criterion was determined was given to the subjects, this knowledge could only have provided a minimum of information to the subjects; d) feedback was given with an overall judgment for correct or incorrect without knowing whether or not the subjects had made their hypotheses explicit, or whether or not the hypotheses the subjects were relying upon were relevant or irrelevant. Watley (1968) studied the effect of providing immediate feedback training to judges known from a previous study (Watley, 1966) to predict educational criteria at relatively high, moderate, or low levels of accuracy. The criteria predicted were freshman and overall college grades. In comparison to judges who received no training, the forecast

PAGE 18

of low-accuracy judges showed substantial improvements for both predicted criteria: however, the training had no noticeable effect on the judgments of the high or moderate-accuracy judges. Perez (1970) studied the effects of feedback on a problem that clinicians face in their practice -that of predicting length of stay in psychotherapy. Sixteen judges, eight clinical psychologists and eight graduate students in clinical psychology were asked to predict length of stay in psychotherapy from MMPI profiles. Judges were randomly divided into a feedback condition and a no-feedback condition. It was hypothesized that judges in the feedback condition would do significantly better than judges in the no-feedback condition. The results, while in the predicted direction, were not significant, largely because of the high level of initial accuracy on this clinically relevant task. In agreement with Watley's (1967) finding, inspection of the performance of each judge individually revealed that feedback was most beneficial for those judges starting at a low accuracy level. The clinician behaving . --Clinical psychologists have produced a widely varied number of hypotheses in response to the evidence that they have not yet demonstrated their diagnostic prowess. Hunt, Wittson, and Hunt (1953) have suggested that the confusion may result not only from lack of ability in the diagnostician, but also in poorly delineated diagnostic categories and the professional customs which permit careless diagnosis or inaccurate diagnosis for administrative reasons. Some (Holt, 1958; Sawyer, 1965; and Taft, 1959) have proposed that the available research comparisons between clinical and statistical methods are essentially not parallel and are therefore meaningless. Others (Hunt, Arnhoff, and Cotton, 1954; Hunt and Jones,

PAGE 19

9 1962) have pointed out that the application of formal scoring, mathematical models, and certain statistical treatments to clinical data tends to distort research findings. Little (1967) and Rotter (1967) have pointed to such artificialities in research studies as the use of undergraduate judges to represent experienced clinicians, the lack of adequate criterion data, the use of inadequate test data for the prediction required, and incomplete description of the criterion to be predicted. Some (Payne, 1958; Cole and Magnussen, 1966) have argued that diagnostic test scores are useless, since the diagnostic categories themselves are not related to symptoms, etiology, treatment, or prognosis, and that the prediction of diagnostic labels has much less meaning than would the prediction of the behavioral consequences of them. Many have continued their attempts to develop better test instruments. Meehl (1960), in contrast, has proposed that diagnosis be left to the superior ComputerLand that the clinician concentrate on research and psychotherapy. It is presently argued that the conflicting results found by experimenters in the area of clinical prediction are due largely to lack of an adequate experimental methodology to study this complex task. Bachrach (1965) states that the goals of science are: description, explanation, prediction, and control. From the previous review of the literature, it is clear that the research area of clinical prediction is far behind in reaching these goals. Meehl (1954) stated that "Presumably some kind of longitudinal study is needed to find out whether and to what degree the 'good' clinician is stably such, rather than being merely the momentarily luckiest fellow among a crew of equal

PAGE 20

10 or near-equal mediocre guessers." No attempt has yet been made to provide an answer to this challenge. The precise experimental methodology of free-operant conditioning, applied to the continuous and direct recording of the clinical judgmental process, provides a unique opportunity to study judges making their predictions on a longitudinal basis. Experimental anal y sis of clinical prediction . --Since Sidman's (1960) influential book, Tactics of Scientific Research , psychology has witnessed many innovative applications of operant methodology to traditional problem areas (Ullmann and Krasner, 1965; Ulrich, Stachnik, and Mabry, 1969, 1970). A most notable example is Lindsley's (1969) application of the continuous and direct measurement of operant methodology to the study of traditional psychotherapy. The underlying thesis of these applications has been that variability is not intrinsic to the subject matter but, stems rather from discoverable and controllable causes. Any sample of behavior is under the control of a multiplicity of variables, some of them presumably held constant in a given experiment, and others simply unrecognized. Sometimes the variability in a set of data can be located among such factors. Two subjects may be found to differ in their response to variable A, not because there is intrinsic variability in the relation between variable A and behavior, but because they differ in their response to variable B, which interacts with variable A. The process of tracking down sources of variability, and thus explaining variable data, is characteristic of the scientific enterprise (Sidman, 1960).

PAGE 21

11 Sidman (1960) believes that the control of data in research does not depend upon the amassing of large groups of subjects, or even large samples from an individual subject. He states that "We must consider our science immensurably enriched each time someone brings another sample of behavior under precise experimental control." Sidman believes that the adequacy of a technique in experimental psychology should be evaluated in terms of the reliability and precision of the control it achieves over the independent variables. According to Sidman (1960), experiments are often carried out to test the fruitfulness of a new technique. Sometimes the technique is developed deliberately in order to obtain information that could not be gained by standard methods; sometimes the technique is simply tried out of curiosity as to the kind of data it will yield. Technical developments in experimental psychology may include improvements in measuring instruments, advanced methods of recording data, sophisticated data analysis, the design of specialized apparatus to do a particular job, or generalized apparatus to perform many functions, and the extension of old techniques to new areas. It is clear that the need for more objective research in the field of clinical judgment has been articulately expressed (Rotter, 1967). This study represents a broad, but systematic, attempt to understand the process of clinical judgment. A more fruitful and rigorous approach to this problem is proposed through the systematic investigation of relevant variables as they influence the judgment process. By the application of continuous and direct measurement to clinical judgment, a unique opportunity exists to bring new light into this intriguing area.

PAGE 22

12 Precise measurement of clinical prediction . --0.R. Lindsley and his associates have developed the most powerful single tool to measure human behavior the Standard Behavior Chart. This chart permits the daily recording of behavior frequencies ranging from 1000 per minute to one per day; frequencies ranging from one per day to one per twenty weeks may also be recorded without changing the coordinates of the chart. This chart, therefore, provides us with a standardized means of depicting and analyzing frequencies of virtually any human behavior on a continuous basis. Further, because it is standardized, it facilitates the comparison of data across different levels of clinical information as well as comparing different experience levels and how these variables effect clinical prediction. Even more importantly, the Standard Behavior Chart greatly facilitates communication of research findings among scientists. According to Wolking and Schwartz (1972), there are several features of the precise behavioral measurement system which makes it distinctive from traditional statistical measures. These are: (1)-The basic unit of measurement is frequency, which is defined as the ratio of the number of behaviors emitted divided by the number of minutes during which the behavior has been observed. Thus, the inescapable dimension of time is made an integral part of the basic datum. (2)--A very important feature of this system is that it measures behavior directly. Direct measurement involves: defining the behavior so precisely that it can be counted with high reliability (pinpointing); counting the number of occurrences of the behavior and the number of minutes of observation and making a permanent record of both (recording); and finally, calculating the frequency of the

PAGE 23

13 behavior observed by dividing the number of movements by the number of minutes. (3)--There should be continuous recording that is, the movement should be measured daily or every time the behaver engages in the behavior, if it is less than daily. (4)--The most important feature of this system is the graphic representation of the daily rates, which provides a unique opportunity for rapid and accurate communication and comparison of facts about behavioral processes. RESEARCH QUESTIONS In considering the present multifaceted design, a number of research questions arise. The present knowledge in the area of clinical prediction is too limited and contradictory to permit formulation in the context of hypotheses. However, the research questions below are important in formulating the analysis and presentation of the data. These questions are: 1. How stable are the daily predictions made by judges? This is a most relevant question which has not been researched in the past. The present study, emphasizing continuous and direct recording of rate correct and rate incorrect of predictions, provides a first step in exploring this basic question on a longitudinal basis. 2. What effects will an increase in available information have on clinical prediction? Research in this area indicates in general, that as available information increases, accuracy increases. The present design provides a unique opportunity to measure this effect directly on each individual judge, as well as to measure its stability over time.

PAGE 24

14 3. What effect will experience level have on clinical prediction? Previous research indicates in general, that as experience level increases, accuracy decreases. The application of the present methodology provides a more powerful technique to compare each judge independently. 4. How does the time involved in making a clinical prediction influence judgmental accuracy? Researchers have not previously studied this question. The use of rate in analyzing judgmental accuracy provides a most sensitive and natural measure, for it takes into consideration the amount of time spent in making a clinical judgment.

PAGE 25

METHOD Subjects To insure the best control of this relevant variable, judges with similar backgrounds and experience with diagnostics were chosen. Six judges participated, representing three levels of experience. Level 1— Two first-year graduate students (£1 £2); Level 2--Two practicum students ( P_ 1 £2); Level 3--Two interns (£1 ]_2). All judges were chosen from the University of Florida and are currently enrolled as graduate students in clinical psychology. Each judge was his own control . Materials Test materials were collected by Shagoury (1971). He studied 60 men imprisoned in the Florida State Prison, Raiford, Florida. Thirty men, convicted of firstor second-degree murder, were compared on certain genetic, biological and personality factors with thirty men convicted of crimes against property. Shagoury (1971) collected biographical information, MMPI's, Rorschach's, EEG's, and chromosome analyses. These materials were used in the present study. Procedures Refer to Table I for a schematic of the design. Judges were asked to predict whether the material presented was that of a homicide or 15

PAGE 26

16 o

PAGE 27

17 -C T-

PAGE 28

13 non-homicide (for instructions, see Appendix A). Shagoury (1971) found that a discriminant function analysis correctly classified 83% of the total sample. Following the free operant methodology of continuous and direct recording, each judge was asked to make their predictions on a daily basis. The research consisted of two experiments. Experiment I. --The purpose of this experiment was to study the stability of the daily predictions made by judges using MMPI's only. This area has received no attention in the experimental literature. Operant methodology, with its unique feature of continuous and direct recording, provides a most powerful tool to study this phenomenon. Sidman (1960) defines a stable, or steady, state as one in which the behavior in question does not change its characteristics over a period of time. Two major types of experimental interest in steadystate behavior have developed. One of these may be termed "descriptive" and the other "manipulative." Experiment I is a purely descriptive study in which a set of experimental conditions are maintained over an extended period of time, providing an account of both the stable and the transitory aspects of the resulting behavior. This form of research is fundamental to the establishment of behavioral control techniques, and of baselines from which to measure behavioral changes. The data yielded by such an experiment do not relate an aspect of behavior to several values of a manipulated independent variable. Rather, the resulting curves show some aspect of behavior as a function of time in the experimental situation. It is the characteristics of behavior in time, under a constant set of maintaining conditions, which are of major interest. According to Sidman (1960), the descriptive

PAGE 29

19 investigation of steady-state behavior must precede any manipulative study. Manipulation of new variables will often produce behavioral changes, but, in order to describe the changes, we must be able to specify the baseline from which they occurred. Otherwise, we face insoluble problems of control, measurement, and generality. A major problem faced in experiments involving the manipulation of steady-states is that of deciding whether the behavior in question has stabilized. According to Sidman (1960), there is no assuredly final answer. He states that "The utility of data will depend not on whether ultimate stability has been achieved, but rather on the reliability and validity of the criterion. That is to say, does the criterion select a reproducible and generalizable state of behavior? If it does, experimental manipulation of steady-states, as defined by the criterion, will yield data that are orderly and generalizable to other situations. If the steady-state criterion is inadequate, failures to reproduce and to replicate systematically the experimental findings will reveal this fact." How does one select a steady-state criterion? There is, according to Sidman (1960), no rule to follow, for the criterion will depend upon the phenomenon being investigated and upon the level of experimental control that can be maintained. Here, descriptive long term studies steady-state behavior are extremely useful. By following behavior over an extended period of time, with no change in the experimental conditions, it is possible to make an estimate of the degree of stability that can eventually be maintained; a criterion can then be selected on the basis of these observations. The adequacy of the criterion chosen can be confirmed by the orderliness of the resulting

PAGE 30

20 data. If the steady-state criterion yields orderly and repli cable functional relations, it may be accepted as adequate. Procedure for Experiment I. —Judges were presented with 20 MMPI profiles, 10 belonging to homicides and 10 to non-homicides (base rate of .5). They were asked to discriminate between the profiles of homicides and non-homicides. Each judge was presented with the same set of profiles on a daily basis until stability of prediction was reached. The criterion of stability was orderliness in the data. Experiment II. --This experiment consisted of a systematic replication of Experiment I plus the study of the effects of adding new information on clinical judgment. Four phases were involved: Phase 1 -Systematic replication of Experiment I. According to Sidman (1960), the soundest empirical test of the reliability of data is provided by replication. The application of continuous and direct recording provided a unique opportunity to attempt to replicate the findings of Experiment I. Phase 2 -Phase I was used as baseline data to study the effects on clinical judgment of adding new information, in this case Rorschach protocols, to MMPI profiles. The research previously reviewed (Goldberg, 1968; Shagoury and Satz, 1969; Moxley and Satz, 1970) seems to indicate that as more information is available to the clinician, judgment accuracy increases. The present methodology provided a more powerful technique to study this phenomena on a daily basis instead of the previous one session studies. Phase 3 -This phase was identical to the previous phase except that biographical and EEG data was added to the existing information. Phase 2 was used as baseline data.

PAGE 31

21 Phase 4 -This phase was identical to the two previous phases except that a summary of the findings of a multivariate analysis on the data as found by Shagoury (1971) was provided to each judge to assist in making his judgment. Phase 3 was used as baseline data. Orderliness of data was the criterion used for termination of this phase. Procedure for Experiment II. —Phase 1 — Judges were asked on a daily basis to predict homicides from non-homicides using a new set of 20 MMPI profiles with base rate of .5. This phase was discontinued when stability was reached. The criterion of stability was orderliness in the data as well as comparison with stability in Experiment I. Phase 2 — Judges continued making their daily predictions. In this phase, the MMPI profiles of Phase 1 and the appropriate Rorschach protocols were utilized. Orderliness of data was the criterion for stability. Phase 3 -This phase was identical to Phase 2, except that judges made their daily predictions with the addition of biographical data and EEG reports. Phase 4 -Judges continued making their daily predictions, but this time a summary of the relevant findings, as found by a multivariate analysis on the previous personality, biographical and biological data, was given to each judge (Shagoury, 1971).

PAGE 32

RESULTS The measures used in this study, frequency of correct predictions and frequency of incorrect predictions, were plotted on Standard Behavior Charts (Behavior Research Co.). Plotting linear data on a log scale provides one with a picture of proportional changes in behavior frequencies rather than absolute changes (Koenig, 1972). Information that the frequency of occurrence of a given behavior has doubled or halved is considerably more valuable than information that the frequency of occurrence of the behavior has changed by one arbitrarily defined unit. In order to understand the present results, it is necessary to briefly familiarize the reader with the Standard Behavior Chart as well as the current procedures of data analysis. Chart scales . --The horizontal dimension across the bottom of the chart represents calendar days. Each chart runs for 140 consecutive days or 20 weeks. The vertical dimension up the left side of the chart is the scale of frequencies or rates. The unit of measurement is movements per minute. Record floo r. --The record floor is the lowest measurable performance frequency other than zero. The record floor is found by dividing the number of minutes in the time sample into one, the smallest number of movement cycles that can be observed. The record floor sets the lower limit of the sensitivity of the chart as a measurement system 22

PAGE 33

23 for each day. Below the floor is an area of record blindness. The symbol of the record floor is a horizontal dashed line at the computed level of the floor for a given day. Celeratio n. --Few statistical measures are available for describing continuous changes in behavior over time. Therefore, researchers interested in continuous observation and recording of behavior have developed several new measures for this purpose (Koenig, 1972). Frequencies displayed on the Behavior Chart are usually either accelerating (x) or decelerating (*) as time passes. Celeration is the general term for these accelerating and decelerating relationships. Celeration is a measure of change occurring in frequency of responding over a week's period of time. The celeration coefficient is functionally related to the slope of the line of best fit and is obtained by using the least squares method of regression. The celeration coefficients are the main measures employed in the present data analyses. The results of the present study are described in terms of how each new phase change in the experiment affected the daily clinical judgments of each judge. These effects are discussed in terms of accur acy , efficiency , step , growth and total bounce . Each of these measures are discussed under separate headings with the presentation of the results. The actual charts of the daily predictions for each judge are located in Appendix B. Accuracy Ratio Celeration The accuracy ratio is defined as the ratio between frequency correct and frequency incorrect. The daily accuracy ratio for each judge was plotted on a Standard Behavior Chart. A value of one

PAGE 34

24 indicates that the frequency correct is equal to the frequency incorrect. A value less than one indicates that the frequence incorrect is higher than the frequency correct, and a value greater than one indicates that the frequency correct is higher than the frequency incorrect. The reader might want to convert these values into percentage (e.g., x 1.0 = 50%; x 9.0 = 90%). The accuracy ratio celeration measure provides the opportunity to compare the celerating effects of adding new information to the clinical judgment process for each judge as well as across judges. Figures 1 through 6 present graphically the daily accuracy ratios for each judge. Table 2 shows a summary of the accuracy ratio celerations per phase for each judge. Inspection of Table 2 shows that the accuracy ratio celeration coefficients ranged from * 1.56 Movements per minute per week (M/m/w) to x 2.27 M/m/w. Figure 7 shows a graphical summary of the accuracy ratio celerations across judges. Overall, there was essentially no acceleration or deceleration of accuracy over time. There were four exceptions. Figure 3 shows that P l's accuracy accelerated x 2.27 M/m/w in Exp 1 (MMPI's only) and x 1.6 M/m/w in Phase 2 (MMPI's + Rorschachs). Figure 6 shows that I 2's accuracy accelerated x 1.51 M/m/w in Phase 2 (MMPI's + Rorschachs) and decelerated t 1.56 M/m/w in Phase 4 (MMPI's + Rorschachs + Biographical Data + Formulas). Accuracy Ratio Frequency Multiplier To measure the effects of a new procedure on the first day of a phase, the frequency multiplier, or step, is used. The frequency multiplier gives a measure of the increase or decrease of frequency

PAGE 35

25 a 5

PAGE 36

26 to S 2 to Q. O oo m g» * O "° Q O O -O o

PAGE 37

27 Q 6 o o o

PAGE 38

28 J L J—L oo o o o w q o o o

PAGE 39

29 ouva Aovdnoov

PAGE 40

30 ^Oi

PAGE 41

31 OllVd AOVdflODV

PAGE 42

32 J 1 I L ~l — r o o o n — r qg o

PAGE 43

33 correct or frequency incorrect the first day new information is added to the clinical judgment process. It is a comparison of the last data point of the old phase with the first data point of the new phase. A frequency multiplier of x 1.0 Movements per minute per day (M/m/d) indicates that there has been no increase or decrease in accuracy with the introduction of new information to the clinical judgment process. A step of x 2.0 M/m/d indicates that accuracy has doubled with the introduction of a new phase. A step of 2.0 M/m/d indicates that accuracy has halved with the introduction of a new phase. Figure 8 shows graphically the steps for each judge as new phases were introduced. The measures for each phase from left to right belong to: F-l; F-2; P-l ; P-2; 1-1 ; 1-2. Table 3 presents a summary of the accuracy ratio frequency multipliers. Inspection of Table 3 indicates that the accuracy ratio frequency multipliers ranged from * 3.0 M/m/d to x 4.0 M/m/d. These two measures belong to I 2. Figure 8 indicates that the maximum accelerating steps were obtained with the addition of Phase 4 (formulas). The addition of Phase 2 (Rorschachs) produced overall the least change in accuracy except for I 2 whose accuracy decreased •: 3 M/m/d. The addition of Phase 1 (new set of MMPI's) as well as the addition of Phase 3 (biographical data) produced the most momentary decreases in accuracy. Record Floor Celeration Efficiency In the present experiment the daily record floor indicates the amount of time a judge spent in making his daily predictions. It is therefore, a measure of efficiency. Record floor celerations for each judge are located in Appendix B. Table 4 shows a summary of the record

PAGE 44

34 J L J L rO

PAGE 45

35 o

PAGE 46

36 in

PAGE 47

37 floor celerations per phase for each judge. Inspection of Table 4 indicates that the record floor celeration coefficients ranged from * 1.21 M/m/w to x 3.22 M/m/w. Figure 9 shows a graphical summary of the record floor celerations. Overall, there was clearly an increase in the efficiency of the judges' daily predictions. The maximum acceleration in efficiency (x 3.22 M/m/w) was obtained for I 2 in Phase 4 (MMPI's + Rorschachs + Biographical Data + Formulas). The maximum deceleration of efficiency (+ 1.21 M/m/w) was observed for F 1 in Exp 1 (MMPI's only). Record Floor Frequency Mul tiplier To assess the immediate effects on efficiency of adding new information to the clinical judgment process, the record floor frequency multiplier for each judge was computed. The frequency multiplier is a comparison of the last record floor of the old phase with the first record floor of the new phase. Figure 10 shows graphically the record floor steps for each judge as new phases are introduced. The measures for each phase from left to right belong to: F-l ; F-2; P-l ; P-2; 1-1 ; 1-2. Table 5 presents a summary of the record floor frequency multipliers. Inspection of Table 5 indicates that the record floor frequency multipliers ranged from * 20.8 M/m/d to x 1.21 M/m/d. Figure 10 indicates that in all cases except one the addition of new information produced an immediate reduction in efficiency. The exception occurred with F 1 with the addition of Phase 3 (biographical data) which produced an almost i significant increment of x 1.2 M/m/d. Inspection of Figure 10 i tes that the greatest overall decrease in efficiency occurred with tl

PAGE 48

38 J L T 1 ro o o o o o o m O
PAGE 49

39

PAGE 50

40 CO

PAGE 51

41 addition of Phase 2 (Rorschachs). The second most noticeable decrease in efficiency occurred with the addition of Phase 4 (formulas). Frequency and Record Floor Growth Ratio Effectiveness The growth ratio is used to assess the relationship between two celerations. In the present study we are interested in the relationship between celeration correct and celeration record floor as well as celeration incorrect and celeration record floor. In other words, the growth ratio provides a measure of the relationship between the celeration of the time spent in making the daily predictions and the celeration of the daily correct and incorrect frequencies. The growth ratio is independent of both the initial frequencies and the two celerations. It is therefore, a measure of effectiveness. A growth ratio of 1.00 indicates that the celeration correct or celeration incorrect is the same as the celeration record floor. A growth ratio greater than one indicates that the celeration correct or celeration incorrect is greater than the celeration record floor. A growth ratio less than one indicates that the celeration record floor is greater than the celeration frequency correct or incorrect. Table 6 shows the frequency correct and record floor growth ratios per phase for each judge. Inspection of Table 6 indicates that the growth ratios ranged from .85 to 1.48. Overall, there was essentially no difference between correct celeration and record floor celeration. There were four exceptions. P 1 obtained a growth ratio of 1.48 in Exp 1 (MMPI) and 1.24 in Phase 2 (MMPI's + Rorschachs). I 2 obtained a growth ratio of 1.26 in Phase 2 (MMPI's + Rorschachs) and .85 in Phase 4 (MMPI's + Rorschachs + Biographical Data + Formulas).

PAGE 52

42

PAGE 53

43 Table 7 shows the frequency incorrect and record floor growth ratios per phase for each judge. Inspection of Table 7 indicates that the growth ratios ranged from .56 to 1.32. Overall, there was essentially no difference between the incorrect celeration and record floor celeration. There were five exceptions. P 1 obtained a growth ratio of .56 in Exp 1 (MMPI), .78 in Phase 1 (MMPI's only) and .74 in Phase 2 (MMPI's + Rorschachs). I 2 obtained a growth ratio of .84 in Phase 2 (MMPI's + Rorschachs) and 1.32 in Phase 4 (MMPI's + Rorschachs + Biographical Data + Formulas). Accuracy Ratio Total Bounce Variability In order to assess the variance around celeration lines on the Behavior Chart, Koenig (1972) has developed the total bounce measure. To find the total bounce, a line is drawn parallel to the celeration line through the frequency farthest above it. Then a line is drawn parallel to the celeration line through the frequency farthest below it. The distance between these two outer lines, expressed as a ratio, defines the total bounce around the celeration line. Koenig (1972) has shown that the proportional variance around the straight line of celerating frequencies usually remains constant regardless of the value of the frequencies. Thus, total bounce is used as a measure of homogeneous variability of the daily predictions. Table 8 presents a summary of the accuracy ratio total bounce per phase for each judge. Inspection of Table 8 reveals that the total bounce ranged from x 6.00 to x 1 .00. The highest total bounce of x 6 was obtained by I 2 with the addition of Phase 2 (Rorschachs).

PAGE 54

44

PAGE 55

45 o rin o X X LU] X

PAGE 56

46 A total bounce of x 1.0 indicates that there is no variance around the line of best fit. P 1 obtained a total bounce measure of x 1.0 in all phases except Exp 1 (MMPI's). In comparison with Koenig's (1972) data 1, the present results indicate that the accuracy ratio total bounce for each judge is considerably below the average. This can be taken as a powerful indication of stability of accuracy in daily judgments. 1. Koenig (1972) investigated 13,941 human behavior projects deposited in the Behavior Bank and found that the average total bounce was x 5.9.

PAGE 57

DISCUSSION The present study provided the first experimental application of continuous and direct recording of operant methodology to the clinical judgment process. This novel application attempted to provide initial answers to four questions: 1) How stable are the daily predictions made by judges?; 2) How does the time involved in making a clinical judgment influence accuracy?; 3) What is the effect of an increase in available information on clinical prediction?; 4) What is the effect of experience level on clinical judgment? The present results are discussed within the framework of these questions. The results demonstrated a number of different effects. In some cases, the effects, were consistent with previous research; for example, the overall low accuracy level for most judges. However, some of the findings were unanticipated, particularly the overall negligible effect of adding information to the clinical judgment process. The results also demonstrated a number of new effects. These were: 1) The direct and systematic replication, across and within judges, of the stability of the daily predictions across phases; 2) The replication of the increase in efficiency, across phases, for each judge; and, 3) The replication of the lack of essential differences between celeration frequencies and celeration record floor. These findings provide a starting point for future research on clinical judgment. Only systematic and direct replication of the present study will provide 47

PAGE 58

reliability and generality of these results. 48 Stability of Daily Predictions It has been eighteen years since Meehl (1954) stated that "Presumably some kind of longitudinal study is needed to find out whether and to what degree the 'good' clinician is stably such, rather than being merely the momentarily luckiest fellow among a crew of equal or near-equal mediocre guessers." The present study provides a partial answer. The results indicate that in all cases (judges and phases) except one, the individual predictions were stable. The exception was I 2, with the addition of Phase 2 (Rorschachs). However, since the accuracy level for most judges across phases was 50%, these results have to be interpreted with caution. A stable 50% accuracy level is easy to maintain. In our sample of judges, only P 1 (See Figure 3) maintained stability above 50% accuracy across phases. He was the only steady "good" clinician that could be identified. Inspection of each chart indicates that F 2's predictions (See Figure 2) in Phase 4 (MMPI's + Rorschachs + Biographical Data + Formulas) were stable above 50% accuracy as well as P 2's (See Figure 4) in Phase 2 (MMPI's + Rorschachs). These findings indicate that, in the present sample of judges, when a judge was identified as "good" (identified by consistently predicting correctly above chance) at least in one phase, his predictions were stable across that phase. Future longitudinal research should identify these "good" clinicians before attempting to replicate the present findings.

PAGE 59

49 Efficiency of Daily Predictions The use of frequencies in analyzing judgmental accuracy provided a most sensitive and natural measure of efficiency, for it considered the amount of time spent in making a clinical judgment. The present results showed that, overall, there was a clear increase in the efficiency of the judges' daily predictions. It was also shown that efficiency decreased when new information was added to the clinical judgment process. Inspection of Table 10 shows that the maximum decrease in efficiency was obtained with the addition of Phase 2 (Rorschachs). This can be taken as an indication that the integration and interpretation of the Rorschachs combined with the MMPI's required the most time and consequently the maximum drop efficiency. It is interesting to note that the maximum decreases in efficiency in Phase 2 (MMPI's + Rorschachs) occurred, in all cases (judges within Phase 2) except one, with the medium (P 2) and high experienced (I 1; I 2) judges. These judges had knowledge in the interpretation of the Rorschach; therefore, a decrease in efficiency is an indication that they were making use of this knowledge. The second most noticeable decrease in efficiency occurred with the addition of Phase 4 (formulas). Once more, the maximum decrease in efficiency occurred with the medium (P 1; P 2) and high experienced (I 1 ; I 2) judges. It seems like the least experienced judges (F 1; F 2), presented with a novel set of information, decided not to spend much additional time in attempting to integrate this new information.

PAGE 60

50 Effectiveness of Daily Predictions The effectiveness ratio (growth) provided a measure of the relationship between the celeration of the time spent in making the daily predictions and the celeration of the daily correct and incorrect frequencies. The present results indicate that there was essentially no difference between either the correct celeration and record floor celeration (See Table 6) or the incorrect celeration and record floor celeration (See Table 7). This indicates that, overall, most judges within each phase expended less time in making their daily predictions as the phase progressed, but their accuracies were uniquely stable within and across phases. That is, they became more efficient without a concomitant increase or decrease in accuracy. Two judges were the exception. P 1 (See Figure 3) increased in efficiency and accuracy throughout Exp 1 (MMPI's) and throughout Phase 2 (MMPI's + Rorschachs). I 2 (See Figure 6) increased in efficiency and accuracy throughout Phase 2 (MMPI's + Rorschachs) but decreased in accuracy and increased in efficiency in Phase 4 (MMPI's + Rorschachs + Biographical Data + Formulas). Levels of Information Across Levels of Experience and Accuracy of Daily Predictions The accuracy ratio celeration measure and the accuracy ratio frequency multiplier provided the opportunity to compare the celerating effects of adding new information to the clinical judgment process for each judge as well as across levels of experience. The present results indicate, in general, that there was essentially no

PAGE 61

51 acceleration nor deceleration of accuracy over time within a phase. The results also indicate that across phases, the maximum accelerating steps were obtained with the addition of Phase 4 (formulas). The addition of Phase 2 (Rorschachs) produced overall the least change in accuracy. The addition of Phase 1 (new set of MMPI's) as well as the addition of Phase 3 (biographical data) produced the most initial decreases in accuracy. Individual differences were observed across phases between judges. These individual differences are discussed according to levels of information. Exp 1 MMPI 's Only . —Most judges predicted at a 50% accuracy level (Figures 1 through 6). There were two exceptions, both occurring with medium experienced judges. P 1 (See Figure 3) increased his accuracy on the second week in this task to a high of 70% and remained stable till the end of the phase. P 2 (See Figure 4) predicted consistently below chance and his accuracy did not accelerate across time. Phase 1 MMPI's Only . —Most judges predicted at a 50% accuracy level (Figures 1 through 6). There was one exception. P 1 (See Figure 3) predicted at a 60% accuracy level on four of the five days in the second week of this phase. Table 8 indicates that the introduction of Phase 1 produced a momentary decrease in accuracy in both of the non-experienced judges (F 1; F 2); in one medium-experienced judge (P 1); and in one high-experienced judge (I 1). Since there was essentially no celeration in accuracy in this phase (See Figure 2), these effects were not permanent.

PAGE 62

52 Phase 2 MMPI's + Rorschachs . —Three out of six judges predicted mostly at a 50% accuracy level (Figures 1 through 6). The addition of Phase 2 (Rorschachs) had no initial effects (See Figure 8) nor celerating effects (See Figures 1 and 2) on the non-experienced judges (P 1; P 2). These judges predicted mostly at a 50% accuracy level. The addition of this phase produced no initial effect in p 1 (See Figure 8), but his accuracy increased above 50% on the second week in this phase. P 2's accuracy increased initially with the addition of this phase (See Figure 8), and remained at 60% (See Figure 4) for the rest of the phase. The addition of Phase 2 had no initial (See Figure 8) nor celerating (See Figure 5) effects on I 1 , I 2's judgments became unstable with the addition of Phase 2 (See Table 8). The initial effects on I 2 was an immediate decrease in accuracy (See Figure 8), but accuracy accelerated (See Figure 6) within the phase to a terminal accuracy of 60%. Phase 3 MMPI's + Rors chach s + Biographical Data . —Most judges predicted at a 50% accuracy level (Figures 1 through 6). With the non-experienced judges, the addition of Biographical Data produced an initial decrease in accuracy for F 2 (See Figure 8), but no effects were found for F 1 (See Figure 8). The addition of this phase produced an initial decrease in accuracy for both of the mediumexperienced judges. P l's predictions (See Figure 3) remained stable at 60% accuracy and P 2's predictions (See Figure 4) remained stable at 50% accuracy. The addition of this phase had no effects on I 1 (See Figure 5), his predictions remained at 50%. I 2's predictions (See Figure 6) decreased initially from 60% to 50% accuracy

PAGE 63

53 with the new phase, and remained stable at this level. Phase 4 MMPI's + Rorschachs + Biographical Data + Formulas . —The addition of this phase produced overall the maximum increase in accuracy of all phases. The addition of this phase elicited an initial increase in accuracy for the two non-experienced judges (See Figure 3). F Ts accuracy (See Figure 1) increased to a maximum of 70% but decelerated within this phase. F 2's accuracy increased initially (See Figure 8) to 60% (See Figure 2) and occassional ly to 70%. The addition of Phase 4 (formulas) produced the maximum increase in accuracy for the non-experienced judges. For P 1 (See Figure 3) the addition of the formulas produced neither an initial step in accuracy (See Figure 8) nor a celeration effect. His predictions remained stable at 60% accuracy. The same occurred for P 2 (See Figure 4), but in this case his predictions remained stable at 50% accuracy. Figure 8 indicates that the addition of Phase 4 (formulas) produced no initial effect on I 1 and an increase in accuracy for I 2. I Ts accuracy remained constant at 50% (See Figure 5). I 2's accuracy decelerated from an initial 70% accuracy to a terminal 60% accuracy (See Figure 6). To summarize, the addition of new clinical information to the judgmental process did not substantially increase accuracy across phases. The only exceptions were the replication of the increase in accuracy (sometimes to 70%) for both non-experienced judges (F 1; F 2) with the addition of Phase 4 (formulas). Also with the addition of Phase 4 (formulas) I 2's accuracy increased to a maximum of 70% with a terminal accuracy of 60%. It is interesting to note that the

PAGE 64

54 two non-experienced judges increased in accuracy from 50% to 70% with the addition of Phase 4 (formulas). Shagoury (1971) found that these formulas predicted accurately 83% of the sample of homicides. It seems, from these results, that non-experienced judges (F 1; F 2) tended to ignore the actual test protocols and looked for the relevant cues provided by the formulas. The same could be said for 1-2. Nevertheless, no judge approximated the overall accuracy of the formulas. On the basis of the preceeding findings, a few general comments can be made. Some of these comments may, at present, lack generality. This study is only a first attempt to apply the single-subject research methodology of experimental analysis to study the clinical judgment process. Future replications of these findings will provide the final test of the reliability and generality of the present data. The present results are in conflict with more traditional studies of increase in levels of information. These studies (Shagoury & Satz, 1969; Moxley & Satz, 1970) found that accuracy increased as levels of information increased. It should be pointed out, however, that the kind of information presently used was in part different from the two previous studies. In these two studies, the information used was quantitative (Z scores; base rates; conditional probabilities; etc.), and in the present study, some information (MMPI's; Rorschachs) was qualitative, and some (biographical data; formulas) was quantitative. It is interesting to note that, in the present study, Phase 4 (formulas), which was purely quantitative data, produced the maximum increase in accuracy. Future applications of experimental analysis

PAGE 65

55 to clinical judgment should use quantitative data only, so that a better comparison between these studies can be accomplished. The present results indicated that the "good" clinician (identified by consistently predicting correctly above chance) is stably "good" on his clinical judgment, and not merely the "momentarily luckiest fellow among a crew of equal or near-equal mediocre guessers" as stated by Meehl (1954). This finding was replicated across phases for P 1, the only "good" clinician that could be identified, as well as within phases for F 2 and P 2. This finding is intriguing and warrants the need for more longitudinal studies of "good" judges. A discouraging result was the overwhelming low accuracy of most judges in the present sample. Shagoury (1971) found that a discriminant function analysis discriminated accurately 83% of the total sample. Most judges discriminated between homicides and non-homicides at 50% accuracy, with the best judges reaching a ceiling of 70% accuracy. A number of interpretations can be provided to explain these results. One is that the random sample of cases chosen could have been the ones missed by the discriminant function, and, thus, the most difficult to discriminate. A second possibility is that special training may be needed to combine the available information to make an accurate discrimination. This possibility is warranted by the observation that experience level had no noticeable effect on judgmental accuracy. Future research should test this possibility by using feedback to specifically train judges to discriminate between homicide and non-homicide test protocols and test the accuracy of their predictions with a new sample. A third and most threatening

PAGE 66

56 hypothesis, previously proposed by Meehl (1956), is that clinical judgment is not one of the talents of the clinician and therefore he should relinquish the role of clinical judgment to the more accurate computer. Recent findings by Blumetti (1972) provide the most convincing argument against this proposition. Most importantly, the present study brought a new sample of human behavior, in this case clinical judgment, under precise and continuous measurement. This was accomplished through the uniqueness of the Standard Behavior Chart.

PAGE 67

APPENDICES

PAGE 68

APPENDIX A Instructions

PAGE 69

59 INSTRUCTIONS Phase I This is a research study investigating clinical judgment. You will be presented with 20 Minnesota Multiphasic Personality Inventory (MMPI) profiles of inmates at Raiford State Prison. Ten of the twenty MMPI profiles belong to men convicted of first or second degree murder. The remaining ten profiles belong to men convicted of crimes against property; as breaking and entering, robbery, forgery or arson, but not of any crimes against the person (as assault). (That is, base rate = .5). Your task is to try to discriminate between the MMPI profiles as to which belong to the homicide group and which do not. It is possible to correctly classify all the profiles. It is hoped that your prediction will in some way help us to understand one aspect of the decision making process as it is applied by psychologists in clinical settings.

PAGE 70

60 INSTRUCTIONS Phase II In this phase you will be presented with 20 cases of inmates at Raiford State Prison. Each case in the folder has the appropriate MMPI and Rorschach protocol. Ten of the 20 cases belong to men convicted of first or second degree murder. The remaining 10 profiles belong to men convicted of crimes against property; as breaking and entering, robbery, forgery or arson, but not of any crimes against the person Cas assault). (That is, base rate = .5). Your task is to try to discriminate between the cases as to which belong to the homicide group and which do not. It is possible to correctly classify all the profiles. It is hoped that your prediction will in some way help us to understand one aspect of the decision making process as it is applied by psychologists in clinical settings.

PAGE 71

61 INSTRUCTIONS Phase III This phase is similar to the previous phase except that each case in the folder has the appropriate MMPI, Rorschach and biographical data. Ten of the 20 cases belong to men convicted of first or second degree murder. The remaining 10 cases belong to men convicted of crimes against property; as breaking and entering, robbery, forgery or arson, but not of any crimes against the person (as assault). (That is, base rate = .5). Your task is to try to discriminate between the cases as to which belong to the homicide group and which do not. It is possible to correctly classify all the profiles. It is hoped that your prediction will in some way help us to understand one aspect of the decision making process as it is applied by psychologists in clinical settings.

PAGE 72

62 INSTRUCTIONS Phase IV In the past several weeks you have been making decisions based on MMPI, Rorschachs and biographical data. The purpose of the present phase is to provide you with the optimal salient findings (Shagoury and Satz, 1971) of the statistical analysis, performed on the 60 protocols of which the present 20 is a random sample, as found by the computer. No variable by itself was discriminatory. However, when the data was subjected to a multivariate analysis , the following variables in some combination (i.e., linear ) were shown to correctly classify 80% of the sample (only 7 homicides and 3 controls were misclassified, yielding a valid positive rate of 70% and a false positive rate of 10%). These are the salient variables as found by the computer: Variables Confidence Value (T) Goldberg Score 7.51 * M Responses 17.74 * Total Rorschach Responses 1.70 Percentage of Human Content 4.46 Percentage of Minus Responses 20.72 * Percentage of Whole Responses -28.50 * Sum C -8.60 *

PAGE 73

63 Variables [Continued) Confidence Value (T) Total Pathological Content Responses 8.41 * I.Q. -6.33 * Grades Completed -4.27 Prior Felony Convictions -4.50 Prior Misdemeanor Convictions -3.13 * T 12, 47 5.44, p .05 Summary of Table : The homicide group showed a higher Goldberg score (+7.51), more M responses (+17.74), a higher percentage of minus responses (+20.72), a lower percentage of W responses (-28.50), a lower Sum C (-8.60), more responses of pathological content (+8.41), and a lower IQ (-6.33). No significant differences between homicide and non-homicide groups were found with respect to total number of Rorschach responses (1.70), percentage of human content (4.46), education level (-4.27), and prior misdemeanor or felony convictions (-3.13; -4.50). Interac t ion of EEG and Personality Variables The following analysis was added by Shagoury and Satz after the computer analysis, and suggests the possibility of a non-linear combination of data. Although considered altogether the abnormal and normal EEG groups showed no difference on the personality measures, the question of severity of EEG abnormality and personality disturbance remained.

PAGE 74

64 There were 11 cases of severe abnormalities in EEG. Five of them occurred in the homicide group and six in the non-homicide group. However, every homicide case with a_ severely abnormal EEG was associated with personality disorganization, whereas only one case in the non-homicide group with a severe abnormality in the EEG was associate d with pers onality disorganization . Borderline abnormalities in EEG were not discriminatory with a trend toward more abnormality in the EEG in the non-homicide group. The following rule can be stated: If the biographical-personality variables point to disorganization and the EEG is severely abnormal, consider higher probability of homicide behavior. However, if the biographical -personality data is not disorganized, and the EEG is severely abnormal, consider the likelihood of non -homicide. Your task at this time is to consider these variables (Computer and interaction) and make your predictions as to which profiles are homicide and which are not. Goldberg's Scores for MMPI Profiles The MMPI data was evaluated for the degree of personality disorganization by means of Goldberg's (1965) formulation. The Goldberg formula is a quantitative equation based upon the following scales: X = L + Pa t Sc (Hy + Pt) If the Goldberg value is high (X 55) the S is classified psychotic, and if the Goldberg value is low (X 35) the S^ is classified neurotic. Intermediate Goldberg values are considered indeterminate. Using these

PAGE 75

65 cut-off values, Goldberg found a hit rate of 74%, with valid positive rate of 62% and false positive rate of 18%. It has proven to be one of the better decision rules for differentiating psychotic from neurotic profiles. Goldberg scores for your samp le of 20 protocols : Protocol # Score 44520

PAGE 76

APPENDIX B Daily Correct and Incorrect Frequencies

PAGE 77

67 J L O O O "O O *> — o CO

PAGE 78

63 BiniMIIAI U3d S1N31AI3A0W

PAGE 79

69 3inNIIAl U3d S1N31AI3A01AJ

PAGE 80

70 BlflNIIAI U3d S1N3IA13AOI/M

PAGE 81

71 31HNIIM U3d S1N31AI3A0I/M

PAGE 82

72 J L J V \ o » o -O

PAGE 83

73 J_L J_JL oo 09 O O o o -CM -8

PAGE 84

74 o o

PAGE 85

75 OO O O o «° o CM *> -9 X .O o 9 S 8u en in UJ o o en o ro CVJ 3±nNII/\l cd3d S1N3W3A01M

PAGE 86

76 31PINIIA1 d3d S1N3IAI3A0W

PAGE 87

11 3inNII/\l U3d S1N31M3A01M

PAGE 88

78 O^ o o o

PAGE 89

79 REFERENCES Amnions, R.3. Effects of knowledge of performance: a survey and tentative theoretical formulation. Journal of General Psychology , 1956, 54, 279-299. Bachrach, A.J. Psychological Research: an introduction . Random House, New York, 1965. Bilodeau, E.A. and Bilodeau, I.M. Motor-skills learning. Annual Review of Psychology , 1961, 12, 243-280. Blumetti, Anthony A true test of clinical versus statistical prediction, Unpublished Doctoral dissertation, University of Florida, 1972 Cole, J.K. and Magnussen, M.G. Where the action is. Journal of Consulting Psychology, 1966, 30, 539-543. Goldberg, L.R. The effectiveness of clinicians' judgments: the diagnosis of organic brain damage from the Bender-Gestalt test. Journal of Consulting Psychology , 1959, 23, 25-33. Goldberg, L.R. Diagnosticians versus diagnostic signs: the diagnosis of psychosis versus neurosis from the MMPI. Psychol ogical Mono graphs: General and Applied , 1965, 79 (whole #6Q2TGoldberg, L.R. Simple models of simple processes: some research on clinical judgments. American Psychologist , 1968, 23, 483-496. Grebstein, L. Relative accuracy of actuarial prediction, experienced clinicians and graduate students in a clinical judgment task. Journal of Consultin g Psychology , 1963, 27, 127-132. Hammond, K.R., Hursch, C.J. and Todd, F.J. Analyzing the components of clinical inference. Psychological Review , 1964, 71, 438-456. Hoffman, P.J. The paramorphic representation of clinical judgment. Psych ol ogical Bulletin , 1960, 57, 116-131. Holt, R.R. Clinical and statistical prediction: a reformulation and some new data. Journal of Abnormal and Social Psychology , 1958, 56, 1-12.

PAGE 90

so Holt, R.R. Yet another look at clinical and statistical prediction: or, is clinical psychology worthwhile? Americ an Psychologist, 1970, 25, 337-349. Hunt, W.A., Wittson, C.L. and Hunt, E.B. A theoretical and practical analysis of the diagnostic process. In P.H. Hoch and J. Zubin (Eds), Cur rent problems in psychiatric diagnosis . New York, Grune & Stratton, 1953, 53-65. Hunt, W.A., Arnhoff, F.N. and Cotton, J.W. Reliability, chance and fantasy in interjudge agreement among clinicians. Journal of Clinical Psy cholog y, 1954, 10, 292-296. Hunt, W.A. and Jones, N.F. The experimental investigation of clinical judgment. In A.J. Bachrach (Ed) Experimental foundations of Clinica l Psychology, 1962, 26-51. Lindsley, O.R. Direct behavioral analysis of psychotherapy sessions by conjugately programmed closed-circuit television. Psychotherapy: Theory, Research and Practice , 1969, 6, 71-81. Little, K.B. Research etiquette in the study of clinician's behavior. Journal of Consulting Psychology , 1967, 31, 16-18. Luft, J. Implicit hypotheses and clinical predictions. Journal of Abnormal and Social Psychology , 1950, 45, 756-759. Mann, R.D. A critique of P.E. Meehl's Cli nical versus Statistical Prediction . Behavioral Science , 1956, 1, 224-230. Meehl, P.E. Cli nical versus Statistical Prediction . Minneapolis: University of Minnesota Press, 1954. Meehl, P.E. Wanted a good cookbook. American Psychologist , 1956, 11, 263-272. Meehl, P.E. A comparison of clinicians with five statistical methods of identifying psychotic MMPI profiles. Journal of Counseling Psychology , 1959, 6, 102-109. Meehl, P.E. The cognitive activity of the clinician. American Psychologist , 1960, 15, 19-27. Meehl, P.E. Seer over sign: the first good example. Journal of Experimental Research in Personality , 1965, 1, 27-37! Moxley, A.W. The effects of statistical information on clinical judgment. Unpublished Doctoral dissertation, University of Florida, 1970.

PAGE 91

81 Moxley, A.W. and Satz, P. The effects of statistical information on clinical judgment. Proceedings: 78th Annual Convention, APA, 1970. Oskamp, S. Clinical judgment from the MMPI: simple or complex? Jour nal of Clinical Psychology , 1967, 23, 411-415. Payne, R.W. Diagnostic and personality testing in clinical psychology. American Journal of Psychiatry , 1958, 115, 25-29. Perez, F.I. The effects of feedback on clinical predictions. Unpublished Master's Thesis, University of Florida, 1970. Perez, F.I. and Satz, P. The effects of feedback on clinical predictions Proceedings: 79th Annual Convention, APA, 1971. Rotter, J.B. Can the clinican learn from experience? Journal of Consulting Psychology, 1967, 31, 12-15. Sarbin, T.R., Taft, R. and Bailey, D.E. Clinical inference and cogniti ve theory . New York: Holt, Rinehart and Winston, 1960. Sawyer, J. Measurement and prediction, clinical and statistical. Psychological B ulletin^ 1966, 66, 178-200. Sechrest, L., Gallimore, R. and Hersch, P.D. Feedback and accuracy of clinical predictions. Journal o f Consulting Psychology, 1967, 31, 1-11. a * iyL Sidman, M. Tactics of Scientific Research . Basic Books, Inc., New York, 1960. ~ Shagoury, P. and Satz, P. The effect of statistical information on clinical prediction. Proceedings: 77th Annual Convention, APA, 1969, 310-311. Shagoury, P. An exploratory investigation of homicidal behavior. Unpublished Doctoral dissertation, University of Florida, 1971. Taft, R. The ability to judge people. Psycho logical Bulletin, 1955, 52, 1-28. Taft, R. Multiple methods of personality assessment. Psychological Bulletin , 1959, 56, 333-351. Ullmann, L.P. and Krasner, L. Research in behavior modification . Holt, Rinehart and Winston, New York, 1965. Ulrich, R., Stachnik, T. and Mabry, J. Control of human behavior . Scott, Foresman & Co., Glenview, Illinois, 1956.

PAGE 92

82 Ul rich, R. , Stachnik, T. and Mabry, J. Control o f hum an be havior . Scott, Pores man & Co., Glenville, Illinois, 1370. Underwood, B.J. Experimen tal Psychology . Appleton-Century-Crofts, New York, 1966. Watley, D.J. Feedback training and improvement of clinical forecasting. Journa l of Counseling Psychology, 1968, 15, 167-171. Wiggins, N and Hoffman, P.J. Three models of clinical judgment. Journal of Abno rma l Psychology , 1968, 73, 70-77. Wolking, W. and Schwartz, V.A. Applied Behavior Analysis and Learning Disorders. In P. Satz and J. Ross (Eds) The Disabled Learner: Early detection and inter vention . Rotterdam, The Netherlands: University of Rotterdam Tress , 1 972 , In Press.

PAGE 93

BIOGRAPHICAL SKETCH Francisco I. Perez was born in Havanna, Cuba, May 21, 1947. He came to the United States in October, 1960. He graduated from Belen Jesuit Preparatory School, Miami, Florida, in June, 1965, and received his Bachelor of Arts in psychology from the University of Florida in June, 1969. In June, 1969, he enrolled in the Graduate School of the University of Florida where until the present he has pursued his work toward degrees of Master of Arts and Doctor of Philosophy. During this period he has held a USPHS Fellowship and Second-level Veterans Administration Traineeship. He received his Master of Arts degree in psychology in December, 1970. Currently, he is married to the former Georgina M. Montero. 82

PAGE 94

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. 7 Satz, Chairman Paul Professor of Psychology and Clinical Psychology I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Henry S. Pennypapker Professor of Psychology I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. "Hugh C. .Davis Professor of Psychology and Clinical Psychology I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. •. 6^UU-u ft • y^At^.;, Jaqquel in R . Go 1 dman Associate Professor of Psychology and Clinical Psychology

PAGE 95

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. v-as lilliam D. walking Associate Professor o^vdducation This dissertation was submitted to the Department of Psychology in the College of Arts and Sciences and to the Graduate Council, and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. August, 1972 Dean, Graduate Schoof

PAGE 96

£^