Title: Clinical judgment and the use of psychological reports
Full Citation
Permanent Link: http://ufdc.ufl.edu/UF00099596/00001
 Material Information
Title: Clinical judgment and the use of psychological reports
Physical Description: vii, 209 leaves : ; 28 cm.
Language: English
Creator: Siwy, James Martin, 1951-
Copyright Date: 1984
Subject: Psychological tests -- Case studies   ( lcsh )
Clinical Psychology thesis Ph. D
Dissertations, Academic -- Clinical Psychology -- UF
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
Case studies   ( lcsh )
Statement of Responsibility: by James Martin Siwy.
Thesis: Thesis (Ph. D.)--University of Florida, 1984.
Bibliography: Bibliography: leaves 195-208.
General Note: Typescript.
General Note: Vita.
 Record Information
Bibliographic ID: UF00099596
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: alephbibnum - 000476222
oclc - 11750192
notis - ACP2476


This item has the following downloads:

clinicaljudgment00siwy ( PDF )

Full Text








This dissertation owes its existence in no small

part to the encouragement and tutelage of Roger Blash-

field, whose enthusiasm, suggestions and criticisms

guided me through every step of the process. Roger's

research acumen helped discern the essential from the

nonessential; his patient nurturance sustained the

belief that the project was not only worthwhile, but

also possible.

In like manner, Lynn Robbins deserves great praise

for her generous assistance with the data analysis.

Her special effort in enabling a thorough, efficient

analysis to be done was marked by personal encouragement

and unselfish donation of her time and talent.

I would also like to extend my appreciation to

fellow colleagues, June Sprock, Mark Gilbertson, Roberta

Isleib, Rich Shapiro and Marna Cohen, for their contri-

butions and helpful ideas in the development of this

project. The thoughtful criticisms of Eileen Fennell,

Mary McCaulley and Scott Miller assisted me in con-

structing a feasible research design. The wise advice


of Austin Creel was finally heeded and much appreciated

for the completion of the manuscript.

Special recognition goes to Bonnie Jo DeCourcey

and Larry Leshan for their creative portrayals of the

interviewed patients. Others who provided skillful

assistance were Susan Linn, Hank Rowland, and Dawn

Lopresto. And, of course, the project was made possible

by the effort of the clinicians who volunteered their

time amidst busy schedules.

Finally, my deepest appreciation goes to my family

for the motivation they provided: Janet, with her

loving support and encouragement, and Elizabeth Anne,

with her spirited insistence on entering the world just

prior to the completion of this paper.



ACKNOWLEDGEMENTS . . .. .. .. .. ii

ABSTRACT . . . . . . . . . vi


Introduction. . . . . ... .. 1
The Significance of Judgment. . . . 2
Definition of Judgment. . . . . 5
The Domain of Psychological . . ...
Research on Judgment. . . . ... 10
Results and Criticisms of . . . .
Research on Judgment. . . . .. 19
Research Suggestions. . . . . ... 32


Introduction. . . . . . . .. 36
Historical Background . . . ... 37
The Referral Process as an . . . .
Interprofessional Relationship .. . 42
The Referral Process. . . . . ... 56
Empirical Research on the Use . ..
of Psychological Reports. . . .. 73
Applications in the Present . . . .
Study . . . . . . . . 90

CHAPTER III. METHOD . . . . . . 94

Summary . . . . . . . .. 94
Hypotheses. . . . . . . .. 95
Subjects. . . . . . . . .. 96
Materials . . . . . . ... 97
Measures. . . . . . . . ... 100
Procedures. . . . . . . .. .104


Subject Characteristics.
The Hypothetical Cases .
Hypothesis #1. . . .
Certainty Measures . .
Hypothesis #2. . . .
Hypothesis #3. . . .
Hypothesis #4 . .
















. . . . . 11

. . . . 111

. . . . . 122
. . . . 116
... ........ ... 122
. . . . . 136
. . . . . 140
. . . . . 144
. . . . . 147

DISCUSSION. . . . . . .. .150

ADMITTING NOTES. . . . .. .159





SYMPTOM CHECKLIST. . . . ... 183




RECALL NOTES . . . . .. 191

FINAL QUESTIONS. . . . .. .192


. . . . . . . . . 195

BIOGRAPHICAL SKETCH . . . . . . .. .207

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



James Martin Siwy

August, 1984

Chairman: Roger K. Blashfield, Ph.D.

Major Department: Clinical Psychology

Psychological test reports have long been a common

component of the information used to make psychiatric

judgments. A review of the literature on human judg-

ment and on the use of psychological reports revealed

that reports had received little empirical study and

that some of the common findings in judgment research

had not been tested in the clinical area. Judgment

research has emphasized persistent biases such as over-

confidence and inability to use disconfirming informa-

tion. Using hypothetical psychiatric cases in a

naturalistic manner, this study tested how psychiatrists

and clinical psychologists made use of psychological

reports that either confirmed or disconfirmed their

previous judgments. In two sessions, clinicians were

given increments of clinical information about two

patients whose descriptions were designed to maximize

diagnostic ambiguity. In the second session the

psychological report for one patient confirmed the

subject's first judgment about the presence of thought

disorder; the report for the other patient disconfirmed

that judgment. Contrary to the main hypothesis, the

32 subjects tended to follow the reports rather than

to ignore disconfirming evidence. Clinicians judged in

a rational, discriminating and conservative manner.

Confidence did not increase over time as predicted. No

significant differences in judgment were found according

to profession or level of clinical experience. The

results support a more rational model of human judgment

and suggest that psychological reports can have a signi-

ficant influence on diagnostic judgment.



The status of psychological testing within clinical

psychology has generated controversy through much of

the past thirty-five years. Questions of the appropriate-

ness and usefulness of testing have often been linked

to the issue of professional identity, particularly

concerning the psychologist's role in the mental health

field vis-a-vis other mental health professionals. In

contrast to the apparent importance of psychological

test reporting through the years, the literature has

revealed very little substantial research on the use

of psychological reports.

The activities that comprise a psychological assess-

ment are complex. Numerous factors affect the referral,

testing and consulting processes. These will be discussed

in the latter part of the literature review. Of particular

interest will be the use of test reports by the referring

party or report "consumer." As the survey of the litera-

ture will reveal, little empirical work has been done on

how information obtained from a psychological consulta-

tion is integrated with other clinical information.

Consideration of the use of psychological test

reports is informed by the more general studies of the

use of information in person perception, judgment and

decision making. These areas of research have become

prominent in the burgeoning literature of cognitive

psychology, which is the general domain of the present

topic. The reading of a psychological report can be

viewed as a psychological event with many variables

that would potentially relate to the effects of the report.

These include such processes as attention, comprehension,

evaluation, information processing, memory and, finally,

action. Another way of framing the topic is to say that

the clinical use of test reports is a particular type of

integration of information on person perception. Rather

than extending the discussion to encompass the many topics

of cognition, the following review will address the study

of judgment as the most pertinent context in which to

consider the proposed study. A discussion of definitions

will help clarify and delimit the topic.

The Significance of Judgment

Judgment has long held an esteemed position as one

of the processes of the mind. Philosophers have, through

the centuries, commented upon it and described it,

often as a simple, straightforward activity, unmuddled

by the complexities attributed to it by modern psychologi-

cal research. Cassirer (1944), for example, in his

discussion of the Stoic view of life, stated:

Judgment is the central power in man, the common
source of truth and morality. For it is the
only thing in which man entirely depends on himself;
it is free, autonomous, self-sufficing. (p. 8)

Further citations would be unnecessary in establishing

that the judgmental process has had a vital place in

human affairs. Indeed, the use of judgment has been

recognized through the ages as one of the distinctive

demands of being human.

Clinical judgment, therefore, is not merely an

academic expression of an opinion. It involves elements

of commitment and responsibility. When clinicians make

their judgments, they are in a sense establishing precedent

as to what sorts of patients will be described in particu-

lar ways. To a certain extent these judgments may be

evaluated by comparison with established criteria. However,

the means to evaluate judgmental validity scientifically

is limited. Thus, clinical judgment will be accepted partly,

if not predominantly, according to the authority bestowed

upon the clinician. The clinician must shoulder the commit-

ment and responsibility for his judgment in much the same

manner as Polanyi (1962) has described judicial and

research judgment:

The course of scientific discovery resembles the
process of reaching a difficult judicial decision
in both cases a passionate search for a
solution that is regarded as potentially pre-
existing, narrows down discretion to zero and
issues at the same time in an innovation claiming
universal acceptance. In both cases the original
mind takes a decision on grounds which are
insufficient to minds lacking similar powers
of creative judgment. The active scientific
investigator stakes bit by bit his whole profes-
sional life on a series of such decisions and
this day-to-day gamble represents his most
responsible activity. (pp. 309-310)

As will be discussed further, this personal element

suggests that research on judgment should begin with

examination of what clinicians are doing rather than

leaping ahead to an assessment of how validly they are

judging in comparison with a contrived standard. What

is central to the judgmental process is the personal

commitment of the judge:

Though every choice in a heuristic process is
indeterminate in the sense of being an entirely
personal judgment, in those who exercise judgment
competently it is completely determined by their
responsibility in respect to the situation con-
fronting them. (Polanyi, 1962, p. 310)

The point here is not to debate whether such

responsibility and authority should be granted to clinicians. In

most clinical situations action does depend on such personal

judgment. What is required is a deeper understanding of

what judgment is and how it in fact is used in regards

to patients today.

Definition of Judgment

Discussions of clinical judgment in the psychologi-

cal literature seldom refer to definitions outside that

literature. Before looking at how psychologists define

judgment, it will be helpful to examine more general,

formal definitions.

Dictionaries reveal that "judgment" has multiple

meanings and usages. Most definitions are of a general

nature, such as "that function of the mind whereby it

arrives at a notion of anything; the critical faculty;

discernment" (Oxford English Dictionary, 1971), or "the

power or ability to decide on the basis of evidence"

(Webster's Third International Dictionary, 1976).

Webster's divided the philosophical consideration of

judgment into the Scholastic and Kantian approaches. The

former concerns the realm of value: "the capacity to

arrive at a decision about the value of things." The

Kantian view is of two types. The first involves "the

power of relating particulars to general terms or concepts."

This can be "reflective judgment," which "proceeds from

given particulars to the discovery of a general concept

or universal principle under which the particulars may be

subsumed." Alternatively, the first type may be

"determinative," which "proceeds from a general concept

or universal principle and designates the particulars

which are to be subsumed under the general." This

first type would seem to describe inductive and deductive

reasoning, respectively. The second Kantian definition

in Webster's described judgment as "a capacity mediating

between reason and the understanding; broadly the

critical faculty" (emphasis in original).

A review of psychologists' definitions of judgment

reveals that the Scholastic or value issue has tended

to be avoided and that the Kantian definitions have been

employed. Much attention has been given to the reasoning

process, both reflective and determinative. In addition,

there has been a recognition of judgment as something

beyond reasoning.

Some psychologists who study judgment avoid explicit

definitions. For instance, Rappoport and Summers (1973)

expressed the presuppositions of a number of researchers

who had attended a conference on judgment. They saw

judgment as an aspect of thinking, based upon the goal

of adaptation to uncertainty, that allows maintenance of

organization and continuity in behavior while "going

beyond perceptual and cognitive 'givens'" (p. 4). Judg-

ment is used to mediate between a person's intentions and

the uncertain environment; it is thus centered upon the

relationship of a cognitive system with an environmental

system. Similarly, in lieu of a definition, Newell

(1968) referred to judgment as "an umbrella term,"

comparable to such terms as "perception," "learning"

and "cognition." These terms need not be well defined

because they designate a general class of phenomena

about which theories are developed. He preferred to

classify judgment within "pretheoretical language" that

eschews precise definitions. Instead, he offered a list

of definitional strands that serve to circumscribe the


Other psychologists have been more explicit in

defining the term. For instance, Bieri, Atkins, Briar,

Leaman, Miller and Tripoli (1966) saw judgment as the

assignment of a stimulus to a response category. Johnson

(1972) ascribed to a similar approach, with the emphasis

that judgment functions to bring order to matters of


Judgment begins with unordered objects, events, or
persons, assigns them to specified response cate-
gories so as to maximize the correspondence between
the responses and the critical dimension of the
stimulus objects, and thus ends with a more orderly
situation. (p. 340)

An essential element in most definitions of judgment

has distinguished the process from mere application of

rules of reasoning. Newell (1968) pointed out that

Judgment fills the gaps in rational calculation.
If the calculation could do it all, then no
judgment is required. (p. 4)

He added that judgment has "a flavor of immediacy, and

. nonraticnality" (p. 4). Responses to complex situa-

tions cannot be formally programmed:

The upshot is that whatever can be analyzed, becomes
part of the formal apparatus, and the remainder is
left to be subjective judgment. Thus, although one
may still assert that there is good judgment and
bad--that a man can somehow arrive at his inputs
to the system appropriately or not--judgment occurs
in precisely those situations where the scientist
is least able to attribute rationality to them.
(Newell, 1968, p. 5)

Perhaps this aspect of judgment was best captured by

Hammond, Stewart, Brehmer and Steinmann (1975), who

described human judgment as "a cognitive activity of

last resort" (p. 272).

Different views have existed as to whether judgment

is to be distinguished from decision making. Slovic and

Lichtenstein (1973) observed such a distinction to be

"a tenuous one" (p. 16). Bieri et al. (1966) tended

to agree with this view.

Others have preferred a separation of the topics

(Einhorn, Kleinmuntz & Kleinmuntz, 1979; Hogarth, 1980,

1981). For example, Hogarth, who used "choice" in place

of decision, drew the following relationship between

judgment and choice:

Judgment can be thought of as providing a temporal
background of mental activity that is punctuated
by particular choices. This is exemplified in
the target model, where aiming is used as the
analogy for judgment, and the actual shooting is
analogous to choice. This does not,of course,
deny that one can shoot without aiming, just as
one can ignore judgment in choice. (1981, p. 201)

In a more recent paper, Einhorn and Hogarth

i1983) distinguished "diagnostic inference" from predictive

clinical judgment. The former was defined as the

inferences of the causal process that could have pro-

duced an observed set of "outcomes/results/symptoms"

(p. 1). The authors were apparently dealing with the

question of etiology as one aspect of diagnosis. Although

not explicitly stated by these authors, it would seem

that diagnostic inference is one part of clinical judg-

ment to be considered in relation to the essential

predictive aspect.

From the above discussion, several points apply

to the consideration of clinical judgment in the proposed

study. First, judgment is perhaps the most crucial

cognitive activity of the professional. Judgment involves

reasoning and assumption of responsibility that cannot

rely on calculations alone. Second, because judgment is

so closely related to decision, it should not be relegated

to the refined category of cognition, but must be studied

in the context of action. Finally, although questions of

value have not generally been the focus in most research on

judgment, including the present one, the influence of

a judge's valuing or devaluing has to be incorporated

as an implicit background for judgments and choices.

The effect of evaluation will be more evident in the

consideration of judgmental biases. Although it may

at times be helpful to distinguish evaluation from pre-

diction, the person as judge never ceases to be an evalu-

ative person. This point may seem obvious; it is

included here as a matter of emphasis to contrast the

objectivist bias that has dominated twentieth century

psychology (Polanyi, 1962).

The Domain of Psychological Research on Judgment


In recent years the study of judgment has been a

popular area for psychological research. As is true

for any relatively new topic, the study of judgment

has been approached from many perspectives. Among the

reviews of this literature, there does not appear to

be any consensus on how the topic may be divided. Newell

(1968) gave a reminder of the most basic division between

psychophysical judgment and the more "general" study of

judgment. Different authors have employed various

outlines in the discussion of the latter type, which is

the subject of this proposal.

A "Systems" Approach

One method of review has been the consideration of

the judgmental task according to the number of known

"systems" under observation. Using this approach,

Rappoport and Summers (1973) classified human judgment

according to a "general systems theory hierarchy" (p. 5).

This taxonomy includes four classes: the single-system

case, two-system case, three-system case and n-system


The single-system case consists of judgment made

under uncertain conditions which lack adequate feedback.

The. authors cited both foreign policy decisions and

psychodiagnosis as examples. The latter was described:

Given ambiguous symptomatic information, the
diagnostician must categorize patients and
recommend treatment. He may then never see
the outcome or he may receive outcome informa-
tion that can be attributed to many factors
other than those leading to the diagnosis.
(Rappoport & Summers, 1973, p. 5)

The authors provided support for the focus of research

on the clinician's thinking rather than on characteris-

tics of patients:

The single-system case must therefore be under-
stood as largely focused on the cognitive
system of the judge, because the environmental
or task system remains obscure. (p. 5)

Hammond et al. (1975) noted that the single-system case

has been the one most studied in the research literature.

In the double-system case, the task outcomes, and

thus the task structure, are known. One person forms

judgments about one task system. The task system may

be a person. Accuracy of judgment and learning to improve

judgment have been topics frequently studied in this

case. Hammond et al. noted that research in this format

has shown the usefulness of "cognitive feedback" in

contrast to outcome feedback in improving judgmental

performance. When subjects were given information on

their own judgmental system or the task system, their

accuracy in judgment improved faster than if they were

given only outcome feedback on their performance. This

is an example of the use of "cognitive aids."

The triple-system case involves two judges and a

task. In social judgment theory, the focus has been on

cognitive differences between judges rather than the

differential gain as the source of conflict (Hammond

et al., 1975).

The final case is that of the n-system, in which

more than three persons are studied and the task system

may or may not be known. This case has examined the

characteristics of different factions and their policies.

Thus, it has encompassed the cognitive bases of conflict

within a group.

A Historical Perspective

A useful historical perspective was offered by

Bieri et al. (1966), who examined the study of judg-

ment as applied to clinical inference. The authors

listed three phases of this research. The first was

introspective reporting exemplified by Freud and, to

use a more recent example, Erikson's (1964) account.

The second phase, labelled the "reliability-validity

phase," refers to the period of intensive study of

psychological tests that took place during the decade

following World War II. The preoccupation of clinical

psychological research at that time was the reliability

and validity of clinical inferences based upon pro-

jective methods of assessment. The third phase was

the period of debate about clinical versus statistical

prediction, led by Meehl's (1954) book. Bieri et al.

(1966) concluded with what they called the "contemporary"

phase of development of theoretical models of the judg-

ment process. This phase can be divided into three

domains of psychological research.

The first domain has been that of cognitive social

psychology, which has studied clinical judgment in the

context of social perception. Implicit personality theory

has been a particular topic within this domain that is

applicable to judgment theory (Schneider, 1973; Wegner &

Vallacher, 1977).

The second contemporary domain has consisted of

the historical influences of the psychophysical approach

as applied to social judgment. This domain has not been

easily separated from the others because of the enduring

influence of psychophysical conceptual models. For

instance, Bieri et al. (1966) drew heavily upon psycho-

physical constructs in conceptualizing processes of social

judgment. They referred to a cue as an "input" which has

multidimensionality, or "differentiation," that can be

calibrated along intervals within a dimension. This last

measurement is "articulation" (p. 16). More recently,

Einhorn and Hogarth (1983) used a psychological

model in their discussion of diagnostic inference. They

began with the perceptual concept of figure/ground,

which includes the importance of an "assumed causal back-
/ /
ground" to salient information. The concept of "net

strength" of a pattern of cues is invoked. This refers to

the relative amount of evidence for a particular diagnosis,

analogous to the detection of a signal among competing

signals. As in perception, diagnosis is a "constructive"

process dependent in part on variables related to the

diagnostician, such as expectations. Finally, like percep-

tion, diagnosis often is a rapid judgment that is made

without awareness of underlying processes.

The third and final domain suggested by Bieri et al.

(1966) has been the application of mathematical models to

judgmental processes. The literature within this domain has

been so extensive that some reviewers see this approach

as equivalent to the study of judgment. Most of the

following research models, although not precisely delin-

eated by Bieri, et al., have constituted this third domain.

Research Models

Slovic and Lichtenstein (1973) provided one of the

most comprehensive reviews of this area. Although focus-

ing on the contrast and comparison of the regression and

Bayesian approaches, the authors also delineated several

other models of research. These models include process

tracing (Kleinmuntz, 1968); multidimensional scaling

(Wiggins, 1973); application of information theory (Bieri

et al., 1966); "probability learning" and signal detection

theory (Green & Swets, 1966). Slovic and Lichtenstein's

discussion revealed the overlapping of different models.

Their focus on the regression and Bayesian models accurately

represented the main direction of the literature.

They divided the regression or correlational approach

into two streams. The first has focused on the judge and

his or her ways of combining information into a judgment.

Hoffman (1960) was an early exemplar of this stream. Slovic

and Lichtenstein saw Anderson's (1965, 1974) extensive appli-

cation of integration theory as a continuing part of this


The other stream has followed Brunswik's (1952,

1956) philosophy of "probabilistic functionalism," which

is centered upon the adaptive interrelationship between

the organism and the environment. Slovic and Lichtenstein

noted that Brunswik was "the foremost advocate" of

"'representative design'" (1973, p. 20), which calls for

the study of an organism in realistic settings so that

experiments may represent the "usual ecology" of the


They described Bayes' Theorem as a normative model

that prescribes optimal judgment based on specification

of "certain internally consistent relationships among

probabilistic opinions" (p. 31). The Bayesian approach

produces a "distribution of probabilities over a set of

hypothesized states of the world" (p. 30). It is aimed

at enabling maximization of utility in decision making.

Slovic and Lichtenstein noted that Brunswik could have

chosen to use conditional probabilities (Bayesian approach),

rather than correlations, for the assessment of relation-

ships in a probabilistic environment. The implementation

of either approach has involved mathematical modeling,

which has been the dominant preoccupation in judgment


As in any topic of psychological study, some researchers

have focused on individual differences. Wiggins (1973)

presented a "general individual difference model" that uses

multivariate statistics and multidimensional scaling

of stimuli. The model attempts to derive ideal "types"

of judges that reflect judgment by individuals more

informatively than a group mean and more parsimoniously

than purely idiographic analyses. Kaplan (1975) also

looked at individual differences, using Anderson's (1974)

basic model of information integration. In this model,

each judge assigns a "scale value" to information along

a particular judgmental dimension. This scale value can

be described as the "meaning" given to the information

in terms of the dimension of judgment. Information may

have different meanings according to different evaluative

dimensions; thus, stimuli are considered to be multi-

dimensional. The importance for judgment is to what extent

the information lies upon the same response dimension as

does the judgment.

The Contextual Paradigm

A summary of the various models of judgment research

indicates a general consensus on the use of a contextual

approach (Jenkins, 1974). The diversity of models may be

more apparent than substantial. Johnson (1972) observed

that many models cannot be compared because of their

specificity: "Most models are applicable only to those

situations in which they were developed" (p. 350).

Most authors have paid homage to Brunswik's idea of

representative design, if not the Lens Model itself. At

times new vocabularies have been applied. For instance,

in their presentation of social judgment theory, Hammond

et al. (1975) restated Tolman's idea of the "cognitive

map" and Brunswik's concept of the "texture" of the

organism and the environment. Simply put, relationships

are the "fundamental units of cognition" (p. 272). What

Brunswik referred to as "texture," these authors denoted

as "causal ambiguity." This concept is in reference to

the relationships among variables or cues. Social judg-

ment theory distinguishes between "surface," which con-

sists of given cues and "depth," which refers to the

inferred conditions. An example would be a person's

pattern of speech as the surface condition and thought

disorder as the depth condition. The region between sur-

face and depth is referred to as the "zone of ambiguity."

This zone encompasses the often complex entanglements of

interrelationships involving surface and depth. The

authors noted that a single effect may have multiple causes

and that one cause may have multiple effects. The causal

ambiguity is due to three inherent conditions: (1) there

is an imperfect relation between surface data and depth

variables; (2) the relation between these two sets of

variables may assume different forms, e.g. linear or

curvilinear; (3) this relation may be organized according

to a variety of principles, e.g. additivity or pattern.

The environment is dominated by "causal ambiguity." People

apply their cognitive processes to reduce this ambiguity

as much as possible. However, this active processing,

which includes such processes as perception, thinking

and learning, at times reaches its limit without resolu-

tion of the ambiguity. What remains is a "passive" means

to arrive at a conclusion. This is the essential situation

in which judgment occurs. Hammond et al. (1975) noted that

social judgment theorists have made the reduction of causal

ambiguity their primary goal. The method for doing this

has been the externalization of the properties of the zone

of ambiguity in each system, organismic and environmental.

This research paradigm has probably been the most

widely followed one today. The most extensively published

group that has continued in this tradition is located at

the Center for Decision Research at the University of

Chicago (Einhorn & Hogarth, 1978; Hogarth, 1980; Einhorn &

Hogarth, 1982, 1983).

Results and Criticisms of Research on Judgment


A brief review of the research will focus upon judg-

mental biases. First, some general conclusions about the

status of knowledge on judgment will be quoted.

General Conclusions

Hammond et al. (1975) concluded that the following

"empirical regularities" exist both inside and outside

the laboratory:

(1) People do not describe accurately and completely
their judgmental policies, (2) people are often
inconsistent in applying their judgmental policies,
(3) only a small number of cues are used, (4) it is
difficult to learn another person's policy simply
by observing his judgments or by listening to his
explanations of them, (5). .. cognitive aids .. .can
reduce conflict and increase learning, and (6) linear,
additive organizational principles are often adequate
to describe judgment processes. (p. 305)

Similarly, Slovic and Lichtenstein (1973) offered

four generalizations about the state of knowledge of human

judgment. First, responses of judges are highly predictable

and can be represented in a quantifiable manner, often

more accurately than the judge can describe his own processes.

Second, because weighing and combining information are very

difficult, judges use simplified decision strategies to

"reduce cognitive strain" (p. 89). Third, the structure of

the situation being judged has a major effect on judgment.

This includes the sequences of information, manner of dis-

play and the nature of required response. Fourth:

Despite the great deal of research already completed,
it is obvious that we know very little about many
aspects of information use in judgment. (p. 89)

The authors added that "an enormous task" remains in

integrating research and theory on judgment into "the

mainstream of cognitive psychology," including such

topics as attention, concept formation, learning, memory

and problem solving (p. 89).

In more concise terms, Hogarth and Makridakis (1981)

noted that cognitive psychological studies of human judg-

ment have resulted in two major conclusions: "(1) ability

to process information is limited; and (2) people are

adaptive" (p. 116). Understanding of the context is a

paramount factor. The authors emphasized that people are

highly motivated to understand and control their environ-

ments. However, much error in human decision making can

be attributed to superficial information acquisition and

processing biases.

Common Shortcomings and Biases

Most reviews of judgment research have emphasized the

failings of human judges. Einhorn and Hogarth (1978) cited

several literature reviews and summarized the empirical

documentation of the limited capacity in human judgment:

Although the study and cataloguing of judgmental
fallibility have had a long history in psychology
. an accumulating body of recent research on
clinical judgment, decision making, and proba-
bility estimation has documented a substantial
lack of ability across both individuals and sit-
uations. (p. 395)

They noted the low predictive ability shown in clinical

settings, adding

It is apparent that simple statistical models for
combining information consistently provide more
accurate predictions than the judgments of clini-
cians. (p. 395)

Much of the research literature has been dominated by

the documentation of common biases in human judgment.

Hogarth (1980) presented an extensive list of such biases,

e.g. primacy/recency effects (Luchins, 1957, 1958; Leach,

1974) and the ignoring of base rate information (Kahneman &

Tversky, 1973). In lieu of reiterating the rest of the

list, the following comment on clinical judgment can serve

as a forthright reminder of the judgmental distortions

which clinicians are commonly prone to commit:

There is always enough evidence in a rich source
of data to nurture all but the most outlandish
diagnosis. (Arkes, 1981, p. 326)

In contrast, Christensen-Szalanski and Beach (1984)

presented evidence that a citation bias has existed in the

literature on human decision making. Studies which have

demonstrated the rationality of subjects' judgments have

been undercited as compared to studies which have emphasized

subjects' inadequacy in judgment. These authors argued

that a balancedviewof the research demonstrated mixed

results in human judgment reliability, rather than only the

more publicized biases.

In the present study, the bias of interest is the

inability to use disconfirming evidence. Rappoport and

Summers (1973) stated that problems in cue congruency,

or disconfirming information, represented a "critical

issue" in the study of judgment. They referred to such

problems as "common." How a judge responds to cue incon-

sistency is not only theoretically important, but also

of considerable practical significance" (p. 143).

Hogarth (1980) noted that the disregard for inconsis-

tent information is a strategy born of the search for con-

sistency. He cited the phrase, "'thirst for confirming

redundancy,'" used by Bruner, Goodnow and Austin (1956).

Confidence in Judgment: Measurement and Findings

The difficulty shown by judges in adequately using

disconfirming information has been one factor related to

the problem of inappropriate confidence in judgment. The

definition of confidence used here is taken from Blaser

(1978): "the individual's emotional satisfaction and

certainty about a judgment at which he has arrived" (p. 1276).

The study of confidence in judgment has received recent

empirical attention within psychology. A brief review of

this research will include measurement issues, empirical

findings and explanatory discussion.

Johnson (1972) has been one of the few authors who has

devoted any discussion to how confidence in judgment can be

measured. He surmised that confidence can be measured either

semantically, e.g. with such terms as "very confident" or

"quite certain," or on a rating scale, which is the usual

procedure. Johnson suggested that subjects would feel

most comfortable with a rating scale extending from 0 to

100 because people are able to conceptualize confidence

in terms of percentage of certainty. He noted that dis-

tributions of confidence ratings are often U-shaped or

J-shaped because of the frequent usage of end points. His

review stated that in studies of psychophysical judgments,

confidence has been shown to increase in concordance with

an increase in the difference between the variable stimulus

and the standard.

Slovic (1966) measured confidence in judgment by

having subjects assign a probability estimate of the accuracy

of their ratings. This represented essentially the same

mode and range of response as suggested by Johnson (1972).

Most studies involving confidence ratings have employed

a Likert-type rating scale. For example, Cantor, Smith,

de Sales and Mezzich (1980) used confidence ratings made by

clinicians who rated how poorly or how well, on a scale from

1 to 7, written case descriptions of patients fit into the

diagnostic category chosen by the subjects. Eker (1981) and

Blashfield and Sprock (1983) also used 7-point rating

scales in the measurement of clinicians' confidence in diag-

nosis of hypothetical patients. In more general, non-

clinical studies of the construct, "decisiveness," Weissman

(1980) used a 10-point scale for "Confidence in Decision,"

one of six components of decisiveness. Other studies have

employed the same essential self-report rating (Oskamp,

1965; Ryback, 1967).

A major finding of a number of studies has been the

negative relationship between confidence and accuracy.

This has been obtained in both non-clinical and clinical


In a study that debunked popular attitudes about

"women's intuition," Valentine (1929) reported a tendency

for a negative relation between confidence in judgment

and accuracy:

A feeling of unusual confidence, then, gives no
clue to the reliability of the judgment: rather
it might seem to act as a warning that the judg-
ment is particularly unreliable. (p. 230)

Ryback (1967) found that, in a study of judgment involving

comparisons of geometric designs in the absence of systena-

tic feedback, undergraduates' confidence increased with

experience. Accuracy did not increase; there was no rela-

tionship between confidence and accuracy. Malpass and

Devine (1981) found that confidence and accuracy in eye-

witnesses' selections of vandals from a lineup were unrelated.

Witnesses who made choices tended to indicate high confidence

in their selections, whereas those who rejected the lineup

had low scores in self-ratings of confidence in their deci-

sions. The authors noted that this is consonant with

the findings of several other studies of witness con-

fidence and accuracy.

Oskamp (1965) studied psychologists' confidence in

their clinical judgment. He reported that confidence

increased steadily and significantly with increasing

information about a case, but that accuracy did not in-

crease. Similarly, Schinka (1976) found no significant

relationship between clinicians' confidence and their

accuracy in judgments.

Blaser (1978) studied the influence of different vari-

ables on the level of confidence in person perception. He

confirmed previous empirical findings that increased infor-

mation was related to increased confidence in person per-

ception. Also the type of personality characteristic being

judged was related to level of confidence. In his study

of ratings by ten experienced psychotherapists (five

analysts and five non-analysts), the rated variables ranked

from most to least confident were

intelligence, likeability, suffering, contact,
exhibitionism, endurance, maturity, defensive-
ness, suggestibility, readiness to accept inter-
pretations, introspection, reality conformity.
(p. 1278)

Blaser observed that these were each complex variables not

accessible to direct observation, thus not making it clear

as to why they would differ in terms of judges' confidence

in their ratability. He also found that confidence level

related to the personalities of the judges, that is;

some judges tended to be more confident than others in

their ratings generally. The author suggested that this

factor could relate to previous findings of no relation

between accuracy and confidence of clinical judgment.

Eker (1981) found that confidence in diagnosis for

American and Turkish clinicians was above the midpoint on

a 7-point scale even when cases had inadequate diagnostic

information. The high confidence was based upon knowledge

of present symptoms. Eker criticized the implied tendency

to not feel a need to search for other information, such

as historical data.

One consistent finding in research on clinical judg-

ment has been the rapidity at which clinicians arrive at a

diagnosis (Gauron & Dickinson, 1969; Sandifer, Hordun &

Green, 1970; Kendell, 1973; Clavelle & Turner, 1980). In

these studies, clinicians often produced diagnoses within

a very few minutes of being presented information. Although

these findings did not necessarily imply overconfidence,

they did raise the hypothesis of hasty decision making.

Several authors have offered explanations of the over-

confidence phenomenon, often in relation to the problem of

inattention to inconsistent information. Kahneman and

Tversky (1973) demonstrated that people made predictions by

selecting outcomes that were most representative of inputs.

They suggested that confidence in predictions varied directly

with the extent to which a selected outcome was more

representative of the input as compared with other out-

comes. The authors stated that consistent information

would increase confidence, even though it had no rela-

tionship to predictive accuracy. They concluded:

The foregoing analysis shows that, for example,
consistency and extremity, are often negatively
correlated with predictive accuracy. Thus,
people are prone to experience much confidence
in highly fallible judgments, a phenomenon that
may be termed the illusion of validity. Like
other perceptual and judgmental errors, the
illusion of validity often persists even when
its illusory character is recognized. (p. 249;
emphasis in original)

Einhorn and Hogarth (1978) posed the following ques-

tion in their review:

How can the contradiction between the considerable
evidence on the fallibility of human judgment be
reconciled with the seemingly unshakable confi-
dence people exhibit in their judgmental ability?
In other words, why does the illusion of validity
persist?.... Why does experience not teach
people to doubt their fallible judgment? (p. 396)

The authors answered their questions by focusing on the

environmental context in which judgments take place,

specifically on the relationship of judgmental tasks and

their outcomes. They noted:

In real-world situations, judgments are made for
the purpose of choosing between actions. This
means that outcome information, which is available
only after actions are taken, is frequently the
only source of feedback with which to compare

judgments. Therefore, to understand how people
learn about their judgmental ability, it is
necessary to consider judgments, actions, and
outcome feedback together. (pp. 396-397)

The authors presented a mathematical model for confidence

in judgment, which they redefined as "the strength of

the learned concept 'my judgment is accurate'" (p. 401).

Using learning theory. principles, they derived an

equation representing confidence as a function of the

total feedback effect, which in turn was a function of the

number of decisions made, the probability of selecting a

particular action, the positive hit rate and the reinforcing

value of negative feedback (pp. 401-402). They suggested

that confidence in judgment was acquired slowly, then would

experience a relatively rapid rise and finally would level

off with large amounts of experience. The model suggested

that once confidence in judgment was achieved, it would

be highly resistant to extinction.

In another article, Einhorn (1980) offered the concept,

"outcome irrelevant learning structures (OILS)" (p. 4),

as an explanation for the pervasiveness of overconfidence in

judgment. A major source of these structures is the inabil-

ity to make use of disconfirming information. To overcome

this common bias, a judge had to be willing to try to falsify his

hypotheses, even when he felt certain of their correctness.

Einhorn noted that seeking disconfirmation is a relatively

recent methodology in the history of science. In general,

Einhorn posited three main factors to explain overconfi-

dence in judgment:

(1) lack of search for and use of disconfirming
evidence, (2) lack of awareness of environmental
effects on outcomes, and (3) the use of unaided
memory for coding, storing and retrieving out-
come information. (p. 13)

Einhorn (1980) offered a number of solutions to the

overconfidence problem. First, formal training in experi-

mental design, use of control groups, and awareness of

base rates could help offset the tendency to discount

disconfirming evidence. Adopting a comprehensive model

such as the Brunswik Lens Model could be helpful in gaining

awareness of environmental effects on outcomes. Finally,

keeping a record of judgments and outcomes could aid problems

due to memory deficiency.

In general, judgment appears to be a complex task

involving distillation of large amounts of information into

a structured scheme of reasoning to aid the act of decision.

Given the stress of the judgmental task, the judge tends to

seek out reinforcing information that would confirm his

efforts and conclusions. Perhaps Hogarth (1980) best

described the motive for overconfidence in his statement

that people were motivated to avoid "psychological regret"

for either taking or failing to take an action.

Einhorn (1980) tried to temper the problem with the


Perhaps overconfidence in judgment is in
some ways functional--lack of confidence
in judgment might result in too much
analysis and a crippling of the ability
to make quick choices. (p. 14)

However, the thrust of his writing delineated the essential

dilemma of fallible judgment:

We are in the common but unenviable position
of having to continually judge our judgments.
Furthermore, if judgmental biases are as fre-
quent as much research suggests. . we must
seriously entertain the hypothesis that a great
deal of what we believe about our judgmental
ability is in error. (pp. 1-2)

Given these findings, the continuing empirical study

of judgment and judge's confidence would seem to be an

important project for clinical research. Adams and Adams

(1961) articulated a compelling rationale for making clini-

cians and other judges aware of their biases:

It may be that realism of confidence is in many
situations a more important variable than level
of performance itself. To take one example, it
might be more important, in terms of his future
work, social interactions, confidence in himself,
etc., for a student to be able to discriminate
realistically between what he knows and what he
does not know than it would be for him to know
considerably more than he does know without such
discrimination. Realism of confidence seems
obviously of importance, not only for scholars
but in general for individuals making decisions
having consequences of importance, either for
themselves or for others. (pp. 36-37)

Research Suggestions

From the authors reviewed, a number of recommendations

for research may be distilled. The most repeated recommenda-

tion has been for representative research designs. This

could involve a number of factors often overlooked in pre-

vious studies. For example, Kleinmuntz (1968) suggested

that a simulation experiment could better approximate a

real-life situation if "noise," in the form of irrelevant

or insufficient information were included. This same

idea was expressed by Hammond, et al. (1975) in terms of

the intrinsic ambiguity of the environment:

The methodological corollary is that such
ambiguity among relations must be represented
in the judgment tasks used to study human
judgment. (p. 273)

Einhorn, Kleinmuntz and Kleinmuntz (1979) observed

that most research on judgment involved "deliberative"

responses in which subjects were given enough time and

were encouraged to "think through the problem" (p. 469).

Also, the response was usually in the form of a quantita-

tive rating, which contributed to the use of compensatory

rules and "focuses attention on the trade-offs between

cues in order to be quantitatively precise" (p. 469).

One could infer that a less deliberative and quantitative

situation would be more representative of common judgmental

tasks. The goal for research would be to ascertain the

nature of the judgmental process.

Anderson captured the essence of such process by

noting that a judgment was dependent upon a "train of

inferential thought" (p. 91). Valid theory and research

on judgment was likely to be an elusive goal, as was

suggested by his major point:

In a very real sense, therefore, people do not
know their own minds. Instead, they are con-
tinuously making them up. Knowledge and belief
are not static memories, but typically involve
active, momentary cognitive processing. (p. 89)

Along these same lines, Hogarth (1981) wrote of the

importance of feedback and redundancy in the continuous

judgmental processes that characterize everyday life.

He reminded the reader that the main purpose of judgment

was to facilitate action. Discrete judgmental tasks

studied in laboratories have often suffered from a lack

of both redundancy and feedback. He summarized the "one

result" that "dominates" descriptive research in decision

making over the years:

Judgment and choice depend crucially upon the
context in which they occur and the cognitive
representationof that context. However, both
environment and mind interact in continuous
fashion. . Theories of judgment and choice
that lack a continuous perspective exclude one
of the most important determinants of the be-
havior they purport to explain. (p. 213)

The objectives of Hammand et al. for social judg-

ment reflected one of the most emphatic programs of

representativeness. This pragmatic approach was meant

to be (1) "life relevant," (2) descriptive, rather than

aimed at establishing laws of behavior or judgment,

and (3) designed to create "cognitive aids" to help

people make judgments.

Another recommendation for research has been the

use of multiple methods. Slovic and Lichtenstein (1973)

criticized the myopic view of some researchers who have

been unaware of other methods of investigation. In the

same spirit, Einhorn et al. (1979) compared linear

regression and process tracing models. They concluded:

Although we believe that both approaches treat the
underlying process at different levels of detail,
some process models may not be seeing the forest
for the trees, whereas some statistical modelers
may not see any trees in the forest. (p. 483)

More specifically relating to clinical judgment,

Eker (1981) recommended further research on "the amount

and type of information" that clinicians view as satis-

factory for diagnosis. Bieri et al. (1966) also dis-

cussed the crucial aspect of structuring response

alternatives, the range of which is an important parameter

in judgment:

The clinician who in an interview poses for him-
self the problem of determining what the client
is like has a much broader array of response
alternatives potentially open to him than does
the clinician who must decide if the client is
schizophrenic. ... In the research situation,
in order to achieve comparability of conditions

across judges, it is often necessary to con-
fine all judges to the same set of response
alternatives even though any given judge may
feel that this is an irrelevant domain or not
the most important one for this particular
client. (pp. 10-11)

There is a lack of other specific recommendations

to cite because of the limited literature on clinical

judgment, which, in turn, has traditionally had little

effect on clinical practice (Hammond, Hursch & Todd,

1964). A reason for this may be, as Slovic and

Lichtenstein (1973) suggested, that most people regard

judgment as an intuitive act that cannot be didactically

taught. However, this attitude is counter to the whole

enterprise that endeavors to create "expert" opinion.

As Bieri et al. (1966) succinctly described:

Professional education in clinical psychology,
social work, and psychiatry is based upon the
assumption that it is possible to develop with-
in the clinician capacities to render judgments
which are both accurate and therapeutic. (p. 11)

In conclusion, Hogarth (1980) made three observations

about decision making that can be borrowed as a good

summary for the justification of research on judgment:

(1) people are generally unaware of how they
make decisions and often why they prefer one
alternative to others; (2) they show little
concern for the quality of their own decision
making processes (although the failures of
others are often indicated with haste); and
(3) the scientific study of decision making
has not in my view attracted the attention it
merits. (p. ix; emphasis in original)



The processes of psychological assessment and

consultation offer a practical arena in which to examine

clinical judgment. As in most applied research, consid-

eration of potential influencing factors reveals a host

of variables that may be implicated in the relation

being studied, e.g. in this case, between a psycholo-

gical report and particular judgments or choices. A

historical perspective is necessary for an appreciation

of the meaning of the referral situation and the use

of reports. This review will look at the emergence of

testing in the history of clinical psychology. Included

will be an examination of the interprofessional relation-

ship between psychiatry and clinical psychology and

the controversy that has surrounded clinical testing in

the past three decades. A detailed description of the

referral process will include discussion of the reasons

for referral and potential factors influencing the use

of assessment. Finally, empirical research will be

reviewed, including both surveys and the few studies

that have endeavored to study the use of test reports

directly. The need for more research will be emphasized,

with general suggestions followed by an outline of the

present study.

Historical Background


The function and significance of psychological

testing in clinical psychology can be understood only

in its historical context. Changes in the use of tests

parallel developments in the growth of the profession.

A key factor in these developments has been psychology's

relationship to psychiatry. Although other professionals,

such as educators and nonpsychiatric physicians, have

had an impact on testing, the relationship of interest

here is with psychiatry, which has probably had the great-

est influence on applied clinical testing.

The Emergence of Testing in Clinical Psychology

Tallent (1965) provided an excellent discussion of

the historical roots of clinical testing. He cited three

sources: academic psychology, clinical psychiatry and

clinical psychology. He compared the two differing

approaches: (1) the academic-psychometric, based upon

operationalism and physicalism and (2) the clinical,

based upon an idiographic and more impressionistic

tradition. Similarly, Watson (1953) traced the develop-

ment of clinical psychology to two general roots: the

psychometric test tradition and the influence of dynamic


In the early part of the century, intelligence test-

ing was the primary occupation of those psychologists

who could be called "clinical." Watson gave an even

more delimiting picture: "In fact, for years, the major

task of the clinical psychologist was to administer the

Stanford-Binet" (1953, p. 323). Until the 1940's,this

role as "mental tester" remained the center of the

psychologist's functional identity vis-a-vis psychiatry.

An expanded role for psychologists was signified

by Brown and Rjpaport's (1941) article in the Bulletin

of the Menninger Clinic. The authors provided an over-

view of personality testing and explained the use of a

psychological test battery in a psychiatric setting.

Although the article focused on the various tests and

their uses, its title, "The Role of the Psychologist in

the Psychiatric Clinic," demonstrated the identification

of clinical psychologists with the testing function.

The authors emphasized that this role, which would now

seem limiting to most psychologists, was a broadening of

professional opportunity:

Recent advances in experimental psychology and
the psychology of personality establish the
psychologist in a new role in clinical psychiatry.
Instead of limiting his activities to intelli-
gence testing, the psychologist functions as
an investigator of the personality by means of
projective personality tests, concept formation
tests, Lewinian techniques, and intelligence
tests. (pp. 82-83)

The subsequent fifteen years brought about the

complete acceptance of, if not infatuation with, personality

assessment by clinical psychology. Watson (1953) cited

the consensually agreed upon role of "diagnostic appraisal

as a task of the clinical psychologist" (p. 341) as one

of the stabilizing factors in the emergence of clinical

psychology as a profession following World War II. Part

of the initial enthusiasm for psychodiagnostic assess-

ment emphasized the advantages of the "objectivity" of

testing. For example, Brown and Rapaport (1941) declared:

To the clinical methods of the psychiatrist
and the psychoanalyst [the psychologist] adds
tools having a considerable degree of objec-
tivity in investigating all the aspects of the
total personality. (p. 83)

Psychologists were soon regarded by many as

wielders of powerful projective techniques, further en-

hanced by the scientific responsibility of psycholo-

gical research. Kubis' summary of clinical practice

and research was a poetic testimony to the optimism of

that era:

The projective tools offer the clinician an
opportunity for the free exercise of his in-
tuitions in a complicated life-space where
the interactions of all forces cannot be
controlled by exacting experiment. Though
more restrictive, standardized psychologi-
cal tests present the functioning individual
side by side with a norm, thus enabling the
psychologist or psychiatrist to evaluate
this specific behavior in a more objective
light. There is finally, the experimental
procedure, uncompromisingly rigid and
assuming the double role of alert sentry
and discerning skeptic with respect to all
claims demanding entrance into the citadel
of science. It is clearly apparent
that these diverse approaches to scientific
truth are closely interrelated and often
merge into a productive pattern well illus-
trative of the scientific method. Promising
clinical insights are temporarily molded into
a working tool, such as a projective test or
questionnaire. These are, sooner or later,
thrown into the experimental furnace and
with the impurities dissolved, the remainder
is often molded into a standardized form,
a psychological test. Note here the torturous
paths science compels clinical intuitions to
take so that (ultimately) they may emerge in a
purified form, independent of their creator
and his special abilities. (1951, pp. 105-106)

Other authors were more skeptical. For example,

Diethelm and Knehr (1951) emphasized that tests were

aids to diagnosis and not substitutes for a clinician's


The well-trained clinician who takes sufficient
time to study the patient will have little need
for projective tests to establish the diagnosis.
(p. 78)

These authors also relegated the use of psychologi-

cal tests to diagnosis, "with only incidental usefulness

in therapy" (p. 73). They reasoned that, because most

test procedures were cross-sectional in nature, they

would be of limited use to dynamic psychiatry, which

emphasized longitudinal aspects of etiological factors.

The subsequent disenchantment with testing will

be detailed further. However, the point here is that

psychologists' campaign for the acceptance of personality

assessment was too much of a success. Rosenwald (1963)

described the situation, albeit in somewhat extreme


Psychodiagnosis became the signal function of
clinical psychologists, even though some were
active in other ways as well, and even though
they were not the only ones to avail themselves
of these tools. Before the mental health
team became the practical reality which it
now is in many hospitals and agencies, the
boundaries of the psychologist's activity
were tightly and impermeably drawn around
testing. (p. 222)

Ivnik (1977) also stated that reviewers of the disputed

value of psychological testing have agreed that "histori-

cally psychological tests have been intertwined with the

clinical psychologist's professional identity" (p. 206).

He added:

Tests provided an assessment procedure that
differentiated the psychologist from other
mental health professionals. Testing also
defined a unique domain of professional
functioning wherein the psychologist was the

unquestioned expert. Unfortunately for
some,the testing boon soon became the
clinical psychologist's bane. (p. 206)

Perhaps the most incisive remark has been that of Kahn,

a psychiatrist, who lamented, "To see a whole generation

of psychologists becoming testing technicians is one of

our painful experiences" (1965, p. 1025).

In the developments since that time, psychologists

have expanded their roles in many ways, particularly in

the realm of psychotherapy. In the area of diagnosis,

they have emphatically endeavored to establish themselves

as "consultants" on an equal footing with referring

professionals (e.g. Crary & Steger, 1972; Tallent, 1983).

This concern for status has most often centered upon

psychologists' relationships with psychiatrists (Shakow,

1949; Ausubel, 1956; Brody, 1956, 1959; Shectman & Harty,


The Referral Process as an Interpersonal Relationship

As stated previously, the use of psychological

tests has often been a focal point of defining inter-

professional relationships. Bellak (1959) described

psychological testing as "the main area of contact between

psychiatrist and clinical psychologist" (p. 76). Brody

(1959), who was writing as a psychiatrist, redefined the

question of referral for psychological testing as a ques-

tion of referral for the consultation of a psychologist:

The issue, then, is not why a patient should or
should not receive psychological tests. It is
why the psychiatrist does or does not refer his
patient to a psychologist. (p. 88)

Brody underscored what may be called the social

psychology of the psychological referral:

The psychiatrist's request for psychological
testing is not simply a dispassionate, objec-
tive effort to obtain scientific evidence which
will be useful in his attempts to understand
and treat his patient. It is a social process
involving his attitude toward psychologists,
toward psychological testing, his identifica-
tion with his own profession, and his insecuri-
ties about his own status. (pp. 88-89)

Brody asked, "What are some of the hidden functions of

psychological test referral for the psychiatrist?"

(pp. 89-90) He suggested that sometimes psychiatrists

wanted to impress their patients with an approach that

appeared to be scientific. Also, he noted that

Psychological testing provides the clinician
with a rapid feedback without risking some
loss of status in his patients' eyes by
asking for consultation from another physi-
cian. (p. 90)

Another hidden function of the test referral, suggested

by Brody, was the use of the consultation for the psychia-

trist's defensive needs, such that the psychologist could

become the target of the patient's aggression or libidinal


Brody also speculated about hidden functions of

the psychiatrist's decision not to refer a patient

for testing. The delay in decision making due to

testing may have detracted from the psychiatrist's

sense of prestige as a professional who acted immediately.

Also, a psychiatrist maybe would feel unconscious guilt

in "trapping" a patient to reveal information about

himself in a way through which he was unaware, i.e. by

testing. Another reason for not referring for testing

was that a psychiatrist perhaps would see the nonobjective

nature of testing as meaning that he was merely request-

ing the clinical judgment of a psychologist; some

psychiatrists would not want to take such a judgment

above their own. Brody suggested a final reason for non-

referral: to reduce the overload of information process-

ing for the psychiatrist, who already would have plenty

of data to integrate.

Brody summed up his discussion:

In summary, we are dealing here not with the
established value of psychological tests as
behavior-sampling devices. We are dealing
with the generally unidentified factors
which may result in the psychiatrist's de-
priving his patient of their benefit, or
in his developing an unwarranted dependence
on them. Some of these factors are economic.
There are also circumstances in which avoid-
ance of psychological testing or unusual
dependence upon tests is justified and rational.
Many of the factors, however, are involved in
the psychiatrist's perception of the person

of the psychologist and of the nature of his
tools. These influence the social process
which is referral. (1959, p. 91)

Schafer (1954) also highlighted salient "nonscienti-

fic" factors in the testing referral. He discussed some

common problems in referrals from psychiatric residents:

To some extent, and not always subtly, the
residents ambivalently transfer to the tester
major or full responsibility for clarification
of diagnostic, dynamic, prognostic, therapeutic
or dispositional problems in the case. It
becomes the tester's "job" to settle the pro-
blematical issues. This difficulty becomes
especially acute when cases are not routinely
tested but are referred for testing only
where major confusions or uncertainties exist
in the psychiatrist's mind. (p. 10)

In general, he noted that the psychologist may be

overvalued or undervalued in a given clinical setting.

Both situations would affect his attitude towards his

task and, ultimately, the responses of the patient.

Such comments advocate an increased awareness of

the subtleties involved in a referral. In conjunction

with this, psychologists have also urged each other to

improve their written response to these various demands.

Sargent (1951) presented different styles of test report

writing and concluded with several questions, including

What is the purpose of a psychological test
report? Do we who write them share the
purposes of those who read them? If not,
who is to educate whom to their fullest
usefulness? (p. 186)

She answered her own questions:

The writer's position regarding all of these
questions is that few can be answered out-
side the context in which a test is requested,
given, and reported. (p. 186)

Sargent advocated improved communication, as did Klopfer

(1959): "In a real sense the sole and only legitimate

purpose of the psychological report is that of communica-

tion" (p. 88).

Thus, taken in a historical context of interprofessional

conflict and problems in communication, the psychological

testing referral can be seen as a complex social psycho-

logical event. A review of the vicissitudes of testing,

from critics and proponents, will be followed by a

summary of the empirical study of the prevalence of

assessment. This in turn will be followed by a descrip-

tion of the processes and factors that may influence

the referral situation. Finally, this second half of the

literature review will conclude with an examination of

the empirical research on the use of test reports.

The Reported Decline of Testing

The decline of psychodiagnostic testing as a task

performed by psychologists has been much discussed,

although not very well documented empirically, in the

literature. Evidence for psychologists' devaluation

of testing has been generally indirect, but nonetheless

accepted by many authors. One of the first comments

about this trend was made by Carson (1958), in his

letter to the editor of the American Psychologist.

He voiced his concern that psychologists were devaluing

their roles as diagnosticians by frequently relegating

the assessment task to trainees.

A review of published research reveals a decline

in articles related to diagnostic testing. Crenshaw,

Bohn, Hoffman, Matheus and Offenbach (1968) compre-

hensively surveyed ten journals over an eighteen year

period and concluded that the use of projective techniques

in research peaked in 1955, dropped sharply in 1956 and

1957, and then stabilized through 1965. Tolor (1973)

documented the declining interest in psychodiagnosis by

reviewing five major clinical journals between 1951 and

1970. He found consistent significant decreases in the

number of diagnostically related articles. These decreases

were especially striking when compared with the general

trends of scientific literature. Most scientific topics

have exhibited an exponential growth in related publica-

tions (Price, 1963; MacRae, 1969; Griffith, Small, Stonehill

& Dye, 1974; Small & Griffith, 1974).

What is most salient is not the decreased amount of

testing activity that may or may not be inferred from the

studies of Crenshaw, et al. (1968) and Tolor (1973). The

implied devaluation of testing from psychologists'points

of view is what is of interest here. The present section

of this review is concerned with highlighting the

rhetoric of the assessment controversy; empirical

studies will be examined further on.

As an example, the following comment by Blatt

(1975) suggested that such a devaluation had become


Testing remains, by and large, a second-class
clinical activity rather than a highly valued
way of studying cognition, perception, affect
and the representation of interpersonal rela-
tionships that can contribute to both the
clinical process and the systematic investi-
gation of clinical phenomena. (Blatt, 1975,
p. 328)

The reasons for this devaluation have been offered

by a number of authors (Mosak & Gushurst, 1972; Tolor,

1973; Cleveland, 1976; Lewandowski & Saccuzzo, 1976;

Ivnik, 1977; Korchin & Schuldberg, 1981). In the most

comprehensive review, Korchin and Schuldberg (1981)

listed seven "themes" in the decline of interest in

diagnostic testing: (1) new alternative roles have

emerged for psychologists; (2) some theoretical orienta-

tions, such as the behavioral and humanistic-existential

approaches,have opposed testing on principle; (3) psycho-

diagnosis was initially overrated; (4) the teaching of

assessment skills to students declined, due to inexperienced

or disinterested faculties; (5) assessment is "difficult

work that can strain the clinician's identity" (p. 1149);

(6) society has challenged the use of tests; (7) testing

is expensive or not sufficiently profitable. To this

list could be added factors mentioned by the other

reviewers: negative research findings; overemphasis on

pathology in reports while downplaying personality strengths;

beliefs that reports are ignored; the low quality of

psychodiagnosticians. Of these themes, the main emphasis

in the literature has been the critique of testing's lack

of utility compared to its expense.

In Defense of Assessment

The mounting criticism and apparent decline in testing

have prompted a continuing defense of psychodiagnostics by

psychologists (Zubin, 1951; Klopfer, 1962, 1964; Rosenwald,

1963; Towbin, 1964; Tallent, 1965, 1983; Holt, 1967;

Appelbaum, 1970, 1976, Gough, 1971; Craddick, 1972,

Freedheim, 1972; Weiner, 1972; Blatt, 1975; Levy & Fox,

1975; Korchin & Schuldberg, 1981). The content of these

rebuttals will not be detailed here. A sample of comments

will serve to represent their positions.

Some writers have attacked the critics' arguments

directly. Rosenwald (1963) attributed much of the criticism

of the psychodiagnostic role of psychologists to the critics'

discomfort with the "nonphysical, artistic and intrusive

processes" (p. 237) involved in assessment. Ivnik (1977)

noted that most criticisms of testing have referred to the

misuse of tests rather than to the utility of the tests

themselves. He asserted that opinions by other profes-

sionals as to the value of psychological reports were

not adequate criteria for assessing the value of tests


Other writers have reiterated the raison d'etre

for testing. In the current Comprehensive Textbook of

Psychiatry, Carr (1980) noted that psychological tests

had two common features that made them advantageous over

an interview: (1) they provided

a fairly objective means for comparing a rela-
tively controlled sample of the patient's
behavior with available normative data repre-
sentative of a larger reference group (p. 940)

and (2) they could elicit responses by a patient to "a

broad range of stimuli on the continuum of structure-

ambiguity" (p. 940). In contrast to these behaviorally

phrased arguments, most published proponents of testing

have spoken from a psychodynamic point of view, e.g.

Allison, Blatt and Zimet (1968).

Regardless of theoretical orientation, supporters

of assessment have often deflected criticisms of the

tests themselves with the common argument that assessment

was a refined skill that leaves itself open to criticism

when the results of mediocre practitioners were examined.

As Klopfer (1962) emphatically stated:

There is no aspect of the job of the clinical
psychologist which is more complex, more
difficult, and more challenging. (p. 298)

Thus, from such a perspective, the object of criticism

should be the incompetent assessing psychologist rather

than assessment per se. This raises the empirical

question of how many competent assessors are in practice,

assuming such competency could be validly assessed. If

such an evaluation could be made and there turned out

to be very few psychologists truly competent to make

the best use of assessment, then the value of assessment

would still be questionable. Assessment would then properly

belong only in the hands of an elite few, unless resources

could be mobilized to transfer such skill on a more wide-

spread basis. Theoretically, criteria for graduate degrees

and professional licensure exist to insure such competency

in assessment. However, the possibility remains that the

use of these criteria has not succeeded in producing the

caliber of psychodiagnosticians desired by the profession.

Whether or not this hypothetical situation accurately

describes the current state of affairs in clinical psycho-

logy is a question beyond the scope of this review.

Returning to the literature that has defended the use

of testing, the reader will discover that, in general,

supporters of testing have not denied the problems raised

by critics. The most common review articles on testing

have acknowledged the "decline" or "uncertain status"

of testing and have offered new approaches to improve

the use of tests (Holt, 1967; Neuringer, 1967; Appelbaum,

1970; Hertz, 1970; Gough, 1971; Beutler, 1973; Lewandowski

& Saccuzzo, 1976; Ivnik, 1977; Petzelt & Craddick, 1978;

Sloves, Docherty & Schneider, 1979; Korchin & Schuldberg,

1981; Sterling, 1982). The many ideas of these authors

will not be detailed here. What is significant, however,

is that their discussions have demonstrated a continued

desire by those who support assessment to address the

limitations of assessment and to adapt to new clinical


Empirical Study of the Prevalence of Testing

In response to the widespread notion that the use

of testing has declined in clinical psychology, a number

of surveys have been taken in the past decade. Prior to

then, most surveys dealt with comparative frequency of

use of particular tests, e.g. Sundberg (1961); Lubin,

Wallis & Paine (1971). Sundberg (1961) did extrapolate

from his results to estimate that, annually, nearly

7,000 clinical and counseling psychologists were seeing

1,300,000 persons for various services, including the

administration of psychological tests to 700,000 clients.

These figures did not include industrial, military,

educational and private clinical settings; thus, they

represented a fraction of total services. The more

recent surveys have endeavored to look at the role of

assessment in psychologists' daily work.

Garfield and Kurtz (1974) reported, in a survey

of one-third of the members in the Division of Clinical

Psychology of the American Psychological Association,

that an average of 10% of the professional clinical

psychologist's time was devoted to diagnostic assessment.

This was compared to time devoted to three other activities:

individual psychotherapy (25%); teaching (14%) and

administration (13%). Ivnik (1977) observed that this

survey indicated that 24% of psychologists' direct

clinical services was devoted to assessment. He argued

that, although assessment may have been taking a smaller

proportion of clinicians' time than previously, it

still occupied a significant portion of their professional


Levy and Fox (1975) surveyed clinical employers who

had advertised job openings during the 1971-72 year.

They found contrary to some discussions of the decline of

diagnostic work, that testing was "alive and well." In

fact, 90.5% of their respondents indicated that they

expected job applicants to have skills in testing.

Wade and Baker (1977) surveyed five hundred clinical

psychologists about their attitudes toward and use of

psychological tests. They reported that "the great

majority of responding clinicians devoted substantial

time to psychological testing" (p. 879). This was true

for clinicians of every theoretical orientation. The

most endorsed reason for testing and for recommending

that students learn testing was information that test-

ing could provide on personality structure. Clinicians

reported that their personal experience with tests was

the main reason contributing to their decisions to use

tests for a given client. This experience apparently was

more influential than the negative research findings

reported in the literature. Clinicians often reported

that they used their own personalized assessment methods

and that testing was an insightful process rather than an

objective skill. Another reported factor in the use of

tests was the lack of an appropriate alternative for assess-

ment. Clinicians also justified their continued use of

tests with methodological criticisms of assessment research.

Reynolds (1979) studied the relationship between fre-

quency of test usage and "overall quality of psychometric

refinement" (p. 326), as rated by 31 academic psychologists.

These quality ratings were applied to the ten most frequently

used tests. Although a significant rank-order correlation

was found for data based on a 1974 survey, only one-third

of the variance in test usage was accounted for by test

quality. This supported other findings, e.g. Wade and Baker

(1977), as to the importance of nonscientific reasons for

the use of tests.

In another study, Piotrowski and Keller (1978)

surveyed 93 clinics and mental health centers in the

Southeastern United States. Responses from 61 centers

indicated that testing was an important function in

these outpatient facilities. They contrasted this fact

with the reported decline in emphasis on assessment

training in doctoral programs (Shemberg & Keeley, 1970;

Thelen & Ewing, 1970; Thelen, Varble & Johnson, 1968).

The above review suggests that a number of psycholc-

gists vigorously support assessment, which continues to

be a significant activity for many in the profession.

Forces in opposition to testing also continue to press

their case. Thus, no dogmatic conclusions about the out-

come of this controversy would be appropriate. Howes'

(1981) conclusion about the use of the Rorschach can be

applied to testing generally:

It would be quite apparent that the use of the
Rorschach is dying hard, if it is dying at all,
although there are undoubtedly many who are
eager to attend the funeral. (p. 347)

Korchin and Schuldberg(1981) gave a somewhat mixed

message in their review. They concluded that assessment

will continue to play a vital role in clinical activity,

but, in answer to the question as to whether clinical

assessment was alive and well, they responded, "Alive,

without doubt, though only in moderate, but we believe

improving, health" (p. 1156). Ivnik's conclusion was

probably the most accurate:

Few definite conclusions can be drawn regarding
the status of psychodiagnostic testing in the
field of clinical psychology today. Many diver-
gent opinions exist, none of which has received
unequivocal research support. It would appear
that despite its detractors, psychological
testing continues to be a frequent activity for
the clinical psychologist. (1977, p. 212)

Having established that psychological assessment is

a controversial yet important clinical activity, this

review will now examine the referral process more closely.

The following section will suggest the many reasons for

referral and factors that potentially influence the referral

process. Included will be the author's perceptions of a

general dichotomy in approach to assessment and then an

emphasis on confirmation in the service of increasing the

referrer's confidence in judgment as a primary usage of

psychological reports.

The Referral Process

General Description

The process of psychological consultation has

received some attention in the research literature, e.g.

Rhodes (1974); Swenson (1974). However, detailed analyses

of the step-by-step procedure of what is involved in a

referral for consultation have seldom been discussed.

The referral may be seen in terms of a series of steps

involving interaction between the referrer, consultant

and client. The author offers a general conception of

these steps in Figure 1. It should be noted that this

outline omits potential active involvement of the person

being assessed, e.g. in the sharing of test results.

This omission is for purposes of economy in the current

discussion. The author otherwise endorses such active

involvement and open communication with the client, as has

been discussed by several psychologists (Richman, 1967;

Fischer, 1970, 1972; Brodsky, 1972; Mosak & Gushurst, 1972;

Craddick, 1975; Tallent, 1983).

Reasons for Referral

An Overview. The third step in Figure 1, concerning

the referral question, has received much attention in the

literature. Table 1 is presented as a list representa-

tive of possible reasons for referral. The list is probably

not exhaustive, but should encompass most referrals, based

upon the author's experience and survey of the literature.

Listed reasons are not meant to be mutually exclusive, as

will become clear in the following discussion.

The first type of referral, the seeking of new infor-

mation about the client, is the most common and plainly

stated reason in virtually all referrals. A second reason,

(1) Establishment of an institutional relationship
so that a psychologist becomes available for
(2) Referrer decides to refer

(3) The referral itself, including specific questions

(4) Psychologist's initial response (acceptance,
clarification, negotiation of referral issues,
(5) Psychologist's assessment of the patient, i.e.
interview and testing

(6) Psychologist's interpretation and integration of
assessment results
(7) Communication of results to the referrer (verbal
and/or written)

(8) Referrer's comprehension of report

(9) Referrer's agreement/disagreement with report

(10) Referrer's new understanding of the patient

(11) Referrer's treatment decisions (may include
communication of test results to the patient)

(12) Referrer's attitude, e.g. confidence, in regards
to his understanding and treatment of the patient
(13) Summary of net payoffs to referrer and
(14) New status of the interpersonal relationship
(back to step #1)

Figure 1. The Referral Process for Psychological

seldom officially admitted, is that of bureaucratic

requirement (Section "II" of Table 1). Psychological

testing is done merely because of routine and not because

the referrer or consultant believes it to be of intrinsic

value for the referrer or the patient. An example of

this might be a situation in which a test report is re-

quired as part of an application for admission to a facility,

even though the test report itself contains no new material

on the client. Testing for purposes of staff training would

also fit into this category. Another reason for testing

would be as a therapeutic intervention itself (Section

"III"). This function has been gaining notice in recent

discussions (Richman, 1967; Mosak & Gushurst, 1972; Tallent,

1983). A fourth reason for referral is for confirmation

of current clinical judgment. The referrer seeks

consensual validation for what he already believes. He is

less interested in new information and more interested in

redundant feedback, although he or she may not be aware

of this. Confirmation would serve to simplify the

clinician's judgment and probably would be reinforcing or

anxiety reducing for him or her. This reason, which will

be elucidated further below, would not be stated explicitly

in a referral. It could refer to any of the aspects of

clinical judgment mentioned in the first section of Table 1.

Table 1.
Potential Uses of a Psychological Consultation

I. Acquisition of new information
A. Descriptive diagnostic issues
1. Differential diagnosis
a. DSM-III categories
b. Other dimensions, e.g.
2. Intellectual evaluation
a. IQ
b. Assessment of particular cognitive
3. Detection of organicity
(neuropsychological assessment)
4. Detection of thought disorder
5. Assessment of psychodynamics (structure,
defenses, conflicts)
6. Assessment of level of depression
7. Assessment of etiology
8. Assessment of psychosexual level of
9. Assessment of impulsivity
10. Assessment of sexual orientation
11. Assessment of possible malingering
12. Assessment of assets, strengths
13. General description of patient
B. Predictive diagnostic issues related to
1. Prognosis
2. Choice of treatment
3. Progress in current treatment
4. Assessment of suicide potential
5. Assessment of potential transference
reactions or defenses likely to surface
in treatment
6. Vocational/educational recommendations
II. Required bureaucratic procedure
III. Therapeutic procedure
IV. Confirmation of the referrer's judgment

Utility versus Understanding. The preponderance of

literature on assessment has been devoted to the issues

listed under the first section, acquisition of new infor-

mation. Although not precisely reflected in the divisions

of Table l's outline, a dichotomy of attitudes towards

the purpose of assessment can be gleaned from an examina-

tion of the debate about the value of testing. As will

be demonstrated, this dichotomy, which may be grossly

contrasted as "utility" versus "understanding," is a

subtle one that has not been previously clarified in the

literature. The distinction of those who see assessment

for utilitarian purposes from those who advocate the

broader goal of "understanding" is a matter of degree.

Both groups emphasize the new information that assessment

brings. Those who fit in the "understanding" category

would say that what they seek in testing is not only that

which will be immediately useful, but also information

that may not reveal its usefulness until a later time

during the course of therapy. The differences between the

two groups lie in what is considered useful, which is

often a reflection of theoretical orientation. The meaning

of the two approaches will be explained through descrip-

tion and example.

A number of advocates of revision in testing have

fit clearly in the utilitarian group (Fulkerson, 1965;

Cole & Magnussen, 1966; Breger, 1968; Arthur, 1969;

Lanyon, 1972; Howe, 1981) Fulkerson (1965) stated,

"If tests are not being used to make decisions about

patients, they should not be given" (p. 193). He

recommended that tests should be developed by individual

clinicians to correspond to criteria related to the

specific decisions that need to be made. Cole and

Magnussen (1966) proposed replacing traditional assess-

ment with a model devoted to disposition and specific

treatment for the patient. Thus, disposition and action

would be the ultimate goal of the assessment, rather

than diagnosis. The authors pointed out that such an

orientation would serve to break down the artificial

barrier between diagnosis and treatment. Such a "heuristic"

approach, based on decision making theory, would take the

context of the assessment into account and would involve

the limited number of dispositions that are available in

real situations. Arthur (1969) and Howe (1981) endorsed

this approach, as did Lanyon (1972), who outlined an

assessment scheme designed for efficiency.

Some observers who have not necessarily advocated

this approach have acknowledged its impact. Klopfer (1968)

observed that the typological interests of Rorschach and

Jung were no longer as salient to practitioners as were

concerns for predictive efficiency:

Thus the matter ofpredictive efficiency is more
highly valued today as a criterion of validity

than is the concurrence of one test inter-
pretation with another. (p. 404)

Tallent (1983) similarly described a recent trend

in psychological reports:

Many reports now are sharply focused and
highly prescriptive. What is the problem,
and what can we do about it? In this
tradition the behavior-oriented report is
strictly a no-nonsense, business-like
approach. Psychopathologically oriented
reports increasingly reflect the fact that
certain classes of medication target
certain behavior symptoms. (p. 11)

One type of utilitarian approach has been the

accommodation of testing to psychiatric classification.

As psychiatry establishes a diagnostic category, psycho-

logical tests are soon applied in an attempt to develop

test criteria for identification of patients who fit

the new category. For instance, with the establishment

of criteria for the borderline personality disorder in

DSM-III, Snyder, Pitts, Goodpaster, Sajadi and Gustin

(1982) developed an MMPI profile to discriminate a group

of borderline patients from a group of dysthymic patients.

Million's (1982) inventory, applicable to Axis II of DSM-

III, may also be seen as a direct response to a new


As contrasted to the emphasis on utility, some

authors have stressed the need for a more general under-

standing of the person as the purpose of psychological

testing (Allison et al., 1968; Hertz, 1970; Holt,

1970; Weiner, 1972; Schwartz & Lazar, 1979).

Tallent (1965) noted a switch in emphasis from the goal

of prediction to that of understanding: "The role of

prediction is downgraded from its historical position

of scientific royalty" (pp. 429-430). As noted above,

however, his most recent review suggested that this

trend was only temporary (Tallent, 1983). His earlier

summary expressed the concept of interest here. At

that time he cited the rise of construct validity as a

more meaningful standard for clinical practice than

predictive validty. Also, he quoted Cantril, Ames,

Hastorf, and Ittelson (1949):

Emphasis on prediction alone can easily obscure
the more fundamental aim of science covered by
the word understanding. (Tallent, 1965, p. 430;
emphasis in original)

Weiner (1972) referred to understanding as knowing "what

people are like and how they got to be that way" (p. 536).

He advocated the incorporation of personality constructs

as intervening variables that generated explanations

for behavior which may or may not be predictable. He

stated that the purpose of psychodiagnostic assessment

was the appraisal of personality processes rather than

the prediction of behavior.

As a part of the continuing debate over the value

of clinical versus statistical prediction, Holt (1970)

voiced a similar opinion. He observed that, in his

view of science, prediction was not an end in itself,

but a means to a larger end, which was "explanation

through understanding" (p. 346). He acknowledged that

other authors viewed prediction and control as the

fundamental goals of science. This division of opinion,

according to Holt, could be further linked to the basic

dispute between phenomenology and positivism.

Schwartz and Lazar (1979) may have been speaking

of the same dichotomy as they distinguished semantic

interpretation from probabilistic interpretation. The

former approach attempted to ascertain meaning, whereas

the latter was interested in prediction. The authors

asserted that projective assessment was based on semantic

interpretation and should not be judged according to

statistical, predictive criteria.

The aim of the clinician is to understand and
describe, not to predict some future event.
Prediction requires random sampling; descrip-
tion requires informal observation (p. 7) . .
[T]he clinician's tool is psycholinguistics,
not statistical inference. When dealing with
the -responses of patients, we can rightfully
speak of the art of clinical psycholinguistics.
This approach applies to any clinical enter-
prise that attempts to understand another
person's thought through his verbal productions.
(Schwartz and Lazar, p. 11)

Similarly, Allison et al. (1968) described the

purpose of an evaluation:

We believe that the major role of psychological
testing should not be prediction, but, instead,
the development of hypotheses which best describe
the given individual and which constitute a "best
fit" for the data and the observations which are
available. The broader purpose of these hypotheses
is to enrich and extend the understanding of the
patient or client and to make available dimensions
of the individual which may have been unavailable
or not considered previously. (p. 2)

Such an emphasis on understanding over specific

prediction should not be misconstrued as a rejection of

utility altogether. Particularly in a long term inten-

sive treatment situation, most statements made in a clini-

cal test report can be construed as relating to useful

clinical predictions. For example, Applebaum's (1977)

description of the Menninger Foundation tradition of

psychological testing embodied this approach. Following

in the legacy of R-paport and Schafer, Menninger

psychologists have given essentially the same extensive

battery of tests to each patient. This has included

the Wechsler-Bellevue, Rorschach, TAT, Word Association,

BRL Sorting and Story Recall (p. 24). A comprehensive

report has been composed for each patient. The Menninger

method would appear to be the prototypic descriptive

evaluation that attempts to map the internal psycholo-

gical functioning of an individual. Specific utility

would seem to be a secondary consideration. Appelbaum,

however, insisted that testing should be done with pre-

dictions for treatment as the foremost goal:

In our view, the true measure of "the validity
of tests" is how well the psychologist who
uses them is able to make useful clinical
predictions. (p. 244)

What Appelbaum was referring to reflects the fact that

a comprehensive, descriptive report will be utilitarian

in such an intensive treatment context. In a different

setting, such as the community clinic, the limited

treatment options will not enable as much use to be

made of such extensive interpretive descriptions.

The question of utility, then, is a matter of the

nature of the treatment situation. Tallent (1965) cited

Payne's (1958) ideas that pointed to a major refocusing

of the validity question to that of utility:

Stated simply, the question is "What sort of
clinical information is needed (The answer
to this apparently simple question is not
yet at hand and has hardly been investigated),
and can the psychologist supply it?" (p. 430)

Tallent (1983) saw the ideal "case-focused" report as

being able to answer Paul's (1967) specifications:

What treatment, by whom,, is most effective
for this individual with that specific
problem, and under which set of circum-
stances. (p. 111; emphasis in original)

Tallent added two additional requirements: "understanding

of the person and the "why" of the report's recommendations

(1983, p. 26; emphasis in original).

In setting up such an ideal, Tallent seemed to

gloss over the distinction made here of the "utility"

versus "understanding" groups. However, this synthesis

would be appropriate if one recognizes that different

referral situations require different levels of under-

standing and criteria for utility.

Communication centered upon the referral questions

thus seems to be the essential problem of the assessment.

Few authors in this area would dispute the following


The usual referral for psychological testing
should clearly specify exactly and what the
referral source hopes to obtain from the
consultation. The real value of the psycho-
logist's report will lie in the emphasis
which is placed on answering the specific
referral questions. (Harlow & Salzman, 1958,
p. 231)

This implies that the referrer would do well to be fully

aware of his needs and that the assessing psychologist

would also be aware of these. This follows the earlier

discussion of the referral process as a social psycholo-

gical event. As applied to the utility/understanding

dichotomy, referrers will likely be placed into one group

or another, reflecting the sort of questions they would

like to have answered.

Confirmation and confidence in judgment. As pre-

viously stated, referral for testing has been influenced

by factors other than a rational, scientific search

for new information. The referral is an interpersonal

event multi-determined by the personalities involved,

including especially the needs of the professional who

requests the consultation. The main hypothesis of this

study is that a primary factor in requesting a consulta-

tion is the desire by the referrer to have confirmed

what he or she already believes, thereby increasing his/

her confidence in his/her judgment. This corresponds

to the fourth use of psychological reports listed in

Table 1.

Very few writers have mentioned this use of an

assessment. Allison et al. (1968) endorsed the worth

of tests as confirmatory evidence because of the

reassurance value for the staff to make decisions with

more confidence. Rosenwald (1963) acknowledged the value

of confirmation, but gave it secondary status:

A psychological report should partly confirm
clinical findings already known, but in large
part it should provide new information uniquely
available through it at that time and not
confirmable until perhaps much later. Finally,
it may provide information which is perhaps not
directly confirmable in any other way at any
time. (p. 229)

Fulkerson (1965), who advocated an immediate decision

making approach for assessment, did not see much value

in assessment as confirmation. However, his analysis

suggested that much of assessment was done precisely

for this purpose. He stated that the "mapping of

an internal state of affairs" (p. 193) was, at best,

only an intermediate goal. If this were the ultimate

goal of testing, then a psychiatrist's judgment could

serve as the validating criterion. Fulkerson concluded

that the only value of testing in such a case would

be to increase the psychiatrist's confidence in his


The point here, contrary to those who regard

assessment from Fulkerson's perspective, is that this

confirmatory function is valuable in itself. Appelbaum

(1977) expressed this proposition in his discussion of

assessment at the Menninger Foundation:

Tests have become a routine part of personality
assessment at the Menninger Foundation as a
result of many years of empirical observation
of their utility. One of their uses is to
corroborate other clinical judgments, an impor-
tant though perhaps undramatic contribution.
Clinical responsibility demands a high degree of
confidence in the making of decisions that may
profoundly affect other people's lives, and
corroboration of inferences increases the confi-
dence with which inferences and the recommenda-
tions based on them are held. (p. 2)

The proposed study will focus upon this "undramatic

contribution." As discussed in the first part of this

review, the seeking of confirmatory information is a common

judgmental bias which has been found to correlate with an

increase in confidence in that judgment. Before pro-

posing how this phenomenon will be tested, the present

discussion will return to a summary of factors that

influence the referral process and then a review of

empirical research on the use of psychological reports.

Factors Influencing the Use of Assessment

A salient feature of the schema in Figure 1 is that

it begins and ends with the status of the interpersonal

relationship. This is a recognition of the point already

emphasized, that is, the interpersonal aspects of the


Given that the referral process represents a

complex interaction between individuals and, sometimes,

institutions, then one can assume that numerous factors

influence whether or not a referral is made for a psycho-

logical consultation. In view of a lack of any previous

comprehensive literature discussion of factors that

influence testing, the author offers a list of common

influences (Table 2). Again, this list is not exhaustive.

It is presented to further illustrate the complexity of

the referral process. It should be noted that most of

the options in Table 1 can be subsumed under part "C" of

section "III" in Table 2.

Table 2.
Factors That Influence the Use of Psychological

I. Factors related primarily to the referrer
A. Referrer's attitude toward psychological
1. Referrer's attitude toward clinical
2. Referrer's knowledge of testing
B. Type of treatment performed by referrer
C. Referrer's theoretical orientation
D. Referrer's experience
E. Referrer's self-confidence
F. Appropriateness of referrer's referral
G. Quality of communication between referrer
and psychologist
H. Type of clinical setting where referrer
II. Factors related primarily to the consultant
A. Availability for referrals
B. Consultant's attitude toward testing
C. Consultant's theoretical orientation
D. Consultant's expertise in assessment
E. Quality and utility of previous reports
1. Clarity in report writing
2. Extent to which referral questions were
3. Time lag in answering previous
F. Time constraints for assessment
G. More profitable or desirable alternative
activities, e.g. psychotherapy
H Quality of communication between referrer
and consultant
III. Other factors
A. Cost and affordability to the patient
B. Testability of the patient,
e.g. willingness
C. Type of patient, i.e. specific problem;
referral questions
D. Scientific validity of tests

Empirical Research on the Use of Psychological Reports


The empirical research reviewed thus far has

concerned mainly the percentage of psychologists' time

devoted to assessment. Turning to the empirical study

of psychological reports themselves, the limited litera-

ture may be divided into surveys and studies that directly

used reports. Before examining these studies, an impor-

tant observation must be made. Compared to the amount of

published research on particular psychological tests

such as the MMPI or Rorschach, the empirical study of

psychological test reports is a sparse literature. In

1956, Tallent observed that much clinical time was devoted

by psychologists to preparing reports. In contrast, he


It is surprising then, that investigators con-
cerned with the elicitation of psychological
data number in the many thousands whereas only
a handful of writing have dealt with the problem
of how such data may be communicated so as to
to be maximally useful. (p. 103)

Almost three decades later he observed that there was

still a "dearth" of research on the utility of psychologi-

cal reports. He noted that the few available studies

showed contradictory evidence that was open to multiple

interpretation. This was based in part on the widely

varied research designs employed (1983, p. 19).

Bellak's statement that "amazingly little systematic

attention has been paid to the writing of the psycholo-

gical report" (1959, p. 76) remains applicable today.


Much of the reported research on the use of

psychological reports has taken the form of surveys,

usually conducted by mail (Tallent & Reiss, 1959a,b,c;

Lacey & Ross, 1964; Hinkle, Nelson & Miller, 1968; Mintz,

1968; Moore, Boblitt & Wildman, 1968; Smyth & Reznikoff,

1971; Olive, 1972; Wiedemann & Mintz, 1974; Tallent, 1983).

In most of these studies, professionals who use reports,

such as psychiatrists, social workers and other psycholo-

gists, have been asked to rate the value of reports generally.

Data have been gathered as to how frequently reports are

used, what sorts of information reports provide and what

shortcomings they typically have. Most of these studies

have been essentially uncontrolled attitude surveys.

Respondents have given their opinions about typical reports

without having to consider any report in particular. An

exception was Mintz' (1968) survey of twenty-five student

therapists who rated the value of reports in current

psychotherapy caseloads. These subjects presumably could

think in terms of specific reports that they had read in

regards to a limited number of current patients. How-

ever, even this study was relying upon general opinions,

influenced by biasing factors, such as inaccurate memory.

There was no control or consideration of the reports

themselves. The limitations of this survey method were

seldom acknowledged by authors. Not only were the usual

pitfalls of self-report research not mentioned, but also

there was a noticeable omission of any discussion con-

cerning potential demand characteristics of these sur-

veys. Such discussion would seem to be appropriate

because the surveys involved critical evaluation of an

important activity of the surveyors' profession. It

is reasonable to hypothesize different results if the

subjects had believed that the studies were conducted

by persons who were not psychologists.

Setting aside these methodological problems for the

moment, one interesting conclusion emerged from two

independent studies. Smyth and Reznikoff (1971) and

Wiedemann and Mintz (1974) each concluded that reports

functioned mainly as confirmation of diagnostic judgment.

This finding supported the main hypothesis of the current

proposal. However, given the methodological drawbacks

already mentioned, one must be cautious in asserting

that most or many reports are indeed used for this purpose.

Experimental Studies

There are a few studies that have attempted to

use psychological reports in an experimental manner.

The author could locate only ten such references that

had subjects directly use reports. Of these, eight

attempted to measure what Meehl (1959/1977) has termed

"incremental validity." This refers to the extent

of new information that a reported procedure, in this

case psychological assessment, provides. Meehl's dis-

cussion of incremental validity was a part of a larger

exposition on the concurrent validity of psychological

tests. He posited four levels of concurrent validity

in the "phenotypic" characterization of a person. By

"phenotypic," he was referring to the "descriptive or

surface features of the patient's behavior, including

his social impact" (p. 91). These levels varied in terms

of the rigorousness in criteria for validity. The eight

studies in question here generally aspired to measure

the utility of test reports according to Meehl's second

level of concurrent validity, which he described in the

form of a question:

To what extent does the test enable us to
make, reliably, accurate statements which
we cannot concurrently and readily (that
is, at low effort and cost) obtain from
clinical personnel routinely observing
the patient who will normally be doing so
anyway (that is, whose observations and
judgments we will not administratively

eliminate by the introduction of the test)?
(Meehl, 1959/1977, p. 104; emphasis in

Thus, this level of validity refers to what unique

information may be derived from a psychological report.

Meehl's first level of concurrent validity, or "valida-

tion demand" (p. 106), referred simply to the accuracy

of information, regardless of whether it added to the

user's knowledge. The third level asked, in addition

to the criteria and uniqueness of the first two levels,

"how much earlier in time" (p. 106; emphasis in original)

that the psychometric assessment enabled judgments to

be made. The fourth and most practical, incorporated the

criteria of the previous three with the additional stand-

ard of how much the procedure helped in the treatment

of the person being assessed. Meehl observed that, to

his knowledge, no validity studies of psychological tests

had been carried out at the third and fourth levels; very

few had been attempted at the second level.

One striking aspect about the eight studies to be

discussed is their variety of methodology. No two are

alike, either in overall design or in specific criteria

used to assess the utility of reports. Also, with one

or two exceptions, these studies have suffered from

significant methodological flaws or have been published

in such sketchy form that a complete critical analysis

is not possible. Keeping in mind their diversity and

limitations, they can be examined as follows, grouped

loosely into five different types.

Two studies can be grouped together because of

their survey methodology. They are distinguished from

the survey studies already reviewed in that they involved

either the ratings of actual reports by clinicians who

use reports or ratings of material derived from actual

reports. This is in contrast to the other survey lit-

erature which assessed general attitudes without reference

to particular report material.

Affleck and Strider (1971) sampled a total of 340

psychological reports on inpatient and outpatient services

for both children and adults. Referring sources were sur-

veyed upon receipt of the reports and then again two months

later, when a more extensive interview was conducted. The

surveys revealed that about two-thirds of the referral ques-

tions were answered by "new and significant information"

(p. 177) or with confirmatory information that was previously

suspected but not well established. Followup interviews re-

vealed that 52% of the reports provided information that

resulted in some modification of patient management or dis-

position. Also 18% of the reports confirmed current thinking

about the patient. In addition, 4.5% were vaguely "help-

ful"; 2.3% had "minimal effect"; 18% had no effect;

1.9% were "erroneous, incomplete or detrimental";

3.5% of the reports could not be completed because

the patient went "AWOL" or quit (p. 178). Affleck

and Strider cautioned that the usefulness of reports

was evaluated by co-workers of the psychologists;

as colleagues, they may have been biased in favor of

the reports. The authors concluded that, considering

an average of three questions for a typical psycholo-

gical referral, there was a high probability of useful

information derived from psychological reports.

Hartlage and Merck (1971) found that when psycho-

logists were made aware of what sorts of statements were

valuable to the users of reports, their future reports

improved in utility for their users. The study involved

psychological reports from a large Goodwill Industries

rehabilitation facility. Their criterion for utility

consisted of ratings of reports, on a five-point scale

from "very poor" to "very good" (p. 460), by the super-

visors for whom they were intended. The authors concluded

that psychologists tended to "grind out reports with good

theoretical consistency but little decisional value" (p.

460) until they became familiar with the actual uses of

reports by those for whom they wrote.

A second type of methodology used for the assessment

of incremental validity was an archival examination of

reports in the files of a particular agency. Two

studies employed this method, albeit in quite different

ways using different criteria.

Adams (1972) assessed the impact of psychological

evaluations by comparing them with the initial and

final psychiatric diagnosis in 137 psychiatric inpatients'

records. His criterion for utility thus depended on his

own judgment as to the similarity of psychological report

conclusions and the different psychiatric diagnoses.

He concluded that the evaluations had an impact on no

more than 16% of the diagnoses. In 52% of the cases, the

report was judged to have either confirmed the diagnosis

or was consistent with it without definite statements of

confirmation. In 30% of the cases, the report's disagree-

ment with the initial diagnosis was not influential in

changing the diagnosis. Adams concluded from these

statistics that psychological evaluations did not have

much influence on psychiatric diagnosis. He suggested

that the data supported Gauron and Dickinson's (1966,

1969) findings that psychiatrists derived emotional com-

fort from confirmation of their diagnostic opinions by

psychological reports. Adams admitted that psychological

evaluations provided more information than diagnosis.

He speculated that psychiatrists found psychological evalua-

tions useful for information not requested in the referral.

He also suggested that psychiatrists were not aware

of the effort involved in a psychological evaluation.

In conclusion, Adams recommended that referrals for

psychological evaluation should be accepted only if

there can be anticipated tangible results. He suggested

two criteria for referral questions:

(a) there is more than one answer possible
and that (b) different alternatives can be
expected to lead to importantly different
behaviors on the part of the referring
person. (p. 566)

Howe (1981) applied Cole and Magnussen's (1966)

dispositional perspective through a three-step "disposi-

tional assessment evaluation scheme" (p. 112). In this

model, a sequence of three questions were asked: (1) Was

the referral meaningful? (2) Did the assessment provide

an accurate understanding of the patient? (3) Were the

resulting recommendations useful? If the referral

question was not meaningful, which typically occurred

if testing was ordered merely to rubberstamp what was

already known, the author placed the referral in a

"bureaucratic" category. In the study presented by

Howe, 31.4% of the vocational referral cases fell into

this category, causing the consultant to recommend that

future referrals be examined for their appropriateness

before creating the expense of unnecessary assessment.

If the second question, concerning accuracy of the

assessment, was not answered positively, then the case

would belong to the "nonpredictive" category, which

corresponded to the traditional "miss" category for

diagnostic assessment. If the assessment were accurate,

however, the third question would be the essential one

that determined the usefulness of the whole assessment

effort. To decide whether an assessment were useful

would require knowledge of the context of the assessment,

the available choices for action and some means to assess

whether an appropriate choice had been made as a result

of the assessment findings. In his study of 51 vocational

assessment cases, Howe determined that 62.9% of the

nonbureaucraticc" cases were predictive and useful, 31.4%

were predictive, i.e. accurate, but not useful, and 5.7%

were nonpredictive, i.e. inaccurate. As in Adams (1972),

the criterion for utility depended upon the author's

judgment. In this study, Howe's judgment in terms of the

answers to each of the three questions for each case was

crucial to the assessment of utility.

The studies of Lakin and Lieberman (1965) and Hartlage,

Freeman, Horine and Walton (1968) could be grouped together

as laboratory studies that assessed the utility of reports

in very indirect ways. This indirectness, coupled with

basic inadequacies in the studies, made these approaches

less attractive than others.

Lakin and Lieberman (1965) obtained Q-sort

descriptions of a patient by 18 psychoanalysts who

were given varying increments of information about the

patient. A psychodiagnostic report Q-sort was one of

the types of information presented, in addition to mini-

mal identifying data, psychiatric intake report, social

history and a therapy protocol Q-sort. The authors

used a factor analytic method to ascertain different

aspects of the analysts' conceptualizations of the patient.

Unfortunately, their methodological description was un-

clear; three tables mentioned in the text were missing

from the article. In their reported results, the authors

stated that the information from the psychological report,

as compared with information from the therapy protocol,

tended to increase the homogeneity of the analysts' ratings

of the patient. The test report had the effect of causing

the raters to view the patient as more disturbed, with

a less hopeful prognosis. The raters emphasized psycho-

sexual issues in their conceptualizations of the patient,

whereas the various pieces of information provided were

primarily concerned with interpersonal issues, emotional

status and self-concept. The authors concluded that

changes in conceptualization appeared to depend mainly

upon the analysts' theoretical orientation. Diagnostic

information, although not directly related to these

psychosexual issues, likely served as a releasee"

for predisposed response sets. Each type of information,

including the test report, mattered little compared to

the influence of theoretical orientation. These findings

suggested that if information in the psychological

report had been conveyed in psychosexual terms concordant

with analysts' theoretical orientation, perhaps the

report would have accounted for more of the variance

in ratings.

In the other laboratory study, Hartlage, Freeman,

Horine and Walton (1968) extracted 4370 different content

statements from 1000 psychological reports selected

randomly from the files of a state hospital covering a

ten-year period, 1957-67. The statements were condensed

into 55 content statements that were rated independently

by four psychiatrists in terms of their decisional utility

for a patient's treatment plan. These ratings were

found to be negatively correlated (r=-.50, p<.001) with

the frequency of occurrence ofthe statements, using a

Spearman correlation coefficient. The authors recommended

that specific referral questions be asked in future

assessments so that psychologists could supply more

relevant information. They also recommended that psycho-

logists learn what information is of high utility value

to psychiatrists.

A fourth type of study, quite distinct from the

others, was the assessment of the value of an extensive

psychological assessment vis-a-vis long-term psycho-

therapy. As reported by Appelbaum (1977), a major

finding in the 20-year psychotherapy research project

of the Menninger Foundation was that psychological test

reports were better predictors of patients' response

to treatment and better assessors of global diagnostic

understanding and treatment recommendations as compared

to psychiatrists' judgments. This involved the study

of 42 patients, mostly severely disturbed, in psycho-

analytic or psychoanalytically oriented treatment. One

"dismaying" aspect of this finding was that the psychia-

trists' judgments were based in part on test reports,

whose findings they must have disregarded or disagreed

with when making their own judgments.

The fifth and final method of incremental validity

assessment was represented by the oldest study of the

group, reported by Dailey (1953), who endeavored to

measure the utility of psychological reports in a VA

psychiatric service. Utility was explicitly defined

in terms of decision making usefulness.

Dailey divided his project into several steps,

the first of which was a compilation of a list of 32

treatment decisions that frequently had to be made by

the professional staff on the unit. This list was

obtained by simply asking the psychiatrists, psycholo-

gists and social workers to suggest items for such a

list, which was then edited by the author. The second

step involved the creation of an index of "stereotypy,"

which was to serve as a base rate control for decisions

made about an "average" patient. This index was estab-

lished by the ratings of ten psychologists familiar with

the treatment unit. They indicated their answers ("yes,"

"no," or "don't know") to each of the 32 treatment deci-

sions for a hypothetically average patient on the unit.

The index thus was meant to reflect a prior judgments

about a patient, in the absence of specific information

on that patient.

Using two clinicians as judges, Dailey then calculated

the clinical utility of psychological reports for nine

patients. The judges indicated their 32 treatment decisions

on the basis of reading the reports alone. Dailey then

applied a weighted scoring system based upon the index

of stereotypy to establish a measure of "new, useful

decisions" (p. 294) made on the basis of psychological.


Dailey also assessed the clarity of reports by totaling

the number of decisions on which the two judges agreed.

He found that, on the average, the two judges could agree

upon 53% of the 32 decisions after reading a report. Taking

base rate information into account via the index of

stereotypy, 26% of the list of therapeutic decisions

could be considered clear, new and useful decisions

resulting from the use of psychological reports.

Although Dailey acknowledged that there existed no

criteria to judge whether such a proportion was satis-

factory or not, the thrust of his concluding discussion

was that this percentage indicated an insignificant

effect for reports. He attributed the results to "the

operation of clinical bias or rigidity" (p. 301).

Dailey cautioned that his measures of utility and clarity

needed improvement in reliability and that they may have

been dependent upon the particular institution in ques-

tion. Also, he noted that psychological reports had

value not only in "gross decisions" (p. 302), such as

those considered in his study, but also for "subtler

problems" (p. 302) relating to psychotherapy. Another

purpose of a report, "establishing in the therapist or

decision maker a clear and memorable image of the patient"

(p. 302), was not included in the study. Dailey also

admitted that the study did not assess whether the deci-

sions were valid or not. He concluded that the study's

most important implication was that the usefulness of

a report could be measured, although such methodology

needed refinement.

The two other studies that directly employed reports

but did not primarily assess incremental validity were

by Garfield, Heine and Leventhal (1954) and Cuadra

and Albaugh (1956). Garfield et al. surveyed

psychiatrists, psychologists and social workers in

a VA setting. Each clinician provided letter grades

(A, B, C, D and F) and spontaneous comments for two

psychological reports randomly chosen from clinical

records. One report was given to every clinician, each

of whom also rated a second report given only to him or

her. The author's conclusions about the utility of

reports consisted mainly in summary comments of what

aspects of reports should be improved, such as stylistic

features and matters of content, e.g. the overuse of

speculative inferences without supporting data.

Cuadra and Albaugh (1956) did not address the issue

of utility, but instead examined the effectiveness of

communication in reports. They compared report writers'

intended meanings and emphasis with perceptions of

clinical judges who read the reports. The authors found

that the correspondence between the report writers'

intentions and judges' interpretations was only 53%.

The greatest source of communication breakdown was

related to matters of emphasis and degree or amount of

personality characteristics described.

Altogether, these ten studies represented a diversity

of research methods attempting to answer diverse questions.

Methodologically, the oldest study (Dailey, 1953) was

probably the most exemplary because of its control

for base rates in incremental utility. For that reason

it was described in more detail than the others. In

general, the research findings are mixed. Some authors

seemed more pessimistic in their conclusions than their

data warranted. The studies may have been biased

against the real value of reports. They tended to focus

on the immediate measurable utility. The one study that

examined utility from a long-term perspective (Appelbaum,

1977) placed reports in a very favorable light.

Tallent (1983) pointed out another important factor

that these studies mostly ignored. He suggested that

any evaluation of the utility of psychological reports

should be done with a comparative reference to the utility

of other diagnostic procedures. He noted that the utility

of such procedures as the social history, mental status

examination and nursing reports has not been empirically

demonstrated. Borrowing Dailey's (1953) finding that

psychological reports contributed to clinical decisions

in 26% of referred cases, Tallent asserted:

But even if only 26 percent of psychological
reports make a difference in patient manage-
ment and disposition, we might regard the
psychological report as a very effective
instrument indeed. (1983, p. 21)

Another significant feature of the studies reviewed

above was that the authors generally did not indicate

an awareness of literature on judgment, decision making

or the problems of combining information in clinical

judgment. This may in large part be due to the dates

of these publications; the research literature on judg-

ment has been predominantly more recent than the litera-

ture on psychological reports. Also, although the use

of reports for confirmation of judgment was a theme that

was raised by these studies, none of them endeavored to

examine the issue of confidence in judgment and the value

of confirmation. Such omissions suggest opportunities

for exploratory research, of which the present study is

one example.

Applications in the Present Study

Although the breadth of the field of judgment

research has been reviewed, only some of the conclusions

can be applied to any one study. The present experiment

obtains its impetus from various currents in the litera-

ture. Foremost is the vacuum of empirical research on

the use of psychological reports. Second, the need to

apply research on clinical judgment to representative

situations is addressed. As such, this study belongs

more to the applied, as opposed to the basic, pole

of the research spectrum. However, this need not

detract from potential scientific value. As Hammond

et al. (1964) noted:

Clinical psychology provides an excellent
research site for the study of man as an
inferring organism. (p. 438)

The study does not employ mathematical models

in the manner of contemporary research on judgment

(Slovic and Lichtenstein, 1973; Hogarth, 1980). This

is due to the limited number of subjects compared to

the multiplicity of response variables. Also, the

ecological validity of psychological reports is unknown.

Assessment of accuracy or optimality of judgments is

not the concern here. This study is concerned with cue

utilization, that is, how subjects will respond to a

given set of cues. Ecological validity of the cues or

overall functional validity of responses is not at

issue because there are no established criteria with

which such validities could be assessed.

Therefore, the present study can be classified as

a "single-system" case (Rapoport & Summers, 1973; Hammond

et al., 1975). It also is best described, "discovery,"

as opposed to a systematically controlled experiment. The

general issues are how clinicians use a psychological

report and how different information influences clinical

judgment and confidence in that judgment.

Questions of professional differences are of

secondary interest, prompted by the inclusion of two

professional groups among the subjects. The study does

not look at institutional relationship factors in the

use of psychological tests, i.e. it is not, for example,

a study of psychiatrists' attitudes toward psychologists.

However, the author does hypothesize that psychiatrists

will tend to be more confident than psychologists because

the latter group has had more of an emphasis in acquiring

skeptical attitudes in clinical thinking. This is in line

with Brody's observation:

The graduate student in psychology lives in an
atmosphere of research and skepticism, in which
theories are constructed and abandoned, in
contrast to that of the medical student, intern,
and psychiatric resident who learn to apply
already established knowledge and are reluctant
to attack established theories, especially those
espoused by respected figures in the field.
(1956, p. 107)

A multi-method approach, involving structured and

unstructured response tasks, is followed. As is true

for the multiple sources of information on patients,

the variety in the response format is intended to reflect

the variety common in real clinical situations. This

includes both descriptive and predictive judgment. An

audiotape is employed to approximate an actual clinical

interview, which is the central activity in the

diagnostic judgment process.

In general, this two-part literature review has

revealed the need to apply laboratory-based methods of

studying judgment and decision making to the practical

issue of how psychological reports are used. The present

study necessarily looks at only a small part of the

many issues touched upon here. The complexities of the

topic have been delineated to justify the breadth of

experimental measurement and to inform the interpretation

of subsequent findings.

University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs