Interpretational ambiguities in conjunction problems


Material Information

Interpretational ambiguities in conjunction problems
Physical Description:
vi, 127 leaves : ill. ; 28 cm.
Messer, Wayne S., 1954-
Publication Date:


Subjects / Keywords:
Heuristic   ( lcsh )
Problem solving -- Psychological aspects   ( lcsh )
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )


Thesis ( Ph. D.)--University of Florida, 1990.
Includes bibliographical references (leaves 122-126).
Statement of Responsibility:
by Wayne S. Messer.
General Note:
General Note:

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001640243
notis - AHR5328
oclc - 24229076
System ID:

Full Text








My thanks are extended to Dr. Richard Griggs for

serving as my committee chair.

Thanks also go to Drs. Michael Levy, Ira Fischler,

Merle Meyer, and James Algina for serving as committee




ACKNOWLEDGMENTS................................. ii

ABSTRACT ......................................... V

INTRODUCTION..................................... 1

HISTORICAL OVERVIEW.............................. 5

The Use of Metaphor in Psychological Thinking.. 5
Thinking as Perception......................... 9
Thinking as Statistics.......................... 13

REVIEW OF THE LITERATURE.......................... 19

Brief Introduction to Conjunction Problems
and the Heuristics and Biases Approach........ 19
Survey of Alternative Explanations............. 29
Survey of Manipulations and Their Results...... 39
Overview of Experiments and Their Rationale.... 51

EXPERIMENT 1..................................... 56

Method........... .................................. 58
Results................................. .....61
Discussion.................................... 63

EXPERIMENT 2................ ..................... 67

Method.............................................. 68
Results.................................. ... 69
Discussion...................................... 70

EXPERIMENT 3...................................... 72

Method......................................... 73
Results............................................ 74
Discussion...................................... 75

EXPERIMENT 4..................................... 78

Method.................................................. 80


Results..................... ................... 81
Discussion.................................... 82

GENERAL DISCUSSION................................ 83





REFERENCES..... ................................ 122

BIOGRAPHICAL SKETCH.............................. 127

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



Wayne S. Messer

August 1990

Chairman: Dr. Richard A. Griggs
Major Department: Psychology

Tversky and Kahneman have used conjunction problems to

support their general contention that people utilize the

representativeness heuristic in place of appropriate

statistical principles when making judgments under

uncertainty. In such problems, subjects are offered a

choice between a compound event and one of the constituent

events. In violation of sound extensional reasoning as

represented by the conjunction rule, which states that the

conjunction of two events can never be more probable than

either of its components, subjects most often choose the

compound event as the more probable. An alternative

explanation for such choices in terms of linguistic

misinterpretation was examined in the present study. This

misinterpretation may occur at two levels. At Level I, the

subject does not even recognize the relevance of an

extensional interpretation. At Level II, the subject may

consider an extensional interpretation, but inappropriately

represent the sample space of the constituent event.

Ambiguities exist in the problems at both levels and turn,

in part, on common conversational conventions of language.

In four experiments, involving 485 subjects, four

possible interpretational facilitators were manipulated. A

betting scenario and an instructional wording variable were

utilized as Level I manipulations, and survey information

and a response option clarifying phrase as Level II

manipulations. Neither Level I factor led to any

significant effects. At Level II, the clarifying phrase

proved robust across experiments, but there was a

significant interaction between this factor and survey

information. Each of these factors was effective only when

the other was absent. Responses to an Euler circle

interpretational task in Experiment 4 indicated that the

clarifying phrase was effective in creating the correct

interpretation of the response option sample space. The

results of these four experiments indicate a significant

interpretational component to performance on conjunction

problems and suggest further experiments to contrast the

heuristics and misinterpretation explanations of such



The conjunction problems developed by Tversky and

Kahneman (1982, 1983) are a set of word problems, which, in

each case, requires subjects to correctly apply the

conjunction rule in order to arrive at what the authors of

the problems argue is the "correct" answer. The conjunction

rule is a fundamental rule of probability that states that

the probability of two events occurring conjointly cannot be

higher than the probability of occurrence of either event

independently. For instance, the probability of rolling

both a six and a one with a pair of dice cannot exceed the

independent probabilities of rolling either a six alone or a

one alone, and in fact, the joint probability in this

particular case (1/36) is much less than either of the

independent probabilities (which in each case is 1/6). The

conjunction rule can also be considered to apply in cases of

category membership, where one category subsumes the other.

For example, the probability of someone having both blond

hair and blue eyes cannot exceed the probability of that

person having blue eyes only. This is because members of

the set "having blue eyes" contains those of all hair

colors, including blond. The two kinds of applications of

the conjunction rule seem to me to be fundamentally

different, although Tversky and Kahneman treat them as

equivalent. All of Tversky and Kahneman's conjunction

problems involve reasoning about categories that consist of

the latter kind of conjunction. In this regard, they are

more appropriately called category membership problems and

the term "conjunction problems" may be a misnomer.

In any case, the majority of subjects presented with

conjunction problems do not choose the designated "correct"

answer, and therefore their responses are discrepant with

the responses that would result from the application of the

conjunction rule (Tversky & Kahneman, 1982, 1983).

Kahneman and Tversky's examination of conjunction

problems is part of a larger program which indicates that

the decisions which subjects render on a variety of problems

do not conform to those that result from the application of

a number of probability principles (e.g., Kahneman, Slovic,

& Tversky, 1982). Other than the conjunction rule, these

principles include Bayes's Theorem, regression to the mean,

and the law of large numbers.

Kahneman and Tversky have argued (see Kahneman &

Tversky, 1972, 1973; Tversky & Kahneman, 1974, 1983) that

these discrepancies are the result of people assessing

uncertainty in a way that is fundamentally different from

that indicated by probability theory. They give an account

of what this other way is that involves subjects reasoning

by means of a small handful of favored "heuristics," or

judgmental rules-of-thumb. Reliance on these heuristics

leads to the systematic neglect of variables indicated as

relevant by probability theory and opens the door for

debates about whether or not man is a "rational" being.

This dissertation will examine the use of conjunction

problems in Tversky and Kahneman's arguments that human

reasoning is inadequate in its use of certain aspects of

problem information and proceeds irrespectively of

fundamental statistical principles. Evidence and arguments

will be presented that the difficulties that subjects

experience with conjunction problems are not the result of

inadequate understanding of the conjunction rule or an

inability to apply it but result instead from

misinterpretations of the problem task. These

misinterpretations are in turn caused in part by the overall

problem structure, which invites a particular interpretation

that is at odds with that which the developers of the

problem see as the correct one, and in part by ambiguity in

the interpretation of the response options.

The first part of the dissertation will provide an

overview of the history and philosophy necessary for a full

understanding of the place of conjunction problems in

contemporary research. The second part of the dissertation

will provide a review of the literature that has accumulated

around conjunction problems. This review will culminate in

an overview of the experiments performed as part of the


dissertation to investigate ambiguities in one of the most

well known of the conjunction problems. The third part will

present four experiments that indicate a misinterpretational

component in subjects' responses to conjunction problems.

The fourth part will provide a general discussion that

incorporates the experimental results and calls into

question the continued use of conjunction problems as

examples of human judgmental incompetency.


In this section I identify several analogies that have

played a role in psychological thinking, then focus on two

metaphors that are shaping contemporary reasoning research,

and that bear directly on the research on conjunction


The Use of Metaphor in Psychology

It is a difficult task to characterize the mind, and

philosophers and scientists throughout history have often

resorted to analogy. Unable to say exactly what the mind

is, they have settled for saying what the mind is like.

Two of the most important classical metaphors for

memory come from Plato. In the Theaetetus (translated by

Waterfield, 1987), Plato has Socrates asking his listeners

to imagine the mind as containing a block of wax.

Perceptions and ideas make impressions on the wax block, as

would a signet-ring, and as long as the impression remains,

we can remember that which caused the impression. As the

impressions in the wax become less distinct, so do our

memories for those events. The wax blocks can vary in size,

consistency, and purity, which accounts for the differences

in people's ability to learn and remember.

Later, Socrates considers the mind as possessing an

aviary. The aviary of an infant is empty, but as pieces of

knowledge are acquired, it is like capturing birds and

confining them in the aviary. The birds in the aviary are

like accessible memories, but in order to actually remember

something, the bird must be once again gotten hold of.

William James (1890) drew an analogy between memory and

a theater back-drop, upon which objects are painted to give

an illusion of continuous perspective. He wrote that "we

paint the remote past, as it were, upon a canvas in our

memory, and yet often imagine that we have direct vision of

its depths" (p. 643).

Analogy has been used for mental processes other than

memory. In his second lecture at Clark University in 1909,

Freud (1910/1965) asked his audience to imagine an ill-

mannered and disrupting person being removed to the outside

of the lecture hall by several strong men, who then

continues his disturbance by shouting and banging on the

door. The continued effort of the strong men are required

in order to prevent his return, or a mediator must act as

peacemaker to directly confront the unruly person and talk

with him until he can be readmitted on better behavior. By

this analogy, Freud attempted to create a picture of the

conscious, the unconscious, and repression, as well as of

the psychoanalytic process.

Roediger (1980) writes how both these early

psychologists also utilized the metaphor of the house. For

James, the house was a memory metaphor, with objects to be

searched for in its rooms, and sometimes misplaced. For

Freud, the house, with its reception room and antechamber,

served as another metaphor for consciousness and


More recently, there have been analogies of mental

processes to a purse (Miller, 1956) and to an attentional

"pool" (Kahneman, 1973).

Roediger (1980) discusses how short-term memory has

been described, among other things, as a work bench, long-

term memory as a library, and how metaphors used in

psychology often reflect the latest technology: the mind as

gramophone, telephone switchboard, tape recorder, Paris

metro subway map, and hologram. Neither did Freud escape

the influence of scientific advances contemporary to his

time. The mind for him adhered to the laws of

thermodynamics in the same way that the newly invented steam

engine adhered to the laws of thermodynamics.

A recent powerful metaphor in psychology has been the

computer metaphor, which born at the end of World War II and

allied with information theory, initiated the birth of

cognitive psychology. This metaphor suggested the use of

flow charts to characterize cognitive processes (Broadbent,

1958), and engendered several decades of fruitful research.

The impact of the digital computer metaphor is well

documented (e.g., Gardner, 1985; Lachman, Lachman, &

Butterfield, 1979). However, like all metaphors, the

digital computer one may be short-lived in its influence

(Roediger, 1980).

The flow-chart model of human mental activity was based

on the operations of the serial-digital computer. There is

a new approach to computer modelling, however, that has

suggested to some a current metaphor shift within the

cognitive psychology field (e.g., Palmer, 1987; Schneider,

1987). These new models are called parallel distributed

processing models, neural network, or connectionist models,

and are loosely based on the characteristics of the neuron

(McClelland & Rumelhart, 1986; Rumelhart & McClelland,

1986). It may be too early to tell to what degree this new

approach will supersede the digital computer metaphor but

Palmer (1987) has said, "the current 'program metaphor of

the mind' will be replaced with something that might be

called, only somewhat facetiously, the 'brain metaphor of

the mind'" (p. 925).

There are other metaphors of mind that are currently

influential in shaping psychological thought. Two metaphors

in particular are interesting because of their influence on

research in the area of thinking. These two metaphors are

thinking as perception, and thinking as statistics, and

there are historical precedents for both.

Thinking as Perception

Wilhelm Wundt is considered the founder of scientific

psychology. In 1875, he established in Leipzig, Germany,

one of the first two experimental laboratories in

experimental psychology. The other was established in the

same year by William James at Harvard (Watson, 1978).

Wundt divided psychology into two broad areas. One of

these was experimental psychology which used introspection

as its method, and attempted to analyze conscious experience

into its elements by means of conducting experiments on

individuals. Some of the research topics investigated

experimentally at the Leipzig lab included aspects of

sensation and perception such as peripheral vision, color

contrast, negative after-images, color blindness, visual

size, optical illusions, and time-sense, or the perception

of the passage of time. Other topics were reaction time,

attention, feeling, imagery, span of apprehension (how much

can be taken in at a single glance), and verbal associations

(Schultz, 1981). Many of these latter topics form chapter

titles in contemporary textbooks of cognition (e.g., Matlin,


Wundt's second area of psychology, which he called

Volkerpsychologie, or "cultural psychology", was the study

of the higher mental processes. The higher mental processes

of thinking, problem solving, and memory could not be

studied experimentally because they resulted from or were

profoundly influenced by the collective psychology of

groups, and were not simply the experiences of the

individual. The higher mental processes were to be studied

observationally by means of their products, which are

language, myth, and custom (Leahey, 1980).

With this division of psychology into an experimental

psychology and an observational psychology, with perception

in one area and thinking in the other, it was impossible for

a comparison to be drawn between perception and thinking.

However, developments occurred that prepared the way for the

emergence of the perception metaphor in thinking.

Oswald Kulpe was a student of Wundt's who disagreed

that introspection could not be used to experimentally study

the higher mental processes. As early as 1879 Ebbinghaus

had begun to show that the experimental study of the higher

process of memory was possible, and by 1885 his results were

published and available. By 1896, Kulpe had established his

own laboratory at Wurzburg, Germany, where introspective

investigations of the higher process of thinking were

regularly being carried out. While Wundt's introspective

methods called for a more immediate, descriptive account of

a simple perceptual experience, Kulpe's methods often

involved more of a retrospective account, interrupted by

questions from the experimenter, following performance on a

complex problem-solving type task. In addition to

differences in the use of the introspective method, the

Wurzburg psychologists diverged from the Wundtian program by

stressing the existence of unconscious elements of thinking,

imageless thought, and unconscious "determining tendencies"

or dispositional sets that were established by task

instructions and guided conscious thought activity. The at-

times-bitter debate between the Leipzig and Wurzburg schools

called into question the validity of introspection as a

experimental tool and prepared the way for the acceptance of

John B. Watson's behaviorism. Though the Wurzburg school

itself essentially dissolved following the departure of

Kulpe in 1909, the ideas of the Wurzburg psychologists

influenced and evolved into the school of Gestalt

psychology, which offered a more sustained and systematic

approach to the study of thinking (Leahey, 1980).

The Gestalt psychologists expounded a holistic approach

to perception and explained perception as the act of

restructuring elements within the visual field until some

stable configuration was obtained. The restructuring were

guided by unconscious principles of perception such as

closure, similarity, and figure-ground perception. Thinking

was also seen holistically, with a similarly successful

restructuring of problem elements necessary in order for

problem solving to occur. When the visual elements of a

scene were suddenly restructured into an appropriate

grouping, the subject "saw" the scene. Likewise, when the

problem elements of some problem were suddenly restructured

into an appropriate grouping, the subject "saw" the

solution. In thinking, this sudden successful restructuring

was called "insight".

The approach of the Gestalt psychologists to thinking

is important because they stressed not only the idea of a

restructuring of problem elements into potential alternative

interpretations but also the importance of problem

presentation, task instructions, linguistic phrasings, and

the examples used as factors which guided and constrained

problem restructuring. Additionally, these factors were

thought to operate in an unconscious fashion.

The Gestalt position on thinking as perception probably

reached its highest expression in the work of Karl Duncker

(1945). In contrast to many contemporary approaches to

thinking, which are driven by the statistical metaphor, and

see task instructions as something neutral which merely

serve to elicit the thought process, Duncker saw task

instructions as a major factor in facilitating and

inhibiting particular restructurings--one might say

interpretations--of problem elements. Consider, for

example, Duncker's tumor problem, which requires subjects to

solve the problem of destroying a stomach tumor with

radiation without damaging intervening tissue. When the

instructions contained the active phrase of: How could one

prevent the rays from injuring the healthy tissue?, 43% of

subjects dealt with varying the intensity of the radiation.

When the instructions contained the passive phrase of: How

could one protect the healthy tissue from being injured by

the rays?, only 14% of the subjects dealt with radiation


Unfortunately, Duncker's experimental approach and the

Gestalt program of the experimental study of thinking were

to be prematurely curtailed. One reason is the appearance

of Nazism in Germany. Many of the Gestalt psychologists or

their wives were Jewish, with the result that some went into

hiding, some died in the concentration camps, and many fled

the country. Duncker came to the U.S. He found the country

immersed in the behaviorist doctrines of John B. Watson and

little interested in a psychology of the mind. In 1940,

dismayed by the turn of events in Germany and the lack of

academic acceptance of his ideas in the U.S., he committed

suicide (Gigerenzer and Murray, 1987).

Thinking as Statistics

From its beginnings, statistics have been used as a

metaphor for thinking. Laplace called the new science of

probability "only common sense reduced to calculus."

The early 1900's saw the rapid development of

statistics into an inferential tool. Pearson developed the

X2 in 1900, and Fisher set forth asymmetrical hypothesis

testing and the analysis of variance (ANOVA) in his books,

Statistical Methods for Research Workers and Design of

Experiments in 1925 and 1935, respectively. Neyman and

Pearson published their first joint paper on symmetrical

hypothesis testing in 1928. Between 1940 and 1955, these

newly developed statistical tools became indispensable to

psychologists. Not long after, the first statistical

metaphors began appearing in psychological theorizing

(Gigerenzer and Murray, 1987).

As an example, consider Neyman and Pearson's

symmetrical hypothesis testing approach. This approach was

important because it allowed for the subjective and

mathematical aspects of hypothesis testing to be separated.

The experimenter set a criterion, based on a subjective

assessment of the risks and payoffs involved in making

errors or being correct, and was able to determine

acceptable levels of type I error and power. The sampling

distribution was determined according to the mathematical

laws of probability theory.

This model first appeared as a psychological theory in

the form of the theory of signal detection (Tanner & Swets,

1954). It was used again by Wickelgren and Norman (1966) as

a model of recognition memory.

Another example of a statistical method being used as a

psychological theory is that of Harold Kelley's (1973)

causal attribution theory, which is analogous to a Fisherian

three-way ANOVA.

From the very beginnings of the translation of

statistical methods into psychological theories, there has

been a basic confusion between a descriptive approach and a

prescriptive approach. Attempts to descriptively

characterize human thinking as analogous to statistical

methods often became confused with direct comparisons of

human thinking to statistical methods. It was often assumed

that statistical methods were standards by which rational

human thought should be judged. This prescriptive approach

reached its highest expression in the work of Kahneman and

Tversky (e.g., Kahneman, Slovic, & Tversky, 1982).

These researchers and others have painted a bleak

picture of human rationality, often characterizing their

subjects as judgmentally incompetent. Subjects are

postulated to use a small number of simple context and

content independent heuristics, avoiding the utilization of

information that would be important from a statistical

standpoint. The tasks and instructions themselves are seen

as unimportant, only as vehicles that conform to a

statistical interpretation and should therefore elicit a

statistically influenced response.

In recent years, however, there has been a new approach

to research in human reasoning that is very similar to the

Gestalt tradition in the study of thinking. It signifies a

return to the perceptual metaphor in thinking research.

The Rebirth of the Gestalt Tradition

The overtly pejorative tone of Kahneman and Tversky's

work engendered several protesting responses. A position

taken by Cohen (1981) is to deny the normative status of

particular statistical principles for particular problems.

Cohen attempts to salvage human rationality by arguing that

subjects are appropriately using statistical principles in

their reasoning, but different ones than those chosen by the

experimenter. For instance, Cohen argues that subjects are

using Baconian probabilities in certain instances rather

than Bayesian ones in order to account for the discrepancies

between subjects' responses and a Bayesian outcome. Cohen

is thus seen as remaining within the confines of the

statistical metaphor.

A different position is to question the very basis of

Kahneman and Tversky's explanation and propose alternative

explanations based on a perceptual metaphor. Explanations

brought forth from the perceptual metaphor focus on a

subject's interpretation of a problem and the elements of

the problem which contribute to that interpretation. Under

the Kahneman and Tversky position, one must accept the human

inference process as imperfect. Under the perceptual

metaphor, the possibility exists that poor performance is

driven by a misrepresentation of the problem or a

misinterpretation of the evidence provided, or of problem

elements, in ways other than that which the experimenter

intended. Frisch (1988) has pointed out that just because

people's judgments systematically deviate from probability

theory, it doesn't necessarily mean that they are using a

common heuristic, as Kahneman and Tversky have assumed. It

could mean that they are misinterpreting something about the

problem in a common way based on cues that invite those

particular mistakes. Likewise, that formal training in

statistics does not improve the tendency to answer certain

problems correctly (Tversky & Kahneman, 1983), could mean

that the heuristics used are exceedingly robust, as Kahneman

and Tversky have concluded, or that particular

misinterpretations are so strongly cued that subjects never

become aware of the relevance of statistical principles in

the first place.

The last several years have seen a reemergence of the

perceptual metaphor in theoretical accounts of reasoning,

and an emphasis on the subject's interpretation of problems.

Recent theoretical works on reasoning (e.g., Evans, 1989;

Margolis, 1987) have stressed the importance of

investigating those factors and processes that guide a

subject's interpretation of a problem, instead of merely

focusing on the comparison of a subject's response to the

answer a relevant statistical method would yield. For

instance, Margolis (1987) has postulated a two-stage theory

of reasoning that includes an initial stage of pattern

recognition and a subsequent rationalization stage. Evans

(1989) has likewise theorized an initial automatic selection

stage, followed by manipulation of the selected aspects of

the problem. This second stage is also guided in an often

unconscious manner. Both authors stress the importance of

the content of reasoning problems and subjects' familiarity

with it, as well as the influence that context and language

play in resolving ambiguities of interpretation and

directing attention. This emphasis on the unconscious

processes that constrain inference is congruent with the

Wurzburg idea of "determining tendencies" and the Gestalt

approach in general, as discussed by Gigerenzer and Murray

(1987) and Gardner (1985).

Likewise, empirical work has gained momentum in

investigating the effects of content, context, and

linguistics on the reasoning process. This has taken place

in work on logical reasoning (e.g., Cheng & Holyoak, 1985;

Cosmides, 1990; Griggs & Cox, 1982) as well as statistical

reasoning (e.g., Nisbett, Krantz, Jepson, & Kunda, 1983).

In the next section, I examine both theoretical and

empirical work on Kahneman and Tversky's conjunction

problems that falls within the general bounds of being

guided by the perceptual metaphor.


In this section of the dissertation I begin by giving a

fuller account of the heuristics approach of Kahneman and

Tversky. Their studies with the Linda problem--the most

well-known of the conjunction problems--are closely

examined. Several alternative explanations to performance

on the Linda problem are then considered. The alternative

explanations have suggested several experimental

manipulations and these manipulations and their results are

discussed. The section ends with an overview of my own

experimental manipulations.

Brief Introduction to Conjunction Problems and the
Heuristics and Biases Approach

Conjunction problems were first introduced in

"Judgments of and by Representativeness" (Tversky &

Kahneman, 1982) as examples of subjects' utilization of the

representativeness heuristic. According to Kahneman and

Tversky (1973), when people are faced with tasks of

assessing probabilities and making predictions under

conditions of uncertainty, they rely on a limited number of

simple, but abstract and content-independent heuristic

strategies that supplant the more appropriate but complex

operations of statistical and probability theory. These

heuristics consist of mere assessments of similarity, in the

case of the representativeness heuristic, and of ease of

recall or scenario generation, in the case of the

availability heuristic. Reliance on these heuristics leads

people to ignore prior probabilities (base rates), neglect

considerations of predictive accuracy and evidence

reliability, be insensitive to sample size differences, fail

to consider regression to the mean, misperceive the

fundamental notions of chance and be susceptible to illusory

correlations and unwarranted confidences (Tversky &

Kahneman, 1974).

The Linda problem, among other conjunction problems,

was constructed to point out that even one of the most basic

and fundamental rules of probability, the conjunction rule,

would be ignored and violated in situations where the

representativeness heuristic would indicate a contrary

judgment. The conjunction rule states that a compound or

joint event cannot have a higher probability of occurrence

than either of its constituent events. The conjunction rule

can be stated in the terms P(A) > P(A+B). In the Linda

problem, subjects violate the conjunction rule by giving a

joint event (that includes a very representative constituent

event) a higher probability of occurrence than a constituent

event that is very unrepresentative, i.e., by considering

A+B more probable than A. (It has already been noted in the

Introduction that Kahneman and Tversky's conjunction

problems involve joint and independent events that are all

actually category membership decisions.)

Tversky and Kahneman's (1983) approach to the

conjunction fallacy is somewhat different from their earlier

approaches to the base-rate fallacy, the fallacy of the law

of small numbers, etc., in that in their earlier writings

(Tversky & Kahneman, 1974; Kahneman & Tversky, 1972, 1973)

the authors not only stated that subjects digressed from the

principles of probability theory in favor of use of the

representativeness heuristic, but also that people evidently

did not develop intuitions anywhere near analogous to the

law of large numbers, the principle of regression to the

mean, or Bayes's Theorem. Thus, the layperson was not a

naive, but adequate statistician at all, as was indicated by

Peterson and Beach (1967), but an incompetent one,

possessing not even rudiments of normative probability

theory. In fact, expert practitioners such as physicians

and mathematical psychologists were shown to be as prone to

biases engendered by the representativeness heuristic as the

layperson (Tversky & Kahneman, 1983).

Since their earlier work, however, Tversky and Kahneman

(1982) have backed away from their extreme hypothesis that

some judgments of likelihood are arrived at solely by the

representativeness heuristic, and representativeness is now

seen as only one of any number of possible procedures useful

for retrieving, interpreting and evaluating information. It

is still seen as a highly favored one, however. They are

now willing to admit that subjects are capable of utilizing

sample size information, reliability information, and base

rates, but that such utilization is dependent on problem-

specific variables, design characteristics, the subject's

statistical sophistication, and demand characteristics or

other suggestive clues. In this, it is clear that their

position has been influenced not only by a growing volume of

evidence unsupportive of the heuristics approach but also by

the characteristics of the perceptual metaphor.

Accordingly, Kahneman and Tversky (1982) now

characterize errors into two basic types. Whereas

previously all errors were discussed as if errors of

comprehension, i.e., errors which resulted from failure to

recognize or understand a statistical rule, now many errors

are seen as errors of application, in which the subject

knows, understands, and accepts a statistical rule as valid

but does not apply it in a case where the experimenter

considers it to be normatively appropriate to do so. Such a

moderation of position significantly weakens the heuristics

and biases approach and focuses attention on the nature,

strength, and degree of abstraction of statistical

principles, the role of intelligence in inducing such

principles from education or experience, whether such

principles remain content bound or are abstracted, and the

variables which aid or hinder the encoding of a problem such

that it makes contact with these principles. These

considerations are being explored by Nisbett and colleagues

(e.g., Holland, Holyoak, Nisbett, & Thagard, 1986; Jepson,

Krantz, & Nisbett, 1983; and Nisbett, Krantz, Jepson, &

Kunda, 1983).

The conjunction problem experiments (Tversky &

Kahneman, 1982, 1983) indicate the influences of this

different approach to judgment under uncertainty in that

Kahneman and Tversky characterize the conjunction fallacy as

more likely an error of application than an error of

comprehension, and search for conditions that mitigate the

bias, being concerned with such variables as the effects of

statistical sophistication and rephrasings of the problem on

commission of the error. Kahneman and Tversky are also much

more willing to address criticisms of their view and discuss

alternative explanations, though in each case, they argue

that the outcomes of their manipulations continue to support

the interpretation that the representativeness heuristic is

the dominant strategy used by subjects.

The Linda problem is an English-language equivalent of

four problems tested by Kahneman and Tversky in Israel in

1974 (Tversky & Kahneman, 1982). In these early between-

subjects versions of conjunction problems, a brief

personality sketch was presented that matched the stereotype

of a particular occupation (e.g., cab driver), followed by a

list of five or six target events to be rank ordered. For

half of the subjects, the list contained a simple event of

which the sketch was highly representative (e.g., "is a cab

driver") and of a simple event of which the sketch was

highly unrepresentative (e.g., "is a member of the Labor

party"), plus four other items. For the other half of the

subjects the list contained a joint event made up of these

two constituents (e.g., "is a member of the Labor party and

drives a cab"), and the same four remaining items. Half of

each group then ranked the items according to the

probability of the person described being a member of that

class, while the remaining half ranked the items according

to the degree to which the description was representative of

a person belonging to that class. In both cases, the

compound event was ranked higher than the unrepresentative

constituent, which for the probability ranking group is a

violation of the conjunction rule. In fact, the

representativeness ranking and the probability ranking of

each set of targets were almost identical, a finding which

Tversky and Kahneman interpreted as meaning subjects were

assessing probability by means of representativeness.

Tversky and Kahneman also developed within-subjects

versions of conjunction problems consisting of eight target

statements. The following is the within-subjects version of

the Linda problem as it was originally introduced (Tversky

and Kahneman, 1982):

Linda is 31 years old, single, outspoken, and
very bright. She majored in philosophy. As a
student, she was deeply concerned with issues of
discrimination and social justice, and also
participated in antinuclear demonstrations.

Please rank the following statements by their
probability, using 1 for the most probable and 8
for the least probable.

(5.2) Linda is a teacher in elementary school.
(3.3) Linda works in a bookstore and takes Yoga
(2.1) Linda is active in the feminist movement.
(3.1) Linda is a psychiatric social worker.
(5.4) Linda is a member of the League of Women
(6.2) Linda is a bank teller. (B)
(6.4) Linda is an insurance salesperson.
(4.1) Linda is a bank teller and is active in the
feminist movement. (A+B)

The numbers before the target statements are the mean

ranks as they were assigned by 173 subjects. Note that the

statement "Linda is a bank teller and is active in the

feminist movement" is ranked as more probable than the

statement "Linda is a bank teller". This conjunction

fallacy of ranking a compound target above the less

representative simple target was exhibited by 89% of the

subjects. Rankings were similar for a second group asked to

rank order each statement by its representativeness.

Another version of the Linda problem with only the two

targets B and A+B was tested (Tversky & Kahneman, 1982).

Each target served as a response option and subjects were

instructed to choose the option that was more probable:

Linda is 31 years old, single, outspoken, and
very bright. She majored in philosophy. As a
student, she was deeply concerned with issues of

discrimination and social justice, and also
participated in antinuclear demonstrations.

Which of the follow is more probable? (Check

a) Linda is a bank teller
b) Linda is a bank teller and is active in
the feminist movement.

Tversky and Kahneman had hypothesized that this would

make the logical relationship between the targets more

transparent. However, the conjunction fallacy was not

reduced, with 87% of subjects tested (n=86) selecting the

compound target as the more probable of the two options.

This was replicated by another group of subjects

(n=147), with 85% of respondents committing the conjunction

fallacy (Tversky & Kahneman, 1983).

In all cases, it was argued that probability judgments

were being driven by representativeness. Another possible

explanation that Tversky and Kahneman did not consider was

that "probability" was being interpreted by their subjects,

not in the statistical sense of relative frequency, which is

the one that the experimenter intended, but in an everyday

sense of "plausibility," or "believability." This more

parsimoniously explains the identical responses by subjects

to the two kinds of instructions to rank by probability and

to rank by representativeness: Subjects were simply giving

the two statements the same reading! A fuller discussion of

the way that the structure of the problem itself contributes

to this linguistic ambiguity and the implications for all

manipulations of conjunction problems will be delayed until

below, when alternative explanations are considered, but

another important point must be made here. In order for the

argument that the conjunction rule is the normative model to

apply to these problems to succeed, it is necessary to

impose a particular framework upon the task: Each target

item to be ranked or each response option, depending on the

form of the problem, must be seen from the perspective of

the probability of membership in a category given a

description. Considering the target item or response option

as a hypothesis and the description as data, this yields a

P(Hypothesis/Data) perspective. This perspective is not the

same as one of P(Data/Hypothesis) or the probability of the

description being accurate given membership in a particular

category. A reversal makes each consideration an

independent assessment of representativeness and no longer

yields a framework within which the conjunction rule is


Now, consider that Tversky and Kahneman (1982) stress

representativeness as a directional relation between a

process and a model or an instance and a model and that it

only makes sense to speak of a sample as being

representative of a population, an act representative of a

person, or an instance representative of a class and not

vice versa. This directionality thus leads naturally to an

assessment of the representativeness of a sample, act, or

instance given the characteristics of the population,

person, or class. When Tversky and Kahneman asked their

subjects to rank the items according to "the degree to which

X [the described person] is representative of that class" or

"the probability that X is a member of that class" (Tversky

& Kahneman, 1982, p. 90), they imposed a directionality on

the task. Both instructions impose the framework of

assessing probability or representativeness of the

description given the class of the target item. Again,

considering the description as data and the target item as

hypothesis, this yields a P(Data/Hypothesis) perspective,

the reversal of the P(Hypothesis/Data) perspective necessary

for the conjunction rule to be normatively applicable. So

we see that as constructed these problems are flawed and


Tversky and Kahneman were perhaps aware of this

directionality cue and its invalidation of the conjunction

problems presented in 1974, because in the Linda problem

(Tversky & Kahneman, 1982) and all subsequent conjunction

problems they constructed, such instructions were deleted;

subjects were subsequently asked to "rank order the

statements by their probability," from most probable to

least probable. This makes the misdirection of the problem

implicit rather than explicit, but does not correct the

basic difficulty.

Several other difficulties in the structure of the

Linda problem have been suggested by a number of different

authors, each serving as an alternative explanation for the

pattern of responses usually seen. These alternative

responses to the heuristics approach are now considered.

Survey of Alternative Explanations

There are several alternative explanations as to why

the conjunction fallacy occurs. Though some of the

explanations can be seen as distinct and separate from each

other, others can be seen as being related. Most

alternative explanations for the Linda problem do not

question the normative status of the conjunction rule or

attempt to support some other statistical principle as being

more normatively appropriate to the problem, as Cohen (1979,

1980, 1981) has done in the case of the base rate and sample

size problems used by Kahneman and Tversky. Instead, most

explanations are concerned with subjects' possible

misinterpretations of elements of the problem or a

misrepresentation of the problem in its entirety. These

concerns are congruent with an acceptance of the perceptual


One alternative to the use of the representativeness

heuristic in explaining the responses to the Linda problem

and other conjunction problems has been called the

"linguistic confusion" hypothesis (Tversky and Kahneman,

1982; 1983; Wells, 1985). However, since linguistic

confusion could refer to any of several aspects of the Linda

problem, I will rename this hypothesis the "mutually

exclusive or" hypothesis. According to this argument, the

meaning of one of the response options is being

misinterpreted. The response options could be read within a

framework of "which is more probable, this option or this

one", with an implied "or" connecting the statements.

Within conventional language usage, "or's" are often

mutually exclusive (Margolis, 1987). For instance, the

phrase "are you having pie or pie a la mode?" implies pie

without ice cream versus pie with ice cream as your choice.

Therefore, reading the statements in the Linda problem as

"Which is more likely, that Linda is a bank teller, or that

Linda is a bank teller active in the feminist movement,"

subjects are prone to read the first choice as "Linda is a

bank teller and not active in the feminist movement." This

destroys the inclusive relationship of the two response

options and turns the latter, most often chosen statement,

"Linda is a bank teller and active in the feminist

movement", into an arguably appropriate response.

A second postulated linguistic confusion (Paulos, 1988)

is that within the context of being asked if Linda is a bank

teller, subjects take the statement of Linda as a bank

teller and feminist as possible additional information.

Instead of reading the "and" as indicative of a conjunction,

subjects read the statement as "Given that Linda is a bank

teller, what are the chances of her also being a feminist."

Such a reading would also justify attaching a higher

probability to this situation than to Linda being strictly a

bank teller. Subjects are in effect preferring a

conditional probability interpretation to a joint

probability one. (Evidence for this will be discussed in

the next section, which deals with the empirical results of

experimental manipulation.)

Another alternative to representativeness is the

"sample-space" hypothesis (Markus & Zajonc, 1984; Morier &

Borgida, 1984; Wells, 1985). This explanation assumes that

there is some ambiguity to the way that the problem is

stated that steers the subject away from forming the proper

sample space for dealing with the response options.

Normative responding requires the subject to form sample

spaces which are at hierarchically different levels, with

one category representation subsuming another. Most

subjects may spontaneously be attempting to resolve all

sample spaces at the same hierarchical level. The

relationship between the "sample-space" hypothesis and the

"mutually exclusive or" hypothesis is that the

linguistically ambiguous structure of response alternatives

in the Linda problem invites improper interpretations of

sample space (Markus & Zajonc, 1984), so that the direct

cause of the conjunction error is in the formation of an

inappropriate sample space, but this is in turn implicitly

generated by the misinterpretation of the linguistically

ambiguous individual response options (Wells, 1985).

Another explanation is that the overall structure of

the problem as well as the practical tendency of the subject

leads to an assessment of P(D/H) when the assessment

normatively called for is P(H/D), as previously discussed.

Given such an interpretation, again we see that responses

are normatively appropriate. The misinterpretations of the

problem can now be seen to be operating at two different

levels. At one level are the various misinterpretations of

the target statements that can detract from formation of an

appropriate sample space. At a higher level is the

ambiguity in the overall structure of the entire problem. I

will refer to this ambiguity of the entire problem as Level

I ambiguity, and the ambiguities at the level of the target

statements (or response options) as Level II ambiguities.

Several alternative explanations are subsumed under

Margolis's (1987) discussion of the problem. Margolis

(1987) explains the responses on the Linda problem as

resulting from an interaction of what he calls "semantic

ambiguity" and "scenario ambiguity." Semantic ambiguity is

the susceptibility of a word or phrase to multiple

meaningful interpretations. Scenario effects constitute the

larger context within which the problem is seen and which

guides resolution of semantic ambiguities. Scenario effects

themselves can be ambiguous.

According to Margolis, the Linda problem contains both

semantic ambiguities and ambiguous--or downright misleading-

-scenario effects. It is the combination of these effects

that leads to the proportion of incorrect responses usually

seen. The scenario effects spring from alternative possible

contexts. On the one hand, the problem can be seen from a

viewpoint of relative frequency assessment, and this is what

the authors of the problem intended. (Actually, it is my

contention that the authors constructed their task in such a

way as to purposefully mislead. Consider: "Our problems, of

course, were constructed to elicit conjunction errors, and

they do not provide an unbiased estimate of the prevalence

of these errors" [Tversky & Kahneman, 1983, p. 311]).

On the other hand, the problem can be seen within the

context of the assessment of the believability or

plausibility of a situation. Unfortunately, the word

"probable" that Tversky and Kahneman chose for their

instructions encompasses both meanings, as does its synonym,

"likely." Here, the single word "probable" has semantic

ambiguity, which is resolved according to the way in which

the scenario ambiguity is resolved. This is what Margolis

means by the interaction of scenario and semantic

ambiguities. I will later discuss how the overall structure

of the problem overwhelmingly leads the unalerted reader,

even statistically sophisticated ones, to a "plausibility"

reading of "probable."

According to Margolis (1987), subjects' answers make

sense if it is taken that they are interpreting the problem

within this "believability" framework, if they are also

influenced by two other semantic ambiguities. One of these,

says Margolis, is that the two statements about Linda are

interpreted as implying a mutually exclusive relationship,

as has already been discussed. The implied "or" in "Linda

is a bank teller or Linda is a bank teller active in the

feminist movement," implies an interpretation of "Linda is a

bank teller" as "Linda is a bank teller and not active in

the feminist movement."

Furthermore, even without these other difficulties in

interpretation, the small difference in frequency between

the classes of the two target items are negligible in

practical terms. With such a description, it is highly

unlikely that Linda would be a bank teller, and even less

likely that she is also a feminist. But both probabilities

are vanishingly small. Such a difference is for all

important practical purposes unimportant and is easily lost

in the larger contexts and ambiguities that the problem

presents. It is, unfortunately, this difference that

subjects are being asked to assess.

Margolis's scenario ambiguities I consider to be

equivalent to my Level I ambiguities, and his concern with

misinterpretations of mutual exclusivity in the target

statements I consider to be Level II ambiguities.

Margolis (1987) is the first author to contend that

performance on the Linda problem is the result of the

convergence and interaction of multiple ambiguities or

misleading influences at different levels, and that

debiasing of the Linda problem will require manipulations

along each of the three lines he discusses. Previous

authors seem to have taken the view that only one approach

to debiasing is required, and that only one alternative

explanation for the conjunction fallacy should be considered

at a time.

It appears, however, that this small problem presents

complications beyond, but related to, the three that

Margolis discusses, though I am in complete agreement that

compounded effects are what make the problem so resistant to

facilitative manipulations. The difficulties I would add

are those already discussed: (a) that there is a confusion

between the conditional probability of the description given

the hypothesis of membership in a target class, or P(D/H),

and the conditional probability of the hypothesis of

membership in a target class given the description, or

P(H/D), which is what is supposedly being asked for; (b)

that there is confusion between the joint probability of

Linda being a bank teller and a feminist, or P(B&F), which

is how the authors mean the phrase to be taken, and the

conditional probability of Linda being a bank teller given

that she is a feminist, or P(B/F), or its converse, p(F/B),

(Paulos, 1988); and, (c) that the normative approach that

Tversky and Kahneman prescribe for this problem makes the

descriptive and class information provided completely

irrelevant to the solution of the problem, in complete

violation of the Gricean conversational cooperativeness

principle (Grice, 1975).

This third problem can be understood more clearly by

briefly considering the field of pragmatics. Pragmatics may

be considered a subarea of linguistics, but for our purposes

it might be defined as the psychology of utterance

interpretation, or of how disambiguation of utterance

meaning is achieved, how reference is assigned, how

implicatures (inferences intended by the speaker) are

arrived at, etc. (Sperber and Wilson, 1981). Much of

pragmatic theory hinges on the social cooperativeness

achieved by two speakers. To the extent that what passes

between experimenter and subject can be conceived of as a

conversational exchange, theories of pragmatics are


Perhaps most germane to the Linda problem is Grice's

(1975) Cooperative Principle (CP). Under the CP, speakers

will strive to be relevant and informative. According to

Adler (1984), the very structure of the Linda problem is

uncooperative in the Gricean sense, because there is nothing

about the description of Linda offered or the particular

classes to which we are asked to consider she might belong

that is differentially relevant to the judgment we are being

asked to make. In other words, any description of Linda,

and any category or conjunction of categories offered would

make absolutely no difference in arriving at the normative

response. In normal exchange, however, what people offer

each other is differentially relevant. We are led to

assume, by every experience we have ever had, that what is

being offered about Linda and these categories is somehow

relevant to the judgment we are required to make. This is

the main reason why subjects misinterpret the problem and

why nearly all unalerted subjects will more plausibly read

"probable" as "believable," because it is only this

interpretation that makes the description of Linda offered

differentially relevant.

It is here that a basic difference between the

statistical metaphor and the perceptual metaphor becomes

more apparent. Kahneman and Tversky say we ought to solve

the Linda problem by applying the conjunction rule, ignoring

the context and content of the problem. Their explanation

of why we don't do this is that we apply a heuristic that

makes an abstract assessment of similarity between a

description and a category and in so doing also ignore the

content and context of the problem. Both their prescription

and their explanation are based on an acceptance of the

position that reasoning is driven by formal rules. The

perceptual metaphor stresses the actual content of any

situation and how a problem solver attempts to wrestle

meaning from that situation, stressing that while resolution

of ambiguities often appears to be a rule-like behavior, the

resolution is always bound to the content and context of the


The tendency is for subjects to interpret the

particular content of the Linda problem according to the

socially acceptable rules of conversational exchange. It is

this overall pragmatic difficulty in the problem that I

believe would qualify as a Margolian scenario ambiguity and

that I categorize as a Level I ambiguity. While Margolis

(1987) recognizes the importance of pragmatics and sees them

as contributing to the general class of scenario effects, he

does not see the two as being synonymous. However, it is my

contention that the "believability" misinterpretation is

fostered by the very structure and nature of the problem

itself and such an inherent fault may not be correctable

without substantially altering the task itself, making it no

longer a test between representativeness and other

judgmental strategies. There is a strong link between all

such problems of representativeness and pragmatics. Where

Kahneman and Tversky see the ubiquitous and insidious

conjunction fallacy, affecting judges, physicians,

journalists, psychologists, etc., and unmitigated even by

prolonged statistical training, I see only the ordinary

workings of pragmatics. The pragmatic "pull" of the Linda

problem appears to be stronger than that of most of the

other conjunction problems that Tversky and Kahneman have

constructed, and so we would expect the Linda problem to be

the most intransigent when it comes to facilitating

manipulations than the other problems, and in fact, this

appears to be the case (e.g., Fiedler, 1988; Tversky &

Kahneman, 1983). Still, by understanding the conjunction of

effects that make the Linda problem so difficult given the

pragmatic framework, it should be possible to get

significant reductions in "error" by making the appropriate

manipulations and to be able to see why those manipulations

that have worked do so. Margolis (1987) has suggested

various changes in the Linda problem which would directly

address the scenario and semantic ambiguities he has

pinpointed. A consideration of these proposed manipulations

served as the rationale for the design of Experiment 1.

However, before discussing any new attempts at facilitation,

the results of some previous manipulations will be


Survey of Manipulations and Their Results

Tversky and Kahneman (1982) originally gave the eight-

statement version of the Linda problem to three groups

differing in statistical sophistication, and found that more

than 80% of subjects in each group, regardless of

statistical sophistication, exhibited the conjunction

fallacy. This included a group of graduate students in a

decision science program who had taken a number of advanced

probability and statistics courses!

Since these experiments, a number of researchers have

examined the Linda problem, and several other conjunction

problems introduced subsequently by Tversky and Kahneman

(1983), all of which are easily replicated in their original

form (e.g., Fiedler, 1988; Macdonald & Gilhooly, 1990;

Morier & Borgida, 1984; Wolford, Taylor, & Beck, 1990).

Tversky and Kahneman (1983) themselves were among the

first authors to further investigate the conjunction

fallacy. After moving from the eight-statement version to

one with two response options, they considered whether the

relationship between the compound target and simple target

in the two statement version might be interpreted as that of

an implied mutually exclusive "or", so that "Linda is a bank

teller" was being read as "Linda is a bank teller and not a

feminist". Such an interpretation would of course be

appropriately considered less likely than that "Linda is a

bank teller and a feminist" and turn the 85% incorrect

responses into correct ones. To test for this, Tversky and

Kahneman replaced the possibly misinterpreted statement with

"Linda is a bank teller whether or not she is active in the

feminist movement", which the researchers affirmed to

"emphasize the inclusion of T&F [teller and feminist] in T

[teller]" (p. 299). One could argue whether their

rephrasing was, in fact, the best way such an inclusion

could have been emphasized, or whether it, in fact, clearly

resolved the ambiguity for most subjects. But still, even

such an attempt lowered the percentage of subjects

committing the conjunction fallacy from 85% to 57%! Even

with such an improvement, Tversky and Kahneman report that

they were still surprised that their subjects could so

blatantly violate what was now, to them, perfectly clearly

an extensional situation. That conjunction errors still

occurred can be explained as being due to the still

ambiguous wording of the supposedly clarified statement, and

the still present scenario effects in which the

differentially relevant description and the word "probable"

created a plausibility context.

Moving on to their next modification, Tversky and

Kahneman included the following in the task instructions for

the two-statement version: "If you could win $10 by betting

on an event, which of the following would you choose to bet

on?" (p. 300).

Here again, violations of the conjunction rule were

driven down by more than 30% to 56%, which Tversky and

Kahneman still considered "much too high for comfort" (p.

300). Conjecturing that the betting context drew attention

to conditions under which such a bet would pay off, Tversky

and Kahneman moved on to other manipulations. One wonders

why Tversky and Kahneman acted as if there could be only one

possible alternative explanation for incorrect responding,

committing a "fallacy of monocausality" in the process, and

did not combine the two manipulations of target statement

clarification and betting scenario. Such a combination, of

course, is just what Margolis (1987) suggests is necessary

to correct the confounding influences in the problem.

Next, Tversky and Kahneman gave the eight-statement

version of the Linda problem to social science graduate

students with several statistics courses under their belts

in a rating scale version, where target items are rated as

to likelihood rather than rank ordered. Here, unlike their

earlier (Tversky & Kahneman, 1982) results, only 36% of the

statistically educated graduate students committed the

fallacy. Such a result is much more in keeping with Jepson,

Krantz, and Nisbett's (1983) idea that statistical reasoning

is a skill like piano-playing, or chess, that can be

learned, and not a competence, as Tversky and Kahneman seem

to see it, that is uniformly lacking from all subjects. It

is possible, however, that facilitation occurred because

subjects had to rate each target item individually as to its

probability rather than rank order all items. This will be

discussed further when considering similar results from

another (Morier & Borgida, 1984) study that directly

addresses the issue.

In the rest of their 1983 paper, Tversky and Kahneman

go on to show how the conjunction fallacy is committed by

practicing physicians on problems of a medical nature, is

encountered in prediction problems that involve conjunctions

as well as problems of gaming probabilities, causal

situations, motives and crimes, and forecasts and scenarios.

In a section of the paper on extensional cues, Tversky

and Kahneman (1983) admit that although people have an

"affinity for nonextensional reasoning, it is nonetheless

obvious that people can understand and apply the extension

rule" (p. 308). Here, they clearly part company from their

earlier positions, when they were wont to say that subjects

lacked such statistical rules.

Tversky and Kahneman (1983) report several other

manipulations that reduce the conjunction fallacy. What

Tversky and Kahneman call "a seemingly inconsequential

change" (p. 309) of having subjects assess percentages of

people belonging to each of the simple targets separately

before assessing the relative frequency of the compound

target brought the rate of the conjunction error down to

31%. Estimating how many out of 100 patients fit in each of

the categories of the two constituent and a compound target

brought down the incidence of error to 11%! I would suggest

that there are two influences occurring here, one at Level I

and one at Level II. These manipulations work at Level I by

making a frequency interpretation unavoidable, and also by

making the P(D/H) versus P(H/D) confusion less likely to

occur, since estimating frequencies of a class forces one to

assume a P(H/D) perspective. The manipulations work at

Level II by indicating the appropriate sample space. This

is accomplished by means of explicitly stating the two

constituents prior to the compound event and thus making the

mutual exclusivity error less likely to occur. Though what

Tversky and Kahneman termed nonextensional reasoning

prevailed to 11% even in these very transparent problems,

this is nonetheless a 74% reduction in error from base rate!

Though Tversky and Kahneman (1983) report these

reductions in error, they lack a framework within which to

interpret them. They fall back on saying that "[it] appears

that extensional considerations are readily brought to mind

by seemingly inconsequential cues" (p. 309). They go on to

note the contrast between extensional cues being effective

in certain of the conjunction problems and the relative

inefficacy of extensional cues in the Linda problem.

Without asking why it might be so, they conclude that the

contrast in effectiveness of cues exists because some

conjunction problems are concerned with classes and others

with properties and that "although classes and properties

are equivalent from a logical standpoint, ... [this

equivalence] is apparently not programmed into the lay mind"

(p. 309).

The overall mood of the paper is a very condemning one,

with nonextensional reasoning made out to be a great danger,

human reasoning being a battleground between appropriate

statistical and logical rules and "seductive nonextensional

intuition" (p. 314).

Subsequent research by others has shown that the

conjunction fallacy has proven to be very responsive to a

number of task specific variables, to the degree that

Tversky and Kahneman's contention that a general inferential

judgmental heuristic is being applied seems very unlikely.

Locksley and Stangor (1984) found that the conjunction error

was reduced somewhat by formal statistical training and

knowledge of the conjunction rule, but that performance was

more strongly affected by task-specific cues. Locksley and

Stangor imputed conjunction problem results to causal

reasoning on the part of the subject, which has been shown

to occur with great spontaneity and facility (e.g.,

Michotte, 1963; Schustack, 1988). The authors reasoned that

while common events may more often be reasoned about with a

multiply sufficient causal schema (i.e., any one of several

factors being sufficient), rare events would more likely cue

a multiply necessary causal schema (i.e., several factors

operating in concert being necessary). They tested this

prediction with two problems similar in structure to the

original Linda problem. It was stated in one problem that

Bob is married, and in the other that John has committed

suicide. Each statement was then followed by a number of

target items, some of which were conjunctions. It was found

that, in fact, in the problem with the rarer outcome of

suicide, 72% of subjects committed the conjunction error, as

compared to 29% for the common outcome problem. This may

have some bearing on the Linda problem, because as Margolis

(1987) states, given her description, Linda is so unlikely

to be a bank teller that either option of bank teller, or

bank teller and feminist, would be an unusual thing.

Using the eight statement version of the Linda problem,

Morier and Borgida (1984) found that when subjects directly

estimated the probability of occurrence of each target

statement, performance was improved compared to performance

when subjects rank ordered the statements. The conjunction

error was reduced by about 15%, from 95% to 80%. The

experimenters had predicted that some small reduction would

result from ties being possible between the compound target

and the two constituent elements. And, in fact, 8 of the 60

subjects in the probability estimation condition assigned

equivalent probabilities to the compound event and

constituent event, which is allowable within the parameters

of the conjunction rule. A second, additional explanation

for the small improvement is that a portion of the

conjunction error is due to the previously mentioned

possibility of confusion between different conditional

probabilities, i.e., between P(event/description) and

P(description/event). Such a confusion is understandable

and though still possible under the probability estimation

condition, may be somewhat ameliorated.

Including the statement "Linda is a bank teller who is

not a feminist" along with the statement "Linda is a bank

teller" reduced the bias from baseline by about 18% (Morier

and Borgida, 1984). More interestingly, bias was reduced by

around 47% by a four-statement version that included the

option "Linda is a bank teller or is active in the feminist

movement". Such a manipulation was thought to cue more

clearly the logical structure of the task. Of course, if

one wanted to fully follow up on this course, the approach

would be to offer explicit statements about every possible

state of class membership of Linda. Such an exhaustive list

of alternatives would be as follows:

1. Linda is a bank teller.

2. Linda is a bank teller and not an active feminist.

3. Linda is both a bank teller and an active feminist.

4. Linda is an active feminist.

5. Linda is both an active feminist and a bank teller.

6. Linda is an active feminist and not a bank teller.

7. Linda is neither bank teller nor feminist.

8. Linda is either a bank teller or a feminist.

Because such a list of statements is exhaustive, the

problem space should be very clearly cued and very clearly

delimited, and linguistic errors involving mutually

exclusive "or's" should be kept to a minimum.

Crandall and Greenfield (1986), found that linguistic

training in the explicit and implicit roles of the

conjunctive "and" did not improve performance while

probability training did. However, the probability training

provided by Crandall and Greenfield may have served less to

educate subjects in the conjunction rule--which most

subjects will already endorse in the abstract form--so much

as providing them examples of encoding the situation of the

Linda problem in such a way that it makes mappable contact

with such abstract rules (see Nisbett et al., 1983, for a

discussion of when statistical rules will be cued by a

problem). In addition, the linguistic training of the

meanings of the conjunctive "and" may not have directly

addressed the mutually exclusive "or" problem.

Nahinsky, Ash, and Cohen (1986) allow that biases in

probabilistic reasoning result in part from the use of

judgmental heuristics, but they also maintain that some

errors are due to incorrectly understood concepts of

probability as well as difficulties in information

processing. They presented subjects with some problems for

which a relative frequency interpretation was explicit and

found that nonetheless there were errors due to a confusion

between joint and conditional probabilities. They suggest

the conjunction fallacy may appear as a by-product of this

confusion. Such findings should alert us to the fact that

we should be vigilant against over-simplified approaches to

the Linda problem. Such a phenomenon as the conjunction

error may not be done justice by seeing it simply through a

performance/competence distinction framework (Evans, 1988),

or an error of comprehension/error of application framework

(Kahneman & Tversky, 1982).

Recently, Fiedler (1988) has reported significant

reductions of anywhere from 40 to 60% in the amount of

conjunction errors across a range of conjunction problems,

including the Linda problem, merely by making a single

change in the task instructions. This change was from the

instruction to rank order a list of eight items according to

their probability, as in the original eight-statement

version of the Linda problem, to making a frequency

assessment of to how many out of 100 people would each of

the statements apply, as was suggested originally by Tversky

and Kahneman (1983). The author concluded that the

conjunction fallacy turns for the most part on the semantic

ambiguity of the word "probability" in the original problem,

which might be interpreted as meaning typicalityy",

"subjective certainty", or expectednesss."

Markus and Zajonc's (1988) results led them to consider

that at least part of the conjunction fallacy is due to

inability of the subjects to form a proper sample space, and

that the inability to form the correct sample space was in

turn caused by a misunderstanding of the target statements.

All of these manipulations indicate a significant

misinterpretational component to responses to the Linda

problem. What is missing from all of these experiments,

however, is an approach that combines manipulations that

address different ambiguities in a single problem version as

Margolis (1987) has suggested. What is also missing from

these studies is any clear-cut evidence that the

manipulations which facilitate correct responding do so by

creating a different interpretation than that which would

exist without the manipulation. That a certain manipulation

causes a different pattern of responding is not direct

evidence that the different pattern of responding is being

mediated by a different problem interpretation.

Four experiments were designed to address each of these

two concerns. Experiment 1 was designed to investigate

simultaneous manipulations of different aspects of the Linda

problem as suggested by Margolis (1987). Experiments 2 and

3 were follow-up experiments to Experiment 1. Experiment 4

was designed to provide direct evidence that one particular

facilitating manipulation found to be robust and replicable

in Experiments 1 and 3 was accompanied by changes in an

interpretation of one of the response options. An overview

of these four experiments is provided in the next section.

Overview of Experiments and Their Rationale

Like Margolis and many other experimenters, I agree

that subjects are misunderstanding the question posed by the

Linda problem and that the main difficulty is that subjects

think they are being asked to make some sort of a typicality

judgment instead of a judgment of relative frequency. I

also agree with Margolis's two semantic ambiguities, but

feel that there are also others, as I have already

mentioned. Even with clarification of the relative

frequency context, there will still be some ambiguity

resulting from other parts of the problem.

Let me set forth explicitly my conceptualization of the

sources of difficulty in the Linda problem: Level I

difficulties consist of the confusion between assessments of

P(H/D) and P(D/H), and also between a typicality assessment

and an extensional assessment. Level II difficulties

consist of the mutual exclusivity problem of the response

options and the conditional versus joint probability

interpretation of the second option. If a Level I typically

interpretation is made, then clarifications at Level II will

have little effect, unless they are successful in reversing

the Level I interpretation. If a Level I interpretation of

P(H/D) or extensionality is made, then the subject might

still check the incorrect response because of Level II

difficulties. Therefore, there should be an interaction

between Level I and Level II manipulations, just as there

should be an interaction between Margolis's scenario and

semantic manipulations. Because Level I difficulties are

inherent in the problem as constructed, they will be very

difficult to correct without changing the problem in some

fundamental way. Because the efficacy of changes at Level

II rely on reaching the appropriate interpretation at Level

I, ambiguities at Level II will also be difficult to

resolve, though they should have some effect. Therefore,

one might expect only moderate levels of facilitation as a

result of any manipulation that does not change the basic

structure of the problem.

Margolis (1987) identified three ambiguities in the

Linda problem. These were discussed in detail in a previous

section, A Survey of Alternative Explanations. Margolis

suggested changes in the Linda problem to address each of

these three ambiguities. His suggested version of the Linda

problem is as follows:

Linda is 31 years old, bright and outspoken.
As an undergraduate she majored in philosophy and
was active in the environmental and civil rights
movements. A personnel survey showed that of
clerical workers in banks (including tellers)
fewer than 1% have personality profiles that sound
similar to Linda's.

If you stood to win $10 if the statement you
choose turns out to be true (whether or not the
other statement is also true), which choice is
more likely to win you the $10? Circle one:

(a) Linda is a bank teller.
(b) Linda is a bank teller active in the
feminist movement.

The betting scenario was added to cue subjects to an

extensional interpretation. The "whether or not" phrase--

embedded in the betting scenario--was added to address the

mutual exclusivity problem of the response options. The

survey information was added to alert subjects to the fact

that, in the case of either option, values are quite small.

These are the three ambiguities that concerned Margolis and

the three changes he added to address them.

However, it should be noted that the description of

Linda that Margolis uses is somewhat less strongly stated

than that used by Kahneman and Tversky. The betting

scenario is also worded differently.

Margolis's suggestions were used as a framework in

designing Experiments 1 through 3. In order to make the

results of the experiments more interpretable, the standard

Linda description utilized by Kahneman and Tversky was used

throughout. The Kahneman and Tversky betting scenario was

also substituted when used. Because it seemed as if the

"whether or not" phrase of Margolis did not address the

mutually exclusive "or" problem of the response options as

would a phrase added directly to the response option, the

phrase "regardless of whether or not she is also a feminist"

was added to "Linda is a bank teller" as a manipulation.

Experiment 1 was designed to investigate the

relationship between three changes to the Linda problem,

each of which addressed one of the three ambiguities

identified by Margolis (1987). A main effect was found for

the "regardless phrase" along with its interaction with each

of the two other manipulations--the betting scenario and

survey information.

It was considered that the main effect of the

"regardless" phrase might be due to two different reasons.

One that the phrase acted to disambiguate the response

option. The other, that the additional length of the phrase

served as an attentional cue. In Experiment 2, length of

both response options was systematically varied and yielded

null results, both for changes in length and for the

"regardless" phrase that worked in Experiment 1. In order

to attempt to replicate the results of the "regardless"

phrase found in Experiment 1 and to investigate whether a

slight inadvertent wording change in Experiment 2--using

"likely" instead of "probable"--had caused the "regardless"

phrase to be ineffectual, Experiment 3 was designed.

Experiment 3 used the "regardless" phrase, the betting

scenario, and "likely" versus "regardless" as experimental

factors. Experiment 3 replicated the main effects of the

"regardless" phrase and gave no indication that "likely"

used in place of "probable" made any difference. Experiment

3 did not replicate the interaction of the betting scenario

with the "regardless" phrase that was indicated in

Experiment 1.

In order to be certain that the "regardless" phrase

would replicate again, and also to ascertain whether the

facilitating effects of the phrase were due to differences

in interpretation of the response options, Experiment 4 was

designed. Experiment 4 utilized Euler circles along with

the "regardless" phrase as an experimental factor. The

results indicated differences in interpretation of the

response options as a result of the "regardless" phrase.

The next four chapters describe in detail each of the four



Margolis (1987) is the first author to have suggested

that subjects' difficulties with the Linda problem are the

result of three simultaneous ambiguities. If this is the

case, then a problem version that contains manipulations

that address all three ambiguities should result in

significant facilitation. Interactions between the

manipulations could be either additive or interactive. Seen

from the perspective of Level I/Level II ambiguities,

however, there should be an interaction between the

different manipulations. Specifically, if a Level I

ambiguity is not resolved appropriately, then manipulations

at Level II will have no effect.

Experiment 1 was designed to investigate the degree of

facilitation three different wording changes would have on

performance on the Linda problem, and to see whether the

effects that might result would be additive or interactive

in nature.

Three wording changes were independently varied. One

of these--the addition of a betting scenario--was utilized

by Tversky and Kahneman (1983) with a resultant 32%

reduction in the conjunction fallacy. Margolis (1987) also

advocated the use of the betting scenario to cue subjects to

a relative frequency interpretation of probability, rather

than probability in the plausibility sense of the word.

This distinction between a relative frequency interpretation

and a typicality or plausibility interpretation, I consider

to occur at Level I, so the betting scenario addresses a

Level I ambiguity.

The second wording change is the addition of some

survey information which Margolis (1987) says cues subjects

not to overlook the small but relevant differences between

two categories. The effects of survey information have not

been empirically tested before. Survey information, because

it emphasizes the sampling space of the two categories, is

considered a Level II manipulation.

The third wording change is similar, but not identical

to one utilized by Tversky and Kahneman (1983). Tversky and

Kahneman added "whether or not she is active in the feminist

movement" to the "Linda is a bank teller" response option

and found a 25% reduction in the conjunction error. The

phrase added to response option "a" in Experiment 1 was

"regardless of whether or not she is also active in the

feminist movement," which was considered to be a more

effective clarification. Both phrases serve the purpose of

clarifying the sample space relationship between the two

response options and mitigating against the implied mutually

exclusive "or."

Margolis also suggested a "whether or not" phrase for

the same purpose, but embedded it in the betting scenario.

It was considered that this might not be as effective as the

phrase attached directly to the response option and so was

not used. The "regardless" phrase, because it was an

attempt to clarify the sample space of the response options,

was considered a Level II manipulation.

It was predicted that the problem version having all

three manipulations would show the greatest degree of

facilitation. Margolis makes no claim as to whether the

manipulations would show additive or interactive effects,

but under a Level I/Level II perspective, it was predicted

that when a Level I manipulation was present, cuing a

relative frequency interpretation, all Level II

manipulations would be effective. When the Level I

manipulation was not present, it was predicted that Level II

manipulations would have little effect.


Subjects. Two hundred introductory psychology students

participated in return for partial satisfaction of the

course's research requirement. Subjects were enlisted by

means of sign-up sheets posted in the psychology department


Materials and design. All problem versions offered the

same description of Linda, which read as follows:

Linda is 31 years old, single, outspoken and
very bright. She majored in philosophy. As a
student, she was deeply concerned with issues of
discrimination and social justice, and also
participated in antinuclear demonstrations.

Each version also offered two response options. Option

"a" always consisted of "Linda is a bank teller", with or

without an additional phrase, to be described below, and

option "b" always consisted solely of "Linda is a bank

teller and is active in the feminist movement".

The design was a three-factor design. Each factor had

two levels consisting of the absence or presence of a

specific wording change.

The change in wording for the Regardless factor was the

addition of a phrase intended to clarify the appropriate

category boundaries of the "Linda is a bank teller" option,

or response option "a". With the "regardless" phrase added,

the complete response option read as follows:

a) Linda is a bank teller, regardless of
whether or not she is also active in the feminist

The wording change for the Betting factor was the

addition of a betting scenario which was previously utilized

by Tversky and Kahneman (1983). The purpose of the betting

scenario was to attempt to discourage a "plausibility"

interpretation of the word "probable" and instead cue a

statistical interpretation of the problem. The betting

scenario read as follows:

If you could win $10 by betting on an event,
which of the following would you choose to bet on?
(check one)

These instructions were immediately followed by the two

response options. For the 100 subjects who received

versions of the problem in which the betting scenario was

absent, the instructions read as follows:

Which of the following is more probable?
(check one)

The wording change for the Survey factor was the

addition of some survey information, taken verbatim from

Margolis's (1987) suggested version, which advises subjects

that small differences between groups are not to be treated

as negligible. The survey information, when present, was

inserted between the description of Linda and the

instructions and read as follows:

A personnel survey showed that of clerical
workers in banks (including tellers) fewer than 1%
have personality profiles that sound similar to

The eight versions of the Linda problem which resulted

from all possible combinations of the presence and absence

of these three wording changes are presented in Appendix A.

Procedure. Subjects were administered the problems in

groups of from 5 to 15 in a small classroom. The eight

versions were distributed across subjects such that one

subject received only one version of the problem, and so

that each of the eight cells gradually and uniformly filled

to 25.

The data were analyzed using a three-factor ANOVA.

Probability levels were set at .05 for main effects. Due to

the reduction in power resulting from fewer subjects in the

comparison groups as groups were subdivided, probability

values for interaction effects were set at .10.


The proportions of incorrect responses for each of the

eight versions of the Linda problem are given in Table 1.




Survey Absent Present


Betting absent 88 40

Betting present 84 68


Betting absent 68 52

Betting present 68 64

The only significant main effect was for the Regardless

factor, F(1,192) = 10.50, E < .05. There were 21% fewer

conjunction errors when the "regardless" phrase was present

than when it was absent. However, this factor was involved

in two significant interactions, Regardless X Survey and

Regardless X Betting, F(1,192) = 2.88, p < .10, in both


Multiple comparison follow-ups of the Regardless X

Survey interaction indicate that when survey information was

absent, the addition of the "regardless" phrase facilitated

a significant 32% reduction in error, t(192) = 3.73, E <

.10. However, when survey information was present, there

was no significant difference in the percentages of

incorrect responses between the conditions of "regardless"

present and "regardless" absent.

Similar multiple comparison follow-ups were performed

on the groups in the Regardless X Betting interaction. When

the betting scenario was absent, the presence of the

"regardless" phrase significantly facilitated correct

responding by 32%, t(192) = 3.49, E < .10. However, when

the betting scenario was present, the presence or absence of

the "regardless" phrase made no significant difference in

the proportion of correct responses.


It was predicted that the problem version having all

three wording changes present would result in the largest

degree of facilitation. This was not the case, and such

results present difficulties for Margolis's theory.

There was an interactive effect among the three wording

changes instead of an additive effect. The "regardless"

phrase manifested a strong facilitative effect, which

replicates the results of Tversky and Kahneman (1983), who

reported a 25% reduction in error when adding a similar

phrase. However, the effect did not operate under all

wording conditions. For example, when survey information

was present, it made no significant difference if the

"regardless" phrase was present or not. Likewise, when the

"regardless" phrase was present, it mattered little if

survey information was present or not. There was, however,

a nonsignificant difference of 18% in the predicted

direction of facilitation by survey information when

"regardless" was absent. The tentative conclusion is drawn

that survey information and "regardless" both facilitate

correct responding to some degree, but that "regardless" is

much more effective at doing so. When both are present,

their effects are not additive and therefore no further

facilitation is seen.

This can be understood if both manipulations are seen

as operating at Level II and both are involved in clarifying

the sample space of the response option, though both may be

operating in slightly different ways. The sample space is

more likely to be clarified by the "regardless" phrase than

survey information, but together they work no better than

either alone.

The interpretation of the Regardless X Betting

interaction is slightly different. When the betting

scenario was absent, "regardless" had a strong facilitative

effect. However, when both the betting scenario and

"regardless" were both present, the observed proportions

suggest that the betting scenario detracted from the effects

of "regardless." If, as already mentioned, the effect of

survey information is somewhat facilitative in the absence

of the "regardless" phrase, then the Survey X Betting

interaction should show that the betting scenario inhibits

the facilitation of survey information as it does for

"regardless". Although the Survey X Betting interaction was

not a significant one, the observed proportions suggest that

this was the case. When survey information was present,

there was slightly higher incorrect responding with the

betting scenario present than with it absent.

Such findings are difficult to reconcile with the

predictions of the Level I/Level II perspective. If

operating at Level I, the presence of Betting should have

enhanced the effects of Survey and Regardless. It seems

instead that its presence detracted. The tentative

conclusion is that Betting does not contribute to

appropriately clarifying a Level I ambiguity.

The lack of main effects of the betting scenario in

this experiment is at odds with the facilitative results of

the betting scenario reported by Tversky and Kahneman

(1983). It should be noted, however, that there is, both in

this experiment, and in the Tversky and Kahneman (1983)

problem, a confound involved with the betting scenario.

Adding the betting scenario results in the removal of the

instructions to choose the most probable option. The

presence and absence of these instructions and the presence

and absence of the betting scenario should have been

manipulated independently of one another. This does not,

however, explain why this experiment did not replicate the

betting scenario results of Tversky and Kahneman (1983).

It is possible that there is an alternative explanation

to the facilitative effects of the "regardless" phrase of

Experiment 1 other than that the phrase was operates to

clarify an ambiguous interpretation of option "a". The

other explanation is that the addition of the phrase

lengthens option "a" beyond that of option "b" and that it

is this additional length which determines option choice.

This interpretation would also explain why subjects choose

option "b" when the "regardless" phrase is absent from

option "a", because then option "b" would be the longer


option. In order to test this hypothesis, Experiment 2 was



The results of Experiment 1 indicated that the

"regardless" phrase played a role in facilitating correct

responses to the Linda problem. It was assumed that any

such role the phrase played would be the result of

diminishing the role of the mutually exclusive "or" by

explicitly indicating the inclusive nature of option "a."

However, there is an alternative explanation for why the

"regardless" phrase was effective. It is possible that

length is a factor that works to cue the attention of the

subject. The longer phrase might seem more important or

appear to contain more information. Length could work as an

attentional pointer which causes subjects to choose option

"a" when the "regardless" phrase is present for reasons

other than that the phrase is clarifying sample space.

Evans's (1988) theory of reasoning posits an initial

stage during which aspects of a problem are selected to be

subsequently manipulated. This selection takes place at an

unconscious level. Length could work as a factor that

increases the probability of an item's being selectively

attended to. Option "a" with the "regardless" phrase could

be being chosen more often merely because it is more


In order to determine whether this was the case,

Experiment 2 was conducted. A lengthening "regardless"

phrase was independently added to options "a" and "b." It

was predicted that if length was the deciding factor in

subjects' choices, that when a "regardless" phrase was added

to option "a," percentages of subjects choosing option "a"

would increase. When a "regardless" phrase was added to

both option "a" and "b," then "b" would be the longer option

and choices of option "b" would increase. When a

"regardless" phrase was added to "b" alone, the percentages

of subjects choosing "b" should be very high. If, on the

other hand, the "regardless" phrase was effective because it

clarified interpretations of option "a," it would cause

increased selection of option "a" regardless of whether or

not a "regardless" phrase was also attached to option "b."


Subjects. Subjects were 84 introductory psychology

students who were enlisted by means of sign-up sheets.

Participation partially satisfied the course's research


Materials and design. The design was a two-factor

between-subjects design. The levels of each factor

consisted of the absence or presence of a lengthening

phrase. For option "a", the phrase was the "regardless"

phrase used in Experiment 1. For option "b", the phrase was

as follows:

regardless of what other activities she also
participates in

The description of Linda used was the same as that used

in Experiment 1, and the instructions read as follows:

Which is more likely? (check one):

The four problem versions used in Experiment 2 are

presented in Appendix B.

Procedure. Subjects were run in small groups of from 5

to 15 in a small classroom. The data were analyzed using a

two-factor ANOVA. Probability levels were set at .05 for

main effects and .10 for interaction effects.


There were no significant main or interaction effects.

Proportions of incorrect responses for the four cells are

given in Table 2.


The results of Experiment 2 indicate that length was

not a factor in selection of an option. However, the

facilitative effects of the "regardless" phrase as found in

Experiment 1 were not replicated either.

It was noted after the experiment was conducted that

"likely" had been used in the instructions in place of the

"probable" that was used in Experiment 1. The reason for

this substitution was that while "probable" was used by

Tversky and Kahneman (1982, 1983), "likely" was used by

Margolis (1987). An oversight on my part caused me to

substitute "likely" for "probable" in all four problem

versions of Experiment 2.



Regardless in "a"

Regardless Absent Present
in "b"

Absent 81 76

Present 67 71

It was considered whether the "regardless" phrase could

possibly interact with the word "likely" in the instructions

of Experiment 2 in a different way than with the word

"probable," which was used in Experiment 1. To follow up on

this possibility, an additional 21 subjects were tested with

a problem version which contained "probable" in the

instructions, and the "regardless" phrase added to option

"a." There was 57% incorrect responding to this version, a

value not significantly different from the version that used

"likely" (76%), t(40) = 1.33, E > .05, but in the direction

of the suspected interaction and similar enough to the 21%

main effect reduction seen in Experiment 1 to warrant

further inquiry. In order to resolve the possibility of

there being different effects of the "regardless" phrase

when used with "likely" and "probable," to attempt to

replicate the main effects of "regardless" seen in

Experiment 1, and to further examine the effects of the

betting scenario used in Experiment 1 in an unconfounded

design, Experiment 3 was conducted.


Experiment 3 had three experimental factors: (a) the

"regardless" phrase in option "a," (b) the betting scenario,

and (c) use of either "likely" or "probable" in the


The design of the experiment was such that the addition

of the betting scenario was not confounded with the removal

of the instructions to choose the most probable option, as

it was in Experiment 1.

It was predicted that Regardless would facilitate and

would also interact with Betting, replicating the results of

Experiment 1. It was also predicted that Regardless would

interact with Likely/Probable such that Regardless would

facilitate with the presence of "probable" but would be

ineffective with the presence of "likely." An admittedly

post hoc explanation for this expected Regardless X

Likely/Regardless interaction was that "probable" has

connotations of a slightly more statistical nature, leaving

more room for an influence by Regardless. "Likely" seems

more synonymous with "plausible" or "believable" and more

likely to cue a "plausibility" Level I scenario. Given the

Level I/Level II perspective, the Level II manipulation of

Regardless would be relatively ineffectual with that Level I



Subjects. One hundred and sixty introductory

psychology students participated voluntarily. The students

were enlisted by means of sign-up sheets placed in the lobby

of the psychology building. Participants were given credit

toward the completion of the course research requirement.

Materials and design. The design was a three-factor

between-subjects design similar to that utilized in

Experiment 1. The Regardless factor consisted of the same

two levels of the presence or absence of the "regardless"

phrase in option "a" and the Betting factor of the presence

or absence of the same betting scenario as was used in

Experiment 1. The description of Linda was the same as that

used in Experiments 1 and 2. The Likely/Probable factor had

two levels: either the use of the word "likely" in the

instructions, or the use of the word "probable". When the

betting scenario was absent, the instructions read as


Which of the following is more probable [or
likely]? Check one:

When the betting scenario was present, the instructions

read as follows:

If you could win $10 by betting correctly,
which of the following would you bet is more
probable [or likely]? Check one:

In this way there was no confounding between the

addition of the betting scenario and the removal of the

words "likely" or "probable" as there was in Experiment 1.

The eight versions of the Linda problem that resulted from

all possible combinations of these three manipulations are

collected in Appendix C.

Procedure. Subjects were run in small groups of from 5

to 15 in a small classroom. The results were analyzed using

a three-factor ANOVA. Probability levels were set at .05

for main effects and .10 for interaction effects.


The proportions of incorrect responses for each of the

eight versions of the problem are given in Table 3.

There was a main effect of the Regardless factor,

F(1,152) = 8.81, p < .05, with a 19% reduction in error in

those cells where the "regardless" phrase was present

compared to those cells in which it was not. There were no

significant main effects for the Betting factor or for the

Likely/Probable factor, and there were no significant

interaction effects.




Probable/Likely Absent Present

Probable present

Betting absent 95 65

Betting present 90 65

Likely present

Betting absent 90 80

Betting present 80 70


The effects of "regardless" seen in Experiment 1 were

replicated here in Experiment 3. The degree of effect was

very similar, with 19% here, compared to the 21% seen in

Experiment 1.

The betting scenario once again defied predictions and

was ineffectual in facilitating correct responding, this

time in an unconfounded design. Neither did Betting

interact with Regardless as was seen in Experiment 1.

However, due to the confounds of Experiment 1, it is

difficult to know whether the Regardless X Betting

interaction seen in that experiment was due to the effects

of the addition of the betting scenario, or to the effects

of the removal of the instructions to choose the most

probable option. The results of Experiment 3 indicate that

in an unconfounded design, there is no interaction between

Betting and Regardless. Lack of main effects and lack of

interaction effects call into question the efficacy of the

betting scenario as a manipulation capable of bringing about

a shift in Level I interpretation. Tversky and Kahneman's

(1983) results are not replicated and Margolis's (1987)

choice of the betting scenario as a manipulative factor are

called into question.

No evidence was found in the results of Experiment 3

for any difference between "likely" and "probable" in the

instructions, either in main effects or interaction effects.

The Likely/Probable factor is therefore also ruled out as an

effective manipulation at Level I.

At this point the only effective and replicated

manipulation seemed to be the "regardless" phrase. To

further investigate the role of Regardless in facilitating

correct responding to the Linda problem, Experiment 4 was

designed. A way was sought that would reveal whether

subjects who chose option "a" when "regardless" was present

were interpreting that option any differently than those

subjects for whom "regardless" was not present. It was

considered that the best way to ascertain this was to have

subjects select a visual representation of the category

membership relationships indicated by the response options.

Euler circles are a convenient way to represent category

membership. Therefore in Experiment 4 an additional task

was assigned to the subject: The selection among an array

of Euler circle pairs of the pair that best represented

options "a" and "b" in the Linda problem. It was assumed

that these choices would reflect interpretational

differences. In order to ascertain whether the Regardless

factor would affect the choice of differing visual

representations for option "a," Experiment 4 was conducted.


To further investigate the role of the "regardless"

phrase in facilitating correct responding to the Linda

problem, Experiment 4 was designed. In Experiment 4 an

additional task was assigned to the subject. Seven

overlapping pairs of Euler circles were added to the Linda

problem, a circle in each pair representing the category of

bank teller, the other, the category of feminist. Subjects

were asked to choose among the particular shadings. All

possible shading combinations were represented. In other

words, the particular shading of one circle pair would

represent the category "all bank tellers", another, the

category "all feminists except those who are also bank

tellers," etc. The additional task assigned the subject was

to identify those two circle pairs that represented the

categories indicated by options "a" and "b." This

arrangement allowed a test of whether the facilitating

effects of the "regardless" phrase were associated with

differences in interpretation of option "a." Specifically,

if "Linda is a bank teller" is being interpreted as "Linda

is a bank teller and not a feminist," such an interpretation

should result in the choice of the circle pair which is

shaded in such a way as to represent "bank tellers who are

not also feminists" or a half-moon shape. If the addition

of the "regardless" phrase acts to clarify this ambiguity,

then the augmented option "a" that reads "Linda is a bank

teller, regardless of whether or not she is also active in

the feminist movement" should be reflected in the choice of

the Euler circle pair whose shading represents all bank

tellers, or a completely shaded circle.

It was predicted that the presence of the "regardless"

phrase would result in a higher percentage of subjects

choosing option "a." It was further predicted that Euler

circle choices would reflect differences in interpretation

due to the presence of the "regardless" phrase.

Specifically, it was predicted that under both Regardless

conditions, subjects choosing option "b" as the correct

answer would check the half-moon circle for option "a,"

indicating they had interpreted "Linda is a bank teller" as

"Linda is a bank teller and not a feminist." Subjects

choosing option "a" as the correct answer would check the

fully shaded circle for option "a," indicating that they had

understood the inclusive nature of "Linda is a bank teller."

Furthermore, it was predicted that the increase in the

choice of option "a" caused by the "regardless" phrase,

would be accompanied by an identical increase in the choice

of the fully shaded bank teller circle to represent option

"a." It was predicted that there would be no difference

between the two groups in their selection of the Euler

circle patterns representing option "b."


Subjects. Twenty introductory psychology students were

enlisted by means of sign-up sheets, and participated in

return for partial credit toward the course's research


Materials and design. The design was a one-factor

between-subjects design. The "regardless" phrase was either

present or absent in option "a." At the bottom of the same

page on which the Linda problem was printed were these

additional instructions:

The circles on the left represent all bank
tellers. The circles on the right represent all
feminists. The area where two circles overlap
represents all those who are both bank tellers and
feminists. Place the letters "a" and "b"
corresponding to the two statements above next to
those two diagrams which you think capture
visually the meanings of the statements.

Following these instructions were the seven pairs of

differently shaded Euler circles, each preceded by a blank

for responses. The order of the shading was presented in

different random sequences, counterbalanced across subjects.

Representative examples of the two problem versions with

Euler circle pairs are presented in Appendix D.

Procedure. Subjects were run in two small groups in a

small classroom. A brief explanation of overlapping Euler

circles involving an example of people with blond hair and

blue eyes preceded distribution of the problems. Data were

analyzed in one-tailed t-tests, with probability values set

at .05.


Of the ten subjects who received the problem without

the "regardless" phrase, 80% chose option "b." Of the ten

subjects who received the problem with the "regardless"

phrase, 40% chose option "b," a significant 40% reduction in

error, t(18) = 2.00, R < .05, which replicates the

"regardless" results of Experiments 1 and 3.

In the no "regardless" group, 10% chose the fully

shaded circle which represented "all bank tellers," 70%

chose the shaded half moon which represented "bank tellers

who are not feminists," and 20% chose other shaded


In the "regardless" group, 60% chose the fully shaded

circle which represented "all bank tellers," 20% chose the

shaded half moon which represented "bank tellers who are not

feminists," and again, 20% chose other shaded


The 50% difference between the "regardless" and no

"regardless" groups in choosing the fully shaded "all bank

tellers" circle was significant, t(18) = 2.75, p < .05.


Experiment 4 replicates the effect of Regardless as

shown in Experiments 1 and 3. It appears that the

"regardless" phrase's effects are a robust phenomenon.

The significant difference between Euler circle choices

indicates that the "regardless" phrase is instrumental in

driving an appropriate interpretation of option "a."

However, it is not accurate to say that those who chose

option "a" all picked the fully shaded circle, and those who

chose option "b" all picked the half-moon shaded circle.

Two of the four people in the "regardless" group who chose

option "b" nonetheless picked the fully shaded circle to

represent option "a," and one of the six in that group who

correctly chose option "a" nonetheless picked the partially

shaded circle for option "a." Likewise, in the no

"regardless" group, the two people who correctly chose "a"

nonetheless picked the partially shaded circle, and one

person who chose "b" picked the fully shaded circle. It

appears that although "regardless" facilitates option "a" as

the correct choice, and also facilitates the choice of the

fully shaded circle as representing "Linda is a bank

teller", the two effects are not identical or perfectly



In an earlier section of this dissertation, the

historical and philosophical overview criticized the

Kahneman and Tversky approach to reasoning research as being

overly restrictive and ignoring variables of content and

context. Their approach is also a highly pejorative one

when comparing the reasoning competence of the layperson to

the workings of formal statistical principles. Such an

approach was shown to have historical precedents, but was

also shown to have a historical counterpoint. There is

currently a renewal in the approach to consider the role of

content in the reasoning process, and the background context

in which it occurs, as well as the goals of the reasoner

(e.g., Baron, 1988; Cosmides, 1990; Evans, 1989; Margolis,

1987). There is a shift away from the statistical metaphor

to the perceptual metaphor.

Not only has there been a redressing of balances within

the larger context of general approach, but on a more

microscopic scale much has been shown to be amiss about the

details of the Kahneman and Tversky program. It is at this

level that this dissertation has attempted to contribute

some empirical evidence.

It has been argued that with the Linda problem as it is

constructed, there is an overwhelming tendency to see the

task in terms of matching a set of characteristics to a

prototype. The pragmatic invitation to make this

interpretation is built into the very structure of the

problem and the information offered. Though I have called

this a Level I ambiguity, there is little hesitation on the

part of the subject in seeing the problem this way. It has

been the argument of this dissertation that even when

subjects can be brought to appropriately resolve the Level I

ambiguity of the problem and make an appropriate

interpretation of the problem in terms of overlapping

category sets, that Level II ambiguities can still detract

from their performance.

Margolis (1987) has argued that multiple ambiguities

beset the Linda problem and offered manipulations to address

those ambiguities. How do the experimental results of this

dissertation contribute to any discussion concerning the

three viewpoints of Tversky and Kahneman, Margolis, and the

Level I/Level II perspective?

A summary of the experimental results are as follows:

(a) there is no evidence that a betting scenario plays any

direct facilitating role; (b) there are indications that

survey information acts as a somewhat successful

facilitation; (c) there is no evidence that "likely"

operates any differently than "probable"; and (d) there is

evidence that the "regardless" phrase is a moderate

facilitator and that its effects are not due to option

length but to a shift in the sample space interpretation of

the response options by the subject.

On the whole, no manipulation resulted in a very large

reduction in conjunction error: The Linda problem is robust

in its resistance to facilitating manipulations. However,

it has already been suggested that the very structure of the

problem will make it resistant in this way.

Changing the task structure of conjunction problems can

have impressive effects. Tversky and Kahneman (1983)

developed a health-survey conjunction problem which asked

subjects for two estimates: the percentages of men who have

had one or more heart attacks, and the percentages of men

who are over 55 and have had one or more heart attacks. The

pragmatic structure of this question is not as overwhelming

as that of the Linda problem, so conjunction effects were

not as high, but still, 65% of respondents assigned a higher

value to the second estimate. When the task was changed to

one of estimating how many out of 100 men fell into the two

categories, the conjunction error fell to 25%. To my mind,

this change in task structure qualifies as a Level I

manipulation or as a scenario manipulation in Margolis's

terms: It tends to shift subjects toward a relative

frequency interpretation. Tversky and Kahneman (1983) then-

-in the only case where they combined two different

manipulations--added a third estimate to the "out of 100

men" health-survey problem. This estimate was of the number

of men over 55. These three estimates now present the joint

event along with both of its constituents and should make

sample space confusions less likely. This manipulation

qualifies as a Level II manipulation and given the Level I

manipulation of the new task instructions should prove to be

effective. In fact, the conjunction error fell to 11% with

these two manipulations, one of the lowest values ever

reported for a conjunction problem.

Of course, the interpretation of these results within

the Level I/Level II perspective is my own. Tversky and

Kahneman (1983) do not have any comprehensive framework

within which they can interpret such findings and instead

focus on the prevalence of the representativeness heuristic

in the remaining 11% of responses.

Fiedler (1988) revised the Linda problem so that

subjects estimated to how many out of 100 such descriptions

would the target items apply. He reports that the

conjunction effect fell to 22%. It would be interesting to

combine this Level I manipulation for the Linda problem with

the Level II manipulation of the "regardless" phrase to see

if the error would be reduced any further.

Morier and Borgida's (1984) findings that the

conjunction error depended on the kind and number of

response options and whether probability estimates or

probability rankings were asked for led them to label the

conjunction fallacy a task-specific phenomenon. The

perceptual metaphor predicts task-specificity of such an


These studies all offer evidence that changes in task

structure can serve as a Level I manipulation, at the same

time that the results of this dissertation call into

question the efficacy of the betting scenario as a Level I

manipulation. Other than the single report (Tversky &

Kahneman, 1983) of an effect of betting scenario, no other

study utilizing the betting scenario has appeared until

recently (Wolford, Taylor, & Beck, 1990). Wolford et al.

compared a version of the Linda problem with a betting

scenario to a version without the betting scenario and found

no significant difference in responses between the two.

Experiment 3 of the present study found no effects of

an unconfounded betting scenario and Experiment 1 found no

effects of Betting. The marginal interaction effects seen

in Experiment 1 may be spurious or they may have resulted

from the confounding of the addition of the betting scenario

with the removal of the word "probable" in the instructions.

Further investigations could address this issue, but,

in any case, the betting scenario in Experiment 1 seemed to

detract from the facilitative effects of Regardless, not

what one would expect if the betting scenario was operating

at Level I.

Margolis (1987) chose the betting scenario as the

manipulation to address the frequency/plausibility ambiguity

in the Linda problem. The empirical results call his choice

into question. However, Margolis (personal communication,

1989) has expressed the belief that sufficient differences

exist between his betting scenario and the one utilized in

this study. He feels his version would nonetheless provide

facilitation. Only further empirical work would establish

his claim.

Margolis's (1987) second suggested manipulation was the

addition of survey information. This short series of

experiments was unable to fully explore the implications of

Experiment 1, which indicated some contribution to correct

responding by survey information. Recall that in that

experiment, in the absence of "regardless," the presence of

survey information helped correct responding by 18% compared

to its absence. As an additional note, the Bill problem, a

conjunction problem similar to the Linda problem, was run in

second and third position behind the Linda problem in

Experiment 3 as a pilot study. The three factors of the

Bill problem were Survey, Betting, and Probable/Likely. The

results should be interpreted cautiously, but there was a

main effect for survey information, with a 24% reduction in

error compared to its absence. How survey information

achieves its facilitating effects is a topic for further

study. A perusal of the cell percentages in Experiment 1

where both Regardless and Survey were factors indicates that

the effects of "regardless" and survey information are not

strictly redundant. Survey information seems to be doing

the same thing as "regardless" but in a slightly different

way. Perhaps a study with Euler circles as in Experiment 4,

but with Survey as a factor in place of Regardless would be

fruitful. Within the Level I/Level II framework, survey

information would be considered as a Level II manipulation.

No conclusive evidence was found in this study for any

role of "likely" versus "probable" in task instructions as a

Level I manipulation, as was somewhat suggested by the

results of Experiment 2. No effects of the Likely/Probable

factor were found in Experiment 3. In addition, a recent

study (Macdonald & Gilhooly, 1990) had one group of subjects

rank the eight statements in the eight-statement version of

the Linda problem by probability, and another group rank by

believability. There were no effects of this manipulation.

The fourth and final empirical contribution of the

present study is evidence that the "regardless" phrase is a

moderate facilitator. The Regardless factor was considered

a Level II manipulation, but it was seen that it could work

alone. It is possible that the "regardless" phrase works

only in those cases where subjects have already brought the

appropriate Level I set to the problem, or it is also

possible that the "regardless" phrase can also work to cue

the appropriate Level I interpretation on its own, in

addition to clarifying sample space. In this study, no

effective Level I manipulations were found in either Betting

or Likely/Probable, but a logical next step is to utilize

the "regardless" phrase in a study where a postulated Level

I manipulation is effective, as in the out-of-a-hundred

estimation version of a conjunction problem studied by

Tversky and Kahneman (1983) and Fiedler (1988).

Experiment 4 provides evidence that the "regardless"

phrase works because it clarifies category relationships.

While Tversky and Kahneman (1982, 1983) say that the

appropriate interpretation of the Linda problem is in terms

of the conjunction rule, it is really a very restricted

application of the conjunction rule which they have in mind.

What the Linda problem really is about is category

membership and the ability to see one category as being

subsumed under another.

A significant part of the poor performance usually seen

with the Linda problem can be accounted for by a

misinterpretation on the part of the subject to which he is

led by the conventions of natural language. To this degree

Tversky and Kahneman have exploited the reasonableness of

human reasoning. The reduction of the Linda problem from

its place in an argument crafted to attack the rationality

of human thought to an insipid fragment of verbal trickery

has its counterpart in another similar area of human

reasoning research, that of research into logical thought.

Inhelder and Piaget (1964) showed children an array of

flowers, consisting of more tulips than roses, then asked if

there were more flowers or more tulips. Children younger

than five answered that there were more tulips, which

Inhelder and Piaget concluded to mean that the children were

incapable of logically comparing a category to a

subcategory. Gardner (1985) discusses how children may

merely be making the reasonable interpretation that a

comparison between the two kinds of flowers is being asked

for. When the question is more appropriately phrased, or

when class/subclass comparisons are appropriate, children

are able to reason competently about class inclusions.

The parallel with the Linda problem is striking.

Subjects are really being asked to make an inference about

class inclusion when such an inference is not at all

relevant to what they think they are being asked to do. In

the case of the Linda problem, however, even when the idea

of category membership is communicated to the subject,

exactly what the subclasses are can be misconstrued. This,

clearly, is the import of Experiment 4. The "regardless"

phrase, if it is appropriately incorporated by the subject

into the correct Level I interpretation, helps subjects to

see what classes and subclasses they are being asked to

compare. The "regardless" phrase allows them to form the

appropriate sample space. It is quite interesting, though,

that the appropriate interpretation of the "Linda is a bank

teller" category as including those who may also be

feminists does not correlate perfectly with the choice of

that option as the most probable one. It will take a larger

sample size to see how common the phenomenon is, but

Experiment 4 at least indicates that subjects can achieve

the correct interpretation of Linda as bank teller and

possibly feminist for option "a" and still choose option "b"

as being more probable. Is it possible that they are able to

correctly assess the sample space with the additional

"regardless" phrase and still fall prey to the pragmatic

pull of the problem towards a prototype matching task?

Perhaps the addition of the Euler circles merely splits the

problem into two tasks, for one of which the "regardless"

phrase creates the right sample space, but the other of

which is still driven by the basic overall mood of the


Experiment 4 also indicates that a subject can assess

"Linda is a bank teller, regardless of whether or not she is

also a feminist" as a category that excludes feminists and

still choose that option as more probable. Does this mean

that the "regardless" phrase is working at some level other

than in the resolution of an appropriate sample space?

Although Experiment 4 indicates that the "regardless"

phrase plays a definite role in both correct sample space

interpretation of "Linda is a bank teller" and also in

seeing that that option is the more likely one, it is

important that this finding be replicated. Perhaps further

research can also clarify why correct sample space

interpretation and correct responding as to the most likely

option are not more closely correlated.

The efficacy of the "regardless" phrase in causing a

moderate reduction in the conjunction fallacy and the Euler

circle representations that subjects choose when

"regardless" is not present, indicates that subjects do take

a noninclusive interpretation of the two response options.

A recent study (Agnoli & Krantz, 1989) that became available

during the course of the present study is relevant to this

discussion. Agnoli and Krantz (1989) hypothesized that

people lack a useful design for mapping extensional

principles onto conjunction problems. They state that, in

everyday life, one does not usually find inclusive

relationships among alternatives, and therefore, people do

not have the problem solving design to search for

inclusions. In order to train people in the search for

inclusion relations, Agnoli and Krantz utilized a training

module that portrayed inclusion relations by means of Euler

circles. Here, Euler circles were used in training, rather

than as a measure of interpretation, as they were in

Experiment 4 of this dissertation. Use of the training

module resulted in about a 30% reduction in conjunction

errors on the eight-statement version of the Linda problem.

Agnoli and Krantz (1989) concluded that the training module

"affects the probability of errors by changing the

competitive balance between extensional and intentional

strategies, rather than by changing how each kind of

strategy operates once it is engaged" (p. 536).

Agnoli and Krantz also found that changing the "X is a

B" statement to "X is a B and may or may not be an A"

resulted in a small reduction in conjunction error in seven

out of eight eight-statement conjunction problems.

Curiously, the interaction between training and the "may or

may not be" phrase in these problems was negative, with the

change in wording reducing the magnitude of the training

effects. Nevertheless, Agnoli and Krantz have shown that

training subjects in the inclusion properties of Euler

circles had significant effects. Such training could have

only made the appropriate Level I interpretation more


In addition to the above study, two other studies

relevant to this dissertation have come recently into print.

Both are important to the arguments of this dissertation

because they explicitly explore the role of context in the

Linda problem.

Wolford, Taylor, & Beck (1990) provide evidence that

the degree of conjunction fallacy depends on the context of

the conjunction problem. Specifically, it is important

whether it is perceived that the event had already occurred

or has yet to occur. The authors suggest that it is only in