Citation
Variability and utilization deficiencies in children's memory strategies : a developmental study

Material Information

Title:
Variability and utilization deficiencies in children's memory strategies : a developmental study
Creator:
Coyle, Thomas, 1968-
Publication Date:
Language:
English
Physical Description:
xiii, 108 leaves : ill. ; 29 cm.

Subjects

Subjects / Keywords:
Age groups ( jstor )
Child development ( jstor )
Child psychology ( jstor )
Correlations ( jstor )
Experimentation ( jstor )
Memory ( jstor )
Philosophical psychology ( jstor )
Rehearsal ( jstor )
Rehearsal techniques ( jstor )
Trials ( jstor )
Dissertations, Academic -- Psychology -- UF ( lcsh )
Psychology thesis, Ph. D ( lcsh )
City of Boca Raton ( local )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1997.
Bibliography:
Includes bibliographical references (leaves 104-107).
Additional Physical Form:
Also available online.
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Thomas R. Coyle.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
0028009852 ( ALEPH )
37849728 ( OCLC )

Downloads

This item has the following downloads:


Full Text











VARIABILITY AND UTILIZATION DEFICIENCIES IN CHILDREN'S MEMORY
STRATEGIES: A DEVELOPMENTAL STUDY
















By

THOMAS R. COYLE
















A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1997



































For my parents,

Oceania and Roger Coyle















ACKNOWLEDGMENTS

Dissertation acknowledgments typically say little about how those acknowledged contributed to the student's academic development. Perhaps this is how it should be, for the main purpose of a dissertation is to present a student's original contribution to a recognized body of knowledge. I will adhere to the academic tradition of brevity in these Acknowledgments, but do so in a way that allows me to recognize specific contributions of individuals who helped and supported me in completing this dissertation. I first acknowledge faculty, and then acknowledge my family.

I wish to thank Dr. James Algina for his contribution in teaching me how to do some of the statistics in this dissertation, particularly the analysis in which measures of variability were converted to z-scores and analyzed simultaneously. I also thank Dr. Algina for discussions, which I initiated, on issues pertaining to tenure in the university and E. D. Hirsch's notion of cultural literacy.

I wish to thank Dr. David Bjorklund for convincing me to devote my life to studying developmental psychology. Dr. Bjorklund was instrumental in my training during the early part of my career, and he deserves much credit for my achievements. Dr. Bjorklund has shown me that the most exciting aspect of science is discovery, that description is a reasonable goal for science, that the best research questions are those that can account for the most data, and that Peter Kapista knew



iii








what he was talking about when he said, "Theory is a good thing but a good experiment lasts forever."

I wish to thank Dr. Shari Ellis for suggesting that I examine

patterns of variability within individual subjects and the effectiveness of individual strategy combinations. Dr. Ellis's suggestions were incorporated into this dissertation and into Coyle and Bjorklund (1997). I also thank Dr. Ellis for discussions, which I initiated, on funding for education and on cross-cultural research.

I wish to thank Dr. Ira Fischler for bringing to my attention several articles in the adult literature that utilize procedures for assessing intentionality in cognition, notably Jacoby's (1991) processdisassociation approach. The intentionality issue is often neglected in strategy research, even though some researchers have made intentionality the sine qua non of strategy use.

I wish to thank Dr. Patricia Miller for emphasizing the continuous nature of strategy classifications. Her contribution is acknowledged in Bjorklund and Coyle (1995, p. 166), and can be identified in the analyses presented in this dissertation. I also thank Dr. Miller for suggesting that I analyze qualitative differences in strategy use. Such an analysis was performed for this dissertation, and it yielded some interesting results.

I wish to thank Dr. Scott Miller for suggestion g that I think carefully about defining and measuring cognitive strategies. It is interesting that defining cognitive strategies never has been a favorite pastime of strategy researchers who study them. I also thank Dr. Miller for his careful and timely reviews of my manuscripts, including my dissertation. I have yet to find anyone whose knowledge of iv








APA guidelines is as expansive as Dr. Miller's, and I probably never will. I also wish to thank Dr. Miller for his contribution to the Developmental area while he was on sabbatical.

I wish to thank Jennifer L. Slawiniski for her suggestions

regarding the design of my dissertation. I also thank Miss Slawinski for the clever idea of applying a sequential design in the context of a microgenetic experiment. Finally, I thank Miss Slawinski for her incisive comments on examining gender effects and on reanalyzing archival data.

I wish to thank the research assistants who helped with data

collection, analyses, and interpretation. These include Joshua List, Chad Colbert, Victoria Otero, and Jerusha Azel. I suspect I learned as much from them as they learned from me.

I wish to thank my mother and father, Oceania and Roger Coyle, for their enduring support during my academic career. My mother and father have taught me that hard work and perseverance will in the end always pay off. Most important, my mother and father have taught me that the most important thing in secular life is family. Their marriage of 35 years (and counting) is why I have acknowledged them together. This dissertation is a testament to their love and support throughout the years.

I wish to thank my brother, James Coyle, for his interest in my work and his continued support of my goals, including the completion of this dissertation. James is an exceptional guitar player, partly because of exceptional talent, and partly because of exceptional practice. I have learned much from observing his work ethic and dedication to the instrument he loves so much.

v








I wish to thank my cousin, Annette Fields, for providing me with support and guidance throughout the years. Annette is an accomplished lawyer and she has taught me by example the rules and standards of good argumentation. She has shown me that anyone can rise to the top with lots of hard work and discipline. Annette's best friend and confidant, Ellen Ross, always has believed in me and my talents, and her support is appreciated.

I wish to thank Deborah Hooks for loving me for what I am and,

more importantly, for what I can become. Deborah entered nearly all of the data for this dissertation, and she provided numerous useful suggestions about possible analyses. One of her suggestions, to examine intrusions in children's recall protocols, turned out to be very promising and provides a possible basis for a new view of strategy development that includes developmental differences in resistance to interference. Deborah has taught me that love is the best part of life, and without it, you really don't have much of a life at all.

To all the members of my family, I love you all.





















vi















TABLE OF CONTENTS


page
ACKNOWLEDGMENTS ..................................................... iii

LIST OF TABLES ...................................................... ix

ABSTRACT ............................................................ xi

INTRODUCTION ........................................................ 1

Memory Strategies Enhance Performance .............................. 2
Memory Strategy Development is Stagelike .......................... 8
Evaluation of Research on Variability and Utilization
Deficiencies .................................................... 12
The Current Study ................................................. 18
Goals of the Current Study ........................................ 21

METHOD .............................................................. 26

Participants ...................................................... 26
Stimuli and Design ................................................ 26
Procedure ......................................................... 29
Coding ............................................................ 31

RESULTS ............................................................. 34

Preliminary Analyses .............................................. 34
Off-Task Behavior and Examination ............................... 34
Recall .......................................................... 35
Strategy Use .................................................... 39
Variability in Strategy Use ....................................... 41
Multiple-Strategy Use ........................................... 41
Strategy Change ................................................. 45
Relation Between Strategy Use and Recall .......................... 56
Utilization Deficiencies ........................................ 56
Strategy Change and Recall ...................................... 64

DISCUSSION .......................................................... 78

Variability in Strategy Use ....................................... 79
Multiple-Strategy Use ........................................... 79
Strategy Changes ................................................ 82
Relation Between Multiple-Strategy Use and Recall ................. 86
Relation Between Strategy Changes and Recall ...................... 93
Conclusions ....................................................... 98

vii










REFERENCES .................................................... page
...... 104

BIOGRAPHICAL SKETCH ................................................. 108






















































viii















LIST OF TABLES

Table page

1. Word lists by category membership .............................. 27

2. Mean proportion recall by condition, grade, and trial, and by
condition and trial (i.e., collapsed across grade), and
grade differences in recall at each trial by condition ....... 37

3. Percentage (and number) of trials on which each strategy was
used, by condition and grade ................................. 40

4. Mean number of strategies used, by grade and trial, and by
condition, grade, and trial .................................. 43

5. Mean number of trial-by-trial strategy changes, by grade and
trial transition, and by condition, grade, and trial
transition .................................................... 47

6. Mean z-scores and raw scores for unique combinations, trials
with changes, and total changes, by grade (standard
deviations in parentheses) .................................... 50

7. Percentage (and number) of children classified as stable or
unstable across all trials, on early trials, and on later
trials, by grade .............................................. 52

8. Percentage (and number) of children changing or not changing
their stability classification across trial blocks, by
grade ........................................................ 54

9. Correlations between number of words recalled and number of
strategies used, by condition, grade, and trial .............. 57

10. Mean proportion recall when strategy use was perfect, by grade and type of strategy used ..................................... 60

11. Percentage (and number) of trials on which each strategy combination was used, and mean proportion recall (and
standard deviations) for each combination, by grade (Codes for strategies: S, sorting; R, rehearsal; C, clustering; N,
category naming) ............................................. 62

12. Correlations between measures of strategy change and recall, by condition and grade ....................................... 66


ix








Table page

13. Correlations among measures of strategy change, by condition and grade .................................................... 67

14. Mean proportion recall (and standard deviations) for children classified as stable or unstable across all trials, on early
trials, and on later trials, by grade ........................ 70

15. Percentage (and number) of trials on which strategy changes did and did not occur immediately after recall was perfect
or not perfect ............................................... 73

16. Percentage (and number) of trials on which strategy changes did and did not occur immediately after recall was perfect
or not perfect, by condition ................................. 75






































x















Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

VARIABILITY AND UTILIZATION DEFICIENCIES IN CHILDREN'S MEMORY STRATEGIES: A DEVELOPMENTAL STUDY By

Thomas R. Coyle

August, 1997

Chair: Patricia H. Miller
Cochair: Shari A. Ellis
Major Department: Psychology

The goal of this study was to examine variability in memory

strategy use, and the relation between such variability and recall, as a function of age and a measure of task difficulty (number of words to remember). Second and fourth graders received seven sort-recall trials of different categorizable words (e.g., nurse, lawyer, wall, roof, rose, lily). The number of words presented varied across trials, whereas the number of categories represented in all word lists remained constant. Variability in strategy use was measured in terms of multiple-strategy use (e.g., number of strategies used across trials) and strategy changes (e.g., number of trial-by-trial changes in the types of strategies used). Consistent with previous research, (a) older children used more strategies and made fewer trial-by-trial changes than younger children;

(b) older children recalled more than comparably strategic younger children, indicating a utilization deficiency for the younger children; and (c) older children showed significant and positive relations between xi









stable-strategy use (i.e., few trial-by-trial changes) and recall, whereas younger children showed no reliable relation between stability and recall. This study extended previous research by showing that (a) stable-strategy use emerges with experience (i.e., over trials) for older children but not younger children, (b) utilization deficiencies occur for some but not all instances of perfect strategy use, and (c) memory benefits from stability occur on early trials (i.e., Trials 1 to 4) for older children but not for younger children, who show memory benefits from stability on later trials (i.e., Trials 4 to 7) only. Surprisingly, the results revealed few significant effects related to changes in the measure of task difficulty (i.e., number of words to remember). The findings are discussed in terms of how they advance our knowledge and understanding of utilization deficiencies and variability in strategy use.



























xii


































Nature is not economical of structures--only of principles Abdus Salam


























xiii















INTRODUCTION

All scientific disciplines are based on a set of core assumptions (Gholson & Barker, 1985). These assumptions are rarely stated explicitly and rarely questioned. They serve to direct a researcher's choice of research questions, data collection procedures, statistical analyses, and interpretation of research findings.

Two such assumptions were central in early research on children's memory strategies. The first was that memory strategies usually enhance memory performance. The second was that memory strategy development proceeds through a series of stages in which a unique strategy is used fairly consistently in each stage. These assumptions were implicit in much of the memory strategy research conducted throughout the 1960s and 1970s.

There are now a number of studies demonstrating that these assumptions are at best misleading, and at worst, empirically inaccurate. The next two sections provide a brief history of events that led to the rise and fall of the view that memory strategies generally enhance performance and that memory strategy development is stagelike. The discussion will focus on two concepts central to this dissertation. The first is the concept of utilization deficiency, which refers to strategy use with no performance benefit. The second is the concept of variability in strategy use, which refers to the use of not one but several different approaches.


1








2

Memory Strategies Enhance Performance

The origin of the assumption that memory strategies usually

facilitate memory performance can be traced to strategy training studies (for a review, see Flavell, 1970). In a typical training study, children who did not spontaneously use a strategy (e.g., rehearsal) were trained to do so, frequently showing marked improvements in memory performance. Such children were said to be production deficient because they were unable to spontaneously produce a strategy, even though they could do so and show memory benefits when instructed,

The discovery of production deficiencies was followed by a number of studies that examined the effectiveness of strategy training. In general, these studies, like the earlier ones, demonstrated that children who do not produce a strategy initially can be trained to do so and show corresponding memory improvements. These findings led to the assumption that memory strategies typically improve performance and that the failure to use memory strategies is associated with relatively low levels of performance (for examples, see Flavell, 1970). This view was not limited to memory strategies but was implicit in the descriptions of other cognitive strategies, including those used in analogical reasoning, arithmetic, and reading (for a historical review, see Bjorklund, 1992).

The view that memory strategies generally improve performance began to be questioned in the middle to late 1980s. A number of developmental studies during this period examined the effectiveness of various mnemonics. Several of these studies showed that memory strategies sometimes resulted in no or little benefit to memory,









3

particularly for younger, less practiced strategy users. For example, research by Miller and her colleagues (reviewed in Miller & Seier, 1994) demonstrated that young children using a selective attention strategy had lower levels of recall than equally strategic older children. Such findings were not limited to selective attention strategies. Similar patterns were found in tasks assessing organizational and elaboration strategies (Bjorklund & Harnishfeger, 1987; Kee & Davies, 1990).

Why did these studies show ineffective strategy use when earlier research on production deficiencies showed effective strategy use? The answer may have to do with how strategy use was measured. Production deficiency research inferred strategy use from patterns of recall following training. Improvements in recall were interpreted as indicating the effective use of a mnemonic. No improvements in recall were interpreted as indicating ineffective strategy training. This latter pattern of data is ambiguous, however. No improvement in recall may indicate ineffective training, but it may also indicate ineffective strategy use. The only sure way to discriminate between these alternatives is to assess recall and strategy use independently. Later research on strategy effectiveness did assess recall and strategy use independently. As mentioned above, these studies showed that increases in strategy use do not always result in benefits to memory.

In 1990, Miller formally identified evidence of strategy use with no recall benefits, and labeled such evidence a utilization deficiency (Miller, 1990). According to Miller, utilization deficiencies occur when a child produces an appropriate strategy but does not benefit from it in terms of recall, or benefits less than an equally strategic older








4

child. Utilization deficiencies are inferred empirically when (a) the correlation between strategy use and recall is nonsignificant for younger children but significant for older children, (b) young strategy users recall no more than their nonstrategic peers, (c) older children recall more than comparably strategic younger children, and (d) strategy use increases over trials with no corresponding improvements in memory performance (for additional examples of utilization deficiencies, see Miller & Seier, 1994). Evidence for utilization deficiencies has been found in studies using a variety of memory paradigms and involving participants ranging in age from preschool to late adolescence (for reviews, see Bjorklund & Coyle, 1995; Miller & Seier, 1994).

Recent reviews of memory development research have revealed the ubiquity of utilization deficiencies (Bjorklund & Coyle, 1995; Miller & Seier, 1994). One such review was conducted by Miller and Seier (1994), who examined the memory development literature from 1974 through mid1992 for evidence of utilization deficiencies in normal populations (e.g., greater recall for older than comparably strategic younger children). Miller and Seier used three criteria to select studies appropriate for the examination of utilization deficiencies: (a) independent measures of strategy use and recall, (b) spontaneous strategy production (i.e., training studies were excluded), and (c) analyses examining age differences in strategy use and performance. Of the 59 studies they evaluated, 56 (95%) provided evidence for a utilization deficiency.

Although Miller and Seier limited their review to spontaneous strategy use, a more recent review by Bjorklund, Miller, Coyle, and








5

Slawinski (in press) examined utilization deficiencies (e.g., increases in strategy use but not recall following training) in memory strategy training studies published between 1968 and 1994. Like Miller and Seier, Bjorklund et al. selected only studies that included children from normal populations and reported independent measures of strategy use and recall. Because studies with multiple-training conditions could provide multiple cases of evidence of utilization deficiencies, training conditions within studies (rather than the studies themselves) served as the units of analysis. Of the 76 relevant training conditions identified, 39 (51%) showed evidence for utilization deficiencies.

Why did it take the field so long to identify utilization

deficiencies? The most parsimonious explanation, I believe, is that utilization deficiencies did not make any sense given the dominant assumption prevalent in much of the early research on production deficiencies (i.e., strategies help performance). Consequently, evidence for a utilization deficiency was often ignored or overlooked. A conceptual shift occurred in the mid-1980s when a number of studies examined independent measures of strategy use and recall (for a review of these studies, see Miller & Seier, 1994). Several studies during this period reported that strategy use resulted in no or little recall benefit. The accumulation of such evidence made it difficult to ignore data indicating that strategies did not always enhance memory performance. A new view of strategy use emerged, one that considered the possibility of ineffective strategy use. This view made possible the discovery of utilization deficiencies in research on memory strategies.








6

Contemporary research has investigated the causes and consequences of utilization deficiencies. Possible causes of utilization deficiencies include inadequate capacity for both strategy production and effective encoding, limited knowledge of stimulus items or task requirements, and insufficient metamnemonic knowledge of when and how to use strategies (Miller & Seier, 1994). Empirical support for these causes has been demonstrated in studies showing that utilization deficiencies are reduced or eliminated when (a) the capacity required for accessing or executing a strategy is eliminated by having an experimenter carry out the strategy (Miller, Woody-Ramsey, & Aloise, 1991); (b) the stimulus items or task requirements are highly familiar and embedded in a meaningful context (Miller, Seier, Barron, & Probert, & 1994); and (c) metamnemonic instruction is provided regarding the cause-and-effect relati on between strategy use and recall (Ringel & Springer, 1980). Of the three causes mentioned, inadequate mental capacity and limited knowledge base have received the most empirical support. Other potential causes of utilization deficiencies, including inadequate strategic monitoring, failure to link one strategy with another, and failure to inhibit an earlier, ineffective strategy (see Bjorklund & Coyle, 1995; Miller & Seier, 1994), have received little attention.

Contemporary research has also examined the development of

utilization deficiencies. The general finding is that young children are more apt to show a utilization deficiency than older children. Compared to older children, young children (a) recall less when using the same strategies, (b) show lower correlations between strategy use








7

and recall, and (c) show more instances of strategy increases over trials with no corresponding increases in recall (see Miller & Seier, 1994). These findings demonstrate that young children are less likely to benefit from strategy use than older children, which is evidence of a utilization deficiency for young children.

A recent study by Coyle and Bjorklund (1996) showed that

utilization deficiencies have different developmental consequences for memory depending on when they occur. In this study, children in second through fourth grade received a multitrial sort-recall task, with different sets of categorizable words on each trial. Children were classified as utilizationally deficient or not based on their pattern of clustering and recall over trials. Children were classified as utilizationally deficient if they showed increases in clustering over trials with no corresponding increases in recall. All other children were classified as nonutilizationally deficient. Mean recall varied as a function of grade and utilization deficiency classification. Secondand third-grade utilizationally deficient children recalled more on average than their nonutilizationally deficient agemates, most of whom used no strategy at all. Conversely, fourth-grade utilizationally deficient children recalled less than their nonutilizationally deficient agemates, most of whom were using strategies effectively.

The Coyle and Bjorklund findings demonstrate that utilization

deficiencies have different memory consequences depending on when they occur in development. Utilization deficiencies that occur early in development are associated with relatively high levels of memory performance, because the dominant alternative pattern is no strategy use









8

and low levels of recall. Conversely, utilization deficiencies that occur later in development are associated with relatively poor recall, because the dominant alternative pattern is effective strategy use and high levels of recall.

Memory Strategy Development is Stagelike

The origin of the assumption that memory strategy development is stagelike can be traced to research on spontaneous (i.e., uninstructed) strategy use in the late 1960s and 1970s. One goal of this research was to identify the types of strategies used by different age groups. To do this, children of different ages were presented with a memory task and their mnemonic behaviors were recorded and compared. The general finding was that children in each age group typically used a different and unique strategy for remembering. This finding was remarkably consistent across a variety of research paradigms. Serial recall studies showed that young children often use no rehearsal strategy, older children often use single-word rehearsal, and still older children use cumulative rehearsal (Flavell, Beach, & Chinsky, 1966; Ornstein, Naus, & Liberty, 1975). Organizational memory tasks showed that young children often organize words along thematic dimensions whereas older children often organize words along taxonomic dimensions (Ceci & Howe, 1978). Paired-associate learning tasks showed that young children often form arbitrary links between word pairs whereas older children often form relational links between word pairs (for a review, see Kee, 1994).

These early findings depicted memory strategy development as a stagelike progression (Siegler, 1995). Stage descriptions were not limited to memory strategies but included strategies in such diverse








9

domains as arithmetic, number conservation, and scientific reasoning (Siegler, 1996). Young children were described as using one approach, older children as using a different approach, and still older children as using yet another approach. At each age children were described as using a single and unique strategy. Strategy development consisted of one strategy being replaced by another more advanced strategy.

Although a stagelike pattern of strategy development appeared to describe well the pattern of data in early studies, evidence inconsistent with a stagelike progression was reported in the mid- to late-1980s. Several studies during this period demonstrated that children of a particular age used not one but several strategies. Such variability was found across a variety of tasks, including ones assessing memory strategies. For example, children asked to remember a series of digits sometimes used no rehearsal strategy, sometimes rehearsed only one digit at a time, and sometimes rehearsed all digits together (McGilly & Siegler, 1989). Children asked to remember the location of a hidden object sometimes talked about where the object was hidden, sometimes stayed near the hiding place, and sometimes pointed to the hiding location (DeLoache, 1984). Children presented with a pairedassociate learning task sometimes repeated the names of the items and sometimes formed a sentence or image linking the word pairs (reviewed in Kee, 1994). Children asked to remember a series of objects sometimes visually inspected the objects, sometimes named the objects, and sometimes physically manipulated the objects (Baker-Ward, Ornstein, & Holden, 1984; Lange, MacKinnon, & Nida, 1989).








10

Why was variability reported in these studies when early research on strategy development had reported a stagelike progression? As with utilization deficiencies, the answer has to do with how strategies were assessed. Early research on memory strategy development typically classified children as using a single strategy only. Children might be identified as rehearsing, sorting, or elaborating, but no child was identified as using more than one strategy. Although variability was present across individuals, data were often presented in terms of the dominant strategy used at each age (Flavell, Beach, & Chinsky, 1966). This type of data presentation, along with the strategy assessment procedures, depicted memory strategy development as a series of stages. Later research on memory strategy development assessed the possibility of intraindividual variability in strategy use (i.e., multiple strategies being used by a particular individual). This research assessed several strategies on a particular trial or different strategies across trials. Under these conditions, children showed considerable variability in strategy use, often using a variety of approaches within and across trials.

Evidence for variability in strategy use led to a new view of strategy development championed by Robert Siegler (1996). Siegler argued that strategy development does not involve the replacement of different strategies, as implied by stage theories. Instead, he argued that strategy development involves changes over time in the frequency of occurrence of several strategic approaches. According to Siegler, at any given age children use not one but a variety of strategies. Strategy development consists of changes in the frequency of use of each









11

individual strategy, with some strategies being used more than others. At no point in development is a single strategy used exclusively. Rather, multiple strategies are used throughout development.

Considerable evidence supports the view of strategy development championed by Siegler. Variability in strategy use has been found for children differing in race, nationality, and intelligence; for problem domains including arithmetic, serial recall, scientific reasoning, reading, spelling, and tic-tac-toe; for participants ranging in age from one year to adulthood; and for analyses examining both group and individual subject data (Siegler, 1996). In sum, variability in strategy development appears to be the rule in development, not the exception.

Contemporary research has investigated possible correlates of variability in memory strategy use. Possible correlates include knowledge of the stimulus items and task, psychometrically-measured intelligence, and the history of effectiveness of a particular strategy. Empirical support for these correlates has been demonstrated in studies showing that variability in strategy use is reduced when (a) highly familiar stimulus items are used (Bjorklund & Bernholtz, 1986; Frankel & Rollins, 1985), (b) children have very high-IQs (Coyle, Colbert, & Read, 1997), and (c) a strategy yields perfect memory performance (McGilly & Siegler, 1989).

Contemporary research also has examined the developmental course of variability in strategy use. Siegler has shown that the number of strategies used depends on amount of experience on a task (Siegler, 1996). In general, few strategies are used when task experience is








12

limited, several different strategies are used when experience is moderate, and few strategies are again used when experience is extensive. Thus, the number of strategies used, plotted as a function of task experience, produces an inverted-U shaped pattern. This pattern has been found across a variety of strategic tasks.

Finally, contemporary research has shown that initial levels of variability have implications for subsequent learning. For example, Goldin-Meadow and her colleagues (Goldin-Meadow, Alibali, & Church, 1993) have shown that children who displayed high levels of variability on a conceptual learning task showed increases in task performance following instruction or practice. In contrast, children who displayed low levels of variability typically showed no or relatively little improvement in performance. Similarly, Siegler (1995) has shown that children who used several strategies on a number conservation task showed increases in subsequent learning. In contrast, children who used few strategies on a number conservation task showed relatively little change in subsequent learning. These findings raise the intriguing possibility that variability may provide an index of when change is likely to occur and when change can be induced to occur (cf. Thelen & Smith, 1994).

Evaluation of Research on Variability and Utilization Deficiencies

The discovery of utilization deficiencies and variability has had two important consequences in strategy development research. First, several models of memory strategy development now explicitly account for utilization deficiencies and variability. These models assume that variability is present at all points in development (Siegler, 1996;








13

Thelen & Smith, 1994), and that memory strategies have costs as well as benefits (Bjorklund & Coyle, 1995; Miller & Seier, 1994). Second, several studies have been designed with the explicit intent of assessing utilization deficiencies and variability (Bjorklund, Coyle, & Gaultney, 1992; Coyle & Bjorklund, 1996; Miller, Seier, Barron, & Probert, 1994; Siegler & Jenkins, 1989). These studies do not view utilization deficiencies and variability as anomalies to be discounted, but consider them as being worthy of study in their own right and deserving of explanation.

Although utilization deficiencies and variability have received considerable attention in contemporary strategy research, current research investigating these phenomena is limited in at least three ways. First, utilization deficiencies have been described almost exclusively on tasks in which only a single strategy is assessed on all trials. In most studies, children are said to be utilizationally deficient when they use a particular strategy (e.g., rehearsal or elaboration) and show no or little memory benefit or less benefit than that shown by more experienced strategy users. Because only a single strategy is assessed, the possibility of utilization deficiencies in multiple-strategy use cannot be examined. Instead, the focus is on the ineffective use of a particular strategy.

Second, variability generally has been assessed on multitrial

tasks in which only one strategy per problem solving trial is assessed. In most studies, children are credited with using a single strategy each time they are presented with a problem, although they can (and usually do) use a variety of strategies across different problems. Because only








14

one strategy per trial is assessed, the possibility of variability within a particular trial (i.e., intratrial variability) cannot be examined. Instead, the focus is on variability across trials (i.e., intertrial variability).

Third, variability has been measured almost exclusively in terms of the number of strategies used. Although the number of strategies used is one measure of variability, it is not the only one. A handful of other studies have shown that variability can be measured in other ways, including the number of trial-by-trial changes in strategy use, the degree of stability in the sequence of strategy production across several trials, and the number of instances when one strategy is expressed in gesture and a different one in speech (Coyle & Bjorklund, 1997; Coyle, Colbert, & Read, 1997; Goldin-Meadow et al., 1993). These studies demonstrate that variability can be measured in not one but several ways. A single measure, such as the number of strategies used, does not capture all possible patterns of variability, and different kinds of variability may have different causes and consequences.

A recent study by Coyle and Bjorklund (1997) addressed these

limitations. Some time will be spent describing this study because its design and findings figure prominently in the study developed for this dissertation. Children in second through fourth grade received five sort-recall trials of categorizable words. Unlike other multitrial experiments (e.g., Bjorklund, 1988), different items and categories were used on each trial, so that any increases in strategy use could not be attributed to increased familiarity with a particular set of stimulus items.








15

Multiple strategies were assessed on each trial. This permitted assessment of variability within trials, as well as variability across trials. It also permitted assessment of utilization deficiencies in multiple-strategy use. The four strategies assessed on each trial were sorting, physically moving or arranging the words into groups; rehearsing, saying out loud or mouthing the items; category naming, saying the category name of a group of words; and clustering, recalling the words by categories. Each strategy was coded as occurring or not on each trial. The measure of performance was the number of words recalled on each trial.

Unlike previous studies, variability was measured in not one but several ways. Variability was measured in terms of (a) average number of strategies used across trials, (b) number of trials on which multiple strategies were used, (c) number on trials that the combination of strategies differed from the preceding trial, and (d) total number of strategy changes on consecutive trials, counting both strategy additions and deletions as changes. The first two measures (average number of strategies and number of trials with multiple strategies) are examples of multiple-strategy use. These are the most frequently reported measures of variability. The last two measures (trials with changes and total number of changes) are examples of strategy change. These measures assess changes over time and are reported less frequently.

Coyle and Bjorklund predicted age differences in variability.

Multiple-strategy use (e.g., number of strategies used) was predicted to increase with age. The basis for this prediction was that strategy use would be less effortful for older than for younger children, and so








16

older children would have the capacity to produce additional strategies. Strategy changes (e.g., trial-by-trial changes in strategy use) were predicted to be high and comparable for both age groups. This prediction was based on research showing that strategy changes occur frequently in development, across a wide range of ages and on a variety of tasks (Siegler, 1995, 1996).

Coyle and Bjorklund also predicted age differences in the relation between variability and recall. Multiple-strategy use was predicted to correlate with recall for older but not younger children. This prediction was based on the assumption that older children would have the mental capacity to produce and use effectively multiple strategies. In contrast, multiple-strategy use was expected to consume so much of young children's limited mental capacity that little would remain for recall, resulting in a utilization deficiency. Strategy change, in particular stable-strategy use (i.e., few trial-by-trial changes in strategy use), was predicted to correlate with recall for older but not younger children. This prediction was based on research showing that older children are likely to stick with a single approach that yields optimal performance, whereas younger children frequently use a variety of ineffective approaches (Lemaire & Siegler, 1995).

The findings were generally consistent with the predictions.

Multiple-strategy use was greater for older than for younger children. Although children of all ages used more than one strategy across trials, older children used more strategies and had more trials with multiple strategies than did younger children. Strategy changes were high and comparable for children in all age groups. Although considerable









17

variability was observed for all age groups, a (nonsignificant) agerelated decline in variability was observed. Older children showed fewer changes on consecutive trials and had fewer trials with changes than younger children. These findings were confirmed in an analysis of strategy change within individual subjects. Although Coyle and Bjorklund (1997) paid little attention to the age-related declines in strategy changes, emphasizing instead pervasive variability at all ages, subsequent research has found considerable evidence for age-related declines in variability across a variety of tasks and for children varying widely in age (Coyle, Colbert, & Read, 1997; for a review, see Siegler, 1996). In general, older and more experienced strategy users show fewer strategy changes than younger and less experienced strategy users.

Further analysis revealed relations between variability and memory performance. As predicted, multiple-strategy use was related to recall for older children, who showed significant and positive relations between number of strategies used and recall. Younger children showed no reliable relation between number of strategies used and recall, indicating a utilization deficiency. In addition, stable-strategy use (i.e., few strategy changes across trials) was significantly related to high levels of recall, but only for the older age groups. That is, third- and fourth-grade children who consistently used a particular strategy combination had higher levels of recall than their peers whose strategy use was less consistent. No reliable relation between variability and recall was found for the youngest children.









18

Taken together, these findings extend current research on

utilization deficiency and variability in several ways. Specifically, they provide evidence for (a) utilization deficiencies in multiplestrategy use, (b) several different types of variability, including multiple-strategy use and strategy change, and (c) variability in strategy use within a particular trial, as well as between trials.

The Current Study

The purpose of the current study was to further examine issues

concerning utilization deficiencies and variability using the procedures developed by Coyle and Bjorklund (1997). As in Coyle and Bjorklund, children received a multitrial sort-recall task with different words and categories on each trial. Also as before, multiple strategies were assessed on each trial and variability was measured in several ways. The strategies assessed were sorting, rehearsal, clustering, and category naming. The measures of strategy variability were number of strategies used on each trial, number of strategy changes on consecutive trials, number of unique combinations, and number of trials with strategy changes.

The current study differed from the study by Coyle and Bjorklund in two important ways, each of which permitted new research questions concerning utilization deficiencies and variability in strategy use. First, in the current study children received seven sort-recall trials, two more than in Coyle and Bjorklund. The additional trials permitted a more detailed analysis of strategy change during the testing session. It was now possible to assess periods of stability and instability within individual children during early trials and again during later








19

trials. In contrast, the study by Coyle and Bjorklund assessed stability and instability using data on all trials.

Second, in the current study the number of words presented varied across trials from six to fifteen, whereas in Coyle and Bjorkiund the number of words presented remained constant across trials at eighteen. Although the number of words varied across trials in the current study, the number of categories represented on each trial remained constant at three. This eliminated the possibility that changes in strategy use and recall would result from changes in the number of categories represented across trials, and ensured that such changes could be attributed to variation in the number of words presented. The design permitted an examination of whether children adapt their strategy use to changes in the number of words on each trial. It was now possible to assess measures of variability, including multiple-strategy use and strategy changes, when children were presented with relatively few words or many words. In contrast, the study by Coyle and Bjorklund assessed variability under conditions in which task demands (i.e., number of words on each trial) remained constant.

Apart from the differences mentioned above, the design of the

current study was very similar to the one used by Coyle and Bjorklund. Second- and fourth-grade children were given seven sort-recall trials of categorizable words. As in Coyle and Bjorklund, different words and categories were used on each trial to minimize the likelihood that increases in strategy use would result from practice with a particular set of categorizable items. Also as in Coyle and Bjorklund, category items were chosen to avoid high associations between words, thus








20

minimizing the likelihood of clustering as a result of the automatic activation of semantic memory relations.

Approximately half the children in each grade were assigned to one of two conditions, labeled ascending/descending and descending/ascending. In the ascending/descending condition, children received an increasing number of words on each successive trial until Trial 4, and then received a decreasing number on each successive trial (number of words on Trials 1-7, respectively, was 6, 9, 12, 15, 12, 9, and 6). The descending/ascending condition was the complement of the ascending/descending condition. In the descending/ascending condition, children received a decreasing number of words on each successive trial until Trial 4, and then received an increasing number of words on each successive trial (number of words on Trials 1-7, respectively, was 15, 12, 9, 6, 9, 12, 15). Each condition had trials with the same number of words, so that effects concerning number of words presented could be teased apart from effects concerning the ascending or descending order in which words in each condition were presented.

These conditions were developed to examine changes in strategy use and performance as a function of the number of words presented on successive trials. Two additional sets of conditions were considered but not selected. The first involved presenting trials in the ascending/descending and descending/ascending condition randomly, without having a constant rate of increase or decrease across trials. For example, Trials 1 to 7 in the ascending/descending condition might be ordered 12, 15, 9, 9, 6, 12, and 6, respectively, whereas Trials 1 to

7 in the descending/ascending condition might be ordered 9, 15, 15, 6,








21

12, 9, and 12, respectively. Unlike the conditions in the current study, these presentation orders would vary randomly the amount of increase or decrease on successive trials. Consequently, they would confound changes in the number of words presented on successive trials with the magnitude of such changes. The design of the current study eliminated this confound by holding constant the rate of change at three words.

A second possible set of conditions that were considered included an ascending only series and a descending only series. The idea was to extend the pattern in the early trials of each condition in the current study. Thus, Trials 1 to 7 in the ascending series would have 6, 9, 12, 15, 18, 21, and 24 words, respectively, whereas Trials 1 to 7 in the descending series would have 24, 21, 18, 15, 12, 9, and 6 words, respectively. Unlike the conditions in the current study, these presentation orders do not reverse the pattern of change in the latter trials. Consequently, effects regarding possible strategic adaptation to reversal of presentation order could not be assessed. Furthermore, it was not clear why differences in strategic adaptation and recall performance would vary beyond 15 words, when the number of words presented would exceed children's memory capacity (Miller, 1956).

Goals of the Current Study

The current study had three goals. The first was to examine

differences in measures of strategy variability (e.g., multiple-strategy use and strategy change) as a function of grade and number of words presented on each trial. As in Coyle and Bjorklund (1997), multiplestrategy use (e.g., number of strategies used per trial) was predicted








22

to increase with age. This prediction was based on research showing that strategies are capacity-demanding operations and that strategy production consumes less capacity with age (Kee, 1994). Thus, older children, who use relatively little capacity during strategy production, should produce more capacity-consuming strategies than younger children.

In addition, multiple-strategy use was predicted to be greater on trials with relatively many words (i.e., 12 or 15 words) than on trials with relatively few words (i.e., 6 or 9). This prediction was based on the assumption that trials with many words would induce children to use additional memory strategies because recall of all words on these trials is beyond children's memory capacity (Miller, 1956). In contrast, trials with few words should not have this effect because recall of all words is within children's memory capacity. Thus, children are expected to use multiple-strategies only when they cannot perform optimally without doing so (cf. McGilly & Siegler, 1989). These predictions may be qualified by age, with older children having greater capacity for using multiple strategies than younger children.

On the basis of the findings in Coyle and Bjorklund (1997) and in other studies (Coyle, Colbert, & Read, 1997; Lemaire & Siegler, 1995), strategy changes (e.g., trial-by-trial changes in strategy use) were predicted to decrease with age. In addition, strategy changes were predicted to decrease over the course of the testing session, especially for older children. This latter prediction was based on models of strategy variability proposing that task-relevant experience is associated with decreases in trial-by-trial changes in strategy use (Siegler, 1996; Thelen & Smith, 1994). Thus, children should show









23

relatively few strategy changes during the later trials of the sortrecall task, when they have had considerable task-related experience.

Strategy changes were predicted to vary according to the number of words presented on each trial. Specifically, strategy changes were predicted to rarely follow trials with relatively few words (i.e., trials with 6 and 9 words), but to frequently follow trials with relatively many words (i.e., trials with 12 and 15 words). These predictions were based on research showing that strategy changes rarely follow perfect performance but frequently follow less than perfect performance (McGilly & Siegler, 1989). Because perfect recall was likely on trials with few words but not on trials with many words, it was predicted that strategy changes would be less frequent on trials with few words compared to trials with many words.

The second goal of the current study was to examine the relation between multiple-strategy use and recall as a function of age and number of words on each trial. A specific aim was to examine data for possible evidence of utilization deficiencies. Utilization deficiencies were predicted to be less frequent for older children than for younger children. This prediction was based on research examining evidence of utilization deficiencies for children of different ages. For example, Miller and Seier (1994) have shown that correlations between strategy use and recall are often positive and significant for older but not younger children, and have interpreted this as evidence of a utilization deficiency for the younger children. Similarly, Coyle and Bjorklund (1996) have shown that younger children recall less than comparably strategic older children and have interpreted this finding as








24

demonstrating a utilization deficiency for younger children. To date, research on utilization deficiencies has examined the effectiveness of a single strategy (e.g., clustering or rehearsal), or, in a few cases, the effectiveness of multiple strategies. The current study examines the effectiveness of both single- and multiple-strategy use in a single paradigm, and compares directly the incidence of utilization deficiency when children use one or several strategies.

Utilization deficiencies were predicted to be less frequent on trials with relatively few words (i.e., 6 or 9 words) than on trials with relatively many words (i.e., 12 or 15 words). This prediction was based on the assumption that trials with few words would consume less of children's limited mental capacity than trials with many words. Thus, additional capacity should be available for efficient strategy utilization on trials with few words. Consequently, utilization deficiencies should be less frequent on trials with few words compared to trials with many words. This prediction may be qualified by age, with older children's superior processing capacity permitting effective strategy use on all trials, irrespective of the number of words presented.

The third and final goal of the current study was to examine the relation between strategy changes and recall as a function of age. On the basis of the findings in Coyle and Bjorklund (1997) and other studies (Lemaire & Siegler, 1995), the relation between strategy change and recall was predicted to be negative and significant for older but not younger children. That is, few strategy changes across trials (i.e., stable-strategy use) were predicted to result in high levels of








25

recall for fourth graders but not second graders. A further prediction was that the relation between stability and recall may be more apt to occur on later trials (Trials 4 to 7) than on early trials (Trials 1 to 4). This prediction was based on research showing that children initially show inconsistent and ineffective strategy use, but later settle into a stable and optimal state of strategic responding (Siegler, 1996; Thelen & Smith, 1994). Because the measures of strategy variability were computed from data aggregated across trials, no predictions concerning the impact of number of words on the relation between strategy change and recall could be made.

A final prediction concerned the conditions under which strategy

changes occur. Strategy changes were predicted to occur less frequently when recall was perfect on the immediately preceding trial than when recall was not perfect on the immediately preceding trial. This prediction was based on the findings of a serial-recall study by McGilly and Siegler (1989). In that study, children who had been given a series of serial-recall trials tended to switch strategies when their performance was less than perfect on the preceding trial, but not when their performance was perfect on the preceding trial. That is, children tended to stick with a particular approach when it had yielded optimal performance but switched approaches when the previous one had yielded less than optimal performance. This pattern is known as the winstay/lose-shift approach in the decision-making literature (Eimas, 1969). Such a pattern may vary with the number of words presented on each trial. Trials with few words should provide greater opportunity for perfect recall, which should result in few strategy changes.















METHOD

Participants

Participants were 69 second graders, 36 boys and 33 girls (mean age = 7 years 8 months, SD =6.42 months), and 51 fourth graders, 21 boys and 30 girls (mean age =9 years 7 months, SD =5.00 months). Children were recruited from schools and recreation centers in Gainesville, Florida. The majority of children were White (80%) and came from middle- and upper-middle-income households.

Stimuli and Design

Seven lists of categorically related words were constructed (three categories per list, five words per category; see Table 1). The lists were composed of words reported in three analyses of category norms (Bjorklund, Thompson, & Ornstein, 1983; Posnansky, 1978; Uyeda & Mandler, 1980). Each word was printed on a 3 x 5 in. (7.6 x 12.7 cm) index card. Different words and categories were used on each list. Items in each list varied in category typicality, with most items being in the top-third frequency ranking for a particular category. Highly associated words within a particular category (e.g., dog, cat; salt, pepper) were avoided, thus minimizing the likelihood that clustering would result from the automatic activation of semantic memory relations (Frankel & Rollins, 1985; Schneider, 1986). Previous research has shown that children in the age range tested here had little difficulty




26









27

Table 1

Word Lists By Category Membership



List I List 2 List 3 List 4


Occupations Trees Metals Buildings


Carpenter Willow Copper Tepee

Lawyer Maple Brass Castle

Nurse Palm Tin Igloo

Dentist Oak Iron Church

Farmer Pine Silver Barn


Parts of a Reading
House Beverages Weapons Material


Window Tea Sword Book

Roof Milk Grenade Journal

Door Soda Cannon Newspaper

Stairs Water Spear Magazine

Ceiling Coffee Knife Letter


Sports Jewelry Vegetables Birds


Soccer Earrings Cabbage Sparrow

Golf Necklace Onion Eagle

Tennis Crown Celery Parrot

Football Watch Peas Dove

Hockey Bracelet Corn Owl









28

Table 1--continued

List 5 List 6 List 7


Weather
Flowers Animals Phenomenon


Daisy Horse Wind

Orchid Zebra Snow

Tulip Pig Rain

Lily Tiger Fog

Rose Cat Hail


Furniture Vehicles Cloth


Couch Bus Cotton

Lamp Plane Satin

Chair Boat Silk

Bed Car Wool

Dresser Motorcycle Velvet


Musical
Instruments Time Body Parts


Drums Year Foot

Tuba Decade Elbow

Violin Month Neck

Flute Hour Mouth

Piano Century Hand








29

defining items like the ones used in the current study (Coyle & Bjorklund, 1997).

Each child received seven sort-recall trials. A different list of words was presented on each trial. Children were assigned to one of two conditions. In both conditions, three categories were represented in the word lists on all trials. However, the number of words in each category varied systematically across trials. In the ascending/descending condition, the number of items in each category was 2, 3, 4, 5, 4, 3, and 2 on trials 1 through 7, respectively. Thus, the total number of items presented on trials 1 through 7 was 6, 9, 12, 15, 12, 9, and 6. In the descending/ascending condition, the number of items in each category was 5, 4, 3, 2, 3, 4, and 5 on trials 1 through 7, respectively. Thus, the total number of items presented on trials 1 through 7 was 15, 12, 9, 6, 9, 12, and 15. The sum of all items in the descending/ascending condition was greater than the sum of all items in the ascending/descending condition. In each condition, the seven lists were presented in 1 of 10 predetermined random orders. Each list was presented on each of the seven trials approximately equally, and all items within a list were used approximately equally. This resulted in a 2 (grade: second vs. fourth) x 2 (condition: ascending/descending vs. descending/ascending) x 7 (trial) design, with repeated measures on the trial factor.

Procedure

Children were tested by the author of this dissertation and two

undergraduate research assistants. Each child was seen individually in a session lasting approximately 30 min. Prior to the presentation of








30

the first list, children were told that they would be presented seven lists of words (each printed on a 3 x 7 in. [7.6 x 12.7 cm] index card) to remember and later recall in any order they wished. They were told that the lists and items would be presented one at a time and that some lists would have a different number of words. They were not told how many words would be presented on each list, nor were they told about the categorical structure of the lists.

The experimenter presented each card (on which a word was printed) to the child at a rate of about one card every 2 s. The experimenter named the item and children repeated the name. Cards were placed in front on children in rows, with the stipulation that no two items from the same category were presented contiguously. Each row contained six cards, unless the number of cards presented was not a multiple of six (i.e., 9 or 15 cards). In this case, the row closest to the child contained three cards. After the cards were presented, children were instructed to "study the words and do whatever you want to remember them later." After 1 min 30 s, the cards were covered with an opaque cloth and then children solved problems on the Matching Familiar Figures Test (Kagan, 1965) for approximately 30 s. Children were then asked to recall as many items as they could in any order they wished. If the child was silent for 10 s, the experimenter asked if there were any more words that he or she could remember. When either another 10 s interval elapsed with no more words recalled or the child stated that he or she could remember no more words, the trial was ended. Trials 2-7 followed immediately after Trial 1, using the same procedure with different sets








31

of items. The experimenter recorded children's sorting patterns on each trial and the entire session was audiotapes.

Coding

During the 1 min 30 s study period on each trial, the experimenter observed the incidence of sorting, rehearsal, category naming, examination, and off-task behavior for each of three separate 30-s intervals. Each type of study behavior was coded as occurring or not during each of the three intervals. Sorting was recorded when children physically moved or arranged cards. Rehearsal was recorded when children verbalized out loud or mouthed the list items (no distinction was made between single-word and cumulative rehearsal). Category naming was recorded when children said the category name of a group of items (e.g., FRUIT for apple, banana, peach). Examination was recorded when children visually scanned the cards. Off-task behavior was recorded when children looked away from the cards and were visually inattentive to the task for a total of 5 consecutive seconds. Clustering during recall was recorded when children recalled words by adult-defined categories.

Following Coyle and Bjorklund (1997), three of the five study behaviors were classified and analyzed as strategies. These were sorting, rehearsal, and category naming. Clustering during recall was classified as a fourth strategy. Examination was not considered a strategy because by itself examination reflects only attention to the target information. Although children may be covertly using a strategy (e.g., rehearsal) while examining the items, this cannot be discerned from their overt behavior. For these reasons, examination was not








32

included as a strategy for purposes of analyses. Unless specified otherwise, strategy data were coded dichotomously, with each strategy being coded as occurring or not occurring on each trial.

The strategies assessed during the study period (i.e., sorting,

rehearsal, and category naming) could be observed between zero and three times during the 1 min 30 s study period. A child was credited with using a strategy on a trial if he or she was observed to use that strategy during at least one of the three 30-s intervals. The strategy assessed during recall, clustering, was measured by the adjusted ratio of clustering (ARC) score (Roenker, Thompson, & Brown, 1971). Following Coyle and Bjorklund (1997), a child was credited with using a clustering strategy if his or her ARC score was .50 or greater. This represents a value of slightly more than one standard deviation greater than clustering expected by chance. Children could be classified as using any one of the four strategies or any combination of the four strategies on a particular trial.

Reliability has been assessed in previous research that examined the same study behaviors and strategies (Coyle & Bjorklund, 1997). This research demonstrated that percentage of agreement for two independent coders coding the study behaviors (i.e., sorting, rehearsal, category naming, examination, and off-task behavior) was very high (92%). Percentage agreement for coding the strategies of sorting, rehearsal, and category naming was even higher (97%). These data, along with data from other studies reporting reliability for similar strategies (Lange, MacKinnon, & Nida, 1989; Wellman, Ritter, & Flavell, 1975), demonstrate








33

high intercoder agreement for the types of strategies coded in the current study.














RESULTS

All analyses are reported at p < .05, with post-hoc tests evaluated with t-tests unless otherwise specified.

Preliminary Analyses

Some of the results were pertinent to general issues in cognition and memory development but not to the focus of the current study. These results are presented here. The next section reports results concerning issues of strategy variability and the relation between variability and recall.

Off-Task Behavior and Examination

off-task behavior and examination were observed during each of the three 30-s intervals of the study period (range: 0 to 3 per trial). Each type of data was analyzed separately using 2 (grade) x 2 (condition) x 7 (trial) analyses of variance (ANOVAs), with repeated measures on the trial factor. The analysis of off-task behavior revealed no significant main effects or interactions. As shown in previous research (Coyle & Bjorklund, 1997), off-task behavior was slightly greater for younger than for older children (mean frequency of off-task behavior per trial: .29 and .16 for second and fourth grade, respectively). The analysis of examination revealed significant main effects of condition, F(l, 116) = 5.51 (mean number of intervals of examination per trial: 1.87 and 2.27 for ascending/descending and descending/ascending conditions, respectively), and trial, F(6, 696)


34








35

5.68 (mean number of intervals of examination per trial: 2.28, 2.19,

2.10, 1.98, 2.03, 2.00, 1.85 for Trials 1-7, respectively). These main effects were qualified by a significant Condition x Trial interaction, f(6, 696) = 2.91. Inspection of the significant interaction revealed that ascending/descending versus descending/ascending comparisons were significant at Trial 2 (1.89 versus 2.54), Trial 3 (1.81 versus 2.43), and Trial 4 (1.69 versus 2.30), but not significant at Trial 1 (2.13 versus 2.46), Trial 5 (1.91 versus 2.16), Trial 6 (1.88 versus 2.14), and Trial 7 (1.81 versus 1.89). These data demonstrate that attention to the task materials was somewhat greater on the initial descending/ascending trials than on the corresponding ascending/descending trials.

Recall

Before presenting preliminary analysis of the recall data, data concerning repetitions and intrusions in recall are examined. Repetitions refer to recall of the same word more than once. Intrusions refer to utterances of words not on the target list. The frequency of occurrence of each type of data was analyzed separately using 2 (grade) x 2 (condition) x 7 (trial) ANOVAs, with repeated measures on the trial factor. Analysis of the repetition data revealed no significant main effects or interactions. Repetitions were slightly greater for fourth graders (M = .49) than for second graders (LI = .40). Analysis of the intrusion data revealed a significant main effect of grade, F(l, 116) 5.64, with intrusions being greater for second graders (M = .22) than for fourth graders (M = .05). All other main effects and interactions for the intrusion data were not significant. The significant grade








36

difference in intrusions is consistent with findings demonstrating that younger children have problems inhibiting task-inappropriate responses (Dempster, 1992). The repetition and intrusion data are excluded from all subsequent analyses.

Because possible recall varied trial-by-trial, the number of words recalled on each trial was converted to the proportion of words recalled relative to possible recall. Mean proportion recall on each trial, by grade and condition, is presented in Table 2, which also shows mean proportion recall by condition and trial (i.e., collapsed across grade) and fourth grade minus second grade recall differences by condition and trial. Proportion recall was examined by a 2 (grade) x 2 (condition) x

7 (trial) ANOVA, with repeated measures on the trial factor. The analysis revealed significant main effects of grade, F(l, 116) = 10-08 (mean proportion recall: .54 and .75 for second and fourth grade, respectively), condition, F(1, 116) = 6.83 (mean proportion recall: .65 and .60 for ascending/descending and descending/ascending conditions, respectively), and trial, F(6, 696) = 3.69 (mean proportion recall: .65, .61, .61, .67, .60, .60, and .65 for trials 1-7, respectively). Also significant were interactions of grade x trial, F(6, 696) = 7.01, and condition x trial, F(6, 696) = 57.68, both of which were qualified by a significant interaction of grade x condition x trial, F(6, 696) = 2.85.

The significant three-way interaction was evaluated by comparing grade differences in recall at each trial, separately for each condition. In the ascending/descending condition, significant grade differences in recall were observed on all trials except Trial 1, when only six words were presented. In the descending/ascending condition,








37

Table 2

Mean Proportion Recall By Condition, Grade, and Trial, and By Condition and Trial (i.e., Collapsed Across Grade), and Grade Differences in Recall at Each Trial By Condition



Trial


1 2 3 4 5 6 7


Ascending/Descending

Maximum Recall 6 9 12 15 12 9 6

Grade 2

M .81 .59 .49 .45 .42 .50 .71

SD .17 .22 .16 .18 .19 .28 .22

Grade 4

M .84 .73 .69 .70 .76 .89 .94

SD .18 .16 .23 .23 .21 .17 .14

Collapsed Across Grade

M .82 .64 .57 .55 .55 .65 .80

SD .17 .21 .21 .24 .26 .31 .22

Grade 4 Grade 2

Difference .03 .14 .20 .25 .34 .39 .24



Descending/Ascending

Maximum Recall 15 12 9 6 9 12 15

Grade 2

M .38 .48 .53 .76 .56 .41 .36

SD .18 .24 .24 .22 .30 .24 .18








38

Table 2--continued

Grade 4

M .55 .66 .80 .90 .80 .69 .61

SD .19 .21 .21 .22 .22 .24 .24

Collapsed Across Grade

M .46 .56 .66 .82 .67 .54 .48

SD .20 .25 .26 .23 .29 .28 .24

Grade 4 Grade 2

Difference .17 .18 .27 .14 .24 .28 .25



Note. Maximum recall indicates the maximum number of words that could be recalled on a particular trial.








39

grade differences in recall were found on all trials. As shown in Table 2, the magnitude of grade differences in recall was least pronounced on trials with the fewest words presented (Trials I and 7 in ascending/ descending and Trial 4 in descending/ascending), compared to the data on adjacent trials.

Strategy Use

The percentage and mean number of trials on which children in each grade used each strategy is presented by condition in Table 3. The percentages within each grade do not sum to 100 because multiple strategies were frequently used in combination on a single trial. The number of trials on which each strategy was used (range = 0 to 7) was examined by a 2 (grade) x 2 (condition) x 4 (strategy) ANOVA. The analysis revealed a significant main effect of strategy, E(l, 116) = 73.09, and significant interactions of grade x strategy, f(3, 348) = 4.63, and condition x strategy, L(3, 348) = 5.72. Inspection of the significant main effect of strategy revealed that sorting, rehearsal, and clustering were used more often than category naming, with all other strategy comparisons being nonsignificant (mean number of trials on which each strategy was used: 3.18, 3.44, 3.23, and .12 for sorting, rehearsal, clustering, and category naming, respectively). The floor levels of category naming are inconsistent with previous research showing that category naming was used relatively frequently by fourth graders who received a sort-recall task similar to the one used here. Although category naming was almost never used in the current study, the near absence of this strategy did not prevent the detection of








40

Table 3

Percentage (and Number) of Trials on Which Each Strategy Was Used, By Condition and Grade



Strategy


Category
Sorting Rehearsal Clustering Naming


Ascending/Descending

Grade 2 36 (2.54) 60 (4.21) 37 (2-56) 2 ( .15)

Grade 4 58 (4.04) 58 (4.04) 49 (3.40) <1 ( .04)

Descending/Ascending

Grade 2 35 (2.47) 38 (2.67) 50 (3.53) 3 ( .20)

Grade 4 59 (4.15) 37 (2.62) 53 (3.69) <1 ( .04)








41

significant effects concerning measures of strategy variability, as shown in later analyses.

Data relevant for the significant interactions concerning strategy use are presented in Table 3. Inspection of the Grade x Strategy interaction revealed that sorting was used more by fourth graders

4.13) than by second graders (M = 2.51), with grade comparisons for the other strategies being nonsignificant. Evaluation of the Condition x Strategy interaction revealed that rehearsal was used more in the ascending/descending condition (M = 4.13) than in the descending/ascending condition Q = 2.65), with the other strategies being used approximately equally in both conditions.

These strategy data provide information concerning the frequency of occurrence of each individual strategy. Subsequent analyses examine the possibility of several strategies being used in combination on a single trial, and changes in the mixture of strategies used across trials.

Variability in Strategy Use

Two general types of variability were examined: multiple-strategy use and strategy change. Multiple-strategy use refers to the number of strategies used within a given trial. Strategy change refers to the number of different strategies used across trials and trial-by-trial changes in strategy use.

Multiple-Strategy Use

An initial analysis examined the prediction that multiple-strategy use would increase with age and that the number of strategies used would be greatest for trials on which relatively many words were presented








42

(i.e., Trials 3-5 in the ascending/descending series and Trials 1, 2, 6, and 7 in the descending/ascending series). The number of strategies used on each trial was analyzed by a 2 (grade) x 2 (condition) x 7 (trial) ANOVA, with repeated measures on the trial factor. The analysis revealed a marginally significant effect of grade, F(l, 116) = 3.31, P .07, with fourth graders using more strategies (M = 1.57) than second graders Qj = 1.32). Also significant was the main effect of trial, F(6, 696) = 12.33, and the Grade x Trial interaction, F(6, 696) = 3.23. No other significant effects were found. Inspection of the significant main effect of trial revealed that the number of strategies used on Trial I (E = 1.14) and Trial 2 (N = 1.18) was significantly less than that used on Trials 3-7 (mean number of strategies used: 1.46, 1.49,

1.58, 1.57, and 1.58 for Trials 3-7, respectively). No other significant comparisons across trials were found.

Data pertaining to the significant Grade x Trial interaction are presented in Table 4, which also shows the number of strategies used for each Condition x Grade x Trial cell. Examination of grade differences in number of strategies used on each trial revealed that fourth graders used more strategies than second graders on Trials 3, 4, and 6, with strategy use being comparable for both grades on all other trials. These data, along with the data presented immediately above, are consistent with the predicted grade differences. In all cases where grade differences were found, fourth graders used more strategies than second graders. The absence of a significant Condition x Trial interaction indicates that strategy use did not vary across trials with different numbers of words presented.








43

Table 4

Mean Number of Strategies Used. By Grade and Trial, and By Condition. Grade, and Trial


Trial


1 2 3 4 5 6 7


Grade 2

M 1.22 1.12 1.30 1.32 1.44 1.38 1.48

SD .78 .83 .77 .85 .87 1.01 1.01

Grade 4

m 1.04 1.28 1.67 1.73 1.77 1.82 1.71

SD .96 .96 1.07 1.08 .99 1.14 1.17



Ascending/Descending

Grade 2

m 1.36 1.15 1.39 1.33 1.41 1.36 1.54

SD .81 .81 .78 .84 .79 1.04 .94

Grade 4

m 1.00 1.28 1.68 1.96 1.88 2.00 1.76

SD .96 .94 1.03 1.10 1.17 1.12 1.20



Descending/Ascending

Grade 2

N 1.03 1.07 1.20 1.30 1.47 1.40 1.40

SD .72 .87 .76 .88 .97 .97 1.10








44


Table 4--continued

Grade 4

m 1.08 1.27 1.65 1.50 1.65 1.65 1.65

SD .98 1.00 1.13 1.03 .80 1.16 1.16








45

Strategy Change

Number of strategy changes across trials. Although the analysis above demonstrates that fourth graders used more strategies than second graders, it did not examine possible changes in strategy use across trials (i.e., additions and deletions in strategy use on consecutive trials). For example, a child using two strategies across all trials could be using sorting and rehearsal on all seven trials, sorting and rehearsal on Trials 1-4 and sorting and clustering on Trials 5-7, or sorting and rehearsal on all even trials and sorting and clustering on all odd trials. In each case the child uses two strategies on all trials but shows a different number of strategy changes. A child using sorting and rehearsal on all trials shows no strategy changes; a child using sorting and rehearsal on Trials 1-4 and sorting and clustering on Trials 5-7 shows two strategy changes (i.e., dropping rehearsal and adding clustering from Trial 4 to Trial 5); and a child using sorting and rehearsal on all even trials and sorting and clustering on all odd trials shows 12 changes (i.e., dropping a strategy and adding a strategy on each of the six trial transitions (Trials 1 to 2, 2 to 3, 3 to 4, 4 to 5, 5 to 6, 6 to 7).

An analysis of strategy change evaluated the prediction that

strategy changes would decrease with age and that strategy changes would occur most frequently on transitions to trials with more words (i.e., Trials 2 to 3 and 3 to 4 in the ascending/descending series and Trials 5 to 6 and 6 to 7 in the descending/ascending series). The number of strategy changes on each of the six trial transitions was analyzed by a

2 (grade) x 2 (condition) x 6 (trial transition) ANOVA, with repeated








46

measures on the trial transition factor. The analysis revealed a significant main effect of grade, F(1, 116) = 11.41 (mean number of strategy changes: .78 and .54 for second and fourth grade, respectively), and a significant Grade x Trial Transition interaction, E(5, 580) = 2.67. No other significant effects were found.

Data relevant to the significant Grade x Trial Transition

interaction are presented in Table 5, which also shows the number of strategy changes for each Condition x Grade x Trial cell. Inspection of grade differences in strategy changes on each trial transition revealed that fourth graders had significantly fewer changes than second graders on all trial transitions except transitions 2 to 3 and 3 to 4. These data demonstrate that the grade difference mentioned above is primarily a result of fourth graders having fewer strategy changes than second graders on later rather than earlier trials. These findings are consistent with the hypothesis that strategy changes decrease with age. The absence of a significant Condition x Trial interaction indicates that strategy changes did not vary across trials with different numbers of words presented.

Other types of variability. Although number of strategy changes across trials is one measure of strategy change, other measures of strategy change are possible. Two additional measures of strategy change are examined here: number of unique strategy combinations used across all trials (range: 0 to 7), and number of consecutive trials with strategy changes (range: 0 to 6). These measures, along with the average number of strategy changes across trials (an average of the measure analyzed above), were converted to z-scores and entered into a 2








47

Table 5

Mean Number of Trial-by-Trial Strategy Changes, By Grade and Trial Transition, and By Condition, Grade, and Trial Transition



Trial Transition


Ilto 2 2 to 3 3 to 4 4 to 5 5 to 6 6 to 7


Grade 2

m .77 .68 .73 .93 .87 .70

SD .75 .58 .75 .85 .89 .69

Grade 4

M .49 .80 .57 .47 .45 .43

SD .64 .83 .67 .83 .50 .61



Ascending/Descending

Grade 2

M .87 .64 .67 .95 .82 .69

SD .83 .63 .74 .79 .82 .69

Grade 4

m .40 .76 .64 .60 .40 .44

SD .50 .78 .76 1.08 .50 .65



Descending/Ascending

Grade 2

14 .63 .73 .80 .90 .93 .70

SD .62 .52 .76 .92 .98 .70








48

Table 5--continued

Grade 4

m .58 .85 .50 .35 .50 .42

SD .76 .88 .58 .49 .51 .58








49

(grade) x 2 (condition) x 3 (strategy change type) ANOVA, with repeated measures on the strategy change type factor. This analysis permitted examination of possible grade and condition differences across the three measures of strategy change.

The analysis revealed a significant main effect of grade, F(l,

116) = 10.65 (mean z-scores summed across the three measures of strategy change: .22 and -.30 for second and fourth grade, respectively), which was qualified by a significant Grade x Strategy Change Type interaction, f(2, 232) = 3.90. No other significant main effects or interactions were found. Data pertaining to the significant Grade x Strategy Change Type interaction are presented in Table 6. Fourth graders had significantly lower levels of strategy change than second graders for two of the three measures (trials with changes and total changes). The grade difference for unique combinations was in the predicted direction but only approached significance, p < .10. Paired comparisons among the change measures within each grade revealed that fourth graders had significantly fewer total strategy changes than unique combinations. No other comparisons among the change measures within each grade were found. These data, along with the data in the preceding section, demonstrate that older children show fewer strategy changes than younger children across a variety of measures of strategy change, with the exception of unique combinations.

Variability within individual children. Although these findings demonstrate that strategy changes decline with age, they are based on analyses of group data, which often mask patterns of individual strategy use. Thus, children were classified as stable or unstable based on








50

Table 6

Mean Z-scores and Raw Scores for Unique Combinations, Trials with Changes, and Total Changes, By Grade (Standard Deviations in Parentheses)



Strategy Change Type


Unique Trials Total
Combinations with Changes Changes


Grade 2

Z-Score .13 (1.00) .25 (1.00) .29 (1.04)

Raw Score 2.64 (1.11) 3.51 (1.56) 4.67 (2.45)

Grade 4

Z-Score -.18 ( .99) -.34 ( .91) -.39 ( .80)

Raw Score 2.29 (1.10) 2.59 (1.43) 3.06 (1.87)








51

their pattern of strategy change across trials. Children were classified as stable if they used the same combination of strategies on at least four pairs of consecutive trials (of a possible six pairs of consecutive trials). Children were classified as unstable if they used the same combination of strategies on fewer than fours pairs of consecutive trials. These classifications were based on changes in the mixture (rather than the number) of strategies used over trials.

The percentage of children in each grade classified as stable or unstable is shown in the first and second columns of Table 7. Fourth graders were significantly more likely to be classified as stable than second graders, who showed considerable variability in strategy use 2j2(l, N = 120) = 7.52. These data are consistent with the findings reported in the previous section. However, the findings in the previous section showed that although both groups tended to show variability on early trials, only the second graders showed variability on later trials. Thus, a second analysis examined the possibility that the observed grade differences in stability classification were primarily attributed to differences in variability on later rather than earlier trials. Children were classified as stable or unstable on early trials (Trials 1 to 4) and separately on later trials (Trials 4 to 7). (Trial 4 is both the last trial in the set of early trials and the first trial in the set of later trials.) For each block of trials, children were classified as stable if they used the same combination of strategies on two or three pairs of consecutive trials (of a possible total of three pairs of consecutive trials). Children were classified as unstable if








52

Table 7

Percentage (and Number) of Children Classified as Stable or Unstable Across All Trials, on Early Trials, and on Later Trials, By Grade



All Trials Early Trials Later Trials


Grade Stable Unstable Stable Unstable Stable Unstable


2 23 (16) 77 (53) 39 (27) 61 (42) 38 (26) 62 (43)

4 47 (24) 53 (27) 43 (22) 57 (29) 63 (32) 37 (19)








53

they used the same combination of strategies on only one of three pairs of consecutive trials.

The percentage of children in each grade classified as stable and unstable on early trials and separately on later trials is presented in columns three through six in Table 7. For the early trials, no grade difference in the distribution of children classified as stable or unstable was found, X2(l, N = 120) < 1, with most children showing unstable strategy use. For the later trials, fourth graders were significantly more likely to be classified as stable than second graders, who frequently showed unstable strategy use, X2(l, N = 120)

7.38. These findings demonstrate that both groups of children showed considerable variability in strategy use on early trials. In contrast, only second graders showed unstable strategy use on later trials; most fourth graders showed stable strategy use.

These findings were extended in an analysis that examined changes in stability classification from early to later trials for individual children. Children were classified as showing one of four possible patterns of stability classification from early trials (i.e., Trials 1 to 4) to later trials (Trials 4 to 7): unstable on early trials, unstable on later trials (unstable/unstable); unstable on early trials, stable on later trials (unstable/stable); stable on early trials, stable on later trials (stable/stable); stable on early trials, unstable on later trials (stable/unstable).

The percentage of children in each of the four pattern

classifications is shown by grade in Table 8. The data are presented in terms of children whose stability classification did or did not change








54

Table 8

Percentage (and Number) of Children Changing or Not Changing Their Stabilit Classification Across Trial Blocks, By Grade



No Change Change


Unstable/ Stable/ Unstable/ Stable/
Grade Unstable Stable Stable Unstable


2 41 (28) 17 (12) 20 (14) 22 (15)

4 18 ( 9) 24 (12) 39 (20) 20 (10)








55

from early to later trials. The distribution of fourth and second graders in each of the four pattern classifications was significantly different, X2(3, N = 120) = 9.33. Analysis of data for children who did not change pattern classifications revealed that second graders were significantly more likely to show the unstable/unstable pattern than fourth graders, who frequently showed the stable/stable pattern, X2(l, N = 61) = 4.25. Analysis of data for children who did change pattern classifications revealed that the distribution of second and fourth graders in the unstable/stable and stable/unstable groups was not significant, X2(l, N = 59) = 2.04. The analysis of children who did not change classifications demonstrates that fourth graders were more likely to maintain an initial pattern of stable-strategy use than second graders, who frequently maintained an initial pattern of unstablestrategy use.

Analyses were also performed on the distribution of second and

fourth graders whose initial classification (on Trials 1-4) was unstable (unstable/unstable and unstable/stable), and separately on the distribution of second and fourth graders whose initial classification was stable (stable/stable and stable/unstable). In the analysis of children whose initial classification was unstable, fourth graders were significantly more likely to show the unstable/stable pattern than second graders, who frequently showed the unstable/unstable pattern, L( 2(l, N = 71) = 8.73. The distribution of second and fourth graders whose initial classification was stable (stable/stable and stable/unstable) was not significant, X2(l, N = 49) < 1. The analysis of children whose initial classification was unstable demonstrates that








56

fourth graders frequently switched from unstable-strategy use on early trials (i.e., Trials 1 to 4) to stable-strategy use on later trials (i.e., Trials 4 to 7). In contrast, second graders who showed unstablestrategy use on early trials frequently also showed unstable-strategy use on later trials.

Relation Between Strategy Use and Recall Utilization Deficiencies

Correlations between number of strategies used and recall. Miller and Seier (1994) have argued that significant and positive correlations between strategy use and recall for older but not younger children indicate a utilization deficiency for younger children. In the current study, utilization deficiencies of this type were expected on most trials for second graders. However, second graders were predicted to overcome a utilization deficiency on trials with relatively few words, when capacity requirements for strategy use were presumably minimal.

Utilization deficiencies were evaluated by computing correlations between number of strategies used and percentage of words recalled, separately for each Condition x Grade x Trial cell (see Table 9). The pattern of correlations in the ascending/descending condition showed clear age differences in the significance and magnitude of the relation between strategy use and recall. Fourth graders showed significant and positive correlations on all trials, whereas second graders showed significant and positive correlations on only three of seven trials. The magnitude of correlations for the fourth graders was higher than that for second graders on all Trials except Trial 6. These data provide evidence of utilization deficiency for the youngest children.








57

Table 9

Correlations Between Number of Words Recalled and Number of Strategies Used, By Condition, Grade, and Trial



Trial


1 2 3 4 5 6 7


Ascending/Descending

Grade 2 .31 .29 .42** .32* .16 .50** .30

Grade 4 .41* .46* .41* .65** .73** .45* .55**

Descending/Ascending

Grade 2 -.18 .50** .43* .34 .60** .65** .57**

Grade 4 .31 .37 .38 .06 .43* .46* .36


* p < .05, ** p < .01








58

Contrary to predictions, second graders did not overcome a utilization deficiency on three of four trials with nine or fewer words presented (i.e., Trials 1, 2, and 7).

The pattern of correlations in the descending/ascending condition was nearly opposite to that observed in the ascending/descending condition. Fourth graders now had significant correlations on only two of seven trials. Second graders had significant correlations on five of seven trials, with two of these correlations being found on trials with nine or fewer words presented (i.e., Trials 3 and 5). The magnitude of correlations for second graders was higher than that for fourth graders on all trials except Trial 1. These data do not provide evidence of a utilization deficiency for the younger children.

The failure to find significant correlations for fourth graders in the descending/ascending condition, when such correlations were significant in the ascending/descending condition, cannot be attributed to restricted variance in number of words recalled or number of strategies used. The standard deviations for number of strategies used on Trials 1-7 were very similar for fourth graders in the ascending/descending condition (SDs = .96, .94, 1.03, 1.10, 1.17, 1.12, and 1.20) and in the descending/ascending condition ( Ds = .98, 1.01, 1.13, 1.03, .80, 1.16, and 1.16). The standard deviations for recall were also similar for both groups of fourth graders (see Table 2).

Recall for perfectly Strategic children. Coyle and Bjorklund

(1996), as well as Miller and Seier (1994), have argued that utilization deficiencies can be inferred when grade differences in recall are observed despite comparable strategy use. In the current study, this








59

type of utilization deficiency was evaluated by analyzing mean proportion recall for trials on which children showed perfect sorting only, perfect clustering only, and perfect sorting and clustering. Clustering and sorting data on each trial were measured continuously by ARC scores for this analysis. ARC scores can range from 1 to -1, with 1 indicating perfect sorting or clustering and 0 indicating chance sorting or clustering. Because children rarely showed multiple trials with perfect strategy use (i.e., two or more trials with sorting or clustering scores of 1), repeated measures analysis of recall across trials with perfect strategy use was not performed. Instead, each child received a single score averaging recall across trials with perfect strategy use. Such a recall score was computed separately for trials with perfect clustering only, perfect sorting only, and perfect clustering and sorting.

Mean proportion recall for children in each grade showing each

measure of perfect strategy use is presented in Table 10, along with the number of subjects in each grade who had at least one trial of perfect strategy use. Separate 2 (grade) x 2 (condition) ANOVAs were performed on proportion recall for trials with perfect sorting only, perfect clustering only, and perfect sorting and clustering. The analysis of recall on trials with perfect clustering revealed a significant main effect of grade, F(l, 83) = 5.59. No other significant main effects or interactions were found for any measure of perfect strategy use.

These findings demonstrate that second graders who clustered

perfectly recalled fewer words than comparably strategic fourth graders, which is evidence for a utilization deficiency for the second graders.









60

Table 10

Mean Proportion Recall When Strategy Use Was Perfect, By Grade and Type of Strategy Used



Strategies Used Perfectly


Sorting Clustering Sorting and
Only Only Clustering


Grade 2

M .72 .50 .87

SD .19 .17 .17

n 15 63 14

Grade 4

M .80 .60 .89

SD .15 .23 .10

n 14 24 28


Note. ns are number of children who showed at least one trial of perfect strategy use.








61

In contrast, second graders who sorted perfectly or sorted and clustered perfectly recalled just as many words as comparably strategic fourth graders. Thus, utilization deficiencies occurred for some but not all instances of perfect strategy use. The absence of a significant effect of condition demonstrates that recall in each condition did not vary for children who showed comparable and perfect strategy use.

Utilization deficiencies for individual strategies. The findings described above were confirmed and extended in a descriptive analysis of recall for children in each grade who used each of the 15 possible strategy combinations or no strategy (see Table 11). The first part of this analysis examined grade differences in the percentage of trials on which each combination was used. As shown in Table 11, second graders were more likely than fourth graders to use rehearsal only, clustering only, and both rehearsal and clustering. In contrast, fourth graders were more likely than second graders to use no strategy, both sorting and clustering, and sorting, rehearsal, and clustering.

Utilization deficiencies were evaluated by analyzing grade

differences in recall when children used the same strategies. Analyses were conducted only on the seven strategy combinations for which sufficient data were available for a significance test. (Recall data for no strategy use were not included in this analysis because children who use no strategy cannot be evaluated for a utilization deficiency.) Of these seven comparisons, four showed that fourth graders recalled significantly more than second graders when strategy use was comparable. The remaining three comparisons were not significant but had means in the predicted direction. Consistent with the findings in the previous








62

Table 11

Percentage (and Number) of Trials on Which Each Strategy Combination Was Used, and Mean Proportion Recall (and Standard Deviations) for Each Combination, By Grade (Codes for Strategies: S. Sorting; R, Rehearsal; C, Clustering; N, Category Naming)



Percentage of Trials Mean Recall


Strategy Grade 2 Grade 4 Grade 2 Grade 4


None 16 ( 78) 22 (77) .41 (.25) .65 (.23)

S 8 ( 39) 8 (30) .50 (.23) .60 (.21)

R 24 (115) 10 (34) .53 (.25) .81 (.19)

C 14 68) 5 (18) .47 (.19) .54 (.17)

N 0 0) 0 ( 0)

SR 8 37) 10 (34) .65 (.23) .76 (.22)

SC 10 50) 17 (61) .70 (.26) .80 (.22)

SN <1 2) 0 ( 0)

RC 10 49) 5 (17) .48 (.20) .69 (.27)

RN <1 1) 0 0)

CN 0 0) 0 0)

SRC 7 35) 23 (83) .74 (.23) .90 (.13)

SRN <1 1) <1 1)

SCN <1 1) <1 2)

RCN <1 1) 0 0)

SRCN 1 6) 0 0)








63

Table 11--continued

Note. Boldface denotes significant age differences in recall or percentage of trials on which a particular strategy combination was used, with all significant results reported at p .05. Recall data are omitted for combinations used on one or zero trials. Grade differences in percentage of trials on which each combination was used are evaluated using Yates corrected chi-squares with one degree of freedom. Grade differences in mean recall for each combination are evaluated using t-tests.








64

section, these data demonstrate that fourth graders outperform comparably strategic second graders, which is evidence of a utilization deficiency for the second graders. Strategy Change and Recall

Correlations between strategy change and recall. Previous

research (Coyle & Bjorklund, 1997) involving procedures and age groups similar to those in the current study has shown that measures of strategy change are significantly and negatively correlated with recall for older but not younger children. That is, older children who showed the fewest strategy changes across trials (i.e., high levels of stability in strategy use) had the highest levels of recall. The current study attempted to replicate this finding with a different sample.

Correlations were computed separately between each measure of strategy change and mean proportion recall across trials. The three measures of strategy change were number of unique strategy combinations, number of consecutive trials with strategy changes, and total number of strategy changes on consecutive trials.

Correlations computed separately for each grade revealed a pattern very similar to that observed in previous research. Fourth graders showed significant and negative relations between recall and strategy change for two of the three measures (trials with changes, r(51) = -.42, p < .01, and total changes, r(51) = -.36, p < .05), but not for unique combinations, r(51) = -.11, p > .10. Second graders showed no reliable relation between recall and any measure of strategy change (Ls(69) = .23, -.15, and -.13 for unique combinations, trials with changes, and








65

total changes, respectively). These findings were qualified by correlations computed separately within each Grade x Condition cell and reported in Table 12. These correlations showed that only fourth graders in the descending/ascending condition showed significant and negative relations between recall and strategy change, with correlations involving all three measures of strategy change being significantly related to recall. These latter findings demonstrate that the grade differences reported above can be attributed to correlational data for fourth graders in the descending/ascending condition. Fourth graders in the ascending/descending condition showed no reliable relation between recall and strategy change.

The data in Table 12 reveal that the difference between

correlations involving trials with changes and total changes was always lower than the difference between correlations involving each of these variables and unique combinations. This suggested possible differences in the relations among the various measures of strategy change. To assess this possibility, pairwise correlations among each of the three measures of strategy change were computed, separately within each Grade x Condition cell. These correlations are reported Table 13.

As shown in Table 13, all correlations among the three measures of strategy change were significant. However, the magnitude of correlations involving unique combinations (i.e., unique combinations and trials with changes; unique combinations and total changes) was lower than the magnitude of correlations not involving unique combinations (i.e., trials with changes and total changes). Correlations between trials with changes and total changes were near








66

Table 12

Correlations Between Measures of Strategy Change and Recall. By Condition and Grade


Measure of Strategy Change Unique Trials Total
Combinations with Changes Changes

Ascending/Descending

Grade 2 .23 -.21 -.06

Grade 4 .28 -.16 -.05

Descending/Ascending

Grade 2 .22 -.13 -.18

Grade 4 -.53** -. 66** .66**


**Q < .01








67

Table 13

Correlations Among Measures of Strategy Change, By Condition and Grade



Correlation


Combinations and Combinations and Trials with Changes
Trials with Changes Total Changes Total Changes


Ascending/Descending

Grade 2 .33* .49** .77***

Grade 4 .60** .65*** .93***

Descending/Ascending

Grade 2 .66*** .64*** .94***

Grade 4 .69*** .68*** .91**


*Q < .05, **p < .01, ***D < .001








68

perfect for all Grade x Condition cells except one (second graders in the ascending/descending condition).

Relation between strategy change and recall for individual

children. The finding that stability was related to high levels of recall only for fourth graders in the descending/ascending condition was only partially confirmed in an analysis of recall for children classified as stable or unstable (see Table 7 for stability classification data). A 2 (grade) x 2 (stability classification) x 2 (condition) x 7 (trial) ANOVA was conducted on mean proportion recall. Because significant main effects and interactions involving the Grade, Condition, and Trial factors have already been reported, only significant main effects and interactions involving the Stability Classification factor are reported here.

The analysis revealed a marginally significant main effect of stability classification, F(l, 112) = 3.02, p = .09 (mean proportion recall: .59 and .70, for unstable and stable, respectively), which was qualified by a significant grade x stability classification interaction, F(l, 112) = 3.78. No other significant effects involving the stability classification factor were found, including effects involving the condition factor. The failure to find a significant Grade x Stability Classification x Condition interaction is inconsistent with the correlational results reported in the previous section. Those results showed that, in the descending/ascending condition, fourth graders showing stable-strategy use had higher recall than fourth graders showing unstable-strategy use. The findings in the current section, along with those of the previous one, demonstrate that findings








69

pertaining to analyses that examine patterns of variability for individual subjects may not always be consistent with those pertaining to analyses that examine patterns of variability in group data.

Data relevant to the significant Grade x Stability Classification interaction are reported in columns one and two of Table 14. Differences in recall between stable and unstable children were analyzed separately within each grade. Second-grade children in each stability classification showed equivalent levels of recall. In contrast, fourth graders classified as stable recalled significantly more than fourth graders classified as unstable. A further analysis examined grade differences.in recall separately within each stability group. Fourth graders recalled significantly more than second graders within both stability groups. However, the magnitude of this grade difference in recall was greater for stable children than for unstable children (mean fourth grade minus second grade recall difference: .28 and .16, for stable and unstable children, respectively).

These findings were confirmed and extended in a final set of analyses that examined grade differences in recall for children classified as stable and unstable on early trials (Trials 1-4) and separately on later trials (Trials 4-7). A 2 (grade) x 2 (condition) x

2 (stability classification) ANOVA was performed on mean proportion recall on early trials and separately on later trials. As before, only significant effects involving the stability classification factor are reported. The recall data for these analyses are reported in columns three through six in Table 14.








70

Table 14

Mean Proportion Recall (and Standard Deviations) for Children Classified as Stable or Unstable Across All Trials, on Early Trials, and on Later Trials, By Grade



All Trials Early Trials Later Trials


Grade Stable Unstable Stable Unstable Stable Unstable


2 .53 (.18) .54 (.15) .53 (.15) .59 (.15) .59 (.18) .47 (.15)

4 .81 (.12) .70 (.16) .78 (.12) .70 (.16) .82 (.16) .72 (.20)








71

The analysis involving data on early trials revealed a significant grade x stability classification interaction, F(l, 112) = 5.62. No other significant effects involving stability classification were found. Examination of the significant interaction revealed a pattern of results very similar to that observed the analysis involving all trials. No differences in recall were found for second graders classified as stable or unstable, whereas recall for fourth graders classified as stable was marginally greater than that for fourth graders classified as unstable, P < .07. Separate grade comparisons within each stability classification revealed that fourth graders recalled significantly more than second graders, although the magnitude of this grade difference was again greater for stable children than for unstable children (mean fourth grade minus second grade recall difference: .25 and .11 for stable and unstable children, respectively).

The comparable analysis involving data on later trials revealed a significant main effect of stability classification, F(l, 112) = 11.10 (mean proportion recall: .72 and .55 for stable and unstable, respectively). No other significant differences involving stability classification were found. These findings, along with those for early trials, demonstrate that stability on early trials is associated with high levels of recall for older but not younger children, whereas stability on later trials is associated with high levels of recall for both age groups.

Conditions of strategy changes. Why do strategy changes occur? McGilly and Siegler (1989) addressed this question by analyzing the number of trials on which children showed strategy changes immediately








72

after serial recall performance that was perfect or less than perfect. They found that children were more likely to show strategy changes when recall was less than perfect than when recall was perfect. That is, children tended to stick with a particular strategy on the next trial when it had yielded perfect performance, but changed strategies on the next trial when it had yielded less than perfect performance. This pattern is consistent with the win-stay/lose-shift approach that has been reported in decision-making literature (Eimas, 1969).

In the current study, evidence for the win-stay/lose-shift

approach was examined by classifying each trial as a trial on which recall was perfect or not perfect and on which strategy changes were or were not observed on the next trial. This resulted in four possible classifications: recall perfect/strategy change; recall perfect/no strategy change; recall not perfect/strategy change; recall not perfect/no strategy change. Classifications were performed separately for Trials 1-6, with each child contributing a single data point at each trial. (Trial 7 was omitted from the analysis because a strategy change following Trial 7 is not possible.)

The percentage of trials on which recall was perfect or not and followed by a strategy change or not is shown in Table 15. The classification data on each trial were analyzed separately by 2 (recall perfect vs. recall not perfect) x 2 (strategy change vs. no strategy change) chi-squares. For Trials 1 through 3, perfect recall was followed by strategy changes or no strategy change approximately equally, 2j2s(l, N = 120) < 1. For Trials 4 through 6, however, perfect recall was followed by no strategy change more frequently than by









73

Table 15

Percentage (and Number) of Trials on Which Strategy Changes Did and Did Not Occur Immediately After Recall was Perfect or Not Perfect



Trial


1 2 3 4 5 6


Recall Perfect

Strategy
Change 42 ( 8) 57 ( 4) 31 ( 4) 30 ( 9) 29 ( 6) 25 ( 5)

No Strategy
Change 58 (11) 43 ( 3) 69 ( 9) 70 (21) 71 (15) 75 (15)

Recall Not Perfect

Strategy
Change 53 (54) 61 (69) 54 (58) 62 (56) 57 (56) 54 (54)

No Strategy
Change 47 (47) 39 (44) 46 (49) 38 (34) 43 (43) 46 (46)


Note. Percentages computed separately at each trial.








74

strategy changes, X2s(l, N = 120) > 5.43. These results demonstrate that children were more likely to continue to use a strategy that yielded perfect performance on later but not earlier trials.

Additional analyses examined the prediction that trials with

relatively few words (i.e., Trials 1, 2, and 6 in ascending/descending and Trials 3, 4, and 5 in descending/ascending) would provide greater opportunity for perfect recall, and consequently result in relatively few strategy changes. To test this prediction, a series of 2 (recall perfect vs. recall not perfect) x 2 (strategy change vs. no strategy change) chi-squares were performed separately at each trial in each condition (see Table 16). This resulted in a total of 12 individual chi-squares (2 conditions x 6 trials). (Because including the grade factor would have resulted in insufficient data to perform significance tests for several of the grade x condition x trial combinations, the grade factor was excluded from these analyses.) One of the 12 chisquares (Trial 1 in the descending/ascending condition) did not contain sufficient data for a significance test. Of the remaining 11, only two were significant. As predicted, the pattern of data on trials 4 and 5 in the descending/ascending condition revealed that perfect recall was followed by no strategy change more frequently than by strategy changes, X2s(l, N = 56) > 5.18. The four other trials on which this pattern was predicted (i.e., Trials 1, 2, and 6 in ascending/descending and Trial 3 in descending ascending) showed that perfect recall and strategy changes did not vary as a function of number of words presented. These data provide little evidence for the prediction that trials with relatively









75

Table 16

Percentage (and Number) of Trials on Which Strategy Changes Did and Did Not Occur Immediately After Recall was Perfect or Not Perfect, By Condition



Trial


1 2 3 4 5 6


Ascending/Descending Number of Words
Presented 6 9 12 15 12 9

Recall Perfect

Strategy
Change 42 ( 8) 50 ( 3) 33 ( 1) 67 ( 2) 33 ( 2) 29 ( 5)

No Strategy
Change 58 (11) 50 ( 3) 67 ( 2) 33 ( 1) 67 ( 4) 71 (12)

Recall Not Perfect

Strategy
Change 56 (25) 57 (33) 51 (31) 59 (36) 53 (31) 55 (26)

No Strategy
Change 44 (20) 43 (25) 49 (30) 41 (25) 47 (27) 45 (21)


Descending/Ascending Number of Words
Presented 15 12 9 6 9 12

Recall Perfect

Strategy
Change 0 ( 0) 100 ( 1) 30 ( 3) 26 ( 7) 27 ( 4) 0 ( 0)

No Strategy
Change 0 ( 0) 0 ( 0) 70 ( 7) 74 (20) 73 (11) 100 ( 3)









76

Table 16--continued

Recall Not Perfect

Strategy
Change 52 (29) 65 (36) 59 (27) 69 (20) 61 (25) 53 (28)

No Strategy
Change 48 (27) 35 (19) 41 (19) 31 ( 9) 39 (16) 47 (25)


Note. Percentages computed separately at each trial.








77

few words would provide greater opportunity for perfect recall, and consequently result in few strategy changes.















DISCUSSION

The current study examined several measures of variability in strategy use, relating each measure to memory performance. Whereas previous investigations of variability in strategy use have assessed only one strategy on each trial and one type of variability (Siegler, 1996), the current study examined the possibility of multiple strategies on each trial and two different types of variability (multiple-strategy use and strategy changes). The current study was very similar in design to a study by Coyle and Bjorklund (1997). However, it had additional trials with which to evaluate changes in variability over time and included trials varying widely in the number of words to be recalled. Strategy variability was assessed within and across trials and related to mean levels of recall, with analyses focusing on utilization deficiencies and stability-recall relations. The results revealed developmental differences in multiple-strategy use and strategy change, and more important, age-related changes in the relation between measures of variability and recall. Surprisingly, the results revealed few significant effects related to the number of words presented on each trial or the pattern of increases and decreases in the number of words presented across trials.

The goals of the current study were to examine the impact of age and number of words presented on each trial on (a) measures of strategy variability, including multiple-strategy use and strategy changes; (b)


78








79

the relation between multiple-strategy use and recall, with particular attention to patterns indicative of utilization deficiencies; (c) and the relation between strategy changes and recall, with particular attention to stability-recall relations. The pages that follow are organized around these goals.

Variability in Strategy Use

Multiple-Strategy Use

As predicted, fourth graders tended to use more strategies than

second graders, with the number of strategies used increasing from Trial

2 to Trial 3 and remaining stable thereafter. These results were confirmed and extended in the analysis of grade differences in the use of each of the 15 unique strategy combinations (Table 11). In that analysis, second graders used the strategies of rehearsal and clustering and the two-strategy combination of rehearsal and clustering more than fourth graders. In contrast, fourth graders used the two-strategy combination of sorting and clustering and the three-strategy combination of sorting, rehearsal, and clustering more often than second graders. These data demonstrate that, when grade differences in strategy use were found, second graders tended to use combinations with the fewest strategies (i.e., single-strategy combinations) whereas fourth graders tended to use combinations with the most strategies (i.e., threestrategy combinations).

The very low frequency of category naming in the current study is inconsistent with the results obtained by Coyle and Bjorklund (1997). Whereas category naming was observed on only 2% of all trials in the current study, it was observed on almost 31% of all trials in Coyle and








80

Bjorklund (1997). The reason for this difference is not clear. Both studies used similar tasks and designs, had very similar testing procedures, and involved children in the same age range. One possible explanation for the disparity is that children in each study attended different types of schools. Whereas children in the current study attended public schools, children in Coyle and Bjorklund attended a university-affiliated laboratory school. The curriculums at public schools and laboratory schools may differ in ways that promote or inhibit organizational strategy use. For example, children who attend the university-affiliated schools may receive explicit instruction in organizing items by taxonomic categories, whereas children who attend public schools may receive such instruction less often, if at all. Such a curriculum difference would affect children's use of organizational strategies, particularly category naming.

The near absence of category naming in the current study resulted in fewer strategies being available for analyses of variability in strategy use. Although a reduction in the total number of strategies available for analyses could affect statistical outcomes, the agerelated patterns of variability and performance found in the current study are comparable to those found in the very similar sort-recall study by Coyle and Bjorklund (1997). Older children in both studies used more strategies than younger children. Also, as reported later in the Discussion, older children in both studies showed lower levels of strategy change, and stronger relations between stable-strategy use and recall, than did younger children. These results suggest that age-








81

related patterns in variability are relatively uninfluenced by changes in the total number of strategies being assessed.

Children were expected to show increases in the number of

strategies used on trials with more words. The results revealed that multiple-strategy use did not vary as a function of the number of words on each trial, with strategy use being comparable across both the ascending/descending and descending/ascending conditions. The absence of any effects involving trial or condition cannot be attributed to ceiling effects or children not being able to use the target strategies. Children of all ages used an average of fewer than two strategies across trials (of a possible four strategies), leaving ample opportunity for increases in the number of strategies used. Furthermore, children in the age range studied have demonstrated competence in using all strategies assessed.

The absence of any effect of condition and trial on the number of strategies used demonstrates that children did not modify their strategic behavior in response to being presented with different number of words. Why did children stick with using a certain number of strategies when presented varying number of words? Perhaps the most parsimonious explanation is that children did not consider altering their strategic behavior on trials with different numbers of words. Although children in the age range tested could use all the strategies assessed in the current study, metacognitive limitations concerning when and how to use strategies may have prevented them from doing so. An implication is that children who do not produce strategies spontaneously might do so if they are instructed to (Ringle & Springer, 1980). In








82

addition to metacognitive limitations, capacity limitations may have prevented the use of additional strategies. Children have limited mental capacity for executing cognitive operations such as strategies, and such capacity constraints may impose limits on the number of strategies that can be used (Guttentag, 1984). Capacity limits can change as a result of task experience or familiarization, as may have occurred when strategy use increased from Trial 2 to 3. However, capacity limits probably place an upper limit on the number of strategies used, resulting in changes that occur in a restricted range. Strategy Changes

In addition to the observed age differences in multiple-strategy use, the current study also revealed age differences in strategy changes. Fourth graders showed fewer strategy changes than second graders for two of the three measures of strategy change (number of trials with changes and number of trial-by-trial changes). No grade difference was found for the third measure of strategy change, number of unique strategy combinations, although the pattern was in the predicted direction. Although age-related declines in variability have been noted elsewhere (Siegler, 1996), these are the first results to demonstrate empirically that strategy changes decline with age. More importantly, these results, along with the results pertaining to multiple-strategy use described above, demonstrate that different measures of variability show different developmental patterns. Number of strategies used increased with age, number of trials with changes and total strategy changes decreased with age, and number of unique strategy combinations was comparable across age. These results suggest considerable diversity








83

in the developmental pathways of different measures of variability, with no one pattern accounting for all measures of strategy change and multiple-strategy use.

Correlations among the various measures of strategy changes

differed in magnitude. Although correlations among all measures of strategy change were significant, correlations between number of trials with changes and total number of changes were consistently higher than correlations between each of these measures and number of unique combinations. Moreover, correlations between trials with changes and total changes were near perfect (rs > .90) for 3 of the possible 4 correlations involving these measures, whereas none of the 8 correlations involving unique combinations was near perfect. These data are the first to my knowledge to show differences in the strength of relations among different measures of strategy change.

Differences in the relations among the various measures of

strategy change can be attributed to how each measure was computed. The two most closely related measures, trials with changes and total changes, both were computed based on the number of consecutive trials on which different strategies were used. The third measure, unique combinations, was computed based on the number of different strategy combinations used, irrespective of whether the different strategies were used on consecutive trials. These computational differences resulted in differences in the magnitude of the correlations among the various measures of strategy change, with correlations among measures based on the same underlying index of strategy change being higher than









84

correlations among measures based on different indexes of strategy change.

Mean levels of variability for each strategy change measure in the current study were lower than those observed in the similar study by Coyle and Bjorklund (1997). In the current study, percentage of unique combinations, trials with changes, and total changes across trials (collapsed across grade) was 35, 51, and 55, respectively. In Coyle and Bjorklund, the corresponding percentages were 46, 54, and 62, respectively. This slight disparity in strategy change scores can be attributed to the current study using more trials than the Coyle and Bjorklund study. As shown in Table 5, the additional trials in the current study allowed fourth graders to maintain a pattern of stablestrategy use (i.e., few strategy changes) that began after Trial 4, whereas second graders showed unstable-strategy use across all trials. Consequently, the mean number of strategy changes averaged across grade can be attributed to fourth graders showing substantially lower strategy change scores on the later trials. Although Coyle and Bjorklund did analyze strategy change patterns across trials, the fewer trials used in that study limited the amount of stability that older children could display and probably contributed to the slight disparity in strategy change scores.

The age-related decline in strategy change was confirmed and extended in analyses of trial-by-trial changes in strategy use for individual children. An initial analysis revealed that fourth graders were more likely to be classified as showing stable-strategy use than second graders, who often switched strategies on adjacent trials.








85

Subsequent analyses revealed that this grade difference in stability classification was attributed to a disproportionate number of fourth graders being classified as stable on later trials (i.e., Trials 4 to 7). The distribution of children in each grade classified as stable and unstable was comparable on early trials (i.e., Trials 1 to 4). These findings were extended in an analysis of changes in individual-subject stability classification across early and later trial blocks for children whose initial stability classification was unstable. In that analysis, fourth graders who showed unstable-strategy use on early trials frequently showed stable-strategy use on later trials. In contrast, second graders who showed unstable-strategy use on early trials often remained unstable on later trials. These findings demonstrate that stable-strategy use emerged during the later trials for fourth graders but not for second graders. The fourth-grade data are consistent with research demonstrating that variability in strategy use declines with experience on a task (Coyle & Bjorklund, 1997; Siegler, 1996). Presumably, the second-grade children eventually would have shown stability in strategy use if they had been given additional practice and experience on the task. Microgenetic studies, assessing children's strategy use over longer periods of time, are needed to evaluate this hypothesis.

Children in all grades were predicted to show relatively few

strategy changes following trials with few words (i.e., trials with 6 or

9 words) and more frequent strategy changes following trials with many words (i.e., trials with 12 or 15). Contrary to this prediction, the results revealed that strategy changes did not vary with trials with








86

different numbers of words. Children of all ages showed comparable numbers of strategy changes after trials with relatively few words or many words.

Why did children fail to show strategy changes on trials with many words when they were predicted to do so and when such changes may have benefited their performance? The answer to this question may involve the same factors that were reviewed in the section on multiple-strategy use: metacognitive limitations and capacity limitations. Metacognitive limitations may have limited children's ability to monitor changes in the number of words presented on each trial and to alter their strategy use in response to such changes. Capacity limitations may have limited children's ability to add strategies on successive trials even if they had the metacognitive awareness to do so. Future research, providing metacognitive instruction on when and how to use strategies and reducing the capacity demands for strategy production and utilization, is needed to assess these possibilities.

Relation Between Multiple-Strategy Use and Recall

An important purpose of the current study was to investigate the relation between multiple-strategy use and recall, identifying possible evidence for utilization deficiencies. An initial analysis examined age differences in correlations between multiple-strategy use (i.e., number of strategies used) and recall, computed separately in each condition and across trials. The findings in the ascending/descending condition were very similar to those observed in previous research examining age differences in the relation between strategy use and recall (Coyle & Bjorklund, 1996, 1997). Fourth graders showed significant correlations








87

on all trials, whereas second graders showed significant correlations on only three of the seven trials (Trials 3, 4, and 6). The pattern of correlations for the fourth graders indicated that they were able to benefit from using multiple strategies from the beginning of the task. The second graders' pattern indicated that strategy use was rarely linked to recall performance, which is evidence of a utilization deficiency.

The findings in the descending/ascending condition revealed a pattern opposite to that found in the ascending/descending condition. Fourth graders now showed significant correlations on only two of the seven trials (Trials 6 and 7), whereas second graders showed significant correlations on five of the seven trials (Trials 2, 3, 5, 6, and 7). The pattern of correlations for the second graders indicated that they were using multiple strategies effectively. The fourth graders, pattern was more difficult to interpret. Although it could be argued that fourth graders were utilizationally deficient, such an interpretation is probably incorrect because, with few exceptions, mean recall and strategy use were higher for fourth graders than for second graders (see Tables 2 and 4). Thus, fourth graders were probably not using strategies ineffectively but likely using other means to recall the list items, perhaps relying on nonstrategic factors (e.g., capacity, speed of processing).

A comparison of the correlational data in each condition

demonstrates different patterns of strategy-recall relations across trials for each age group. Fourth graders tended to use multiple strategies effectively when an increasing number of words was presented




Full Text
19
trials. In contrast, the study by Coyle and Bjorklund assessed
stability and instability using data on all trials.
Second, in the current study the number of words presented varied
across trials from six to fifteen, whereas in Coyle and Bjorklund the
number of words presented remained constant across trials at eighteen.
Although the number of words varied across trials in the current study,
the number of categories represented on each trial remained constant at
three. This eliminated the possibility that changes in strategy use and
recall would result from changes in the number of categories represented
across trials, and ensured that such changes could be attributed to
variation in the number of words presented. The design permitted an
examination of whether children adapt their strategy use to changes in
the number of words on each trial. It was now possible to assess
measures of variability, including multiple-strategy use and strategy
changes, when children were presented with relatively few words or many
words. In contrast, the study by Coyle and Bjorklund assessed
variability under conditions in which task demands (i.e., number of
words on each trial) remained constant.
Apart from the differences mentioned above, the design of the
current study was very similar to the one used by Coyle and Bjorklund.
Second- and fourth-grade children were given seven sort-recall trials of
categorizable words. As in Coyle and Bjorklund, different words and
categories were used on each trial to minimize the likelihood that
increases in strategy use would result from practice with a particular
set of categorizable items. Also as in Coyle and Bjorklund, category
items were chosen to avoid high associations between words, thus


55
from early to later trials. The distribution of fourth and second
graders in each of the four pattern classifications was significantly
different, X?(3, N = 120) = 9.33. Analysis of data for children who did
not change pattern classifications revealed that second graders were
significantly more likely to show the unstable/unstable pattern than
fourth graders, who frequently showed the stable/stable pattern, X (1 M
= 61) = 4.25. Analysis of data for children who did change pattern
classifications revealed that the distribution of second and fourth
graders in the unstable/stable and stable/unstable groups was not
significant, X (1, N = 59) = 2.04. The analysis of children who did not
change classifications demonstrates that fourth graders were more likely
to maintain an initial pattern of stable-strategy use than second
graders, who frequently maintained an initial pattern of unstable-
strategy use.
Analyses were also performed on the distribution of second and
fourth graders whose initial classification (on Trials 1-4) was unstable
(unstable/unstable and unstable/stable), and separately on the
distribution of second and fourth graders whose initial classification
was stable (stable/stable and stable/unstable). In the analysis of
children whose initial classification was unstable, fourth graders were
significantly more likely to show the unstable/stable pattern than
second graders, who frequently showed the unstable/unstable pattern,
X (1, N = 71) = 8.73. The distribution of second and fourth graders
whose initial classification was stable {stable/stable and
stable/unstable) was not significant, X (1, N = 49) < 1. The analysis
of children whose initial classification was unstable demonstrates that


61
In contrast, second graders who sorted perfectly or sorted and clustered
perfectly recalled just as many words as comparably strategic fourth
graders. Thus, utilization deficiencies occurred for some but not all
instances of perfect strategy use. The absence of a significant effect
of condition demonstrates that recall in each condition did not vary for
children who showed comparable and perfect strategy use.
Utilization deficiencies for individual strategies. The findings
described above were confirmed and extended in a descriptive analysis of
recall for children in each grade who used each of the 15 possible
strategy combinations or no strategy (see Table 11). The first part of
this analysis examined grade differences in the percentage of trials on
which each combination was used. As shown in Table 11, second graders
were more likely than fourth graders to use rehearsal only, clustering
only, and both rehearsal and clustering. In contrast, fourth graders
were more likely than second graders to use no strategy, both sorting
and clustering, and sorting, rehearsal, and clustering.
Utilization deficiencies were evaluated by analyzing grade
differences in recall when children used the same strategies. Analyses
were conducted only on the seven strategy combinations for which
sufficient data were available for a significance test. (Recall data
for no strategy use were not included in this analysis because children
who use no strategy cannot be evaluated for a utilization deficiency.)
Of these seven comparisons, four showed that fourth graders recalled
significantly more than second graders when strategy use was comparable.
The remaining three comparisons were not significant but had means in
the predicted direction. Consistent with the findings in the previous


43
Table 4
nean IMUIIlUtrl Ul u ll duc^lcb UpcU, Dy UldUc ctliU. 1I ldi diiu i->y VjUUU.1 liuu, uiauc,
and Trial
Trial
1
2
3
4
5
6
7
Grade 2
M
1.22
1.12
1.30
1.32
1.44
1.38
1.48
SD
.78
.83
.77
.85
.87
1.01
1.01
Grade 4
M
1.04
1.28
1.67
1.73
1.77
1.82
1.71
SD
.96
.96
1.07
1.08
.99
1.14
1.17
Grade 2
Ascending/Descending
M
1.36
1.15
1.39
1.33
1.41
1.36
1.54
SD
.81
.81
.78
.84
.79
1.04
.94
Grade 4
M
1.00
1.28
1.68
1.96
1.88
2.00
1.76
SD
.96
.94
1.03
1.10
1.17
1.12
1.20
Grade 2
Descending/Ascending
M
1.03
1.07
1.20
1.30
1.47
1.40
1.40
SD
.72
.87
.76
.88
.97
.97
1.10


15
Multiple strategies were assessed on each trial. This permitted
assessment of variability within trials, as well as variability across
trials. It also permitted assessment of utilization deficiencies in
multiple-strategy use. The four strategies assessed on each trial were
sorting, physically moving or arranging the words into groups;
rehearsing, saying out loud or mouthing the items; category naming,
saying the category name of a group of words; and clustering, recalling
the words by categories. Each strategy was coded as occurring or not on
each trial. The measure of performance was the number of words recalled
on each trial.
Unlike previous studies, variability was measured in not one but
several ways. Variability was measured in terms of (a) average number
of strategies used across trials, (b) number of trials on which multiple
strategies were used, (c) number on trials that the combination of
strategies differed from the preceding trial, and (d) total number of
strategy changes on consecutive trials, counting both strategy additions
and deletions as changes. The first two measures (average number of
strategies and number of trials with multiple strategies) are examples
of multiple-strategy use. These are the most frequently reported
measures of variability. The last two measures (trials with changes and
total number of changes) are examples of strategy change. These
measures assess changes over time and are reported less frequently.
Coyle and Bjorklund predicted age differences in variability.
Multiple-strategy use (e.g., number of strategies used) was predicted to
increase with age. The basis for this prediction was that strategy use
would be less effortful for older than for younger children, and so


11
individual strategy, with some strategies being used more than others.
At no point in development is a single strategy used exclusively.
Rather, multiple strategies are used throughout development.
Considerable evidence supports the view of strategy development
championed by Siegler. Variability in strategy use has been found for
children differing in race, nationality, and intelligence; for problem
domains including arithmetic, serial recall, scientific reasoning,
reading, spelling, and tic-tac-toe; for participants ranging in age from
one year to adulthood; and for analyses examining both group and
individual subject data (Siegler, 1996). In sum, variability in
strategy development appears to be the rule in development, not the
exception.
Contemporary research has investigated possible correlates of
variability in memory strategy use. Possible correlates include
knowledge of the stimulus items and task, psychometrically-measured
intelligence, and the history of effectiveness of a particular strategy.
Empirical support for these correlates has been demonstrated in studies
showing that variability in strategy use is reduced when (a) highly
familiar stimulus items are used (Bjorklund & Bernholtz, 1986; Frankel &
Rollins, 1985), (b) children have very high-IQs (Coyle, Colbert, & Read,
1997), and (c) a strategy yields perfect memory performance (McGilly &
Siegler, 1989).
Contemporary research also has examined the developmental course
of variability in strategy use. Siegler has shown that the number of
strategies used depends on amount of experience on a task (Siegler,
1996). In general, few strategies are used when task experience is


71
The analysis involving data on early trials revealed a significant
grade x stability classification interaction, F(l, 112) =5.62. No
other significant effects involving stability classification were found.
Examination of the significant interaction revealed a pattern of results
very similar to that observed the analysis involving all trials. No
differences in recall were found for second graders classified as stable
or unstable, whereas recall for fourth graders classified as stable was
marginally greater than that for fourth graders classified as unstable,
£ < .07. Separate grade comparisons within each stability
classification revealed that fourth graders recalled significantly more
than second graders, although the magnitude of this grade difference was
again greater for stable children than for unstable children (mean
fourth grade minus second grade recall difference: .25 and .11 for
stable and unstable children, respectively).
The comparable analysis involving data on later trials revealed a
significant main effect of stability classification, F(l, 112) = 11.10
(mean proportion recall: .72 and .55 for stable and unstable,
respectively). No other significant differences involving stability
classification were found. These findings, along with those for early
trials, demonstrate that stability on early trials is associated with
high levels of recall for older but not younger children, whereas
stability on later trials is associated with high levels of recall for
both age groups.
Conditions of strategy changes. Why do strategy changes occur?
McGilly and Siegler (1989) addressed this question by analyzing the
number of trials on which children showed strategy changes immediately


101
children may not have the cognitive capacity to execute the strategy
effectively, and so no gain in performance would be realized (Miller,
Woody-Ramsey, & Aloise, 1991). The failure to find consistent effects
associated with the number of words manipulation can be attributed to
any one, or combination of, the above possibilities.
The current study has several implications for conceptualizing and
studying variability and the relation between variability and recall.
First, the current study demonstrates that variability is a multifaceted
phenomenon that can be assessed in various ways. Because no single
measure of variability is likely to describe fully the diversity of
variability in strategy use, future research on variability should
assess multiple-measures of variability, or risk ignoring some
potentially important types of variability. Second, the current study
demonstrates that some measures of variability are highly related
whereas others are less strongly related. For example, although unique
combinations was significantly related to other measures of variability,
the magnitude of correlations involving unique combinations was lower
than the magnitude of the correlations involving the other measures of
variability (i.e., number of trial with changes and number of trial-by
trial changes). These findings demonstrate that different measures of
variability are empirically, and perhaps conceptually, distinct.
Consequently, future research should define precisely what type of
variability is being measured and limit conclusions to that type of
variability. Third, the current study demonstrates that utilization
deficiencies are present for some but not all measures of strategy use.
For example, in the analysis of recall for children showing perfect


VARIABILITY AND UTILIZATION DEFICIENCIES IN CHILDREN'S MEMORY
STRATEGIES: A DEVELOPMENTAL STUDY
By
THOMAS R. COYLE
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1997

For my parents,
Oceania and Roger Coyle

ACKNOWLEDGMENTS
Dissertation acknowledgments typically say little about how those
acknowledged contributed to the student's academic development. Perhaps
this is how it should be, for the main purpose of a dissertation is to
present a student's original contribution to a recognized body of
knowledge. I will adhere to the academic tradition of brevity in these
Acknowledgments, but do so in a way that allows me to recognize specific
contributions of individuals who helped and supported me in completing
this dissertation. I first acknowledge faculty, and then acknowledge my
family.
I wish to thank Dr. James Algina for his contribution in teaching
me how to do some of the statistics in this dissertation, particularly
the analysis in which measures of variability were converted to z-scores
and analyzed simultaneously. I also thank Dr. Algina for discussions,
which I initiated, on issues pertaining to tenure in the university and
E. D. Hirsch's notion of cultural literacy.
I wish to thank Dr. David Bjorklund for convincing me to devote my
life to studying developmental psychology. Dr. Bjorklund was
instrumental in my training during the early part of my career, and he
deserves much credit for my achievements. Dr. Bjorklund has shown me
that the most exciting aspect of science is discovery, that description
is a reasonable goal for science, that the best research questions are
those that can account for the most data, and that Peter Kapista knew

what he was talking about when he said, "Theory is a good thing but a
good experiment lasts forever."
I wish to thank Dr. Shari Ellis for suggesting that I examine
patterns of variability within individual subjects and the effectiveness
of individual strategy combinations. Dr. Ellis's suggestions were
incorporated into this dissertation and into Coyle and Bjorklund (1997).
I also thank Dr. Ellis for discussions, which I initiated, on funding
for education and on cross-cultural research.
I wish to thank Dr. Ira Fischler for bringing to my attention
several articles in the adult literature that utilize procedures for
assessing intentionality in cognition, notably Jacoby's (1991) process-
disassociation approach. The intentionality issue is often neglected in
strategy research, even though some researchers have made intentionality
the sine qua non of strategy use.
I wish to thank Dr. Patricia Miller for emphasizing the continuous
nature of strategy classifications. Her contribution is acknowledged in
Bjorklund and Coyle (1995, p. 166), and can be identified in the
analyses presented in this dissertation. I also thank Dr. Miller for
suggesting that I analyze qualitative differences in strategy use. Such
an analysis was performed for this dissertation, and it yielded some
interesting results.
I wish to thank Dr. Scott Miller for suggesting that I think
carefully about defining and measuring cognitive strategies. It is
interesting that defining cognitive strategies never has been a
favorite pastime of strategy researchers who study them. I also thank
Dr. Miller for his careful and timely reviews of my manuscripts,
including my dissertation. I have yet to find anyone whose knowledge of
IV

APA guidelines is as expansive as Dr. Miller's, and I probably never
will. I also wish to thank Dr. Miller for his contribution to the
Developmental area while he was on sabbatical.
I wish to thank Jennifer L. Slawiniski for her suggestions
regarding the design of my dissertation. I also thank Miss Slawinski
for the clever idea of applying a sequential design in the context of a
microgenetic experiment. Finally, I thank Miss Slawinski for her
incisive comments on examining gender effects and on reanalyzing
archival data.
I wish to thank the research assistants who helped with data
collection, analyses, and interpretation. These include Joshua List,
Chad Colbert, Victoria Otero, and Jerusha Azel. I suspect I learned as
much from them as they learned from me.
I wish to thank my mother and father, Oceania and Roger Coyle, for
their enduring support during my academic career. My mother and father
have taught me that hard work and perseverance will in the end always
pay off. Most important, my mother and father have taught me that the
most important thing in secular life is family. Their marriage of 35
years (and counting) is why I have acknowledged them together. This
dissertation is a testament to their love and support throughout the
years.
I wish to thank my brother, James Coyle, for his interest in my
work and his continued support of my goals, including the completion of
this dissertation. James is an exceptional guitar player, partly
because of exceptional talent, and partly because of exceptional
practice. I have learned much from observing his work ethic and
dedication to the instrument he loves so much.
v

I wish to thank my cousin, Annette Fields, for providing me with
support and guidance throughout the years. Annette is an accomplished
lawyer and she has taught me by example the rules and standards of good
argumentation. She has shown me that anyone can rise to the top with
lots of hard work and discipline. Annette's best friend and confidant,
Ellen Ross, always has believed in me and my talents, and her support is
appreciated.
I wish to thank Deborah Hooks for loving me for what I am and,
more importantly, for what I can become. Deborah entered nearly all of
the data for this dissertation, and she provided numerous useful
suggestions about possible analyses. One of her suggestions, to examine
intrusions in childrens recall protocols, turned out to be very
promising and provides a possible basis for a new view of strategy
development that includes developmental differences in resistance to
interference. Deborah has taught me that love is the best part of life,
and without it, you really don't have much of a life at all.
To all the members of my family, I love you all.
vi

TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS iii
LIST OF TABLES ix
ABSTRACT xi
INTRODUCTION 1
Memory Strategies Enhance Performance 2
Memory Strategy Development is Stagelike 8
Evaluation of Research on Variability and Utilization
Deficiencies 12
The Current Study 18
Goals of the Current Study 21
METHOD 26
Participants 26
Stimuli and Design 26
Procedure 29
Coding 31
RESULTS 34
Preliminary Analyses 34
Off-Task Behavior and Examination 34
Recall 35
Strategy Use 39
Variability in Strategy Use 41
Multiple-Strategy Use 41
Strategy Change 45
Relation Between Strategy Use and Recall 56
Utilization Deficiencies 56
Strategy Change and Recall 64
DISCUSSION 78
Variability in Strategy Use 79
Multiple-Strategy Use 79
Strategy Changes 82
Relation Between Multiple-Strategy Use and Recall 86
Relation Between Strategy Changes and Recall 93
Conclusions 98
vii

page
REFERENCES 104
BIOGRAPHICAL SKETCH 108
viii

LIST OF TABLES
Table page
1. Word lists by category membership 27
2. Mean proportion recall by condition, grade, and trial, and by
condition and trial (i.e., collapsed across grade), and
grade differences in recall at each trial by condition 37
3. Percentage (and number) of trials on which each strategy was
used, by condition and grade 40
4. Mean number of strategies used, by grade and trial, and by
condition, grade, and trial 43
5. Mean number of trial-by-trial strategy changes, by grade and
trial transition, and by condition, grade, and trial
transition 47
6. Mean z-scores and raw scores for unique combinations, trials
with changes, and total changes, by grade (standard
deviations in parentheses) 50
7. Percentage (and number) of children classified as stable or
unstable across all trials, on early trials, and on later
trials, by grade 52
8. Percentage (and number) of children changing or not changing
their stability classification across trial blocks, by
grade 54
9. Correlations between number of words recalled and number of
strategies used, by condition, grade, and trial 57
10. Mean proportion recall when strategy use was perfect, by grade
and type of strategy used 60
11. Percentage (and number) of trials on which each strategy
combination was used, and mean proportion recall (and
standard deviations) for each combination, by grade (Codes
for strategies: S, sorting; R, rehearsal; C, clustering; N,
category naming) 62
12. Correlations between measures of strategy change and recall,
by condition and grade 66
ix

Table
pase
13. Correlations among measures of strategy change, by condition
and grade 67
14. Mean proportion recall (and standard deviations) for children
classified as stable or unstable across all trials, on early
trials, and on later trials, by grade 70
15. Percentage (and number) of trials on which strategy changes
did and did not occur immediately after recall was perfect
or not perfect 73
16. Percentage (and number) of trials on which strategy changes
did and did not occur immediately after recall was perfect
or not perfect, by condition 75
x

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
VARIABILITY AND UTILIZATION DEFICIENCIES IN CHILDREN'S MEMORY
STRATEGIES: A DEVELOPMENTAL STUDY
By
Thomas R. Coyle
August, 1997
Chair: Patricia H. Miller
Cochair: Shari A. Ellis
Major Department: Psychology
The goal of this study was to examine variability in memory
strategy use, and the relation between such variability and recall, as a
function of age and a measure of task difficulty (number of words to
remember). Second and fourth graders received seven sort-recall trials
of different categorizable words (e.g., nurse, lawyer, wall, roof, rose,
lily). The number of words presented varied across trials, whereas the
number of categories represented in all word lists remained constant.
Variability in strategy use was measured in terms of multiple-strategy
use (e.g., number of strategies used across trials) and strategy changes
(e.g., number of trial-by-trial changes in the types of strategies
used). Consistent with previous research, (a) older children used more
strategies and made fewer trial-by-trial changes than younger children;
(b) older children recalled more than comparably strategic younger
children, indicating a utilization deficiency for the younger children;
and (c) older children showed significant and positive relations between
xi

stable-strategy use (i.e., few trial-by-trial changes) and recall,
whereas younger children showed no reliable relation between stability
and recall. This study extended previous research by showing that (a)
stable-strategy use emerges with experience (i.e., over trials) for
older children but not younger children, (b) utilization deficiencies
occur for some but not all instances of perfect strategy use, and (c)
memory benefits from stability occur on early trials (i.e., Trials 1 to
4) for older children but not for younger children, who show memory
benefits from stability on later trials (i.e., Trials 4 to 7) only.
Surprisingly, the results revealed few significant effects related to
changes in the measure of task difficulty (i.e., number of words to
remember). The findings are discussed in terms of how they advance our
knowledge and understanding of utilization deficiencies and variability
in strategy use.
Xll

Nature is not economical of structures-~only of principles
Abdus Salam

INTRODUCTION
All scientific disciplines are based on a set of core assumptions
(Gholson & Barker, 1985). These assumptions are rarely stated
explicitly and rarely questioned. They serve to direct a researcher's
choice of research questions, data collection procedures, statistical
analyses, and interpretation of research findings.
Two such assumptions were central in early research on children's
memory strategies. The first was that memory strategies usually enhance
memory performance. The second was that memory strategy development
proceeds through a series of stages in which a unique strategy is used
fairly consistently in each stage. These assumptions were implicit in
much of the memory strategy research conducted throughout the 1960s and
1970s.
There are now a number of studies demonstrating that these
assumptions are at best misleading, and at worst, empirically
inaccurate. The next two sections provide a brief history of events
that led to the rise and fall of the view that memory strategies
generally enhance performance and that memory strategy development is
stagelike. The discussion will focus on two concepts central to this
dissertation. The first is the concept of utilization deficiency, which
refers to strategy use with no performance benefit. The second is the
concept of variability in strategy use, which refers to the use of not
one but several different approaches.
1

2
Memory Strategies Enhance Performance
The origin of the assumption that memory strategies usually
facilitate memory performance can be traced to strategy training studies
(for a review, see Flavell, 1970). In a typical training study,
children who did not spontaneously use a strategy (e.g., rehearsal) were
trained to do so, frequently showing marked improvements in memory
performance. Such children were said to be production deficient because
they were unable to spontaneously produce a strategy, even though they
could do so and show memory benefits when instructed.
The discovery of production deficiencies was followed by a number
of studies that examined the effectiveness of strategy training. In
general, these studies, like the earlier ones, demonstrated that
children who do not produce a strategy initially can be trained to do so
and show corresponding memory improvements. These findings led to the
assumption that memory strategies typically improve performance and that
the failure to use memory strategies is associated with relatively low
levels of performance (for examples, see Flavell, 1970). This view was
not limited to memory strategies but was implicit in the descriptions of
other cognitive strategies, including those used in analogical
reasoning, arithmetic, and reading (for a historical review, see
Bjorklund, 1992).
The view that memory strategies generally improve performance
began to be questioned in the middle to late 1980s. A number of
developmental studies during this period examined the effectiveness of
various mnemonics. Several of these studies showed that memory
strategies sometimes resulted in no or little benefit to memory,

3
particularly for younger, less practiced strategy users. For example,
research by Miller and her colleagues (reviewed in Miller & Seier, 1994)
demonstrated that young children using a selective attention strategy
had lower levels of recall than equally strategic older children. Such
findings were not limited to selective attention strategies. Similar
patterns were found in tasks assessing organizational and elaboration
strategies (Bjorklund & Harnishfeger, 1987; Kee & Davies, 1990).
Why did these studies show ineffective strategy use when earlier
research on production deficiencies showed effective strategy use? The
answer may have to do with how strategy use was measured. Production
deficiency research inferred strategy use from patterns of recall
following training. Improvements in recall were interpreted as
indicating the effective use of a mnemonic. No improvements in recall
were interpreted as indicating ineffective strategy training. This
latter pattern of data is ambiguous, however. No improvement in recall
may indicate ineffective training, but it may also indicate ineffective
strategy use. The only sure way to discriminate between these
alternatives is to assess recall and strategy use independently. Later
research on strategy effectiveness did assess recall and strategy use
independently. As mentioned above, these studies showed that increases
in strategy use do not always result in benefits to memory.
In 1990, Miller formally identified evidence of strategy use with
no recall benefits, and labeled such evidence a utilization deficiency
(Miller, 1990). According to Miller, utilization deficiencies occur
when a child produces an appropriate strategy but does not benefit from
it in terms of recall, or benefits less than an equally strategic older

4
child. Utilization deficiencies are inferred empirically when (a) the
correlation between strategy use and recall is nonsignificant for
younger children but significant for older children, (b) young strategy
users recall no more than their nonstrategic peers, (c) older children
recall more than comparably strategic younger children, and (d) strategy
use increases over trials with no corresponding improvements in memory
performance (for additional examples of utilization deficiencies, see
Miller & Seier, 1994). Evidence for utilization deficiencies has been
found in studies using a variety of memory paradigms and involving
participants ranging in age from preschool to late adolescence (for
reviews, see Bjorklund & Coyle, 1995; Miller & Seier, 1994).
Recent reviews of memory development research have revealed the
ubiquity of utilization deficiencies (Bjorklund & Coyle, 1995; Miller &
Seier, 1994). One such review was conducted by Miller and Seier (1994),
who examined the memory development literature from 1974 through mid-
1992 for evidence of utilization deficiencies in normal populations
(e.g., greater recall for older than comparably strategic younger
children). Miller and Seier used three criteria to select studies
appropriate for the examination of utilization deficiencies: (a)
independent measures of strategy use and recall, (b) spontaneous
strategy production (i.e., training studies were excluded), and (c)
analyses examining age differences in strategy use and performance. Of
the 59 studies they evaluated, 56 (95%) provided evidence for a
utilization deficiency.
Although Miller and Seier limited their review to spontaneous
strategy use, a more recent review by Bjorklund, Miller, Coyle, and

5
Slawinski (in press) examined utilization deficiencies (e.g., increases
in strategy use but not recall following training) in memory strategy
training studies published between 1968 and 1994. Like Miller and
Seier, Bjorklund et al. selected only studies that included children
from normal populations and reported independent measures of strategy
use and recall. Because studies with multiple-training conditions could
provide multiple cases of evidence of utilization deficiencies, training
conditions within studies (rather than the studies themselves) served as
the units of analysis. Of the 76 relevant training conditions
identified, 39 (51%) showed evidence for utilization deficiencies.
Why did it take the field so long to identify utilization
deficiencies? The most parsimonious explanation, I believe, is that
utilization deficiencies did not make any sense given the dominant
assumption prevalent in much of the early research on production
deficiencies (i.e., strategies help performance). Consequently,
evidence for a utilization deficiency was often ignored or overlooked.
A conceptual shift occurred in the mid-1980s when a number of studies
examined independent measures of strategy use and recall (for a review
of these studies, see Miller & Seier, 1994). Several studies during
this period reported that strategy use resulted in no or little recall
benefit. The accumulation of such evidence made it difficult to ignore
data indicating that strategies did not always enhance memory
performance. A new view of strategy use emerged, one that considered
the possibility of ineffective strategy use. This view made possible
the discovery of utilization deficiencies in research on memory
strategies.

6
Contemporary research has investigated the causes and consequences
of utilization deficiencies. Possible causes of utilization
deficiencies include inadequate capacity for both strategy production
and effective encoding, limited knowledge of stimulus items or task
requirements, and insufficient metamnemonic knowledge of when and how to
use strategies (Miller & Seier, 1994). Empirical support for these
causes has been demonstrated in studies showing that utilization
deficiencies are reduced or eliminated when (a) the capacity required
for accessing or executing a strategy is eliminated by having an
experimenter carry out the strategy (Miller, Woody-Ramsey, & Aloise,
1991); (b) the stimulus items or task requirements are highly familiar
and embedded in a meaningful context (Miller, Seier, Barron, & Probert,
& 1994); and (c) metamnemonic instruction is provided regarding the
cause-and-effect relation between strategy use and recall (Ringel &
Springer, 1980). Of the three causes mentioned, inadequate mental
capacity and limited knowledge base have received the most empirical
support. Other potential causes of utilization deficiencies, including
inadequate strategic monitoring, failure to link one strategy with
another, and failure to inhibit an earlier, ineffective strategy (see
Bjorklund & Coyle, 1995; Miller & Seier, 1994), have received little
attention.
Contemporary research has also examined the development of
utilization deficiencies. The general finding is that young children
are more apt to show a utilization deficiency than older children.
Compared to older children, young children (a) recall less when using
the same strategies, (b) show lower correlations between strategy use

7
and recall, and (c) show more instances of strategy increases over
trials with no corresponding increases in recall (see Miller & Seier,
1994). These findings demonstrate that young children are less likely
to benefit from strategy use than older children, which is evidence of a
utilization deficiency for young children.
A recent study by Coyle and Bjorklund (1996) showed that
utilization deficiencies have different developmental consequences for
memory depending on when they occur. In this study, children in second
through fourth grade received a multitrial sort-recall task, with
different sets of categorizable words on each trial. Children were
classified as utilizationally deficient or not based on their pattern of
clustering and recall over trials. Children were classified as
utilizationally deficient if they showed increases in clustering over
trials with no corresponding increases in recall. All other children
were classified as nonutilizationally deficient. Mean recall varied as
a function of grade and utilization deficiency classification. Second-
and third-grade utilizationally deficient children recalled more on
average than their nonutilizationally deficient agemates, most of whom
used no strategy at all. Conversely, fourth-grade utilizationally
deficient children recalled less than their nonutilizationally deficient
agemates, most of whom were using strategies effectively.
The Coyle and Bjorklund findings demonstrate that utilization
deficiencies have different memory consequences depending on when they
occur in development. Utilization deficiencies that occur early in
development are associated with relatively high levels of memory
performance, because the dominant alternative pattern is no strategy use

8
and low levels of recall. Conversely, utilization deficiencies that
occur later in development are associated with relatively poor recall,
because the dominant alternative pattern is effective strategy use and
high levels of recall.
Memory Strategy Development is Stagelike
The origin of the assumption that memory strategy development is
stagelike can be traced to research on spontaneous (i.e., uninstructed)
strategy use in the late 1960s and 1970s. One goal of this research was
to identify the types of strategies used by different age groups. To do
this, children of different ages were presented with a memory task and
their mnemonic behaviors were recorded and compared. The general
finding was that children in each age group typically used a different
and unique strategy for remembering. This finding was remarkably
consistent across a variety of research paradigms. Serial recall
studies showed that young children often use no rehearsal strategy,
older children often use single-word rehearsal, and still older children
use cumulative rehearsal (Flavell, Beach, & Chinsky, 1966; Ornstein,
Naus, & Liberty, 1975). Organizational memory tasks showed that young
children often organize words along thematic dimensions whereas older
children often organize words along taxonomic dimensions (Ceci & Howe,
1978). Paired-associate learning tasks showed that young children often
form arbitrary links between word pairs whereas older children often
form relational links between word pairs (for a review, see Kee, 1994).
These early findings depicted memory strategy development as a
stagelike progression (Siegler, 1995). Stage descriptions were not
limited to memory strategies but included strategies in such diverse

9
domains as arithmetic, number conservation, and scientific reasoning
(Siegler, 1996). Young children were described as using one approach,
older children as using a different approach, and still older children
as using yet another approach. At each age children were described as
using a single and unique strategy. Strategy development consisted of
one strategy being replaced by another more advanced strategy.
Although a stagelike pattern of strategy development appeared to
describe well the pattern of data in early studies, evidence
inconsistent with a stagelike progression was reported in the mid- to
late-1980s. Several studies during this period demonstrated that
children of a particular age used not one but several strategies. Such
variability was found across a variety of tasks, including ones
assessing memory strategies. For example, children asked to remember a
series of digits sometimes used no rehearsal strategy, sometimes
rehearsed only one digit at a time, and sometimes rehearsed all digits
together (McGilly & Siegler, 1989). Children asked to remember the
location of a hidden object sometimes talked about where the object was
hidden, sometimes stayed near the hiding place, and sometimes pointed to
the hiding location (DeLoache, 1984). Children presented with a paired-
associate learning task sometimes repeated the names of the items and
sometimes formed a sentence or image linking the word pairs (reviewed in
Kee, 1994). Children asked to remember a series of objects sometimes
visually inspected the objects, sometimes named the objects, and
sometimes physically manipulated the objects (Baker-Ward, Ornstein, &
Holden, 1984; Lange, MacKinnon, & Nida, 1989).

10
Why was variability reported in these studies when early research
on strategy development had reported a stagelike progression? As with
utilization deficiencies, the answer has to do with how strategies were
assessed. Early research on memory strategy development typically
classified children as using a single strategy only. Children might be
identified as rehearsing, sorting, or elaborating, but no child was
identified as using more than one strategy. Although variability was
present across individuals, data were often presented in terms of the
dominant strategy used at each age (Flavell, Beach, & Chinsky, 1966).
This type of data presentation, along with the strategy assessment
procedures, depicted memory strategy development as a series of stages.
Later research on memory strategy development assessed the possibility
of intraindividual variability in strategy use (i.e., multiple
strategies being used by a particular individual). This research
assessed several strategies on a particular trial or different
strategies across trials. Under these conditions, children showed
considerable variability in strategy use, often using a variety of
approaches within and across trials.
Evidence for variability in strategy use led to a new view of
strategy development championed by Robert Siegler (1996). Siegler
argued that strategy development does not involve the replacement of
different strategies, as implied by stage theories. Instead, he argued
that strategy development involves changes over time in the frequency of
occurrence of several strategic approaches. According to Siegler, at
any given age children use not one but a variety of strategies.
Strategy development consists of changes in the frequency of use of each

11
individual strategy, with some strategies being used more than others.
At no point in development is a single strategy used exclusively.
Rather, multiple strategies are used throughout development.
Considerable evidence supports the view of strategy development
championed by Siegler. Variability in strategy use has been found for
children differing in race, nationality, and intelligence; for problem
domains including arithmetic, serial recall, scientific reasoning,
reading, spelling, and tic-tac-toe; for participants ranging in age from
one year to adulthood; and for analyses examining both group and
individual subject data (Siegler, 1996). In sum, variability in
strategy development appears to be the rule in development, not the
exception.
Contemporary research has investigated possible correlates of
variability in memory strategy use. Possible correlates include
knowledge of the stimulus items and task, psychometrically-measured
intelligence, and the history of effectiveness of a particular strategy.
Empirical support for these correlates has been demonstrated in studies
showing that variability in strategy use is reduced when (a) highly
familiar stimulus items are used (Bjorklund & Bernholtz, 1986; Frankel &
Rollins, 1985), (b) children have very high-IQs (Coyle, Colbert, & Read,
1997), and (c) a strategy yields perfect memory performance (McGilly &
Siegler, 1989).
Contemporary research also has examined the developmental course
of variability in strategy use. Siegler has shown that the number of
strategies used depends on amount of experience on a task (Siegler,
1996). In general, few strategies are used when task experience is

12
limited, several different strategies are used when experience is
moderate, and few strategies are again used when experience is
extensive. Thus, the number of strategies used, plotted as a function
of task experience, produces an inverted-U shaped pattern. This pattern
has been found across a variety of strategic tasks.
Finally, contemporary research has shown that initial levels of
variability have implications for subsequent learning. For example,
Goldin-Meadow and her colleagues (Goldin-Meadow, Alibali, & Church,
1993) have shown that children who displayed high levels of variability
on a conceptual learning task showed increases in task performance
following instruction or practice. In contrast, children who displayed
low levels of variability typically showed no or relatively little
improvement in performance. Similarly, Siegler (1995) has shown that
children who used several strategies on a number conservation task
showed increases in subsequent learning. In contrast, children who used
few strategies on a number conservation task showed relatively little
change in subsequent learning. These findings raise the intriguing
possibility that variability may provide an index of when change is
likely to occur and when change can be induced to occur (cf. Thelen &
Smith, 1994).
Evaluation of Research on Variability and Utilization Deficiencies
The discovery of utilization deficiencies and variability has had
two important consequences in strategy development research. First,
several models of memory strategy development now explicitly account for
utilization deficiencies and variability. These models assume that
variability is present at all points in development (Siegler, 1996;

13
Thelen & Smith, 1994), and that memory strategies have costs as well as
benefits (Bjorklund & Coyle, 1995; Miller & Seier, 1994). Second,
several studies have been designed with the explicit intent of assessing
utilization deficiencies and variability (Bjorklund, Coyle, & Gaultney,
1992; Coyle & Bjorklund, 1996; Miller, Seier, Barron, & Probert, 1994;
Siegler & Jenkins, 1989). These studies do not view utilization
deficiencies and variability as anomalies to be discounted, but consider
them as being worthy of study in their own right and deserving of
explanation.
Although utilization deficiencies and variability have received
considerable attention in contemporary strategy research, current
research investigating these phenomena is limited in at least three
ways. First, utilization deficiencies have been described almost
exclusively on tasks in which only a single strategy is assessed on all
trials. In most studies, children are said to be utilizationally
deficient when they use a particular strategy (e.g., rehearsal or
elaboration) and show no or little memory benefit or less benefit than
that shown by more experienced strategy users. Because only a single
strategy is assessed, the possibility of utilization deficiencies in
multiple-strategy use cannot be examined. Instead, the focus is on the
ineffective use of a particular strategy.
Second, variability generally has been assessed on multitrial
tasks in which only one strategy per problem solving trial is assessed.
In most studies, children are credited with using a single strategy each
time they are presented with a problem, although they can (and usually
do) use a variety of strategies across different problems. Because only

14
one strategy per trial is assessed, the possibility of variability
within a particular trial (i.e., intratrial variability) cannot be
examined. Instead, the focus is on variability across trials (i.e.,
intertrial variability).
Third, variability has been measured almost exclusively in terms
of the number of strategies used. Although the number of strategies
used is one measure of variability, it is not the only one. A handful
of other studies have shown that variability can be measured in other
ways, including the number of trial-by-trial changes in strategy use,
the degree of stability in the sequence of strategy production across
several trials, and the number of instances when one strategy is
expressed in gesture and a different one in speech (Coyle & Bjorklund,
1997; Coyle, Colbert, & Read, 1997; Goldin-Meadow et al., 1993). These
studies demonstrate that variability can be measured in not one but
several ways. A single measure, such as the number of strategies used,
does not capture all possible patterns of variability, and different
kinds of variability may have different causes and consequences.
A recent study by Coyle and Bjorklund (1997) addressed these
limitations. Some time will be spent describing this study because its
design and findings figure prominently in the study developed for this
dissertation. Children in second through fourth grade received five
sort-recall trials of categorizable words. Unlike other multitrial
experiments (e.g., Bjorklund, 1988), different items and categories were
used on each trial, so that any increases in strategy use could not be
attributed to increased familiarity with a particular set of stimulus
items.

15
Multiple strategies were assessed on each trial. This permitted
assessment of variability within trials, as well as variability across
trials. It also permitted assessment of utilization deficiencies in
multiple-strategy use. The four strategies assessed on each trial were
sorting, physically moving or arranging the words into groups;
rehearsing, saying out loud or mouthing the items; category naming,
saying the category name of a group of words; and clustering, recalling
the words by categories. Each strategy was coded as occurring or not on
each trial. The measure of performance was the number of words recalled
on each trial.
Unlike previous studies, variability was measured in not one but
several ways. Variability was measured in terms of (a) average number
of strategies used across trials, (b) number of trials on which multiple
strategies were used, (c) number on trials that the combination of
strategies differed from the preceding trial, and (d) total number of
strategy changes on consecutive trials, counting both strategy additions
and deletions as changes. The first two measures (average number of
strategies and number of trials with multiple strategies) are examples
of multiple-strategy use. These are the most frequently reported
measures of variability. The last two measures (trials with changes and
total number of changes) are examples of strategy change. These
measures assess changes over time and are reported less frequently.
Coyle and Bjorklund predicted age differences in variability.
Multiple-strategy use (e.g., number of strategies used) was predicted to
increase with age. The basis for this prediction was that strategy use
would be less effortful for older than for younger children, and so

16
older children would have the capacity to produce additional strategies.
Strategy changes (e.g., trial-by-trial changes in strategy use) were
predicted to be high and comparable for both age groups. This
prediction was based on research showing that strategy changes occur
frequently in development, across a wide range of ages and on a variety
of tasks (Siegler, 1995, 1996).
Coyle and Bjorklund also predicted age differences in the relation
between variability and recall. Multiple-strategy use was predicted to
correlate with recall for older but not younger children. This
prediction was based on the assumption that older children would have
the mental capacity to produce and use effectively multiple strategies.
In contrast, multiple-strategy use was expected to consume so much of
young children's limited mental capacity that little would remain for
recall, resulting in a utilization deficiency. Strategy change, in
particular stable-strategy use (i.e., few trial-by-trial changes in
strategy use), was predicted to correlate with recall for older but not
younger children. This prediction was based on research showing that
older children are likely to stick with a single approach that yields
optimal performance, whereas younger children frequently use a variety
of ineffective approaches (Lemaire & Siegler, 1995).
The findings were generally consistent with the predictions.
Multiple-strategy use was greater for older than for younger children.
Although children of all ages used more than one strategy across trials,
older children used more strategies and had more trials with multiple
strategies than did younger children. Strategy changes were high and
comparable for children in all age groups. Although considerable

17
variability was observed for all age groups, a (nonsignificant) age-
related decline in variability was observed. Older children showed
fewer changes on consecutive trials and had fewer trials with changes
than younger children. These findings were confirmed in an analysis of
strategy change within individual subjects. Although Coyle and
Bjorklund (1997) paid little attention to the age-related declines in
strategy changes, emphasizing instead pervasive variability at all ages,
subsequent research has found considerable evidence for age-related
declines in variability across a variety of tasks and for children
varying widely in age (Coyle, Colbert, & Read, 1997; for a review, see
Siegler, 1996). In general, older and more experienced strategy users
show fewer strategy changes than younger and less experienced strategy
users.
Further analysis revealed relations between variability and memory
performance. As predicted, multiple-strategy use was related to recall
for older children, who showed significant and positive relations
between number of strategies used and recall. Younger children showed
no reliable relation between number of strategies used and recall,
indicating a utilization deficiency. In addition, stable-strategy use
(i.e., few strategy changes across trials) was significantly related to
high levels of recall, but only for the older age groups. That is,
third- and fourth-grade children who consistently used a particular
strategy combination had higher levels of recall than their peers whose
strategy use was less consistent. No reliable relation between
variability and recall was found for the youngest children.

18
Taken together, these findings extend current research on
utilization deficiency and variability in several ways. Specifically,
they provide evidence for (a) utilization deficiencies in multiple-
strategy use, (b) several different types of variability, including
multiple-strategy use and strategy change, and (c) variability in
strategy use within a particular trial, as well as between trials.
The Current Study
The purpose of the current study was to further examine issues
concerning utilization deficiencies and variability using the procedures
developed by Coyle and Bjorklund (1997). As in Coyle and Bjorklund,
children received a multitrial sort-recall task with different words and
categories on each trial. Also as before, multiple strategies were
assessed on each trial and variability was measured in several ways.
The strategies assessed were sorting, rehearsal, clustering, and
category naming. The measures of strategy variability were number of
strategies used on each trial, number of strategy changes on consecutive
trials, number of unique combinations, and number of trials with
strategy changes.
The current study differed from the study by Coyle and Bjorklund
in two important ways, each of which permitted new research questions
concerning utilization deficiencies and variability in strategy use.
First, in the current study children received seven sort-recall trials,
two more than in Coyle and Bjorklund. The additional trials permitted a
more detailed analysis of strategy change during the testing session.
It was now possible to assess periods of stability and instability
within individual children during early trials and again during later

19
trials. In contrast, the study by Coyle and Bjorklund assessed
stability and instability using data on all trials.
Second, in the current study the number of words presented varied
across trials from six to fifteen, whereas in Coyle and Bjorklund the
number of words presented remained constant across trials at eighteen.
Although the number of words varied across trials in the current study,
the number of categories represented on each trial remained constant at
three. This eliminated the possibility that changes in strategy use and
recall would result from changes in the number of categories represented
across trials, and ensured that such changes could be attributed to
variation in the number of words presented. The design permitted an
examination of whether children adapt their strategy use to changes in
the number of words on each trial. It was now possible to assess
measures of variability, including multiple-strategy use and strategy
changes, when children were presented with relatively few words or many
words. In contrast, the study by Coyle and Bjorklund assessed
variability under conditions in which task demands (i.e., number of
words on each trial) remained constant.
Apart from the differences mentioned above, the design of the
current study was very similar to the one used by Coyle and Bjorklund.
Second- and fourth-grade children were given seven sort-recall trials of
categorizable words. As in Coyle and Bjorklund, different words and
categories were used on each trial to minimize the likelihood that
increases in strategy use would result from practice with a particular
set of categorizable items. Also as in Coyle and Bjorklund, category
items were chosen to avoid high associations between words, thus

20
minimizing the likelihood of clustering as a result of the automatic
activation of semantic memory relations.
Approximately half the children in each grade were assigned to one
of two conditions, labeled ascending/descending and
descending/ascending. In the ascending/descending condition, children
received an increasing number of words on each successive trial until
Trial 4, and then received a decreasing number on each successive trial
(number of words on Trials 1-7, respectively, was 6, 9, 12, 15, 12, 9,
and 6). The descending/ascending condition was the complement of the
ascending/descending condition. In the descending/ascending condition,
children received a decreasing number of words on each successive trial
until Trial 4, and then received an increasing number of words on each
successive trial (number of words on Trials 1-7, respectively, was 15,
12, 9, 6, 9, 12, 15). Each condition had trials with the same number of
words, so that effects concerning number of words presented could be
teased apart from effects concerning the ascending or descending order
in which words in each condition were presented.
These conditions were developed to examine changes in strategy use
and performance as a function of the number of words presented on
successive trials. Two additional sets of conditions were considered
but not selected. The first involved presenting trials in the
ascending/descending and descending/ascending condition randomly,
without having a constant rate of increase or decrease across trials.
For example, Trials 1 to 7 in the ascending/descending condition might
be ordered 12, 15, 9, 9, 6, 12, and 6, respectively, whereas Trials 1 to
7 in the descending/ascending condition might be ordered 9, 15, 15, 6,

21
12, 9, and 12, respectively. Unlike the conditions in the current
study, these presentation orders would vary randomly the amount of
increase or decrease on successive trials. Consequently, they would
confound changes in the number of words presented on successive trials
with the magnitude of such changes. The design of the current study
eliminated this confound by holding constant the rate of change at three
words.
A second possible set of conditions that were considered included
an ascending only series and a descending only series. The idea was to
extend the pattern in the early trials of each condition in the current
study. Thus, Trials 1 to 7 in the ascending series would have 6, 9, 12,
15, 18, 21, and 24 words, respectively, whereas Trials 1 to 7 in the
descending series would have 24, 21, 18, 15, 12, 9, and 6 words,
respectively. Unlike the conditions in the current study, these
presentation orders do not reverse the pattern of change in the latter
trials. Consequently, effects regarding possible strategic adaptation
to reversal of presentation order could not be assessed. Furthermore,
it was not clear why differences in strategic adaptation and recall
performance would vary beyond 15 words, when the number of words
presented would exceed children's memory capacity (Miller, 1956).
Goals of the Current Study
The current study had three goals. The first was to examine
differences in measures of strategy variability (e.g., multiple-strategy
use and strategy change) as a function of grade and number of words
presented on each trial. As in Coyle and Bjorklund (1997), multiple-
strategy use (e.g., number of strategies used per trial) was predicted

22
to increase with age. This prediction was based on research showing
that strategies are capacity-demanding operations and that strategy
production consumes less capacity with age (Kee, 1994). Thus, older
children, who use relatively little capacity during strategy production,
should produce more capacity-consuming strategies than younger children.
In addition, multiple-strategy use was predicted to be greater on
trials with relatively many words (i.e., 12 or 15 words) than on trials
with relatively few words (i.e., 6 or 9). This prediction was based on
the assumption that trials with many words would induce children to use
additional memory strategies because recall of all words on these trials
is beyond children's memory capacity (Miller, 1956). In contrast,
trials with few words should not have this effect because recall of all
words is within children's memory capacity. Thus, children are expected
to use multiple-strategies only when they cannot perform optimally
without doing so (cf. McGilly & Siegler, 1989). These predictions may
be qualified by age, with older children having greater capacity for
using multiple strategies than younger children.
On the basis of the findings in Coyle and Bjorklund (1997) and in
other studies (Coyle, Colbert, & Read, 1997; Lemaire & Siegler, 1995),
strategy changes (e.g., trial-by-trial changes in strategy use) were
predicted to decrease with age. In addition, strategy changes were
predicted to decrease over the course of the testing session, especially
for older children. This latter prediction was based on models of
strategy variability proposing that task-relevant experience is
associated with decreases in trial-by-trial changes in strategy use
(Siegler, 1996; Thelen & Smith, 1994). Thus, children should show

23
relatively few strategy changes during the later trials of the sort-
recall task, when they have had considerable task-related experience.
Strategy changes were predicted to vary according to the number of
words presented on each trial. Specifically, strategy changes were
predicted to rarely follow trials with relatively few words (i.e.,
trials with 6 and 9 words), but to frequently follow trials with
relatively many words (i.e., trials with 12 and 15 words). These
predictions were based on research showing that strategy changes rarely
follow perfect performance but frequently follow less than perfect
performance (McGilly & Siegler, 1989). Because perfect recall was
likely on trials with few words but not on trials with many words, it
was predicted that strategy changes would be less frequent on trials
with few words compared to trials with many words.
The second goal of the current study was to examine the relation
between multiple-strategy use and recall as a function of age and number
of words on each trial. A specific aim was to examine data for possible
evidence of utilization deficiencies. Utilization deficiencies were
predicted to be less frequent for older children than for younger
children. This prediction was based on research examining evidence of
utilization deficiencies for children of different ages. For example,
Miller and Seier (1994) have shown that correlations between strategy
use and recall are often positive and significant for older but not
younger children, and have interpreted this as evidence of a utilization
deficiency for the younger children. Similarly, Coyle and Bjorklund
(1996) have shown that younger children recall less than comparably
strategic older children and have interpreted this finding as

24
demonstrating a utilization deficiency for younger children. To date,
research on utilization deficiencies has examined the effectiveness of a
single strategy (e.g., clustering or rehearsal), or, in a few cases, the
effectiveness of multiple strategies. The current study examines the
effectiveness of both single- and multiple-strategy use in a single
paradigm, and compares directly the incidence of utilization deficiency
when children use one or several strategies.
Utilization deficiencies were predicted to be less frequent on
trials with relatively few words (i.e., 6 or 9 words) than on trials
with relatively many words {i.e., 12 or 15 words). This prediction was
based on the assumption that trials with few words would consume less of
children's limited mental capacity than trials with many words. Thus,
additional capacity should be available for efficient strategy
utilization on trials with few words. Consequently, utilization
deficiencies should be less frequent on trials with few words compared
to trials with many words. This prediction may be qualified by age, with
older children's superior processing capacity permitting effective
strategy use on all trials, irrespective of the number of words
presented.
The third and final goal of the current study was to examine the
relation between strategy changes and recall as a function of age. On
the basis of the findings in Coyle and Bjorklund (1997) and other
studies (Lemaire & Siegler, 1995), the relation between strategy change
and recall was predicted to be negative and significant for older but
not younger children. That is, few strategy changes across trials
(i.e., stable-strategy use) were predicted to result in high levels of

25
recall for fourth graders but not second graders. A further prediction
was that the relation between stability and recall may be more apt to
occur on later trials (Trials 4 to 7) than on early trials (Trials 1 to
4). This prediction was based on research showing that children
initially show inconsistent and ineffective strategy use, but later
settle into a stable and optimal state of strategic responding (Siegler,
1996; Thelen & Smith, 1994). Because the measures of strategy
variability were computed from data aggregated across trials, no
predictions concerning the impact of number of words on the relation
between strategy change and recall could be made.
A final prediction concerned the conditions under which strategy
changes occur. Strategy changes were predicted to occur less frequently
when recall was perfect on the immediately preceding trial than when
recall was not perfect on the immediately preceding trial. This
prediction was based on the findings of a serial-recall study by McGilly
and Siegler (1989). In that study, children who had been given a series
of serial-recall trials tended to switch strategies when their
performance was less than perfect on the preceding trial, but not when
their performance was perfect on the preceding trial. That is, children
tended to stick with a particular approach when it had yielded optimal
performance but switched approaches when the previous one had yielded
less than optimal performance. This pattern is known as the win-
stay/lose-shift approach in the decision-making literature (Eimas,
1969). Such a pattern may vary with the number of words presented on
each trial. Trials with few words should provide greater opportunity
for perfect recall, which should result in few strategy changes.

METHOD
Participants
Participants were 69 second graders, 36 boys and 33 girls (mean
age = 7 years 8 months, SD = 6.42 months), and 51 fourth graders, 21
boys and 30 girls (mean age = 9 years 7 months, SD = 5.00 months).
Children were recruited from schools and recreation centers in
Gainesville, Florida. The majority of children were White (80%) and
came from middle- and upper-middle-income households.
Stimuli and Design
Seven lists of categorically related words were constructed (three
categories per list, five words per category; see Table 1). The lists
were composed of words reported in three analyses of category norms
(Bjorklund, Thompson, & Ornstein, 1983; Posnansky, 1978; Uyeda &
Mandler, 1980). Each word was printed on a 3 x 5 in. (7.6 x 12.7 cm)
index card. Different words and categories were used on each list.
Items in each list varied in category typicality, with most items being
in the top-third frequency ranking for a particular category. Highly
associated words within a particular category (e.g., dog, cat; salt,
pepper) were avoided, thus minimizing the likelihood that clustering
would result from the automatic activation of semantic memory relations
(Frankel & Rollins, 1985; Schneider, 1986). Previous research has shown
that children in the age range tested here had little difficulty
26

27
Table 1
Word Lists By Category Membership
List 1
List 2
List 3
List 4
Occupations
Trees
Metals
Buildings
Carpenter
Willow
Copper
Tepee
Lawyer
Maple
Brass
Castle
Nurse
Palm
Tin
Igloo
Dentist
Oak
Iron
Church
Farmer
Pine
Silver
Barn
Parts of a
House
Beverages
Weapons
Reading
Material
Window
Tea
Sword
Book
Roof
Milk
Grenade
Journal
Door
Soda
Cannon
Newspaper
Stairs
Water
Spear
Magazine
Ceiling
Coffee
Knife
Letter
Sports
Jewelry
Vegetables
Birds
Soccer
Earrings
Cabbage
Sparrow
Golf
Necklace
Onion
Eagle
Tennis
Crown
Celery
Parrot
Football
Watch
Peas
Dove
Hockey
Bracelet
Corn
Owl

28
Table 1continued
List 5
List 6
List 7
Weather
Flowers
Animals
Phenomenon
Daisy
Horse
Wind
Orchid
Zebra
Snow
Tulip
Pig
Rain
Lily
Tiger
Fog
Rose
Cat
Hail
Furniture
Vehicles
Cloth
Couch
Bus
Cotton
Lamp
Plane
Satin
Chair
Boat
Silk
Bed
Car
Wool
Dresser
Motorcycle
Velvet
Musical
Instruments
Time
Body Parts
Drums
Year
Foot
Tuba
Decade
Elbow
Violin
Month
Neck
Flute
Hour
Mouth
Piano
Century
Hand

29
defining items like the ones used in the current study (Coyle &
Bjorklund, 1997).
Each child received seven sort-recall trials. A different list of
words was presented on each trial. Children were assigned to one of two
conditions. In both conditions, three categories were represented in
the word lists on all trials. However, the number of words in each
category varied systematically across trials. In the
ascending/descending condition, the number of items in each category was
2, 3, 4, 5, 4, 3, and 2 on trials 1 through 7, respectively. Thus, the
total number of items presented on trials 1 through 7 was 6, 9, 12, 15,
12, 9, and 6. In the descending/ascending condition, the number of
items in each category was 5, 4, 3, 2, 3, 4, and 5 on trials 1 through
7, respectively. Thus, the total number of items presented on trials 1
through 7 was 15, 12, 9, 6, 9, 12, and 15. The sum of all items in the
descending/ascending condition was greater than the sum of all items in
the ascending/descending condition. In each condition, the seven lists
were presented in 1 of 10 predetermined random orders. Each list was
presented on each of the seven trials approximately equally, and all
items within a list were used approximately equally. This resulted in
a 2 (grade: second vs. fourth) x 2 (condition: ascending/descending vs.
descending/ascending) x 7 (trial) design, with repeated measures on the
trial factor.
Procedure
Children were tested by the author of this dissertation and two
undergraduate research assistants. Each child was seen individually in
a session lasting approximately 30 min. Prior to the presentation of

30
the first list, children were told that they would be presented seven
lists of words (each printed on a 3 x 7 in. [7.6 x 12.7 cm] index card)
to remember and later recall in any order they wished. They were told
that the lists and items would be presented one at a time and that some
lists would have a different number of words. They were not told how
many words would be presented on each list, nor were they told about the
categorical structure of the lists.
The experimenter presented each card (on which a word was printed)
to the child at a rate of about one card every 2 s. The experimenter
named the item and children repeated the name. Cards were placed in
front on children in rows, with the stipulation that no two items from
the same category were presented contiguously. Each row contained six
cards, unless the number of cards presented was not a multiple of six
(i.e., 9 or 15 cards). In this case, the row closest to the child
contained three cards. After the cards were presented, children were
instructed to "study the words and do whatever you want to remember them
later." After 1 min 30 s, the cards were covered with an opaque cloth
and then children solved problems on the Matching Familiar Figures Test
(Kagan, 1965) for approximately 30 s. Children were then asked to
recall as many items as they could in any order they wished. If the
child was silent for 10 s, the experimenter asked if there were any more
words that he or she could remember. When either another 10 s interval
elapsed with no more words recalled or the child stated that he or she
could remember no more words, the trial was ended. Trials 2-7 followed
immediately after Trial 1, using the same procedure with different sets

31
of items. The experimenter recorded childrens sorting patterns on each
trial and the entire session was audiotaped.
Coding
During the 1 min 30 s study period on each trial, the experimenter
observed the incidence of sorting, rehearsal, category naming,
examination, and off-task behavior for each of three separate 30-s
intervals. Each type of study behavior was coded as occurring or not
during each of the three intervals. Sorting was recorded when children
physically moved or arranged cards. Rehearsal was recorded when
children verbalized out loud or mouthed the list items (no distinction
was made between single-word and cumulative rehearsal). Category naming
was recorded when children said the category name of a group of items
(e.g., FRUIT for apple, banana, peach). Examination was recorded when
children visually scanned the cards. Off-task behavior was recorded
when children looked away from the cards and were visually inattentive
to the task for a total of 5 consecutive seconds. Clustering during
recall was recorded when children recalled words by adult-defined
categories.
Following Coyle and Bjorklund (1997), three of the five study
behaviors were classified and analyzed as strategies. These were
sorting, rehearsal, and category naming. Clustering during recall was
classified as a fourth strategy. Examination was not considered a
strategy because by itself examination reflects only attention to the
target information. Although children may be covertly using a strategy
(e.g., rehearsal) while examining the items, this cannot be discerned
from their overt behavior. For these reasons, examination was not

32
included as a strategy for purposes of analyses. Unless specified
otherwise, strategy data were coded dichotomously, with each strategy
being coded as occurring or not occurring on each trial.
The strategies assessed during the study period (i.e., sorting,
rehearsal, and category naming) could be observed between zero and three
times during the 1 min 30 s study period. A child was credited with
using a strategy on a trial if he or she was observed to use that
strategy during at least one of the three 30-s intervals. The strategy
assessed during recall, clustering, was measured by the adjusted ratio
of clustering (ARC) score (Roenker, Thompson, & Brown, 1971). Following
Coyle and Bjorklund (1997), a child was credited with using a clustering
strategy if his or her ARC score was .50 or greater. This represents a
value of slightly more than one standard deviation greater than
clustering expected by chance. Children could be classified as using
any one of the four strategies or any combination of the four
strategies on a particular trial.
Reliability has been assessed in previous research that examined
the same study behaviors and strategies (Coyle & Bjorklund, 1997). This
research demonstrated that percentage of agreement for two independent
coders coding the study behaviors (i.e., sorting, rehearsal, category
naming, examination, and off-task behavior) was very high (92%).
Percentage agreement for coding the strategies of sorting, rehearsal,
and category naming was even higher (97%). These data, along with data
from other studies reporting reliability for similar strategies (Lange,
MacKinnon, & Nida, 1989; Wellman, Ritter, & Flavell, 1975), demonstrate

33
high intercoder agreement for the types of strategies coded in the
current study.

RESULTS
All analyses are reported at p < .05, with post-hoc tests
evaluated with t-tests unless otherwise specified.
Preliminary Analyses
Some of the results were pertinent to general issues in cognition
and memory development but not to the focus of the current study. These
results are presented here. The next section reports results concerning
issues of strategy variability and the relation between variability and
recall.
Off-Task Behavior and Examination
Off-task behavior and examination were observed during each of the
three 30-s intervals of the study period (range: 0 to 3 per trial).
Each type of data was analyzed separately using 2 (grade) x 2
(condition) x 7 (trial) analyses of variance (ANOVAs), with repeated
measures on the trial factor. The analysis of off-task behavior
revealed no significant main effects or interactions. As shown in
previous research (Coyle & Bjorklund, 1997), off-task behavior was
slightly greater for younger than for older children (mean frequency of
off-task behavior per trial: .29 and .16 for second and fourth grade,
respectively). The analysis of examination revealed significant main
effects of condition, F(l, 116) = 5.51 (mean number of intervals of
examination per trial: 1.87 and 2.27 for ascending/descending and
descending/ascending conditions, respectively), and trial, F(6, 696) =
34

35
5.68 (mean number of intervals of examination per trial: 2.28, 2.19,
2.10, 1.98, 2.03, 2.00, 1.85 for Trials 1-7, respectively). These main
effects were qualified by a significant Condition x Trial interaction,
F(6, 696) = 2.91. Inspection of the significant interaction revealed
that ascending/descending versus descending/ascending comparisons were
significant at Trial 2 (1.89 versus 2.54), Trial 3 (1.81 versus 2.43),
and Trial 4 (1.69 versus 2.30), but not significant at Trial 1 (2.13
versus 2.46), Trial 5 (1.91 versus 2.16), Trial 6 (1.88 versus 2.14),
and Trial 7 (1.81 versus 1.89). These data demonstrate that attention
to the task materials was somewhat greater on the initial
descending/ascending trials than on the corresponding
ascending/descending trials.
Recall
Before presenting preliminary analysis of the recall data, data
concerning repetitions and intrusions in recall are examined.
Repetitions refer to recall of the same word more than once. Intrusions
refer to utterances of words not on the target list. The frequency of
occurrence of each type of data was analyzed separately using 2 (grade)
x 2 (condition) x 7 (trial) ANOVAs, with repeated measures on the trial
factor. Analysis of the repetition data revealed no significant main
effects or interactions. Repetitions were slightly greater for fourth
graders (M = .49) than for second graders (M = .40). Analysis of the
intrusion data revealed a significant main effect of grade, F(l, 116) =
5.64, with intrusions being greater for second graders (M = .22) than
for fourth graders (M = .05). All other main effects and interactions
for the intrusion data were not significant. The significant grade

36
difference in intrusions is consistent with findings demonstrating that
younger children have problems inhibiting task-inappropriate responses
{Dempster, 1992). The repetition and intrusion data are excluded from
all subsequent analyses.
Because possible recall varied trial-by-trial, the number of words
recalled on each trial was converted to the proportion of words recalled
relative to possible recall. Mean proportion recall on each trial, by
grade and condition, is presented in Table 2, which also shows mean
proportion recall by condition and trial (i.e., collapsed across grade)
and fourth grade minus second grade recall differences by condition and
trial. Proportion recall was examined by a 2 (grade) x 2 (condition) x
7 (trial) ANOVA, with repeated measures on the trial factor. The
analysis revealed significant main effects of grade, F(l, 116) = 10.08
(mean proportion recall: .54 and .75 for second and fourth grade,
respectively), condition, F(l, 116) = 6.83 (mean proportion recall: .65
and .60 for ascending/descending and descending/ascending conditions,
respectively), and trial, F(6, 696) = 3.69 (mean proportion recall: .65,
.61, .61, .67, .60, .60, and .65 for trials 1-7, respectively). Also
significant were interactions of grade x trial, F(6, 696) = 7.01, and
condition x trial, F(6, 696) = 57.68, both of which were qualified by a
significant interaction of grade x condition x trial, F(6, 696) = 2.85.
The significant three-way interaction was evaluated by comparing
grade differences in recall at each trial, separately for each
condition. In the ascending/descending condition, significant grade
differences in recall were observed on all trials except Trial 1, when
only six words were presented. In the descending/ascending condition,

37
Table 2
Trial (i.e., Collapsed Across
Grade), and
Grade Differences
in Recall at
Each
Trial Bv Condition
Trial
1
2
3
4
5
6
7
Ascending/Descending
Maximum Recall 6
9
12
15
12
9
6
Grade 2
M .81
.59
.49
.45
.42
.50
.71
SD .17
.22
.16
.18
.19
.28
.22
Grade 4
M .84
.73
.69
.70
.76
.89
.94
SD .18
.16
.23
.23
.21
.17
.14
Collapsed Across Grade
M .82
.64
.57
.55
.55
.65
.80
SD .17
.21
.21
.24
.26
.31
.22
Grade 4 Grade 2
Difference .03
.14
.20
.25
.34
.39
.24
Descending/Ascending
Maximum Recall
15
12
9
6
9
12
15
Grade 2
M
.38
.48
.53
.76
.56
.41
.36
SD
.18
.24
.24
.22
.30
.24
.18

38
Table 2continued
Grade 4
M .55
.66
.80
.90
.80
.69
.61
SD .19
.21
.21
.22
.22
.24
.24
Collapsed Across Grade
M .46
.56
.66
.82
.67
.54
.48
SD .20
.25
.26
.23
.29
.28
.24
Grade 4 Grade 2
Difference .17
.18
.27
.14
.24
.28
.25
Note. Maximum recall indicates
the maximum
number
of words that
could be
recalled on a particular trial.

39
grade differences in recall were found on all trials. As shown in Table
2, the magnitude of grade differences in recall was least pronounced on
trials with the fewest words presented (Trials 1 and 7 in ascending/
descending and Trial 4 in descending/ascending), compared to the data on
adjacent trials.
Strategy Use
The percentage and mean number of trials on which children in each
grade used each strategy is presented by condition in Table 3. The
percentages within each grade do not sum to 100 because multiple
strategies were frequently used in combination on a single trial. The
number of trials on which each strategy was used (range = 0 to 7) was
examined by a 2 (grade) x 2 (condition) x 4 (strategy) ANOVA. The
analysis revealed a significant main effect of strategy, F(l, 116) =
73.09, and significant interactions of grade x strategy, F(3, 348) =
4.63, and condition x strategy, F(3, 348) = 5.72. Inspection of the
significant main effect of strategy revealed that sorting, rehearsal,
and clustering were used more often than category naming, with all other
strategy comparisons being nonsignificant (mean number of trials on
which each strategy was used: 3.18, 3.44, 3.23, and .12 for sorting,
rehearsal, clustering, and category naming, respectively). The floor
levels of category naming are inconsistent with previous research
showing that category naming was used relatively frequently by fourth
graders who received a sort-recall task similar to the one used here.
Although category naming was almost never used in the current study, the
near absence of this strategy did not prevent the detection of

40
Table 3
Percentage (and Number) of Trials on Which Each Strategy Was Used, By
Condition and Grade
Strategy
Category
Sorting Rehearsal Clustering Naming
Ascending/Descending
Grade 2
36
(2.54)
60
Grade 4
58
(4.04)
58
Descending/Ascending
Grade 2
35
(2.47)
38
Grade 4
59
(4.15)
37
(4.21) 37 (2.56) 2 ( .15)
(4.04) 49 (3.40) <1 ( .04)
(2.67) 50 (3.53) 3 ( .20)
(2.62) 53 (3.69) <1 ( .04)

41
significant effects concerning measures of strategy variability, as
shown in later analyses.
Data relevant for the significant interactions concerning strategy
use are presented in Table 3. Inspection of the Grade x Strategy
interaction revealed that sorting was used more by fourth graders (M =
4.13) than by second graders (M = 2.51), with grade comparisons for the
other strategies being nonsignificant. Evaluation of the Condition x
Strategy interaction revealed that rehearsal was used more in the
ascending/descending condition (M = 4.13) than in the
descending/ascending condition (M = 2.65), with the other strategies
being used approximately equally in both conditions.
These strategy data provide information concerning the frequency
of occurrence of each individual strategy. Subsequent analyses examine
the possibility of several strategies being used in combination on a
single trial, and changes in the mixture of strategies used across
trials.
Variability in Strategy Use
Two general types of variability were examined: multiple-strategy
use and strategy change. Multiple-strategy use refers to the number of
strategies used within a given trial. Strategy change refers to the
number of different strategies used across trials and trial-by-trial
changes in strategy use.
Multiple-Strategy Use
An initial analysis examined the prediction that multiple-strategy
use would increase with age and that the number of strategies used would
be greatest for trials on which relatively many words were presented

42
(i.e., Trials 3-5 in the ascending/descending series and Trials 1, 2, 6,
and 7 in the descending/ascending series). The number of strategies
used on each trial was analyzed by a 2 (grade) x 2 (condition) x 7
(trial) ANOVA, with repeated measures on the trial factor. The analysis
revealed a marginally significant effect of grade, F(l, 116) = 3.31, £ =
.07, with fourth graders using more strategies (M = 1.57) than second
graders (M = 1.32). Also significant was the main effect of trial, F{6,
696) = 12.33, and the Grade x Trial interaction, F(6, 696) = 3.23. No
other significant effects were found. Inspection of the significant
main effect of trial revealed that the number of strategies used on
Trial 1 (M = 1.14) and Trial 2 (M = 1.18) was significantly less than
that used on Trials 3-7 (mean number of strategies used: 1.46, 1.49,
1.58, 1.57, and 1.58 for Trials 3-7, respectively). No other
significant comparisons across trials were found.
Data pertaining to the significant Grade x Trial interaction are
presented in Table 4, which also shows the number of strategies used for
each Condition x Grade x Trial cell. Examination of grade differences
in number of strategies used on each trial revealed that fourth graders
used more strategies than second graders on Trials 3, 4, and 6, with
strategy use being comparable for both grades on all other trials.
These data, along with the data presented immediately above, are
consistent with the predicted grade differences. In all cases where
grade differences were found, fourth graders used more strategies than
second graders. The absence of a significant Condition x Trial
interaction indicates that strategy use did not vary across trials with
different numbers of words presented.

43
Table 4
nean IMUIIlUtrl Ul u ll duc^lcb UpcU, Dy UldUc ctliU. 1I ldi diiu i->y VjUUU.1 liuu, uiauc,
and Trial
Trial
1
2
3
4
5
6
7
Grade 2
M
1.22
1.12
1.30
1.32
1.44
1.38
1.48
SD
.78
.83
.77
.85
.87
1.01
1.01
Grade 4
M
1.04
1.28
1.67
1.73
1.77
1.82
1.71
SD
.96
.96
1.07
1.08
.99
1.14
1.17
Grade 2
Ascending/Descending
M
1.36
1.15
1.39
1.33
1.41
1.36
1.54
SD
.81
.81
.78
.84
.79
1.04
.94
Grade 4
M
1.00
1.28
1.68
1.96
1.88
2.00
1.76
SD
.96
.94
1.03
1.10
1.17
1.12
1.20
Grade 2
Descending/Ascending
M
1.03
1.07
1.20
1.30
1.47
1.40
1.40
SD
.72
.87
.76
.88
.97
.97
1.10

Table 4
Grade 4
M
SD
continued
1.08 1.27
.98 1.00
44
1.65 1.50
1.13 1.03
1.65 1.65 1.65
.80 1.16 1.16

45
Strategy Change
Number of strategy changes across trials. Although the analysis
above demonstrates that fourth graders used more strategies than second
graders, it did not examine possible changes in strategy use across
trials (i.e., additions and deletions in strategy use on consecutive
trials). For example, a child using two strategies across all trials
could be using sorting and rehearsal on all seven trials, sorting and
rehearsal on Trials 1-4 and sorting and clustering on Trials 5-7, or
sorting and rehearsal on all even trials and sorting and clustering on
all odd trials. In each case the child uses two strategies on all
trials but shows a different number of strategy changes. A child using
sorting and rehearsal on all trials shows no strategy changes; a child
using sorting and rehearsal on Trials 1-4 and sorting and clustering on
Trials 5-7 shows two strategy changes (i.e., dropping rehearsal and
adding clustering from Trial 4 to Trial 5); and a child using sorting
and rehearsal on all even trials and sorting and clustering on all odd
trials shows 12 changes (i.e., dropping a strategy and adding a
strategy on each of the six trial transitions (Trials 1 to 2, 2 to 3, 3
to 4, 4 to 5, 5 to 6, 6 to 7).
An analysis of strategy change evaluated the prediction that
strategy changes would decrease with age and that strategy changes would
occur most frequently on transitions to trials with more words (i.e.,
Trials 2 to 3 and 3 to 4 in the ascending/descending series and Trials 5
to 6 and 6 to 7 in the descending/ascending series). The number of
strategy changes on each of the six trial transitions was analyzed by a
2 (grade) x 2 (condition) x 6 (trial transition) ANOVA, with repeated

46
measures on the trial transition factor. The analysis revealed a
significant main effect of grade, F(l, 116) = 11.41 (mean number of
strategy changes: .78 and .54 for second and fourth grade,
respectively), and a significant Grade x Trial Transition interaction,
F(5, 580) = 2.67. No other significant effects were found.
Data relevant to the significant Grade x Trial Transition
interaction are presented in Table 5, which also shows the number of
strategy changes for each Condition x Grade x Trial cell. Inspection of
grade differences in strategy changes on each trial transition revealed
that fourth graders had significantly fewer changes than second graders
on all trial transitions except transitions 2 to 3 and 3 to 4. These
data demonstrate that the grade difference mentioned above is primarily
a result of fourth graders having fewer strategy changes than second
graders on later rather than earlier trials. These findings are
consistent with the hypothesis that strategy changes decrease with age.
The absence of a significant Condition x Trial interaction indicates
that strategy changes did not vary across trials with different numbers
of words presented.
Other types of variability. Although number of strategy changes
across trials is one measure of strategy change, other measures of
strategy change are possible. Two additional measures of strategy
change are examined here: number of unique strategy combinations used
across all trials (range: 0 to 7), and number of consecutive trials
with strategy changes (range: 0 to 6). These measures, along with the
average number of strategy changes across trials (an average of the
measure analyzed above), were converted to z-scores and entered into a 2

47
Table 5
Mean Number of Trial-bv-Trial Strategy Changes, By Grade and Trial Transition,
and By Condition, Grade, and Trial Transition
Trial
Transition
1 to 2
2 to 3
3 to 4
4 to 5
5 to 6
6 to 7
Grade 2
M
.77
cr>
co
.73
.93
CO
.70
SD
.75
.58
.75
.85
.89
.69
Grade 4
M
.49
.80
.57
.47
.45
.43
SD
.64
.83
.67
.83
.50
.61
Ascending/Descending
Grade 2
M
CO
^4
. 64
.67
.95
.82
.69
SD
.83
.63
.74
.79
.82
.69
Grade 4
M
.40
.76
.64
.60
.40
.44
SD
.50
.78
.76
1.08
.50
.65
Descending/Ascending
Grade 2
63
.73
o
CO
.90
.93
.70
62
.52
.76
.92
vJD
CO
.70

Table 5
Grade 4
M
48
continued
58
.85
.50
.35
.50
.42
76
CO
CO
.58
.49
.51
.58
SD

49
(grade) x 2 (condition) x 3 (strategy change type) ANOVA, with repeated
measures on the strategy change type factor. This analysis permitted
examination of possible grade and condition differences across the three
measures of strategy change.
The analysis revealed a significant main effect of grade, F(l,
116) = 10.65 (mean z-scores summed across the three measures of strategy
change: .22 and -.30 for second and fourth grade, respectively), which
was qualified by a significant Grade x Strategy Change Type interaction,
F(2, 232) =3.90. No other significant main effects or interactions
were found. Data pertaining to the significant Grade x Strategy Change
Type interaction are presented in Table 6. Fourth graders had
significantly lower levels of strategy change than second graders for
two of the three measures (trials with changes and total changes). The
grade difference for unique combinations was in the predicted direction
but only approached significance, p < .10. Paired comparisons among the
change measures within each grade revealed that fourth graders had
significantly fewer total strategy changes than unique combinations. No
other comparisons among the change measures within each grade were
found. These data, along with the data in the preceding section,
demonstrate that older children show fewer strategy changes than younger
children across a variety of measures of strategy change, with the
exception of unique combinations.
Variability within individual children. Although these findings
demonstrate that strategy changes decline with age, they are based on
analyses of group data, which often mask patterns of individual strategy
use. Thus, children were classified as stable or unstable based on

50
Table 6
Mean Z-scores and Raw Scores for Unique Combinations, Trials with Changes, and
Total Changes, By Grade (Standard Deviations in Parentheses)
Strategy Change Type
Unique
Combinations
Trials
with Changes
Total
Changes
Grade 2
Z-Score
.13 (1.00)
.25
(1.00)
.29
(1.04)
Raw Score
2.64 (1.11)
3.51
(1.56)
4.67
(2.45)
Grade 4
Z-Score
-.18 ( .99)
-.34
( -91)
-.39
( -80)
Raw Score
2.29 (1.10)
2.59
(1.43)
3.06
(1.87)

51
their pattern of strategy change across trials. Children were
classified as stable if they used the same combination of strategies on
at least four pairs of consecutive trials (of a possible six pairs of
consecutive trials). Children were classified as unstable if they used
the same combination of strategies on fewer than fours pairs of
consecutive trials. These classifications were based on changes in the
mixture (rather than the number) of strategies used over trials.
The percentage of children in each grade classified as stable or
unstable is shown in the first and second columns of Table 7. Fourth
graders were significantly more likely to be classified as stable than
second graders, who showed considerable variability in strategy use
X2(l, N = 120) = 7.52. These data are consistent with the findings
reported in the previous section. However, the findings in the previous
section showed that although both groups tended to show variability on
early trials, only the second graders showed variability on later
trials. Thus, a second analysis examined the possibility that the
observed grade differences in stability classification were primarily
attributed to differences in variability on later rather than earlier
trials. Children were classified as stable or unstable on early trials
(Trials 1 to 4) and separately on later trials (Trials 4 to 7). (Trial
4 is both the last trial in the set of early trials and the first trial
in the set of later trials.) For each block of trials, children were
classified as stable if they used the same combination of strategies on
two or three pairs of consecutive trials (of a possible total of three
pairs of consecutive trials). Children were classified as unstable if

52
Table 7
Percentage (and Number) of Children Classified as Stable or Unstable Across
All Trials, on Early Trials, and on Later Trials, By Grade
All
Trials
Early Trials
Later
Trials
Grade
Stable
Unstable
Stable
Unstable
Stable
Unstable
2
23 (16)
77 (53)
39 (27)
61 (42)
38 (26)
62 (43)
4
47 (24)
53 (27)
43 (22)
57 (29)
63 (32)
37 (19)

53
they used the same combination of strategies on only one of three pairs
of consecutive trials.
The percentage of children in each grade classified as stable and
unstable on early trials and separately on later trials is presented in
columns three through six in Table 7. For the early trials, no grade
difference in the distribution of children classified as stable or
unstable was found, X2(l, N = 120) < 1, with most children showing
unstable strategy use. For the later trials, fourth graders were
significantly more likely to be classified as stable than second
7
graders, who frequently showed unstable strategy use, X (1, N = 120) =
7.38. These findings demonstrate that both groups of children showed
considerable variability in strategy use on early trials. In contrast,
only second graders showed unstable strategy use on later trials; most
fourth graders showed stable strategy use.
These findings were extended in an analysis that examined changes
in stability classification from early to later trials for individual
children. Children were classified as showing one of four possible
patterns of stability classification from early trials (i.e., Trials 1
to 4) to later trials (Trials 4 to 7): unstable on early trials,
unstable on later trials (unstable/unstable); unstable on early trials,
stable on later trials (unstable/stable); stable on early trials, stable
on later trials (stable/stable); stable on early trials, unstable on
later trials (stable/unstable).
The percentage of children in each of the four pattern
classifications is shown by grade in Table 8. The data are presented in
terms of children whose stability classification did or did not change

54
Table 8
Classification
Across Trial
Blocks, Bv Grade
No Change
Change
Unstable/
Stable/
Unstable/
Stable/
Grade
Unstable
Stable
Stable
Unstable
2
41 (28)
17 (12)
20 (14)
22 (15)
4
18 ( 9)
24 (12)
39 (20)
20 (10)

55
from early to later trials. The distribution of fourth and second
graders in each of the four pattern classifications was significantly
different, X?(3, N = 120) = 9.33. Analysis of data for children who did
not change pattern classifications revealed that second graders were
significantly more likely to show the unstable/unstable pattern than
fourth graders, who frequently showed the stable/stable pattern, X (1 M
= 61) = 4.25. Analysis of data for children who did change pattern
classifications revealed that the distribution of second and fourth
graders in the unstable/stable and stable/unstable groups was not
significant, X (1, N = 59) = 2.04. The analysis of children who did not
change classifications demonstrates that fourth graders were more likely
to maintain an initial pattern of stable-strategy use than second
graders, who frequently maintained an initial pattern of unstable-
strategy use.
Analyses were also performed on the distribution of second and
fourth graders whose initial classification (on Trials 1-4) was unstable
(unstable/unstable and unstable/stable), and separately on the
distribution of second and fourth graders whose initial classification
was stable (stable/stable and stable/unstable). In the analysis of
children whose initial classification was unstable, fourth graders were
significantly more likely to show the unstable/stable pattern than
second graders, who frequently showed the unstable/unstable pattern,
X (1, N = 71) = 8.73. The distribution of second and fourth graders
whose initial classification was stable {stable/stable and
stable/unstable) was not significant, X (1, N = 49) < 1. The analysis
of children whose initial classification was unstable demonstrates that

56
fourth graders frequently switched from unstable-strategy use on early
trials (i.e., Trials 1 to 4) to stable-strategy use on later trials
(i.e., Trials 4 to 7). In contrast, second graders who showed unstable-
strategy use on early trials frequently also showed unstable-strategy
use on later trials.
Relation Between Strategy Use and Recall
Utilization Deficiencies
Correlations between number of strategies used and recall. Miller
and Seier (1994) have argued that significant and positive correlations
between strategy use and recall for older but not younger children
indicate a utilization deficiency for younger children. In the current
study, utilization deficiencies of this type were expected on most
trials for second graders. However, second graders were predicted to
overcome a utilization deficiency on trials with relatively few words,
when capacity requirements for strategy use were presumably minimal.
Utilization deficiencies were evaluated by computing correlations
between number of strategies used and percentage of words recalled,
separately for each Condition x Grade x Trial cell (see Table 9). The
pattern of correlations in the ascending/descending condition showed
clear age differences in the significance and magnitude of the relation
between strategy use and recall. Fourth graders showed significant and
positive correlations on all trials, whereas second graders showed
significant and positive correlations on only three of seven trials.
The magnitude of correlations for the fourth graders was higher than
that for second graders on all Trials except Trial 6. These data
provide evidence of utilization deficiency for the youngest children.

57
Table 9
LUi I LJ-Uli Dfci LWccil Lilil
Bv Condition, Grade, and
JfciJ. UI WU1U
Trial
IXCLdl -LfciLI
L ilU 1\ U.111U t-i.
UI
JU
V PSr.V* 1.
Trial
1
2
3
4
5
6
7
Ascending/Descending
Grade 2 .31
.29
.42**
.32*
16
.50**
.30
Grade 4 .41*
.46*
.41*
.65**
73**
.45*
.55**
Descending/Ascending
Grade 2 -.18
.50**
.43*
.34
60**
.65**
.57**
Grade 4 .31
.37
.38
.06
43*
.46*
.36
* £ < .05, ** £ < .01

58
Contrary to predictions, second graders did not overcome a utilization
deficiency on three of four trials with nine or fewer words presented
(i.e., Trials 1, 2, and 7).
The pattern of correlations in the descending/ascending condition
was nearly opposite to that observed in the ascending/descending
condition. Fourth graders now had significant correlations on only two
of seven trials. Second graders had significant correlations on five of
seven trials, with two of these correlations being found on trials with
nine or fewer words presented (i.e., Trials 3 and 5). The magnitude of
correlations for second graders was higher than that for fourth graders
on all trials except Trial 1. These data do not provide evidence of a
utilization deficiency for the younger children.
The failure to find significant correlations for fourth graders in
the descending/ascending condition, when such correlations were
significant in the ascending/descending condition, cannot be attributed
to restricted variance in number of words recalled or number of
strategies used. The standard deviations for number of strategies used
on Trials 1-7 were very similar for fourth graders in the
ascending/descending condition (SDs = .96, .94, 1.03, 1.10, 1.17, 1.12,
and 1.20) and in the descending/ascending condition (SDs = .98, 1.01,
1.13, 1.03, .80, 1.16, and 1.16). The standard deviations for recall
were also similar for both groups of fourth graders (see Table 2).
Recall for perfectly strategic children. Coyle and Bjorklund
(1996), as well as Miller and Seier (1994), have argued that utilization
deficiencies can be inferred when grade differences in recall are
observed despite comparable strategy use. In the current study, this

59
type of utilization deficiency was evaluated by analyzing mean
proportion recall for trials on which children showed perfect sorting
only, perfect clustering only, and perfect sorting and clustering.
Clustering and sorting data on each trial were measured continuously by
ARC scores for this analysis. ARC scores can range from 1 to -1, with 1
indicating perfect sorting or clustering and 0 indicating chance sorting
or clustering. Because children rarely showed multiple trials with
perfect strategy use (i.e., two or more trials with sorting or
clustering scores of 1), repeated measures analysis of recall across
trials with perfect strategy use was not performed. Instead, each child
received a single score averaging recall across trials with perfect
strategy use. Such a recall score was computed separately for trials
with perfect clustering only, perfect sorting only, and perfect
clustering and sorting.
Mean proportion recall for children in each grade showing each
measure of perfect strategy use is presented in Table 10, along with the
number of subjects in each grade who had at least one trial of perfect
strategy use. Separate 2 (grade) x 2 (condition) ANOVAs were performed
on proportion recall for trials with perfect sorting only, perfect
clustering only, and perfect sorting and clustering. The analysis of
recall on trials with perfect clustering revealed a significant main
effect of grade, F(l, 83) = 5.59. No other significant main effects or
interactions were found for any measure of perfect strategy use.
These findings demonstrate that second graders who clustered
perfectly recalled fewer words than comparably strategic fourth graders,
which is evidence for a utilization deficiency for the second graders.

60
Table 10
Mean Proportion Recall When Strategy Use Was Perfect, By Grade and Type of
Strategy Used
Strategies Used Perfectly
Sorting
Only
Clustering
Only
Sorting am
Clustering
Grade 2
M
.72
.50
.87
SD
.19
.17
.17
n
15
63
14
Grade 4
M
.80
.60
.89
SD
.15
.23
.10
n
14
24
28
Note, ns are number of children who showed at least one trial of perfect
strategy use.

61
In contrast, second graders who sorted perfectly or sorted and clustered
perfectly recalled just as many words as comparably strategic fourth
graders. Thus, utilization deficiencies occurred for some but not all
instances of perfect strategy use. The absence of a significant effect
of condition demonstrates that recall in each condition did not vary for
children who showed comparable and perfect strategy use.
Utilization deficiencies for individual strategies. The findings
described above were confirmed and extended in a descriptive analysis of
recall for children in each grade who used each of the 15 possible
strategy combinations or no strategy (see Table 11). The first part of
this analysis examined grade differences in the percentage of trials on
which each combination was used. As shown in Table 11, second graders
were more likely than fourth graders to use rehearsal only, clustering
only, and both rehearsal and clustering. In contrast, fourth graders
were more likely than second graders to use no strategy, both sorting
and clustering, and sorting, rehearsal, and clustering.
Utilization deficiencies were evaluated by analyzing grade
differences in recall when children used the same strategies. Analyses
were conducted only on the seven strategy combinations for which
sufficient data were available for a significance test. (Recall data
for no strategy use were not included in this analysis because children
who use no strategy cannot be evaluated for a utilization deficiency.)
Of these seven comparisons, four showed that fourth graders recalled
significantly more than second graders when strategy use was comparable.
The remaining three comparisons were not significant but had means in
the predicted direction. Consistent with the findings in the previous

62
Table 11
Percentage (and Number) of Trials on Which Each Strategy Combination Was Used,
and Mean Proportion Recall (and Standard Deviations) for Each Combination, By
Grade (Codes for Strategies: S. Sorting; R, Rehearsal: C, Clustering; N,
Category Naming)
Percentage of Trials Mean Recall
Strategy Grade 2 Grade 4
Grade 2 Grade 4
None
16
(
78)
22
(77)
.41
(
.25)
.65
(
.23)
S
8
(
39)
8
(30)
.50
(
.23)
.60
(
.21)
R
24
(
115)
10
(34)
.53
(
.25)
.81
(
.19)
C
14
(
68)
5
(18)
.47
(
19)
.54
(
.17)
N
0
(
0)
0
( 0)
SR
8
(
37)
10
(34)
.65
(
.23)
.76
(
.22)
SC
10
(
50)
17
(61)
.70
(
.26)
.80
(
.22)
SN
<1
(
2)
0
( 0)
RC
10
(
49)
5
(17)
.48
(
.20)
.69
(
.27)
RN
<1
(
1)
0
( 0)
CN
0
(
0)
0
( 0)
SRC
7
(
35)
23
(83)
.74
(
.23)
.90
(
.13)
SRN
<1
(
1)
<1
( 1)
SCN
<1
(
1)
<1
( 2)
RCN
<1
(
1)
0
( 0)
SRCN
1
(
6)
0
( 0)

63
Table 11continued
Note. Boldface denotes significant age differences in recall or percentage of
trials on which a particular strategy combination was used, with all
significant results reported at p < .05. Recall data are omitted for
combinations used on one or zero trials. Grade differences in percentage of
trials on which each combination was used are evaluated using Yates corrected
chi-squares with one degree of freedom. Grade differences in mean recall for
each combination are evaluated using t-tests.

64
section, these data demonstrate that fourth graders outperform
comparably strategic second graders, which is evidence of a utilization
deficiency for the second graders.
Strategy Change and Recall
Correlations between strategy change and recall. Previous
research (Coyle & Bjorklund, 1997) involving procedures and age groups
similar to those in the current study has shown that measures of
strategy change are significantly and negatively correlated with recall
for older but not younger children. That is, older children who showed
the fewest strategy changes across trials (i.e., high levels of
stability in strategy use) had the highest levels of recall. The
current study attempted to replicate this finding with a different
sample.
Correlations were computed separately between each measure of
strategy change and mean proportion recall across trials. The three
measures of strategy change were number of unique strategy combinations,
number of consecutive trials with strategy changes, and total number of
strategy changes on consecutive trials.
Correlations computed separately for each grade revealed a pattern
very similar to that observed in previous research. Fourth graders
showed significant and negative relations between recall and strategy
change for two of the three measures (trials with changes, r(51) = -.42,
p < .01, and total changes, r(51) = -.36, p < .05), but not for unique
combinations, r(51) = -.11, p > .10. Second graders showed no reliable
relation between recall and any measure of strategy change (rs(69) =
.23, -.15, and -.13 for unique combinations, trials with changes, and

65
total changes, respectively). These findings were qualified by
correlations computed separately within each Grade x Condition cell and
reported in Table 12. These correlations showed that only fourth
graders in the descending/ascending condition showed significant and
negative relations between recall and strategy change, with correlations
involving all three measures of strategy change being significantly
related to recall. These latter findings demonstrate that the grade
differences reported above can be attributed to correlational data for
fourth graders in the descending/ascending condition. Fourth graders in
the ascending/descending condition showed no reliable relation between
recall and strategy change.
The data in Table 12 reveal that the difference between
correlations involving trials with changes and total changes was always
lower than the difference between correlations involving each of these
variables and unique combinations. This suggested possible differences
in the relations among the various measures of strategy change. To
assess this possibility, pairwise correlations among each of the three
measures of strategy change were computed, separately within each Grade
x Condition cell. These correlations are reported Table 13.
As shown in Table 13, all correlations among the three measures of
strategy change were significant. However, the magnitude of
correlations involving unique combinations (i.e., unique combinations
and trials with changes; unique combinations and total changes) was
lower than the magnitude of correlations not involving unique
combinations (i.e., trials with changes and total changes).
Correlations between trials with changes and total changes were near

66
Table 12
Grade
Measure of Strategy Change
Unique
Trials
Total
Combinations
with Changes
Changes
Ascending/Descending
Grade 2
.23
-.21
-.06
Grade 4
.28
-.16
-.05
Descending/Ascending
Grade 2
.22
-.13
-.18
Grade 4
-.53**
-.66**
-.66**
**£ < .01

67
Table 13
Correlations Among Measures of Strategy Change, By Condition and Grade
Correlation
Combinations and
Trials with Changes
Combinations and
Total Changes
Trials with Changes
Total Changes
Ascending/Descending
Grade 2
.33*
#77***
Grade 4
. 60**
.65***
.93***
Descending/Ascending
Grade 2
.66***
.64***
,94***
Grade 4
_59***
.68***
.91***
*£ < .05, **£ < .01, ***£ < .001

68
perfect for all Grade x Condition cells except one (second graders in
the ascending/descending condition).
Relation between strategy chanRe and recall for individual
children. The finding that stability was related to high levels of
recall only for fourth graders in the descending/ascending condition was
only partially confirmed in an analysis of recall for children
classified as stable or unstable (see Table 7 for stability
classification data). A 2 (grade) x 2 (stability classification) x 2
(condition) x 7 (trial) ANOVA was conducted on mean proportion recall.
Because significant main effects and interactions involving the Grade,
Condition, and Trial factors have already been reported, only
significant main effects and interactions involving the Stability
Classification factor are reported here.
The analysis revealed a marginally significant main effect of
stability classification, F(l, 112) = 3.02, £ = .09 (mean proportion
recall: .59 and .70, for unstable and stable, respectively), which was
qualified by a significant grade x stability classification interaction,
F(l, 112) =3.78. No other significant effects involving the stability
classification factor were found, including effects involving the
condition factor. The failure to find a significant Grade x Stability
Classification x Condition interaction is inconsistent with the
correlational results reported in the previous section. Those results
showed that, in the descending/ascending condition, fourth graders
showing stable-strategy use had higher recall than fourth graders
showing unstable-strategy use. The findings in the current section,
along with those of the previous one, demonstrate that findings

69
pertaining to analyses that examine patterns of variability for
individual subjects may not always be consistent with those pertaining
to analyses that examine patterns of variability in group data.
Data relevant to the significant Grade x Stability Classification
interaction are reported in columns one and two of Table 14.
Differences in recall between stable and unstable children were analyzed
separately within each grade. Second-grade children in each stability
classification showed equivalent levels of recall. In contrast, fourth
graders classified as stable recalled significantly more than fourth
graders classified as unstable. A further analysis examined grade
differences in recall separately within each stability group. Fourth
graders recalled significantly more than second graders within both
stability groups. However, the magnitude of this grade difference in
recall was greater for stable children than for unstable children (mean
fourth grade minus second grade recall difference: .28 and .16, for
stable and unstable children, respectively).
These findings were confirmed and extended in a final set of
analyses that examined grade differences in recall for children
classified as stable and unstable on early trials (Trials 1-4) and
separately on later trials (Trials 4-7). A 2 (grade) x 2 (condition) x
2 (stability classification) ANOVA was performed on mean proportion
recall on early trials and separately on later trials. As before, only
significant effects involving the stability classification factor are
reported. The recall data for these analyses are reported in columns
three through six in Table 14.

70
Table 14
Stable
or Unstable
Across All
Trials, on Early Trials.
and on Later
Trials, By
Grade
All
Trials
Early Trials
Later
Trials
Grade
Stable
Unstable
Stable
Unstable
Stable
Unstable
2
.53 (.18)
.54 (.15)
.53 (.15)
.59 (.15)
.59 (.18)
.47 (.15)
4
.81 (.12)
.70 (.16)
.78 (.12)
.70 (.16)
.82 (.16)
.72 (.20)

71
The analysis involving data on early trials revealed a significant
grade x stability classification interaction, F(l, 112) =5.62. No
other significant effects involving stability classification were found.
Examination of the significant interaction revealed a pattern of results
very similar to that observed the analysis involving all trials. No
differences in recall were found for second graders classified as stable
or unstable, whereas recall for fourth graders classified as stable was
marginally greater than that for fourth graders classified as unstable,
£ < .07. Separate grade comparisons within each stability
classification revealed that fourth graders recalled significantly more
than second graders, although the magnitude of this grade difference was
again greater for stable children than for unstable children (mean
fourth grade minus second grade recall difference: .25 and .11 for
stable and unstable children, respectively).
The comparable analysis involving data on later trials revealed a
significant main effect of stability classification, F(l, 112) = 11.10
(mean proportion recall: .72 and .55 for stable and unstable,
respectively). No other significant differences involving stability
classification were found. These findings, along with those for early
trials, demonstrate that stability on early trials is associated with
high levels of recall for older but not younger children, whereas
stability on later trials is associated with high levels of recall for
both age groups.
Conditions of strategy changes. Why do strategy changes occur?
McGilly and Siegler (1989) addressed this question by analyzing the
number of trials on which children showed strategy changes immediately

72
after serial recall performance that was perfect or less than perfect.
They found that children were more likely to show strategy changes when
recall was less than perfect than when recall was perfect. That is,
children tended to stick with a particular strategy on the next trial
when it had yielded perfect performance, but changed strategies on the
next trial when it had yielded less than perfect performance. This
pattern is consistent with the win-stay/lose-shift approach that has
been reported in decision-making literature (Eimas, 1969).
In the current study, evidence for the win-stay/lose-shift
approach was examined by classifying each trial as a trial on which
recall was perfect or not perfect and on which strategy changes were or
were not observed on the next trial. This resulted in four possible
classifications: recall perfect/strategy change; recall perfect/no
strategy change; recall not perfect/strategy change; recall not
perfect/no strategy change. Classifications were performed separately
for Trials 1-6, with each child contributing a single data point at each
trial. (Trial 7 was omitted from the analysis because a strategy change
following Trial 7 is not possible.)
The percentage of trials on which recall was perfect or not and
followed by a strategy change or not is shown in Table 15. The
classification data on each trial were analyzed separately by 2 (recall
perfect vs. recall not perfect) x 2 (strategy change vs. no strategy
change) chi-squares. For Trials 1 through 3, perfect recall was
followed by strategy changes or no strategy change approximately
equally, X s(l, N = 120) < 1. For Trials 4 through 6, however, perfect
recall was followed by no strategy change more frequently than by

73
Table 15
Percentage (and Number) of Trials on Which Strategy Changes Did and Did Not
Occur Immediately After Recall vas Perfect or Not Perfect
Trial
1 2 3 4 5 6
Recall Perfect
Strategy
Change
42
( 8)
57 ( 4)
31 ( 4)
30 ( 9)
29 ( 6)
25
( 5)
No Strategy
Change
58
(ID
43 ( 3)
69 ( 9)
70 (21)
71 (15)
75
(15)
Recall Not Perfect
Strategy
Change
53
(54)
61 (69)
54 (58)
62 (56)
57 (56)
54
(54)
No Strategy
Change
47
(47)
39 (44)
46 (49)
38 (34)
43 (43)
46
(46)
Note. Percentages computed separately at each trial.

74
j
strategy changes, X s( 1, N = 120) > 5.43. These results demonstrate
that children were more likely to continue to use a strategy that
yielded perfect performance on later but not earlier trials.
Additional analyses examined the prediction that trials with
relatively few words (i.e., Trials 1, 2, and 6 in ascending/descending
and Trials 3, 4, and 5 in descending/ascending) would provide greater
opportunity for perfect recall, and consequently result in relatively
few strategy changes. To test this prediction, a series of 2 (recall
perfect vs. recall not perfect) x 2 (strategy change vs. no strategy
change) chi-squares were performed separately at each trial in each
condition (see Table 16). This resulted in a total of 12 individual
chi-squares (2 conditions x 6 trials). (Because including the grade
factor would have resulted in insufficient data to perform significance
tests for several of the grade x condition x trial combinations, the
grade factor was excluded from these analyses.) One of the 12 chi-
squares (Trial 1 in the descending/ascending condition) did not contain
sufficient data for a significance test. Of the remaining 11, only two
were significant. As predicted, the pattern of data on trials 4 and 5
in the descending/ascending condition revealed that perfect recall was
followed by no strategy change more frequently than by strategy changes,
X s(l, N = 56) > 5.18. The four other trials on which this pattern was
predicted (i.e., Trials 1, 2, and 6 in ascending/descending and Trial 3
in descending ascending) showed that perfect recall and strategy changes
did not vary as a function of number of words presented. These data
provide little evidence for the prediction that trials with relatively

75
Table 16
Percentage (and Number) of Trials on Which Strategy Changes Did and Did Not
Occur Immediately After Recall was Perfect or Not Perfect, By Condition
Trial
1
2
3
4
5
6
Number of Words
Presented
6
9
Ascending/Descending
12 15
12
9
Recall Perfect
Strategy-
Change
42
( 8)
50
( 3)
33 ( 1)
67 ( 2)
33
( 2)
29
( 5)
No Strategy
Change
58
(ID
50
( 3)
67 ( 2)
33 ( 1)
67
( 4)
71
(12)
Recall Not Perfect
Strategy
Change
56
(25)
57
(33)
51 (31)
59 (36)
53
(31)
55
(26)
No Strategy
Change
44
(20)
43
(25)
49 (30)
41 (25)
47
(27)
45
(21)
Number of Words
Presented
15
12
Descending/Ascending
9 6
9
12
Recall Perfect
Strategy
Change
0
( 0)
100
( 1)
30 ( 3)
26 ( 7)
27
( 4)
0
( 0)
No Strategy
Change
0
( 0)
0
( 0)
70 ( 7)
74 (20)
73
(ID
100
( 3)

76
Table 16continued
Recall Not Perfect
Strategy-
Change 52 (29) 65 (36) 59 (27) 69 (20) 61 (25
No Strategy
Change 48 (27) 35 (19) 41 (19) 31 ( 9) 39 (16
Note. Percentages computed separately at each trial.
53 (28)
47 (25)

77
few words would provide greater opportunity for perfect recall, and
consequently result in few strategy changes.

DISCUSSION
The current study examined several measures of variability in
strategy use, relating each measure to memory performance. Whereas
previous investigations of variability in strategy use have assessed
only one strategy on each trial and one type of variability (Siegler,
1996), the current study examined the possibility of multiple strategies
on each trial and two different types of variability (multiple-strategy
use and strategy changes). The current study was very similar in design
to a study by Coyle and Bjorklund (1997). However, it had additional
trials with which to evaluate changes in variability over time and
included trials varying widely in the number of words to be recalled.
Strategy variability was assessed within and across trials and related
to mean levels of recall, with analyses focusing on utilization
deficiencies and stability-recall relations. The results revealed
developmental differences in multiple-strategy use and strategy change,
and more important, age-related changes in the relation between measures
of variability and recall. Surprisingly, the results revealed few
significant effects related to the number of words presented on each
trial or the pattern of increases and decreases in the number of words
presented across trials.
The goals of the current study were to examine the impact of age
and number of words presented on each trial on (a) measures of strategy
variability, including multiple-strategy use and strategy changes; (b)
78

79
the relation between multiple-strategy use and recall, with particular
attention to patterns indicative of utilization deficiencies; (c) and
the relation between strategy changes and recall, with particular
attention to stability-recall relations. The pages that follow are
organized around these goals.
Variability in Strategy Use
Multiple-Strategy Use
As predicted, fourth graders tended to use more strategies than
second graders, with the number of strategies used increasing from Trial
2 to Trial 3 and remaining stable thereafter. These results were
confirmed and extended in the analysis of grade differences in the use
of each of the 15 unique strategy combinations (Table 11). In that
analysis, second graders used the strategies of rehearsal and clustering
and the two-strategy combination of rehearsal and clustering more than
fourth graders. In contrast, fourth graders used the two-strategy
combination of sorting and clustering and the three-strategy combination
of sorting, rehearsal, and clustering more often than second graders.
These data demonstrate that, when grade differences in strategy use were
found, second graders tended to use combinations with the fewest
strategies (i.e., single-strategy combinations) whereas fourth graders
tended to use combinations with the most strategies (i.e., three-
strategy combinations).
The very low frequency of category naming in the current study is
inconsistent with the results obtained by Coyle and Bjorklund (1997).
Whereas category naming was observed on only 2% of all trials in the
current study, it was observed on almost 31% of all trials in Coyle and

80
Bjorklund (1997). The reason for this difference is not clear. Both
studies used similar tasks and designs, had very similar testing
procedures, and involved children in the same age range. One possible
explanation for the disparity is that children in each study attended
different types of schools. Whereas children in the current study
attended public schools, children in Coyle and Bjorklund attended a
university-affiliated laboratory school. The curriculums at public
schools and laboratory schools may differ in ways that promote or
inhibit organizational strategy use. For example, children who attend
the university-affiliated schools may receive explicit instruction in
organizing items by taxonomic categories, whereas children who attend
public schools may receive such instruction less often, if at all. Such
a curriculum difference would affect children's use of organizational
strategies, particularly category naming.
The near absence of category naming in the current study resulted
in fewer strategies being available for analyses of variability in
strategy use. Although a reduction in the total number of strategies
available for analyses could affect statistical outcomes, the age-
related patterns of variability and performance found in the current
study are comparable to those found in the very similar sort-recall
study by Coyle and Bjorklund (1997). Older children in both studies
used more strategies than younger children. Also, as reported later in
the Discussion, older children in both studies showed lower levels of
strategy change, and stronger relations between stable-strategy use and
recall, than did younger children. These results suggest that age-

81
related patterns in variability are relatively uninfluenced by changes
in the total number of strategies being assessed.
Children were expected to show increases in the number of
strategies used on trials with more words. The results revealed that
multiple-strategy use did not vary as a function of the number of words
on each trial, with strategy use being comparable across both the
ascending/descending and descending/ascending conditions. The absence
of any effects involving trial or condition cannot be attributed to
ceiling effects or children not being able to use the target strategies.
Children of all ages used an average of fewer than two strategies across
trials (of a possible four strategies), leaving ample opportunity for
increases in the number of strategies used. Furthermore, children in
the age range studied have demonstrated competence in using all
strategies assessed.
The absence of any effect of condition and trial on the number of
strategies used demonstrates that children did not modify their
strategic behavior in response to being presented with different number
of words. Why did children stick with using a certain number of
strategies when presented varying number of words? Perhaps the most
parsimonious explanation is that children did not consider altering
their strategic behavior on trials with different numbers of words.
Although children in the age range tested could use all the strategies
assessed in the current study, metacognitive limitations concerning when
and how to use strategies may have prevented them from doing so. An
implication is that children who do not produce strategies spontaneously
might do so if they are instructed to (Ringle & Springer, 1980). In

82
addition to metacognitive limitations, capacity limitations may have
prevented the use of additional strategies. Children have limited
mental capacity for executing cognitive operations such as strategies,
and such capacity constraints may impose limits on the number of
strategies that can be used (Guttentag, 1984). Capacity limits can
change as a result of task experience or familiarization, as may have
occurred when strategy use increased from Trial 2 to 3. However,
capacity limits probably place an upper limit on the number of
strategies used, resulting in changes that occur in a restricted range.
Strategy Changes
In addition to the observed age differences in multiple-strategy
use, the current study also revealed age differences in strategy
changes. Fourth graders showed fewer strategy changes than second
graders for two of the three measures of strategy change (number of
trials with changes and number of trial-by-trial changes). No grade
difference was found for the third measure of strategy change, number of
unique strategy combinations, although the pattern was in the predicted
direction. Although age-related declines in variability have been noted
elsewhere (Siegler, 1996), these are the first results to demonstrate
empirically that strategy changes decline with age. More importantly,
these results, along with the results pertaining to multiple-strategy
use described above, demonstrate that different measures of variability
show different developmental patterns. Number of strategies used
increased with age, number of trials with changes and total strategy
changes decreased with age, and number of unique strategy combinations
was comparable across age. These results suggest considerable diversity

83
in the developmental pathways of different measures of variability, with
no one pattern accounting for all measures of strategy change and
multiple-strategy use.
Correlations among the various measures of strategy changes
differed in magnitude. Although correlations among all measures of
strategy change were significant, correlations between number of trials
with changes and total number of changes were consistently higher than
correlations between each of these measures and number of unique
combinations. Moreover, correlations between trials with changes and
total changes were near perfect (rs > .90) for 3 of the possible 4
correlations involving these measures, whereas none of the 8
correlations involving unique combinations was near perfect. These data
are the first to my knowledge to show differences in the strength of
relations among different measures of strategy change.
Differences in the relations among the various measures of
strategy change can be attributed to how each measure was computed. The
two most closely related measures, trials with changes and total
changes, both were computed based on the number of consecutive trials on
which different strategies were used. The third measure, unique
combinations, was computed based on the number of different strategy
combinations used, irrespective of whether the different strategies were
used on consecutive trials. These computational differences resulted in
differences in the magnitude of the correlations among the various
measures of strategy change, with correlations among measures based on
the same underlying index of strategy change being higher than

84
correlations among measures based on different indexes of strategy-
change.
Mean levels of variability for each strategy change measure in the
current study were lower than those observed in the similar study by
Coyle and Bjorklund (1997). In the current study, percentage of unique
combinations, trials with changes, and total changes across trials
(collapsed across grade) was 35, 51, and 55, respectively. In Coyle and
Bjorklund, the corresponding percentages were 46, 54, and 62,
respectively. This slight disparity in strategy change scores can be
attributed to the current study using more trials than the Coyle and
Bjorklund study. As shown in Table 5, the additional trials in the
current study allowed fourth graders to maintain a pattern of stable-
strategy use (i.e., few strategy changes) that began after Trial 4,
whereas second graders showed unstable-strategy use across all trials.
Consequently, the mean number of strategy changes averaged across grade
can be attributed to fourth graders showing substantially lower strategy
change scores on the later trials. Although Coyle and Bjorklund did
analyze strategy change patterns across trials, the fewer trials used in
that study limited the amount of stability that older children could
display and probably contributed to the slight disparity in strategy
change scores.
The age-related decline in strategy change was confirmed and
extended in analyses of trial-by-trial changes in strategy use for
individual children. An initial analysis revealed that fourth graders
were more likely to be classified as showing stable-strategy use than
second graders, who often switched strategies on adjacent trials.

85
Subsequent analyses revealed that this grade difference in stability
classification was attributed to a disproportionate number of fourth
graders being classified as stable on later trials (i.e., Trials 4 to
7). The distribution of children in each grade classified as stable and
unstable was comparable on early trials (i.e., Trials 1 to 4). These
findings were extended in an analysis of changes in individual-subject
stability classification across early and later trial blocks for
children whose initial stability classification was unstable. In that
analysis, fourth graders who showed unstable-strategy use on early
trials frequently showed stable-strategy use on later trials. In
contrast, second graders who showed unstable-strategy use on early
trials often remained unstable on later trials. These findings
demonstrate that stable-strategy use emerged during the later trials for
fourth graders but not for second graders. The fourth-grade data are
consistent with research demonstrating that variability in strategy use
declines with experience on a task (Coyle & Bjorklund, 1997; Siegler,
1996). Presumably, the second-grade children eventually would have
shown stability in strategy use if they had been given additional
practice and experience on the task. Microgenetic studies, assessing
children's strategy use over longer periods of time, are needed to
evaluate this hypothesis.
Children in all grades were predicted to show relatively few
strategy changes following trials with few words (i.e., trials with 6 or
9 words) and more frequent strategy changes following trials with many
words (i.e., trials with 12 or 15). Contrary to this prediction, the
results revealed that strategy changes did not vary with trials with

86
different numbers of words. Children of all ages showed comparable
numbers of strategy changes after trials with relatively few words or
many words.
Why did children fail to show strategy changes on trials with many
words when they were predicted to do so and when such changes may have
benefited their performance? The answer to this question may involve
the same factors that were reviewed in the section on multiple-strategy
use: metacognitive limitations and capacity limitations. Metacognitive
limitations may have limited children's ability to monitor changes in
the number of words presented on each trial and to alter their strategy
use in response to such changes. Capacity limitations may have limited
children's ability to add strategies on successive trials even if they
had the metacognitive awareness to do so. Future research, providing
metacognitive instruction on when and how to use strategies and reducing
the capacity demands for strategy production and utilization, is needed
to assess these possibilities.
Relation Between Multiple-Strategy Use and Recall
An important purpose of the current study was to investigate the
relation between multiple-strategy use and recall, identifying possible
evidence for utilization deficiencies. An initial analysis examined age
differences in correlations between multiple-strategy use (i.e., number
of strategies used) and recall, computed separately in each condition
and across trials. The findings in the ascending/descending condition
were very similar to those observed in previous research examining age
differences in the relation between strategy use and recall (Coyle &
Bjorklund, 1996, 1997). Fourth graders showed significant correlations

87
on all trials, whereas second graders showed significant correlations on
only three of the seven trials (Trials 3, 4, and 6). The pattern of
correlations for the fourth graders indicated that they were able to
benefit from using multiple strategies from the beginning of the task.
The second graders pattern indicated that strategy use was rarely
linked to recall performance, which is evidence of a utilization
deficiency.
The findings in the descending/ascending condition revealed a
pattern opposite to that found in the ascending/descending condition.
Fourth graders now showed significant correlations on only two of the
seven trials (Trials 6 and 7), whereas second graders showed significant
correlations on five of the seven trials (Trials 2, 3, 5, 6, and 7).
The pattern of correlations for the second graders indicated that they
were using multiple strategies effectively. The fourth graders' pattern
was more difficult to interpret. Although it could be argued that
fourth graders were utilizationally deficient, such an interpretation is
probably incorrect because, with few exceptions, mean recall and
strategy use were higher for fourth graders than for second graders (see
Tables 2 and 4). Thus, fourth graders were probably not using
strategies ineffectively but likely using other means to recall the list
items, perhaps relying on nonstrategic factors (e.g., capacity, speed of
processing).
A comparison of the correlational data in each condition
demonstrates different patterns of strategy-recall relations across
trials for each age group. Fourth graders tended to use multiple
strategies effectively when an increasing number of words was presented

88
initially (i.e., Trials 1 to 4 in the ascending/descending condition),
but not when a decreasing number of words was presented initially (i.e.,
Trials 1 to 4 in the descending/ascending condition). Conversely,
second graders tended to use multiple strategies effectively when a
decreasing number of words was presented initially but not when an
increasing number of words was presented initially. Apparently, second
graders' effective use of multiple strategies occurred when an initially
large problem set (i.e., 12 or 15 words presented) was subsequently
reduced whereas fourth graders effective use of multiple strategies
occurred when an initially small problem set was subsequently increased.
I have no good explanation for these findings, and believe that lengthy
speculation is not warranted at this time. I conclude only that
strategy-recall relationships do vary as a function of age, amount of
information in the problem, and subsequent presentation of problems with
different amounts of information, and that the impact of these variables
on strategy variability warrants further investigation.
A second set of analyses examined grade comparisons in mean recall
on trials on which perfect clustering, sorting, or both sorting and
clustering were observed. Coyle and Bjorklund (1996) have argued that
utilization deficiencies can be inferred when grade differences in
recall are found despite comparable and perfect strategy use. The
analyses did not examine the effect of trials with different number of
words on recall because children rarely showed multiple trials with
perfect strategy use.
The analyses revealed that second graders who clustered perfectly
recalled fewer words than fourth graders who clustered perfectly. In

89
contrast, second graders who sorted perfectly, or both sorted and
clustered perfectly, recalled just as many words as comparably and
perfectly strategic fourth graders. Thus, perfectly strategic second
graders who used a sorting strategy, either alone or in combination with
a clustering strategy, showed levels of recall equivalent to comparably
and perfectly strategic fourth graders. However, perfectly strategic
second graders who used a clustering strategy by itself showed evidence
of a utilization deficiency. These findings demonstrate that
utilization deficiencies can occur for some but not all instances of
perfect strategy use.
Why did perfectly strategic second graders show a utilization
deficiency when using a clustering strategy but not a sorting strategy?
The answer to this question may have to do with the ontogeny of
organizational strategy development. Several studies have shown that
clustering early in development is caused by the automatic activation of
semantic memory relations, a nondeliberate and relatively ineffective
form of clustering (Frankel & Rollins, 1985; Schneider, 1986).* In
contrast, clustering later in development is caused by the deliberate
recall of words by taxonomic categories, a relatively effective way to
enhance recall. Sorting is very similar to the deliberate form of
clustering, in that it requires recognition and deliberate placement of
the items into categories, is relatively effective, and emerges later in
^Although stimulus items in the current study were selected to
minimize the automatic activation of semantic memory relations, in
practice it is not possible to eliminate entirely the occurrence of such
nondeliberate strategy use. Thus, the current study likely reduced but
did not eliminate the possibility of the automatic activation of
semantic memory relations.

90
development. Extrapolating from these findings to the current study,
second graders who clustered perfectly were probably using a less mature
and less effective form of clustering than were fourth graders, who were
clustering deliberately and reaping the benefits of doing so. In
contrast, second graders who sorted perfectly were likely using a
relatively advanced and effective approach for their age, one that is
normally observed in older children and that eliminated a utilization
deficiency.
The findings for the analyses of perfect strategy use were
confirmed and extended in an analysis of grade differences in recall on
trials when children used each of the 15 possible strategy combinations
(Table 11). Data were analyzed only for strategy combinations that were
used on more than one trial by each age group. Of the seven
combinations included in the analysis, every one showed that fourth
graders had higher levels of recall than comparably strategic second
graders, with recall on four of the seven combinations being
significantly higher for fourth graders. These results, along with
those in the analysis of perfect strategy use presented above, provide
substantial evidence of utilization deficiencies for the second graders.
Although in the current study utilization deficiencies were
evaluated for all possible strategy combinations, previous research,
including my own, has examined utilization deficiencies for only a
subset of all possible strategy combinations (Bjorklund, Schneider,
Cassel, & Ashely, 1994; Coyle & Bjorklund, 1996). As a result,
utilization deficiencies are evaluated for some but not all
combinations. For example, a sort-recall study by Bjorklund et al.

91
(1994) assessed utilization deficiencies separately for sorting and
clustering, but not for sorting and clustering used in combination.
Although evidence of a utilization deficiency for each strategy used
separately was found, utilization deficiencies for both strategies used
together were not evaluated. Similarly, a sort-recall study by Coyle
and Bjorklund (1996) assessed utilization deficiencies for sorting only,
clustering only, and sorting and clustering used together, but evidence
of utilization deficiencies for each strategy combination was aggregated
and examined in a single analysis. Because utilization deficiencies for
each possible strategy combination were not analyzed separately, the
possibility that a particular strategy exerted a disproportionate
influence on the outcome could not be examined.
The results of the current study show that assessing utilization
deficiencies for only a subset of strategy combinations may mask
utilization deficiencies for strategy combinations that are not
evaluated. For example, although utilization deficiencies in the
current study were found for the combinations of rehearsal only, sorting
and rehearsal, rehearsal and clustering, and the three strategy
combination of sorting, rehearsal, and clustering, no utilization
deficiencies were found for the strategy combinations of sorting only,
clustering only, and sorting and clustering (see Table 11). These data
show that utilization deficiencies do not apply broadly to childrens
strategic behavior, but apply to a specific set of strategies. Thus,
analyses that examine only a subset of possible strategy combinations
risk not detecting utilization deficiencies in strategy combinations
that are not evaluated.

92
The analyses that examined utilization deficiencies for children
who showed perfect strategy use and for children who showed equivalent
(but not necessarily perfect) strategy use did not yield identical
results (Tables 10 and 11). Both analyses examined the strategy
combinations of sorting, clustering, and sorting and clustering, and so
concordance for utilization deficiencies across analyses could be
evaluated for these strategies. Both analyses yielded no utilization
deficiency for the strategies of sorting, and sorting and clustering.
However, a utilization deficiency for second graders using clustering
was found in the analysis of children who showed perfect strategy use,
but not in the analysis of children who showed equivalent strategy use.
This discrepancy can be attributed to differences in how clustering was
assessed. The analysis of utilization deficiencies for children who
showed equivalent clustering assessed clustering by itself, not used in
combination with any other strategy. In contrast, the analysis of
utilization deficiencies for children who showed perfect clustering
ignored the possibility that clustering was used in combination with
rehearsal or category naming. Consequently, the latter analysis
probably included several trials on which clustering was used in
combination with another strategy. This possibility is supported by
data reported in the separate analyses of the 15 strategy combinations
(Table 11), which shows that clustering was used frequently with sorting
and also with rehearsal. Although the analysis examining perfect
clustering did examine both clustering and sorting together, it did not
examine clustering and rehearsal together, which, as shown in the
separate analyses of the 15 strategy combinations, resulted in a

93
utilization deficiency for the younger children. Thus, the utilization
deficiency reported in the analysis of children who clustered perfectly
is likely a result of clustering being used in combination with a
rehearsal strategy that was not evaluated. Presumably, no utilization
deficiency would have been reported if perfect clustering was separated
from other strategies and analyzed alone.
Relation Between Strategy Changes and Recall
The final aim of the present study was to investigate the relation
between recall and strategy change. A first set of analyses examined
whether average recall across trials varied according to three measures
of strategy changenumber of unique combinations, number of trials with
changes, and total number of changes across trials. Correlations
between recall and each measure of strategy change were computed
separately within each grade. Fourth graders showed significant and
negative relations between recall and two of the three measures of
strategy change (trials with changes and number of changes). Second
graders showed no significant relations between recall and any measure
of strategy change. These findings demonstrate that fourth graders who
used a stable mixture of strategies across trials had higher levels of
recall than their agemates who made many strategy changes across trials.
Such findings are consistent with models of strategy and behavioral
development postulating that, with age and experience, children are more
likely to use consistently an approach that yields optimal performance
(Siegler, 1996; Thelen & Ulrich, 1991).
The findings for correlations computed at each grade were
qualified by the findings for correlations computed separately for each

94
grade in each condition. Those correlations showed that fourth graders
in the descending/ascending condition showed significant and negative
relations between recall and all three measure of strategy change. No
other significant relations between recall and strategy change were
found for children of any age in any condition, including fourth graders
in the ascending/descending condition. These findings demonstrate that
fourth graders in the descending/ascending condition who used the same
mix of strategies across trials tended to recall more than their
agemates who made many strategy changes across trials. Surprisingly,
fourth graders in the ascending/descending condition showed no such
pattern. These findings suggest that fourth graders in the
descending/ascending condition contributed disproportionately to the
observed relations between recall and strategy change when data for all
fourth graders (i.e., those in both conditions) were analyzed together.
Why was stability associated with high levels of recall for fourth
graders in the descending/ascending condition, but not for fourth
graders in the ascending/descending condition? Perhaps the answer is
related to the number of words presented on the initial trials (Trials 1
and 2) in each condition. The descending/ascending condition had 15 and
12 words presented on Trials 1 and 2, respectively, whereas the
ascending/descending condition had six and nine words presented on the
corresponding trials. The additional words presented on the initial
trials of the descending/ascending condition made perfect recall very
unlikely without the use of strategies. Consequently, the early trials
of the descending/ascending condition may have induced a pattern of
effective strategic responding that was retained for all subsequent

95
trials. That is, children in the descending/ascending condition, who
were unlikely to show perfect performance without using strategies, may
have developed a mental set which guided their subsequent strategy use
(cf. Langer, 1989). In contrast, children in the ascending/descending
condition, who were able to recall perfect without using strategies, had
less opportunity to develop such a pattern of responding, at least on
the critical initial trials. This speculation must be interpreted
cautiously, however, because the effect of condition was not replicated
in the analyses of recall and strategy changes within individual
subjects.
The correlational findings pertaining to grade (but not condition)
were confirmed and extended in a series of analyses of recall for
individual children in each grade classified as stable or unstable. A
first analysis, using data on all trials for the classifications, showed
that fourth graders classified as stable had higher recall than fourth
graders classified as unstable. In contrast, second graders classified
as stable recalled no more words than second graders classified as
unstable. Thus, stability was beneficial to recall only for the older
children. Unlike the findings in the correlational analyses, the
pattern of results obtained in this analysis did not vary as a function
of condition. These findings were qualified by a second set of
analyses, which examined recall for children in each grade classified as
stable or unstable on early trials (Trials 1-4) and separately on later
trials (Trials 4-7). On early trials, recall varied as a function of
grade and stability classification but not condition. Fourth graders
classified as stable had marginally higher recall than fourth graders

96
classified as unstable, whereas no difference in recall was found for
second graders classified as stable or unstable. On later trials,
recall varied as a function of stability classification only; effects
involving grade and condition were not significant. Stable children
recalled more than unstable children, regardless of age or condition.
Together, the results pertaining to data on early and later trials
demonstrate that stable-strategy use on the early trials benefited
recall for fourth graders only, whereas stable-strategy use on later
trials benefited recall for both age groups. These findings qualify
previous reports of stable-strategy use being associated with high
levels of performance (Coyle & Bjorklund, 1997), showing that the
beneficial effect of stability on recall varies as a function of age and
task experience. Older children show recall benefits from stability
earlier than younger children, who show recall benefits from stability
only after they have acquired additional task experience.
Why did levels of recall vary as a function of variability in
strategy use? One possible answer to this question is that variability
is associated with high levels of capacity expenditure, which reduce
strategy effectiveness and task performance. For example, a study of
mathematical equivalence by Goldin-Meadow and her colleagues (Goldin-
Meadow, Nusbaum, Garber, Church, 1993) found that children who showed
variability in strategy use, displaying one strategy in gesture and a
different one in speech, also showed relatively high levels of capacity
expenditure, indicated by decreased performance on a secondary task. By
contrast, children who showed low levels of variability in strategy use,
displaying the same strategy in gesture and speech, showed less capacity

97
expenditure. Extrapolating from these findings to the current study
suggests that the inverse relation between variability and recall was
mediated by capacity requirements. Presumably, variability in strategy
use, indicated by many trial-by-trial changes in strategy use, resulted
in increased capacity expenditure, which consequently reduced strategy
effectiveness and lowered recall. No such effect was found when low
levels of variability were observed, presumably because low levels of
variability do not result in any appreciable capacity expenditure.
A final analysis examined the question of why strategy changes
occur. The analysis was developed based on the findings of a serial-
recall study by McGilly and Siegler (1989). In that study, strategy
changes were observed less frequently after trials on which serial-
recall performance was perfect than after trials on which serial-recall
performance was not perfect. This pattern is referred to as the win-
stay/lose-shift approach in the decision-making literature (Eimas,
1969).
Evidence for the win-stay/lose-shift approach in the current study
was examined by analyzing at each trial the occurrence of strategy
changes when recall was perfect and when recall was less than perfect.
For Trials 1 to 3, strategy changes occurred approximately equally
following perfect recall and following less than perfect recall. For
Trials 4 to 6, strategy changes were less frequent following perfect
recall than following less than perfect recall. These latter findings
are consistent with the win-stay/lose-shift approach. These are the
first results to my knowledge to demonstrate that the win-stay/lose-
shift approach varies as a function of task experience. Children in the

98
current study continued to use a strategy that yielded optimal
performance only after they had some experience on the task. No such
pattern of strategic responding was found when children had relatively
little experience on the task. An additional set of analyses revealed
little evidence that the win-stay/lose-shift approach was more apt to
occur on trials with relatively few words, with only two of six possible
trials on which few words were presented showing a pattern consistent
with the win-stay/lose-shift approach.
Conclusions
The findings of the current study extend research on multiple-
strategy use and strategy changes in two ways. First, whereas previous
research has described a single developmental pattern of variability
(variability declining with age and experience), the current study
showed that a single developmental pattern does not account for all
types of variability. Instead, the developmental course of variability
differed for different types of variability, with number of strategies
used increasing with age and strategy changes decreasing with age.
Second, whereas previous research has identified children as showing
stable- or unstable-strategy use, the current study demonstrated that
changes in the amount of stability observed in a brief testing session
vary with age. Specifically, older children tended to become more
stable in their strategy use over time, whereas younger children tended
to remain unstable.
The current study also extends research on the relation between
variability and performance in three ways. First, the current study
shows that utilization deficiencies may not apply to all possible

99
strategies but only to a specific subset of strategies. Second, whereas
previous research has reported the beneficial effect of stability on
performance, the current study shows that such benefits may occur
earlier in the task for older children than for younger children.
Third, the current study extended to a new domain evidence that strategy
changes occur less frequently following perfect performance than
following less than perfect performance. Such a pattern is consistent
with the win-stay/lose-shift approach (Eimas, 1969; McGilly & Siegler,
1989). Unlike previous research demonstrating the win-stay/lose-shift
approach (e.g., McGilly & Siegler, 1989), the current study found that
the win-stay/lose-shift approach occurred on later trials (i.e., Trials
4 to 6) only, suggesting that a pattern of strategic responding
consistent with the win-stay/lose-shift model emerges as a result of
task-related experience.
Perhaps more interesting than what the current study found is what
the current study did not find. A primary goal was to assess strategy
variability and the relation between variability and recall as a
function of the number of words presented on each trial. A number of
hypotheses that predicted effects involving the number of words on each
trial were made. Most of these hypotheses were based on the assumption
that children would adapt their strategy use to the demands of the task,
using more strategies when their memory capacity was insufficient for
perfect recall. For example, multiple-strategy use was predicted to be
highest on trials with relatively many words {i.e., 12 or 15 words),
because perfect recall on these trials was assumed to require additional
mnemonics. None of the predictions pertaining to the effect of number

100
of words were supported. Multiple-strategy use was not highest on
trials with relatively many words; strategy changes were not more common
following trials with relatively many words; and utilization
deficiencies were not less frequent on trials with relatively few words.
Why did the number of words manipulation have no effect on
variability in strategy use and the relation between variability and
recall? The answer, I believe, has to do with children's difficulty in
executing the chain of cognitive events that are required to alter
strategic behavior in response to changes in task demands (i.e., a
different number of words presented on consecutive trials) (cf. Miller,
1990). First, children must decide whether a change in task demands
warrants the use of a strategic approach different from the one used
previously. Second, if they decide a different approach is warranted,
children must then decide which of the approaches available in their
repertoire is most suitable. Third, children must access the strategy
that has been deemed most appropriate. Fourth, children must produce
the strategy that has been accessed. Fifth, children must execute
effectively the strategy that has been produced. Problems that occur at
any point in the chain may prevent task manipulations, including the one
used in the current study, from affecting strategic behavior. For
example, if children do not identify the problem as warranting a change
in strategic behavior, no new action would be taken. Even if a problem
is identified as warranting a change in strategic behavior, children may
not have in their repertoire a strategy that is (from their perspective)
appropriate to use, and so no new strategy would be accessed. Finally,
if a new and potentially effective strategy is accessed and produced,

101
children may not have the cognitive capacity to execute the strategy
effectively, and so no gain in performance would be realized (Miller,
Woody-Ramsey, & Aloise, 1991). The failure to find consistent effects
associated with the number of words manipulation can be attributed to
any one, or combination of, the above possibilities.
The current study has several implications for conceptualizing and
studying variability and the relation between variability and recall.
First, the current study demonstrates that variability is a multifaceted
phenomenon that can be assessed in various ways. Because no single
measure of variability is likely to describe fully the diversity of
variability in strategy use, future research on variability should
assess multiple-measures of variability, or risk ignoring some
potentially important types of variability. Second, the current study
demonstrates that some measures of variability are highly related
whereas others are less strongly related. For example, although unique
combinations was significantly related to other measures of variability,
the magnitude of correlations involving unique combinations was lower
than the magnitude of the correlations involving the other measures of
variability (i.e., number of trial with changes and number of trial-by
trial changes). These findings demonstrate that different measures of
variability are empirically, and perhaps conceptually, distinct.
Consequently, future research should define precisely what type of
variability is being measured and limit conclusions to that type of
variability. Third, the current study demonstrates that utilization
deficiencies are present for some but not all measures of strategy use.
For example, in the analysis of recall for children showing perfect

102
strategy use, utilization deficiencies were found for clustering but not
sorting or the combination of sorting and clustering. These findings
suggest that utilization deficiencies are not a general phase in
strategy development but are limited to the use of a particular strategy
in a particular context. Accordingly, future research should examine
the possibility of utilization deficiencies for each possible strategy
combination, not only for a subset of possible strategy combinations.
The results of the current study have implications for models of
strategy development postulating variability in strategy use. Such
models focus almost exclusively on multiple-strategy use, depicting
children as using multiple strategies early in development, later in
development, and at all points in between. The current study suggests
that models emphasizing multiple-strategy use might achieve even greater
descriptive power if they considered developmental changes in stability
in strategy use, defined as the consistent use of a particular strategy
or strategy combination. Children in the current study did show
considerable multiple-strategy use, but they also showed considerable
stability in strategy use, and such stability was particularly
pronounced for older children on the latter trials of the sort-recall
task. In addition, stability in strategy use was associated with
relatively high levels of performance, suggesting that stable-strategy
use was a relatively efficient form of strategy production. These
findings are consistent with research showing that older, more mature
strategy users are likely to select and use consistently a strategy that
yields optimal performance (Thelen & Smith, 1994). More importantly,
these findings suggest that stability can provide additional information

103
regarding strategy use that, when incorporated into current models of
development emphasizing multiple-strategy use, may provide a more
complete account of strategy development.

REFERENCES
Baker-Ward, L., Ornstein, P. A., & Holden, D. J. (1984). The expression
of memorization in early childhood. Journal of Experimental Child
Psychology. 37, 555-557.
Bjorklund, D. F. (1988). Acquiring a mnemonic: Age and category
knowledge effects. Journal of Experimental Child Psychology. 45.
71-87.
Bjorklund, D. F. (1990). Children's strategies: Contemporary views of
cognitive development. Hillsdale. NJ: Erlbaum.
Bjorklund, D. F., & Bernholtz, J. F. (1986). The role of knowledge base
in the memory performance of good and poor readers. Journal of
Experimental Child Psychology. 41. 367-373.
Bjorklund, D. F., & Coyle, T. R. (1995). Utilization deficiencies in the
development of memory strategies. In F. E. Weinert & W. Schneider
(Eds.), Memory performance and competencies: Issues in growth and
development (pp. 161-180). Mahwah, NJ: Erlbaum.
Bjorklund, D. F., Coyle, T. R., & Gaultney, J. F. (1992). Developmental
differences in the acquisition and maintenance of an
organizational strategy: Evidence for the utilization deficiency
hypothesis. Journal of Experimental Child Psychology. 54. 434-448.
Bjorklund, D. F., & Harnishfeger, K. K. (1987). Developmental
differences in the mental effort requirements for the use of an
organizational strategy in free recall. Journal of Experimental
Child Psychology. 44. 109-125.
Bjorklund, D. F., Miller, P. H., Coyle, T. R., & Slawinski, J. L. (in
press). Instructing children to use memory strategies: Evidence
for utilization deficiencies in memory training studies.
Developmental Review.
Bjorklund, D. F., Schneider, W., Cassel, W. S., & Ashely, E. (1994).
Training and extension of a memory strategy: Evidence for
utilization deficiencies in the acquisition of an organizational
strategy in high- and low-IQ children. Child Development, 65,
951-965.
Bjorklund, D. F., Thompson, B. E., & Ornstein, P. A. (1983).
Developmental trends in children's typicality judgments.
Behavioral Research Methods and Instrumentation, 15. 350-356.
104

105
Ceci, S. J., & Howe, M. J. (1978). Age-related differences in free
recall as a function of retrieval flexibility. Journal of
Experimental Child Psychology, 26. 432-442.
Coyle, T. R., & Bjorklund, D. F. (1996). The development of strategic
memory: A modified microgenetic assessment of utilization
deficiencies. Cognitive Development. 11. 295-314.
Coyle, T. R., & Bjorklund, D. F. (1997). Age differences in, and
consequences of, multiple- and variable-strategy use on a
multitrial sort-recall task. Developmental Psychology. 33. 372-
380.
Coyle, T. R., Colbert, C. T., & Read, L. E. (1997, April). Strategy
variability and memory performance in average- and high-10
children. Poster presented at the meeting of the Society for
Research in Child Development, Washington, D.C.
DeLoache, J. S. (1984). Oh where, on where: Memory-based searching by
very young children. In C. Sophian (Ed.), Origins of cognitive
skills (pp. 57-80). Hillsdale, NJ: Erlbaum.
Dempster, F. N. (1992). The rise and fall of the inhibitory mechanism:
Toward a unified theory of cognitive development and aging.
Developmental Review, 12. 45-75.
Eimas, P. D. (1969). A developmental study of hypothesis behavior and
focusing. Journal of Experimental Child Psychology, 8. 160-172.
Flavell, J. H. (1970). Developmental studies of mediated memory. In H.
W. Reese & L. P. Lipsitt (Eds.), Advances in child development and
child behavior (Vol. 5, pp. 181-211). New York: Academic Press.
Flavell, J. H., Beach, D. H., & Chinsky, J. M. (1966). Spontaneous
verbal rehearsal in a memory task as a function of age. Child
Development. 37, 283-299.
Frankel, M. T., & Rollins, H. S. (1985). Associative and categorical
hypotheses of organization in the free recall of adults and
children. Journal of Experimental Child Psychology, 40. 304-318.
Gholson, B., & Barker, P. (1985). Kuhn, Lakatos, and Laudan:
Applications in the history of physics and psychology. American
Psychologist, 40, 755-769.
Goldin-Meadow, S., Alibali, M. W., & Church, R. B. (1993). Transitions
in concept acquisition: Using the hand to read the mind.
Psychological Review, 100, 279-297.

106
Goldin-Meadow, S., Nussbaum, H., Garber, P., & Church, R. B. (1993).
Transitions in learning: Evidence for simultaneously activated
hypotheses. Journal of Experimental Psychology: Human Perception
and Performance. 19. 1-16.
Guttentag, R. E. (1984). The mental effort requirements of cumulative
rehearsal: A developmental study. Journal of Experimental Child
Psychology. 37, 92-106.
Jacoby, L. L. (1991). A process dissociation framework: Separating
automatic from intentional uses of memory. Journal of Memory and
Language. 30. 513-541.
Kee, D. W. (1994). Developmental differences in associative memory:
Strategy use, mental effort, and knowledge-access interactions. In
H. W. Reese (Ed.), Advanced in child development and behavior
(Vol. 25, pp. 232). New York: Academic Press.
Kee, D. W., & Davies, L. (1990). Mental effort and elaboration: Effects
of accessibility and instruction. Journal of Experimental Child
Psychology. 49. 264-274.
Lange, G., MacKinnon, C. E., & Nida, R. E. (1989). Knowledge, strategy,
and motivational contributions to preschool children's object
recall. Developmental Psychology, 25. 772-779.
Langer, E. J. (1989). Mindfulness. Reading, MA: Addison-Wesley.
Lemaire, P., & Siegler, R. S. (1995). Four aspects of strategic change:
Contributions to children's learning of multiplication. Journal of
Experimental Psychology: General, 124, 83-97.
McGilly, K., & Siegler, R. S. (1989). How children choose among serial
recall strategies. Child Development, 60, 172-182.
Miller, G. A. (1956). The magical number seven plus or minus 2: Some
limits on our capacity for processing information. Psychological
Review, 63, 81-97.
Miller, P. H. (1990). The development of strategies of selective
attention. In D. F. Bjorklund (Ed.), Children's strategies:
Contemporary views of cognitive development (pp. 157-184).
Hillsdale, NJ: Erlbaum.
Miller, P. H., & Seier, W. L. (1994). Strategy utilization deficiencies
in children: When, where, and why. In H. W. Reese (Ed.), Advances
in child development and behavior (Vol. 25, pp. 108-156). New
York: Academic Press.
Miller, P. H., Seier, W. L., Barron, K. L., & Probert, J. S. (1994).
What causes a memory strategy utilization deficiency? Cognitive
Development, 9, 77-102.

107
Miller, P. H., Wood-Ramsey, J., & Aloise, P. A. (1991). The effect of
strategy effortfulness on strategy effectiveness. Developmental
Psychology, 27, 738-745.
Ornstein, P. A., Naus, M. J., & Liberty, C. (1975). Rehearsal and
organizational processes in children's memory. Child Development,
46, 818-830.
Posnansky, C. J. (1978). Category norms for verbal items in 25
categories for children in grades 2-6. Behavior Research Methods
and Instrumentation, 10, 819-832.
Ringel, B. A., & Springer, C. J. (1980). On knowing how well one is
remembering: The persistence of strategy use during transfer.
Journal of Experimental Child Psychology, 29, 322-333.
Roenker, D. L., Thompson, C. P., & Brown, S. C. (1971). Comparison of
measures for the estimation of clustering in free recall.
Psychological Bulletin, 76, 45-48.
Schneider, ¥. (1986). The role of conceptual knowledge and metamemory in
the development of organizational processes in memory. Journal of
Experimental Child Psychology, 42, 218-236.
Siegler, R. S. (1995). Children's thinking: How does change occur? In F.
E. Weinert & W. Schneider (Eds.), Memory performance and
competencies: Issues in growth and development (pp. 405-430).
Mahwah, NJ: Erlbaum.
Siegler, R. S. (1996). Emerging minds: The process of change in
children's thinking. New York: Oxford University Press.
Siegler, R. S., & Jenkins, E. (1989). How children discover new
strategies. Hillsdale, NJ: Erlbaum.
Thelen, E., & Smith, L. B. (1994). A dynamic systems approach to the
development of cognition and action. Cambridge, MA: MIT
Press/Bradford Books.
Thelen, E., & Ulrich, B. D. (1991). Hidden skills: A dynamic systems
analysis of treadmill stepping during the first year. Monographs
of the Society for Research in Child Development. 56(1, Serial No.
223) .
Uyeda, K. M., & Mandler, G. (1980). Prototypicality norms for 28
semantic categories. Behavior Research Methods and
Instrumentation, 12, 567-595.
Wellman, H. M., Ritter, K., & Flavell, J. H. (1975). Deliberate memory
behavior the delayed reactions of very young children.
Developmental Psychology, 11, 780-787.

BIOGRAPHICAL SKETCH
Thomas R. Coyle was born in Philadelphia, Pennsylvania, on
February 13, 1968. He was raised in North Lauderdale, Florida, and Boca
Raton, Florida. Thomas graduated high school at Saint Andrew's School
in Boca Raton. He attended Palm Beach Community College in Boca Raton,
where he received an Associate of Arts in psychology in 1989. He then
transferred to Florida Atlantic University, also in Boca Raton, where he
received a Bachelor of Arts in psychology in 1991 and a Master of Arts
in psychology in 1993 under the direction of Dr. David F. Bjorklund.
Thomas then went on to the University of Florida, in Gainesville,
Florida, where he received a Doctor of Philosophy in psychology in 1997
under the direction of Dr. Patricia H. Miller. Thomas is currently
Assistant Professor of Psychology at the University of Texas at San
Antonio.
108

I certify that I
conforms to acceptable
adequate, in scope and
Doctor of Philosophy.
I certify that I
conforms to acceptable
adequate, in scope and
Doctor of Philosophy.
I certify that I
conforms to acceptable
adequate, in scope and
Doctor of Philosophy.
I certify that I
conforms to acceptable
adequate, in scope and
Doctor of Philosophy.
I certify that I
conforms to acceptable
adequate, in scope and
Doctor of Philosophy.
have read this study and that in my opinion it
standards of scholarly presentation and is fully
quality, as a dissertation for the degree of
Pafricia E. feller, Chair
Professor of Psychology
have read this study and that in my opinion it
standards of scholarly presentation and is fully
quality, as a dissertation for the degree of
Shari A. Ellis, Cochair
Assistant Professor of Psychology
have read this study and that in my opinion it
standards of scholarly presentation and is fully
quality, as ardissertation for the degree of
Scott A. Miller
Professor of Psychology
have read this study and that in my opinion it
standards of scholarly presentation and is fully
quality, as a dissertation for the degree of
\a, Av-fK (jQsi. A/Nrev..
James J. Algina /\
Processor of Foundations of Education
have read this study and that in my opinion it
standards of scholarly presentation and is fully
quality, as a dissertation for the degree of
Ira S. Fischler
Professor of Psychology

This dissertation was submitted to the Graduate Faculty of the
Department of Psychology in the College of Liberal Arts and Sciences and
to the Graduate School and was accepted as partial fulfillment of the
requirements for the degree of Doctor of Philosophy
August, 1997
Dean, Graduate School



7
and recall, and (c) show more instances of strategy increases over
trials with no corresponding increases in recall (see Miller & Seier,
1994). These findings demonstrate that young children are less likely
to benefit from strategy use than older children, which is evidence of a
utilization deficiency for young children.
A recent study by Coyle and Bjorklund (1996) showed that
utilization deficiencies have different developmental consequences for
memory depending on when they occur. In this study, children in second
through fourth grade received a multitrial sort-recall task, with
different sets of categorizable words on each trial. Children were
classified as utilizationally deficient or not based on their pattern of
clustering and recall over trials. Children were classified as
utilizationally deficient if they showed increases in clustering over
trials with no corresponding increases in recall. All other children
were classified as nonutilizationally deficient. Mean recall varied as
a function of grade and utilization deficiency classification. Second-
and third-grade utilizationally deficient children recalled more on
average than their nonutilizationally deficient agemates, most of whom
used no strategy at all. Conversely, fourth-grade utilizationally
deficient children recalled less than their nonutilizationally deficient
agemates, most of whom were using strategies effectively.
The Coyle and Bjorklund findings demonstrate that utilization
deficiencies have different memory consequences depending on when they
occur in development. Utilization deficiencies that occur early in
development are associated with relatively high levels of memory
performance, because the dominant alternative pattern is no strategy use


50
Table 6
Mean Z-scores and Raw Scores for Unique Combinations, Trials with Changes, and
Total Changes, By Grade (Standard Deviations in Parentheses)
Strategy Change Type
Unique
Combinations
Trials
with Changes
Total
Changes
Grade 2
Z-Score
.13 (1.00)
.25
(1.00)
.29
(1.04)
Raw Score
2.64 (1.11)
3.51
(1.56)
4.67
(2.45)
Grade 4
Z-Score
-.18 ( .99)
-.34
( -91)
-.39
( -80)
Raw Score
2.29 (1.10)
2.59
(1.43)
3.06
(1.87)


53
they used the same combination of strategies on only one of three pairs
of consecutive trials.
The percentage of children in each grade classified as stable and
unstable on early trials and separately on later trials is presented in
columns three through six in Table 7. For the early trials, no grade
difference in the distribution of children classified as stable or
unstable was found, X2(l, N = 120) < 1, with most children showing
unstable strategy use. For the later trials, fourth graders were
significantly more likely to be classified as stable than second
7
graders, who frequently showed unstable strategy use, X (1, N = 120) =
7.38. These findings demonstrate that both groups of children showed
considerable variability in strategy use on early trials. In contrast,
only second graders showed unstable strategy use on later trials; most
fourth graders showed stable strategy use.
These findings were extended in an analysis that examined changes
in stability classification from early to later trials for individual
children. Children were classified as showing one of four possible
patterns of stability classification from early trials (i.e., Trials 1
to 4) to later trials (Trials 4 to 7): unstable on early trials,
unstable on later trials (unstable/unstable); unstable on early trials,
stable on later trials (unstable/stable); stable on early trials, stable
on later trials (stable/stable); stable on early trials, unstable on
later trials (stable/unstable).
The percentage of children in each of the four pattern
classifications is shown by grade in Table 8. The data are presented in
terms of children whose stability classification did or did not change


83
in the developmental pathways of different measures of variability, with
no one pattern accounting for all measures of strategy change and
multiple-strategy use.
Correlations among the various measures of strategy changes
differed in magnitude. Although correlations among all measures of
strategy change were significant, correlations between number of trials
with changes and total number of changes were consistently higher than
correlations between each of these measures and number of unique
combinations. Moreover, correlations between trials with changes and
total changes were near perfect (rs > .90) for 3 of the possible 4
correlations involving these measures, whereas none of the 8
correlations involving unique combinations was near perfect. These data
are the first to my knowledge to show differences in the strength of
relations among different measures of strategy change.
Differences in the relations among the various measures of
strategy change can be attributed to how each measure was computed. The
two most closely related measures, trials with changes and total
changes, both were computed based on the number of consecutive trials on
which different strategies were used. The third measure, unique
combinations, was computed based on the number of different strategy
combinations used, irrespective of whether the different strategies were
used on consecutive trials. These computational differences resulted in
differences in the magnitude of the correlations among the various
measures of strategy change, with correlations among measures based on
the same underlying index of strategy change being higher than


stable-strategy use (i.e., few trial-by-trial changes) and recall,
whereas younger children showed no reliable relation between stability
and recall. This study extended previous research by showing that (a)
stable-strategy use emerges with experience (i.e., over trials) for
older children but not younger children, (b) utilization deficiencies
occur for some but not all instances of perfect strategy use, and (c)
memory benefits from stability occur on early trials (i.e., Trials 1 to
4) for older children but not for younger children, who show memory
benefits from stability on later trials (i.e., Trials 4 to 7) only.
Surprisingly, the results revealed few significant effects related to
changes in the measure of task difficulty (i.e., number of words to
remember). The findings are discussed in terms of how they advance our
knowledge and understanding of utilization deficiencies and variability
in strategy use.
Xll


INTRODUCTION
All scientific disciplines are based on a set of core assumptions
(Gholson & Barker, 1985). These assumptions are rarely stated
explicitly and rarely questioned. They serve to direct a researcher's
choice of research questions, data collection procedures, statistical
analyses, and interpretation of research findings.
Two such assumptions were central in early research on children's
memory strategies. The first was that memory strategies usually enhance
memory performance. The second was that memory strategy development
proceeds through a series of stages in which a unique strategy is used
fairly consistently in each stage. These assumptions were implicit in
much of the memory strategy research conducted throughout the 1960s and
1970s.
There are now a number of studies demonstrating that these
assumptions are at best misleading, and at worst, empirically
inaccurate. The next two sections provide a brief history of events
that led to the rise and fall of the view that memory strategies
generally enhance performance and that memory strategy development is
stagelike. The discussion will focus on two concepts central to this
dissertation. The first is the concept of utilization deficiency, which
refers to strategy use with no performance benefit. The second is the
concept of variability in strategy use, which refers to the use of not
one but several different approaches.
1


35
5.68 (mean number of intervals of examination per trial: 2.28, 2.19,
2.10, 1.98, 2.03, 2.00, 1.85 for Trials 1-7, respectively). These main
effects were qualified by a significant Condition x Trial interaction,
F(6, 696) = 2.91. Inspection of the significant interaction revealed
that ascending/descending versus descending/ascending comparisons were
significant at Trial 2 (1.89 versus 2.54), Trial 3 (1.81 versus 2.43),
and Trial 4 (1.69 versus 2.30), but not significant at Trial 1 (2.13
versus 2.46), Trial 5 (1.91 versus 2.16), Trial 6 (1.88 versus 2.14),
and Trial 7 (1.81 versus 1.89). These data demonstrate that attention
to the task materials was somewhat greater on the initial
descending/ascending trials than on the corresponding
ascending/descending trials.
Recall
Before presenting preliminary analysis of the recall data, data
concerning repetitions and intrusions in recall are examined.
Repetitions refer to recall of the same word more than once. Intrusions
refer to utterances of words not on the target list. The frequency of
occurrence of each type of data was analyzed separately using 2 (grade)
x 2 (condition) x 7 (trial) ANOVAs, with repeated measures on the trial
factor. Analysis of the repetition data revealed no significant main
effects or interactions. Repetitions were slightly greater for fourth
graders (M = .49) than for second graders (M = .40). Analysis of the
intrusion data revealed a significant main effect of grade, F(l, 116) =
5.64, with intrusions being greater for second graders (M = .22) than
for fourth graders (M = .05). All other main effects and interactions
for the intrusion data were not significant. The significant grade


84
correlations among measures based on different indexes of strategy-
change.
Mean levels of variability for each strategy change measure in the
current study were lower than those observed in the similar study by
Coyle and Bjorklund (1997). In the current study, percentage of unique
combinations, trials with changes, and total changes across trials
(collapsed across grade) was 35, 51, and 55, respectively. In Coyle and
Bjorklund, the corresponding percentages were 46, 54, and 62,
respectively. This slight disparity in strategy change scores can be
attributed to the current study using more trials than the Coyle and
Bjorklund study. As shown in Table 5, the additional trials in the
current study allowed fourth graders to maintain a pattern of stable-
strategy use (i.e., few strategy changes) that began after Trial 4,
whereas second graders showed unstable-strategy use across all trials.
Consequently, the mean number of strategy changes averaged across grade
can be attributed to fourth graders showing substantially lower strategy
change scores on the later trials. Although Coyle and Bjorklund did
analyze strategy change patterns across trials, the fewer trials used in
that study limited the amount of stability that older children could
display and probably contributed to the slight disparity in strategy
change scores.
The age-related decline in strategy change was confirmed and
extended in analyses of trial-by-trial changes in strategy use for
individual children. An initial analysis revealed that fourth graders
were more likely to be classified as showing stable-strategy use than
second graders, who often switched strategies on adjacent trials.


57
Table 9
LUi I LJ-Uli Dfci LWccil Lilil
Bv Condition, Grade, and
JfciJ. UI WU1U
Trial
IXCLdl -LfciLI
L ilU 1\ U.111U t-i.
UI
JU
V PSr.V* 1.
Trial
1
2
3
4
5
6
7
Ascending/Descending
Grade 2 .31
.29
.42**
.32*
16
.50**
.30
Grade 4 .41*
.46*
.41*
.65**
73**
.45*
.55**
Descending/Ascending
Grade 2 -.18
.50**
.43*
.34
60**
.65**
.57**
Grade 4 .31
.37
.38
.06
43*
.46*
.36
* £ < .05, ** £ < .01


31
of items. The experimenter recorded childrens sorting patterns on each
trial and the entire session was audiotaped.
Coding
During the 1 min 30 s study period on each trial, the experimenter
observed the incidence of sorting, rehearsal, category naming,
examination, and off-task behavior for each of three separate 30-s
intervals. Each type of study behavior was coded as occurring or not
during each of the three intervals. Sorting was recorded when children
physically moved or arranged cards. Rehearsal was recorded when
children verbalized out loud or mouthed the list items (no distinction
was made between single-word and cumulative rehearsal). Category naming
was recorded when children said the category name of a group of items
(e.g., FRUIT for apple, banana, peach). Examination was recorded when
children visually scanned the cards. Off-task behavior was recorded
when children looked away from the cards and were visually inattentive
to the task for a total of 5 consecutive seconds. Clustering during
recall was recorded when children recalled words by adult-defined
categories.
Following Coyle and Bjorklund (1997), three of the five study
behaviors were classified and analyzed as strategies. These were
sorting, rehearsal, and category naming. Clustering during recall was
classified as a fourth strategy. Examination was not considered a
strategy because by itself examination reflects only attention to the
target information. Although children may be covertly using a strategy
(e.g., rehearsal) while examining the items, this cannot be discerned
from their overt behavior. For these reasons, examination was not


Table
pase
13. Correlations among measures of strategy change, by condition
and grade 67
14. Mean proportion recall (and standard deviations) for children
classified as stable or unstable across all trials, on early
trials, and on later trials, by grade 70
15. Percentage (and number) of trials on which strategy changes
did and did not occur immediately after recall was perfect
or not perfect 73
16. Percentage (and number) of trials on which strategy changes
did and did not occur immediately after recall was perfect
or not perfect, by condition 75
x


For my parents,
Oceania and Roger Coyle


33
high intercoder agreement for the types of strategies coded in the
current study.


24
demonstrating a utilization deficiency for younger children. To date,
research on utilization deficiencies has examined the effectiveness of a
single strategy (e.g., clustering or rehearsal), or, in a few cases, the
effectiveness of multiple strategies. The current study examines the
effectiveness of both single- and multiple-strategy use in a single
paradigm, and compares directly the incidence of utilization deficiency
when children use one or several strategies.
Utilization deficiencies were predicted to be less frequent on
trials with relatively few words (i.e., 6 or 9 words) than on trials
with relatively many words {i.e., 12 or 15 words). This prediction was
based on the assumption that trials with few words would consume less of
children's limited mental capacity than trials with many words. Thus,
additional capacity should be available for efficient strategy
utilization on trials with few words. Consequently, utilization
deficiencies should be less frequent on trials with few words compared
to trials with many words. This prediction may be qualified by age, with
older children's superior processing capacity permitting effective
strategy use on all trials, irrespective of the number of words
presented.
The third and final goal of the current study was to examine the
relation between strategy changes and recall as a function of age. On
the basis of the findings in Coyle and Bjorklund (1997) and other
studies (Lemaire & Siegler, 1995), the relation between strategy change
and recall was predicted to be negative and significant for older but
not younger children. That is, few strategy changes across trials
(i.e., stable-strategy use) were predicted to result in high levels of


51
their pattern of strategy change across trials. Children were
classified as stable if they used the same combination of strategies on
at least four pairs of consecutive trials (of a possible six pairs of
consecutive trials). Children were classified as unstable if they used
the same combination of strategies on fewer than fours pairs of
consecutive trials. These classifications were based on changes in the
mixture (rather than the number) of strategies used over trials.
The percentage of children in each grade classified as stable or
unstable is shown in the first and second columns of Table 7. Fourth
graders were significantly more likely to be classified as stable than
second graders, who showed considerable variability in strategy use
X2(l, N = 120) = 7.52. These data are consistent with the findings
reported in the previous section. However, the findings in the previous
section showed that although both groups tended to show variability on
early trials, only the second graders showed variability on later
trials. Thus, a second analysis examined the possibility that the
observed grade differences in stability classification were primarily
attributed to differences in variability on later rather than earlier
trials. Children were classified as stable or unstable on early trials
(Trials 1 to 4) and separately on later trials (Trials 4 to 7). (Trial
4 is both the last trial in the set of early trials and the first trial
in the set of later trials.) For each block of trials, children were
classified as stable if they used the same combination of strategies on
two or three pairs of consecutive trials (of a possible total of three
pairs of consecutive trials). Children were classified as unstable if


LIST OF TABLES
Table page
1. Word lists by category membership 27
2. Mean proportion recall by condition, grade, and trial, and by
condition and trial (i.e., collapsed across grade), and
grade differences in recall at each trial by condition 37
3. Percentage (and number) of trials on which each strategy was
used, by condition and grade 40
4. Mean number of strategies used, by grade and trial, and by
condition, grade, and trial 43
5. Mean number of trial-by-trial strategy changes, by grade and
trial transition, and by condition, grade, and trial
transition 47
6. Mean z-scores and raw scores for unique combinations, trials
with changes, and total changes, by grade (standard
deviations in parentheses) 50
7. Percentage (and number) of children classified as stable or
unstable across all trials, on early trials, and on later
trials, by grade 52
8. Percentage (and number) of children changing or not changing
their stability classification across trial blocks, by
grade 54
9. Correlations between number of words recalled and number of
strategies used, by condition, grade, and trial 57
10. Mean proportion recall when strategy use was perfect, by grade
and type of strategy used 60
11. Percentage (and number) of trials on which each strategy
combination was used, and mean proportion recall (and
standard deviations) for each combination, by grade (Codes
for strategies: S, sorting; R, rehearsal; C, clustering; N,
category naming) 62
12. Correlations between measures of strategy change and recall,
by condition and grade 66
ix


68
perfect for all Grade x Condition cells except one (second graders in
the ascending/descending condition).
Relation between strategy chanRe and recall for individual
children. The finding that stability was related to high levels of
recall only for fourth graders in the descending/ascending condition was
only partially confirmed in an analysis of recall for children
classified as stable or unstable (see Table 7 for stability
classification data). A 2 (grade) x 2 (stability classification) x 2
(condition) x 7 (trial) ANOVA was conducted on mean proportion recall.
Because significant main effects and interactions involving the Grade,
Condition, and Trial factors have already been reported, only
significant main effects and interactions involving the Stability
Classification factor are reported here.
The analysis revealed a marginally significant main effect of
stability classification, F(l, 112) = 3.02, £ = .09 (mean proportion
recall: .59 and .70, for unstable and stable, respectively), which was
qualified by a significant grade x stability classification interaction,
F(l, 112) =3.78. No other significant effects involving the stability
classification factor were found, including effects involving the
condition factor. The failure to find a significant Grade x Stability
Classification x Condition interaction is inconsistent with the
correlational results reported in the previous section. Those results
showed that, in the descending/ascending condition, fourth graders
showing stable-strategy use had higher recall than fourth graders
showing unstable-strategy use. The findings in the current section,
along with those of the previous one, demonstrate that findings


29
defining items like the ones used in the current study (Coyle &
Bjorklund, 1997).
Each child received seven sort-recall trials. A different list of
words was presented on each trial. Children were assigned to one of two
conditions. In both conditions, three categories were represented in
the word lists on all trials. However, the number of words in each
category varied systematically across trials. In the
ascending/descending condition, the number of items in each category was
2, 3, 4, 5, 4, 3, and 2 on trials 1 through 7, respectively. Thus, the
total number of items presented on trials 1 through 7 was 6, 9, 12, 15,
12, 9, and 6. In the descending/ascending condition, the number of
items in each category was 5, 4, 3, 2, 3, 4, and 5 on trials 1 through
7, respectively. Thus, the total number of items presented on trials 1
through 7 was 15, 12, 9, 6, 9, 12, and 15. The sum of all items in the
descending/ascending condition was greater than the sum of all items in
the ascending/descending condition. In each condition, the seven lists
were presented in 1 of 10 predetermined random orders. Each list was
presented on each of the seven trials approximately equally, and all
items within a list were used approximately equally. This resulted in
a 2 (grade: second vs. fourth) x 2 (condition: ascending/descending vs.
descending/ascending) x 7 (trial) design, with repeated measures on the
trial factor.
Procedure
Children were tested by the author of this dissertation and two
undergraduate research assistants. Each child was seen individually in
a session lasting approximately 30 min. Prior to the presentation of


14
one strategy per trial is assessed, the possibility of variability
within a particular trial (i.e., intratrial variability) cannot be
examined. Instead, the focus is on variability across trials (i.e.,
intertrial variability).
Third, variability has been measured almost exclusively in terms
of the number of strategies used. Although the number of strategies
used is one measure of variability, it is not the only one. A handful
of other studies have shown that variability can be measured in other
ways, including the number of trial-by-trial changes in strategy use,
the degree of stability in the sequence of strategy production across
several trials, and the number of instances when one strategy is
expressed in gesture and a different one in speech (Coyle & Bjorklund,
1997; Coyle, Colbert, & Read, 1997; Goldin-Meadow et al., 1993). These
studies demonstrate that variability can be measured in not one but
several ways. A single measure, such as the number of strategies used,
does not capture all possible patterns of variability, and different
kinds of variability may have different causes and consequences.
A recent study by Coyle and Bjorklund (1997) addressed these
limitations. Some time will be spent describing this study because its
design and findings figure prominently in the study developed for this
dissertation. Children in second through fourth grade received five
sort-recall trials of categorizable words. Unlike other multitrial
experiments (e.g., Bjorklund, 1988), different items and categories were
used on each trial, so that any increases in strategy use could not be
attributed to increased familiarity with a particular set of stimulus
items.


100
of words were supported. Multiple-strategy use was not highest on
trials with relatively many words; strategy changes were not more common
following trials with relatively many words; and utilization
deficiencies were not less frequent on trials with relatively few words.
Why did the number of words manipulation have no effect on
variability in strategy use and the relation between variability and
recall? The answer, I believe, has to do with children's difficulty in
executing the chain of cognitive events that are required to alter
strategic behavior in response to changes in task demands (i.e., a
different number of words presented on consecutive trials) (cf. Miller,
1990). First, children must decide whether a change in task demands
warrants the use of a strategic approach different from the one used
previously. Second, if they decide a different approach is warranted,
children must then decide which of the approaches available in their
repertoire is most suitable. Third, children must access the strategy
that has been deemed most appropriate. Fourth, children must produce
the strategy that has been accessed. Fifth, children must execute
effectively the strategy that has been produced. Problems that occur at
any point in the chain may prevent task manipulations, including the one
used in the current study, from affecting strategic behavior. For
example, if children do not identify the problem as warranting a change
in strategic behavior, no new action would be taken. Even if a problem
is identified as warranting a change in strategic behavior, children may
not have in their repertoire a strategy that is (from their perspective)
appropriate to use, and so no new strategy would be accessed. Finally,
if a new and potentially effective strategy is accessed and produced,


DISCUSSION
The current study examined several measures of variability in
strategy use, relating each measure to memory performance. Whereas
previous investigations of variability in strategy use have assessed
only one strategy on each trial and one type of variability (Siegler,
1996), the current study examined the possibility of multiple strategies
on each trial and two different types of variability (multiple-strategy
use and strategy changes). The current study was very similar in design
to a study by Coyle and Bjorklund (1997). However, it had additional
trials with which to evaluate changes in variability over time and
included trials varying widely in the number of words to be recalled.
Strategy variability was assessed within and across trials and related
to mean levels of recall, with analyses focusing on utilization
deficiencies and stability-recall relations. The results revealed
developmental differences in multiple-strategy use and strategy change,
and more important, age-related changes in the relation between measures
of variability and recall. Surprisingly, the results revealed few
significant effects related to the number of words presented on each
trial or the pattern of increases and decreases in the number of words
presented across trials.
The goals of the current study were to examine the impact of age
and number of words presented on each trial on (a) measures of strategy
variability, including multiple-strategy use and strategy changes; (b)
78


103
regarding strategy use that, when incorporated into current models of
development emphasizing multiple-strategy use, may provide a more
complete account of strategy development.


93
utilization deficiency for the younger children. Thus, the utilization
deficiency reported in the analysis of children who clustered perfectly
is likely a result of clustering being used in combination with a
rehearsal strategy that was not evaluated. Presumably, no utilization
deficiency would have been reported if perfect clustering was separated
from other strategies and analyzed alone.
Relation Between Strategy Changes and Recall
The final aim of the present study was to investigate the relation
between recall and strategy change. A first set of analyses examined
whether average recall across trials varied according to three measures
of strategy changenumber of unique combinations, number of trials with
changes, and total number of changes across trials. Correlations
between recall and each measure of strategy change were computed
separately within each grade. Fourth graders showed significant and
negative relations between recall and two of the three measures of
strategy change (trials with changes and number of changes). Second
graders showed no significant relations between recall and any measure
of strategy change. These findings demonstrate that fourth graders who
used a stable mixture of strategies across trials had higher levels of
recall than their agemates who made many strategy changes across trials.
Such findings are consistent with models of strategy and behavioral
development postulating that, with age and experience, children are more
likely to use consistently an approach that yields optimal performance
(Siegler, 1996; Thelen & Ulrich, 1991).
The findings for correlations computed at each grade were
qualified by the findings for correlations computed separately for each


107
Miller, P. H., Wood-Ramsey, J., & Aloise, P. A. (1991). The effect of
strategy effortfulness on strategy effectiveness. Developmental
Psychology, 27, 738-745.
Ornstein, P. A., Naus, M. J., & Liberty, C. (1975). Rehearsal and
organizational processes in children's memory. Child Development,
46, 818-830.
Posnansky, C. J. (1978). Category norms for verbal items in 25
categories for children in grades 2-6. Behavior Research Methods
and Instrumentation, 10, 819-832.
Ringel, B. A., & Springer, C. J. (1980). On knowing how well one is
remembering: The persistence of strategy use during transfer.
Journal of Experimental Child Psychology, 29, 322-333.
Roenker, D. L., Thompson, C. P., & Brown, S. C. (1971). Comparison of
measures for the estimation of clustering in free recall.
Psychological Bulletin, 76, 45-48.
Schneider, ¥. (1986). The role of conceptual knowledge and metamemory in
the development of organizational processes in memory. Journal of
Experimental Child Psychology, 42, 218-236.
Siegler, R. S. (1995). Children's thinking: How does change occur? In F.
E. Weinert & W. Schneider (Eds.), Memory performance and
competencies: Issues in growth and development (pp. 405-430).
Mahwah, NJ: Erlbaum.
Siegler, R. S. (1996). Emerging minds: The process of change in
children's thinking. New York: Oxford University Press.
Siegler, R. S., & Jenkins, E. (1989). How children discover new
strategies. Hillsdale, NJ: Erlbaum.
Thelen, E., & Smith, L. B. (1994). A dynamic systems approach to the
development of cognition and action. Cambridge, MA: MIT
Press/Bradford Books.
Thelen, E., & Ulrich, B. D. (1991). Hidden skills: A dynamic systems
analysis of treadmill stepping during the first year. Monographs
of the Society for Research in Child Development. 56(1, Serial No.
223) .
Uyeda, K. M., & Mandler, G. (1980). Prototypicality norms for 28
semantic categories. Behavior Research Methods and
Instrumentation, 12, 567-595.
Wellman, H. M., Ritter, K., & Flavell, J. H. (1975). Deliberate memory
behavior the delayed reactions of very young children.
Developmental Psychology, 11, 780-787.


I certify that I
conforms to acceptable
adequate, in scope and
Doctor of Philosophy.
I certify that I
conforms to acceptable
adequate, in scope and
Doctor of Philosophy.
I certify that I
conforms to acceptable
adequate, in scope and
Doctor of Philosophy.
I certify that I
conforms to acceptable
adequate, in scope and
Doctor of Philosophy.
I certify that I
conforms to acceptable
adequate, in scope and
Doctor of Philosophy.
have read this study and that in my opinion it
standards of scholarly presentation and is fully
quality, as a dissertation for the degree of
Pafricia E. feller, Chair
Professor of Psychology
have read this study and that in my opinion it
standards of scholarly presentation and is fully
quality, as a dissertation for the degree of
Shari A. Ellis, Cochair
Assistant Professor of Psychology
have read this study and that in my opinion it
standards of scholarly presentation and is fully
quality, as ardissertation for the degree of
Scott A. Miller
Professor of Psychology
have read this study and that in my opinion it
standards of scholarly presentation and is fully
quality, as a dissertation for the degree of
\a, Av-fK (jQsi. A/Nrev..
James J. Algina /\
Processor of Foundations of Education
have read this study and that in my opinion it
standards of scholarly presentation and is fully
quality, as a dissertation for the degree of
Ira S. Fischler
Professor of Psychology


65
total changes, respectively). These findings were qualified by
correlations computed separately within each Grade x Condition cell and
reported in Table 12. These correlations showed that only fourth
graders in the descending/ascending condition showed significant and
negative relations between recall and strategy change, with correlations
involving all three measures of strategy change being significantly
related to recall. These latter findings demonstrate that the grade
differences reported above can be attributed to correlational data for
fourth graders in the descending/ascending condition. Fourth graders in
the ascending/descending condition showed no reliable relation between
recall and strategy change.
The data in Table 12 reveal that the difference between
correlations involving trials with changes and total changes was always
lower than the difference between correlations involving each of these
variables and unique combinations. This suggested possible differences
in the relations among the various measures of strategy change. To
assess this possibility, pairwise correlations among each of the three
measures of strategy change were computed, separately within each Grade
x Condition cell. These correlations are reported Table 13.
As shown in Table 13, all correlations among the three measures of
strategy change were significant. However, the magnitude of
correlations involving unique combinations (i.e., unique combinations
and trials with changes; unique combinations and total changes) was
lower than the magnitude of correlations not involving unique
combinations (i.e., trials with changes and total changes).
Correlations between trials with changes and total changes were near


89
contrast, second graders who sorted perfectly, or both sorted and
clustered perfectly, recalled just as many words as comparably and
perfectly strategic fourth graders. Thus, perfectly strategic second
graders who used a sorting strategy, either alone or in combination with
a clustering strategy, showed levels of recall equivalent to comparably
and perfectly strategic fourth graders. However, perfectly strategic
second graders who used a clustering strategy by itself showed evidence
of a utilization deficiency. These findings demonstrate that
utilization deficiencies can occur for some but not all instances of
perfect strategy use.
Why did perfectly strategic second graders show a utilization
deficiency when using a clustering strategy but not a sorting strategy?
The answer to this question may have to do with the ontogeny of
organizational strategy development. Several studies have shown that
clustering early in development is caused by the automatic activation of
semantic memory relations, a nondeliberate and relatively ineffective
form of clustering (Frankel & Rollins, 1985; Schneider, 1986).* In
contrast, clustering later in development is caused by the deliberate
recall of words by taxonomic categories, a relatively effective way to
enhance recall. Sorting is very similar to the deliberate form of
clustering, in that it requires recognition and deliberate placement of
the items into categories, is relatively effective, and emerges later in
^Although stimulus items in the current study were selected to
minimize the automatic activation of semantic memory relations, in
practice it is not possible to eliminate entirely the occurrence of such
nondeliberate strategy use. Thus, the current study likely reduced but
did not eliminate the possibility of the automatic activation of
semantic memory relations.


17
variability was observed for all age groups, a (nonsignificant) age-
related decline in variability was observed. Older children showed
fewer changes on consecutive trials and had fewer trials with changes
than younger children. These findings were confirmed in an analysis of
strategy change within individual subjects. Although Coyle and
Bjorklund (1997) paid little attention to the age-related declines in
strategy changes, emphasizing instead pervasive variability at all ages,
subsequent research has found considerable evidence for age-related
declines in variability across a variety of tasks and for children
varying widely in age (Coyle, Colbert, & Read, 1997; for a review, see
Siegler, 1996). In general, older and more experienced strategy users
show fewer strategy changes than younger and less experienced strategy
users.
Further analysis revealed relations between variability and memory
performance. As predicted, multiple-strategy use was related to recall
for older children, who showed significant and positive relations
between number of strategies used and recall. Younger children showed
no reliable relation between number of strategies used and recall,
indicating a utilization deficiency. In addition, stable-strategy use
(i.e., few strategy changes across trials) was significantly related to
high levels of recall, but only for the older age groups. That is,
third- and fourth-grade children who consistently used a particular
strategy combination had higher levels of recall than their peers whose
strategy use was less consistent. No reliable relation between
variability and recall was found for the youngest children.


99
strategies but only to a specific subset of strategies. Second, whereas
previous research has reported the beneficial effect of stability on
performance, the current study shows that such benefits may occur
earlier in the task for older children than for younger children.
Third, the current study extended to a new domain evidence that strategy
changes occur less frequently following perfect performance than
following less than perfect performance. Such a pattern is consistent
with the win-stay/lose-shift approach (Eimas, 1969; McGilly & Siegler,
1989). Unlike previous research demonstrating the win-stay/lose-shift
approach (e.g., McGilly & Siegler, 1989), the current study found that
the win-stay/lose-shift approach occurred on later trials (i.e., Trials
4 to 6) only, suggesting that a pattern of strategic responding
consistent with the win-stay/lose-shift model emerges as a result of
task-related experience.
Perhaps more interesting than what the current study found is what
the current study did not find. A primary goal was to assess strategy
variability and the relation between variability and recall as a
function of the number of words presented on each trial. A number of
hypotheses that predicted effects involving the number of words on each
trial were made. Most of these hypotheses were based on the assumption
that children would adapt their strategy use to the demands of the task,
using more strategies when their memory capacity was insufficient for
perfect recall. For example, multiple-strategy use was predicted to be
highest on trials with relatively many words {i.e., 12 or 15 words),
because perfect recall on these trials was assumed to require additional
mnemonics. None of the predictions pertaining to the effect of number


22
to increase with age. This prediction was based on research showing
that strategies are capacity-demanding operations and that strategy
production consumes less capacity with age (Kee, 1994). Thus, older
children, who use relatively little capacity during strategy production,
should produce more capacity-consuming strategies than younger children.
In addition, multiple-strategy use was predicted to be greater on
trials with relatively many words (i.e., 12 or 15 words) than on trials
with relatively few words (i.e., 6 or 9). This prediction was based on
the assumption that trials with many words would induce children to use
additional memory strategies because recall of all words on these trials
is beyond children's memory capacity (Miller, 1956). In contrast,
trials with few words should not have this effect because recall of all
words is within children's memory capacity. Thus, children are expected
to use multiple-strategies only when they cannot perform optimally
without doing so (cf. McGilly & Siegler, 1989). These predictions may
be qualified by age, with older children having greater capacity for
using multiple strategies than younger children.
On the basis of the findings in Coyle and Bjorklund (1997) and in
other studies (Coyle, Colbert, & Read, 1997; Lemaire & Siegler, 1995),
strategy changes (e.g., trial-by-trial changes in strategy use) were
predicted to decrease with age. In addition, strategy changes were
predicted to decrease over the course of the testing session, especially
for older children. This latter prediction was based on models of
strategy variability proposing that task-relevant experience is
associated with decreases in trial-by-trial changes in strategy use
(Siegler, 1996; Thelen & Smith, 1994). Thus, children should show


6
Contemporary research has investigated the causes and consequences
of utilization deficiencies. Possible causes of utilization
deficiencies include inadequate capacity for both strategy production
and effective encoding, limited knowledge of stimulus items or task
requirements, and insufficient metamnemonic knowledge of when and how to
use strategies (Miller & Seier, 1994). Empirical support for these
causes has been demonstrated in studies showing that utilization
deficiencies are reduced or eliminated when (a) the capacity required
for accessing or executing a strategy is eliminated by having an
experimenter carry out the strategy (Miller, Woody-Ramsey, & Aloise,
1991); (b) the stimulus items or task requirements are highly familiar
and embedded in a meaningful context (Miller, Seier, Barron, & Probert,
& 1994); and (c) metamnemonic instruction is provided regarding the
cause-and-effect relation between strategy use and recall (Ringel &
Springer, 1980). Of the three causes mentioned, inadequate mental
capacity and limited knowledge base have received the most empirical
support. Other potential causes of utilization deficiencies, including
inadequate strategic monitoring, failure to link one strategy with
another, and failure to inhibit an earlier, ineffective strategy (see
Bjorklund & Coyle, 1995; Miller & Seier, 1994), have received little
attention.
Contemporary research has also examined the development of
utilization deficiencies. The general finding is that young children
are more apt to show a utilization deficiency than older children.
Compared to older children, young children (a) recall less when using
the same strategies, (b) show lower correlations between strategy use


63
Table 11continued
Note. Boldface denotes significant age differences in recall or percentage of
trials on which a particular strategy combination was used, with all
significant results reported at p < .05. Recall data are omitted for
combinations used on one or zero trials. Grade differences in percentage of
trials on which each combination was used are evaluated using Yates corrected
chi-squares with one degree of freedom. Grade differences in mean recall for
each combination are evaluated using t-tests.


Nature is not economical of structures-~only of principles
Abdus Salam


VARIABILITY AND UTILIZATION DEFICIENCIES IN CHILDREN'S MEMORY
STRATEGIES: A DEVELOPMENTAL STUDY
By
THOMAS R. COYLE
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1997


95
trials. That is, children in the descending/ascending condition, who
were unlikely to show perfect performance without using strategies, may
have developed a mental set which guided their subsequent strategy use
(cf. Langer, 1989). In contrast, children in the ascending/descending
condition, who were able to recall perfect without using strategies, had
less opportunity to develop such a pattern of responding, at least on
the critical initial trials. This speculation must be interpreted
cautiously, however, because the effect of condition was not replicated
in the analyses of recall and strategy changes within individual
subjects.
The correlational findings pertaining to grade (but not condition)
were confirmed and extended in a series of analyses of recall for
individual children in each grade classified as stable or unstable. A
first analysis, using data on all trials for the classifications, showed
that fourth graders classified as stable had higher recall than fourth
graders classified as unstable. In contrast, second graders classified
as stable recalled no more words than second graders classified as
unstable. Thus, stability was beneficial to recall only for the older
children. Unlike the findings in the correlational analyses, the
pattern of results obtained in this analysis did not vary as a function
of condition. These findings were qualified by a second set of
analyses, which examined recall for children in each grade classified as
stable or unstable on early trials (Trials 1-4) and separately on later
trials (Trials 4-7). On early trials, recall varied as a function of
grade and stability classification but not condition. Fourth graders
classified as stable had marginally higher recall than fourth graders


58
Contrary to predictions, second graders did not overcome a utilization
deficiency on three of four trials with nine or fewer words presented
(i.e., Trials 1, 2, and 7).
The pattern of correlations in the descending/ascending condition
was nearly opposite to that observed in the ascending/descending
condition. Fourth graders now had significant correlations on only two
of seven trials. Second graders had significant correlations on five of
seven trials, with two of these correlations being found on trials with
nine or fewer words presented (i.e., Trials 3 and 5). The magnitude of
correlations for second graders was higher than that for fourth graders
on all trials except Trial 1. These data do not provide evidence of a
utilization deficiency for the younger children.
The failure to find significant correlations for fourth graders in
the descending/ascending condition, when such correlations were
significant in the ascending/descending condition, cannot be attributed
to restricted variance in number of words recalled or number of
strategies used. The standard deviations for number of strategies used
on Trials 1-7 were very similar for fourth graders in the
ascending/descending condition (SDs = .96, .94, 1.03, 1.10, 1.17, 1.12,
and 1.20) and in the descending/ascending condition (SDs = .98, 1.01,
1.13, 1.03, .80, 1.16, and 1.16). The standard deviations for recall
were also similar for both groups of fourth graders (see Table 2).
Recall for perfectly strategic children. Coyle and Bjorklund
(1996), as well as Miller and Seier (1994), have argued that utilization
deficiencies can be inferred when grade differences in recall are
observed despite comparable strategy use. In the current study, this


BIOGRAPHICAL SKETCH
Thomas R. Coyle was born in Philadelphia, Pennsylvania, on
February 13, 1968. He was raised in North Lauderdale, Florida, and Boca
Raton, Florida. Thomas graduated high school at Saint Andrew's School
in Boca Raton. He attended Palm Beach Community College in Boca Raton,
where he received an Associate of Arts in psychology in 1989. He then
transferred to Florida Atlantic University, also in Boca Raton, where he
received a Bachelor of Arts in psychology in 1991 and a Master of Arts
in psychology in 1993 under the direction of Dr. David F. Bjorklund.
Thomas then went on to the University of Florida, in Gainesville,
Florida, where he received a Doctor of Philosophy in psychology in 1997
under the direction of Dr. Patricia H. Miller. Thomas is currently
Assistant Professor of Psychology at the University of Texas at San
Antonio.
108


20
minimizing the likelihood of clustering as a result of the automatic
activation of semantic memory relations.
Approximately half the children in each grade were assigned to one
of two conditions, labeled ascending/descending and
descending/ascending. In the ascending/descending condition, children
received an increasing number of words on each successive trial until
Trial 4, and then received a decreasing number on each successive trial
(number of words on Trials 1-7, respectively, was 6, 9, 12, 15, 12, 9,
and 6). The descending/ascending condition was the complement of the
ascending/descending condition. In the descending/ascending condition,
children received a decreasing number of words on each successive trial
until Trial 4, and then received an increasing number of words on each
successive trial (number of words on Trials 1-7, respectively, was 15,
12, 9, 6, 9, 12, 15). Each condition had trials with the same number of
words, so that effects concerning number of words presented could be
teased apart from effects concerning the ascending or descending order
in which words in each condition were presented.
These conditions were developed to examine changes in strategy use
and performance as a function of the number of words presented on
successive trials. Two additional sets of conditions were considered
but not selected. The first involved presenting trials in the
ascending/descending and descending/ascending condition randomly,
without having a constant rate of increase or decrease across trials.
For example, Trials 1 to 7 in the ascending/descending condition might
be ordered 12, 15, 9, 9, 6, 12, and 6, respectively, whereas Trials 1 to
7 in the descending/ascending condition might be ordered 9, 15, 15, 6,


54
Table 8
Classification
Across Trial
Blocks, Bv Grade
No Change
Change
Unstable/
Stable/
Unstable/
Stable/
Grade
Unstable
Stable
Stable
Unstable
2
41 (28)
17 (12)
20 (14)
22 (15)
4
18 ( 9)
24 (12)
39 (20)
20 (10)


42
(i.e., Trials 3-5 in the ascending/descending series and Trials 1, 2, 6,
and 7 in the descending/ascending series). The number of strategies
used on each trial was analyzed by a 2 (grade) x 2 (condition) x 7
(trial) ANOVA, with repeated measures on the trial factor. The analysis
revealed a marginally significant effect of grade, F(l, 116) = 3.31, £ =
.07, with fourth graders using more strategies (M = 1.57) than second
graders (M = 1.32). Also significant was the main effect of trial, F{6,
696) = 12.33, and the Grade x Trial interaction, F(6, 696) = 3.23. No
other significant effects were found. Inspection of the significant
main effect of trial revealed that the number of strategies used on
Trial 1 (M = 1.14) and Trial 2 (M = 1.18) was significantly less than
that used on Trials 3-7 (mean number of strategies used: 1.46, 1.49,
1.58, 1.57, and 1.58 for Trials 3-7, respectively). No other
significant comparisons across trials were found.
Data pertaining to the significant Grade x Trial interaction are
presented in Table 4, which also shows the number of strategies used for
each Condition x Grade x Trial cell. Examination of grade differences
in number of strategies used on each trial revealed that fourth graders
used more strategies than second graders on Trials 3, 4, and 6, with
strategy use being comparable for both grades on all other trials.
These data, along with the data presented immediately above, are
consistent with the predicted grade differences. In all cases where
grade differences were found, fourth graders used more strategies than
second graders. The absence of a significant Condition x Trial
interaction indicates that strategy use did not vary across trials with
different numbers of words presented.


25
recall for fourth graders but not second graders. A further prediction
was that the relation between stability and recall may be more apt to
occur on later trials (Trials 4 to 7) than on early trials (Trials 1 to
4). This prediction was based on research showing that children
initially show inconsistent and ineffective strategy use, but later
settle into a stable and optimal state of strategic responding (Siegler,
1996; Thelen & Smith, 1994). Because the measures of strategy
variability were computed from data aggregated across trials, no
predictions concerning the impact of number of words on the relation
between strategy change and recall could be made.
A final prediction concerned the conditions under which strategy
changes occur. Strategy changes were predicted to occur less frequently
when recall was perfect on the immediately preceding trial than when
recall was not perfect on the immediately preceding trial. This
prediction was based on the findings of a serial-recall study by McGilly
and Siegler (1989). In that study, children who had been given a series
of serial-recall trials tended to switch strategies when their
performance was less than perfect on the preceding trial, but not when
their performance was perfect on the preceding trial. That is, children
tended to stick with a particular approach when it had yielded optimal
performance but switched approaches when the previous one had yielded
less than optimal performance. This pattern is known as the win-
stay/lose-shift approach in the decision-making literature (Eimas,
1969). Such a pattern may vary with the number of words presented on
each trial. Trials with few words should provide greater opportunity
for perfect recall, which should result in few strategy changes.


5
Slawinski (in press) examined utilization deficiencies (e.g., increases
in strategy use but not recall following training) in memory strategy
training studies published between 1968 and 1994. Like Miller and
Seier, Bjorklund et al. selected only studies that included children
from normal populations and reported independent measures of strategy
use and recall. Because studies with multiple-training conditions could
provide multiple cases of evidence of utilization deficiencies, training
conditions within studies (rather than the studies themselves) served as
the units of analysis. Of the 76 relevant training conditions
identified, 39 (51%) showed evidence for utilization deficiencies.
Why did it take the field so long to identify utilization
deficiencies? The most parsimonious explanation, I believe, is that
utilization deficiencies did not make any sense given the dominant
assumption prevalent in much of the early research on production
deficiencies (i.e., strategies help performance). Consequently,
evidence for a utilization deficiency was often ignored or overlooked.
A conceptual shift occurred in the mid-1980s when a number of studies
examined independent measures of strategy use and recall (for a review
of these studies, see Miller & Seier, 1994). Several studies during
this period reported that strategy use resulted in no or little recall
benefit. The accumulation of such evidence made it difficult to ignore
data indicating that strategies did not always enhance memory
performance. A new view of strategy use emerged, one that considered
the possibility of ineffective strategy use. This view made possible
the discovery of utilization deficiencies in research on memory
strategies.


94
grade in each condition. Those correlations showed that fourth graders
in the descending/ascending condition showed significant and negative
relations between recall and all three measure of strategy change. No
other significant relations between recall and strategy change were
found for children of any age in any condition, including fourth graders
in the ascending/descending condition. These findings demonstrate that
fourth graders in the descending/ascending condition who used the same
mix of strategies across trials tended to recall more than their
agemates who made many strategy changes across trials. Surprisingly,
fourth graders in the ascending/descending condition showed no such
pattern. These findings suggest that fourth graders in the
descending/ascending condition contributed disproportionately to the
observed relations between recall and strategy change when data for all
fourth graders (i.e., those in both conditions) were analyzed together.
Why was stability associated with high levels of recall for fourth
graders in the descending/ascending condition, but not for fourth
graders in the ascending/descending condition? Perhaps the answer is
related to the number of words presented on the initial trials (Trials 1
and 2) in each condition. The descending/ascending condition had 15 and
12 words presented on Trials 1 and 2, respectively, whereas the
ascending/descending condition had six and nine words presented on the
corresponding trials. The additional words presented on the initial
trials of the descending/ascending condition made perfect recall very
unlikely without the use of strategies. Consequently, the early trials
of the descending/ascending condition may have induced a pattern of
effective strategic responding that was retained for all subsequent


91
(1994) assessed utilization deficiencies separately for sorting and
clustering, but not for sorting and clustering used in combination.
Although evidence of a utilization deficiency for each strategy used
separately was found, utilization deficiencies for both strategies used
together were not evaluated. Similarly, a sort-recall study by Coyle
and Bjorklund (1996) assessed utilization deficiencies for sorting only,
clustering only, and sorting and clustering used together, but evidence
of utilization deficiencies for each strategy combination was aggregated
and examined in a single analysis. Because utilization deficiencies for
each possible strategy combination were not analyzed separately, the
possibility that a particular strategy exerted a disproportionate
influence on the outcome could not be examined.
The results of the current study show that assessing utilization
deficiencies for only a subset of strategy combinations may mask
utilization deficiencies for strategy combinations that are not
evaluated. For example, although utilization deficiencies in the
current study were found for the combinations of rehearsal only, sorting
and rehearsal, rehearsal and clustering, and the three strategy
combination of sorting, rehearsal, and clustering, no utilization
deficiencies were found for the strategy combinations of sorting only,
clustering only, and sorting and clustering (see Table 11). These data
show that utilization deficiencies do not apply broadly to childrens
strategic behavior, but apply to a specific set of strategies. Thus,
analyses that examine only a subset of possible strategy combinations
risk not detecting utilization deficiencies in strategy combinations
that are not evaluated.


76
Table 16continued
Recall Not Perfect
Strategy-
Change 52 (29) 65 (36) 59 (27) 69 (20) 61 (25
No Strategy
Change 48 (27) 35 (19) 41 (19) 31 ( 9) 39 (16
Note. Percentages computed separately at each trial.
53 (28)
47 (25)


28
Table 1continued
List 5
List 6
List 7
Weather
Flowers
Animals
Phenomenon
Daisy
Horse
Wind
Orchid
Zebra
Snow
Tulip
Pig
Rain
Lily
Tiger
Fog
Rose
Cat
Hail
Furniture
Vehicles
Cloth
Couch
Bus
Cotton
Lamp
Plane
Satin
Chair
Boat
Silk
Bed
Car
Wool
Dresser
Motorcycle
Velvet
Musical
Instruments
Time
Body Parts
Drums
Year
Foot
Tuba
Decade
Elbow
Violin
Month
Neck
Flute
Hour
Mouth
Piano
Century
Hand


81
related patterns in variability are relatively uninfluenced by changes
in the total number of strategies being assessed.
Children were expected to show increases in the number of
strategies used on trials with more words. The results revealed that
multiple-strategy use did not vary as a function of the number of words
on each trial, with strategy use being comparable across both the
ascending/descending and descending/ascending conditions. The absence
of any effects involving trial or condition cannot be attributed to
ceiling effects or children not being able to use the target strategies.
Children of all ages used an average of fewer than two strategies across
trials (of a possible four strategies), leaving ample opportunity for
increases in the number of strategies used. Furthermore, children in
the age range studied have demonstrated competence in using all
strategies assessed.
The absence of any effect of condition and trial on the number of
strategies used demonstrates that children did not modify their
strategic behavior in response to being presented with different number
of words. Why did children stick with using a certain number of
strategies when presented varying number of words? Perhaps the most
parsimonious explanation is that children did not consider altering
their strategic behavior on trials with different numbers of words.
Although children in the age range tested could use all the strategies
assessed in the current study, metacognitive limitations concerning when
and how to use strategies may have prevented them from doing so. An
implication is that children who do not produce strategies spontaneously
might do so if they are instructed to (Ringle & Springer, 1980). In


18
Taken together, these findings extend current research on
utilization deficiency and variability in several ways. Specifically,
they provide evidence for (a) utilization deficiencies in multiple-
strategy use, (b) several different types of variability, including
multiple-strategy use and strategy change, and (c) variability in
strategy use within a particular trial, as well as between trials.
The Current Study
The purpose of the current study was to further examine issues
concerning utilization deficiencies and variability using the procedures
developed by Coyle and Bjorklund (1997). As in Coyle and Bjorklund,
children received a multitrial sort-recall task with different words and
categories on each trial. Also as before, multiple strategies were
assessed on each trial and variability was measured in several ways.
The strategies assessed were sorting, rehearsal, clustering, and
category naming. The measures of strategy variability were number of
strategies used on each trial, number of strategy changes on consecutive
trials, number of unique combinations, and number of trials with
strategy changes.
The current study differed from the study by Coyle and Bjorklund
in two important ways, each of which permitted new research questions
concerning utilization deficiencies and variability in strategy use.
First, in the current study children received seven sort-recall trials,
two more than in Coyle and Bjorklund. The additional trials permitted a
more detailed analysis of strategy change during the testing session.
It was now possible to assess periods of stability and instability
within individual children during early trials and again during later


21
12, 9, and 12, respectively. Unlike the conditions in the current
study, these presentation orders would vary randomly the amount of
increase or decrease on successive trials. Consequently, they would
confound changes in the number of words presented on successive trials
with the magnitude of such changes. The design of the current study
eliminated this confound by holding constant the rate of change at three
words.
A second possible set of conditions that were considered included
an ascending only series and a descending only series. The idea was to
extend the pattern in the early trials of each condition in the current
study. Thus, Trials 1 to 7 in the ascending series would have 6, 9, 12,
15, 18, 21, and 24 words, respectively, whereas Trials 1 to 7 in the
descending series would have 24, 21, 18, 15, 12, 9, and 6 words,
respectively. Unlike the conditions in the current study, these
presentation orders do not reverse the pattern of change in the latter
trials. Consequently, effects regarding possible strategic adaptation
to reversal of presentation order could not be assessed. Furthermore,
it was not clear why differences in strategic adaptation and recall
performance would vary beyond 15 words, when the number of words
presented would exceed children's memory capacity (Miller, 1956).
Goals of the Current Study
The current study had three goals. The first was to examine
differences in measures of strategy variability (e.g., multiple-strategy
use and strategy change) as a function of grade and number of words
presented on each trial. As in Coyle and Bjorklund (1997), multiple-
strategy use (e.g., number of strategies used per trial) was predicted


62
Table 11
Percentage (and Number) of Trials on Which Each Strategy Combination Was Used,
and Mean Proportion Recall (and Standard Deviations) for Each Combination, By
Grade (Codes for Strategies: S. Sorting; R, Rehearsal: C, Clustering; N,
Category Naming)
Percentage of Trials Mean Recall
Strategy Grade 2 Grade 4
Grade 2 Grade 4
None
16
(
78)
22
(77)
.41
(
.25)
.65
(
.23)
S
8
(
39)
8
(30)
.50
(
.23)
.60
(
.21)
R
24
(
115)
10
(34)
.53
(
.25)
.81
(
.19)
C
14
(
68)
5
(18)
.47
(
19)
.54
(
.17)
N
0
(
0)
0
( 0)
SR
8
(
37)
10
(34)
.65
(
.23)
.76
(
.22)
SC
10
(
50)
17
(61)
.70
(
.26)
.80
(
.22)
SN
<1
(
2)
0
( 0)
RC
10
(
49)
5
(17)
.48
(
.20)
.69
(
.27)
RN
<1
(
1)
0
( 0)
CN
0
(
0)
0
( 0)
SRC
7
(
35)
23
(83)
.74
(
.23)
.90
(
.13)
SRN
<1
(
1)
<1
( 1)
SCN
<1
(
1)
<1
( 2)
RCN
<1
(
1)
0
( 0)
SRCN
1
(
6)
0
( 0)


59
type of utilization deficiency was evaluated by analyzing mean
proportion recall for trials on which children showed perfect sorting
only, perfect clustering only, and perfect sorting and clustering.
Clustering and sorting data on each trial were measured continuously by
ARC scores for this analysis. ARC scores can range from 1 to -1, with 1
indicating perfect sorting or clustering and 0 indicating chance sorting
or clustering. Because children rarely showed multiple trials with
perfect strategy use (i.e., two or more trials with sorting or
clustering scores of 1), repeated measures analysis of recall across
trials with perfect strategy use was not performed. Instead, each child
received a single score averaging recall across trials with perfect
strategy use. Such a recall score was computed separately for trials
with perfect clustering only, perfect sorting only, and perfect
clustering and sorting.
Mean proportion recall for children in each grade showing each
measure of perfect strategy use is presented in Table 10, along with the
number of subjects in each grade who had at least one trial of perfect
strategy use. Separate 2 (grade) x 2 (condition) ANOVAs were performed
on proportion recall for trials with perfect sorting only, perfect
clustering only, and perfect sorting and clustering. The analysis of
recall on trials with perfect clustering revealed a significant main
effect of grade, F(l, 83) = 5.59. No other significant main effects or
interactions were found for any measure of perfect strategy use.
These findings demonstrate that second graders who clustered
perfectly recalled fewer words than comparably strategic fourth graders,
which is evidence for a utilization deficiency for the second graders.


REFERENCES
Baker-Ward, L., Ornstein, P. A., & Holden, D. J. (1984). The expression
of memorization in early childhood. Journal of Experimental Child
Psychology. 37, 555-557.
Bjorklund, D. F. (1988). Acquiring a mnemonic: Age and category
knowledge effects. Journal of Experimental Child Psychology. 45.
71-87.
Bjorklund, D. F. (1990). Children's strategies: Contemporary views of
cognitive development. Hillsdale. NJ: Erlbaum.
Bjorklund, D. F., & Bernholtz, J. F. (1986). The role of knowledge base
in the memory performance of good and poor readers. Journal of
Experimental Child Psychology. 41. 367-373.
Bjorklund, D. F., & Coyle, T. R. (1995). Utilization deficiencies in the
development of memory strategies. In F. E. Weinert & W. Schneider
(Eds.), Memory performance and competencies: Issues in growth and
development (pp. 161-180). Mahwah, NJ: Erlbaum.
Bjorklund, D. F., Coyle, T. R., & Gaultney, J. F. (1992). Developmental
differences in the acquisition and maintenance of an
organizational strategy: Evidence for the utilization deficiency
hypothesis. Journal of Experimental Child Psychology. 54. 434-448.
Bjorklund, D. F., & Harnishfeger, K. K. (1987). Developmental
differences in the mental effort requirements for the use of an
organizational strategy in free recall. Journal of Experimental
Child Psychology. 44. 109-125.
Bjorklund, D. F., Miller, P. H., Coyle, T. R., & Slawinski, J. L. (in
press). Instructing children to use memory strategies: Evidence
for utilization deficiencies in memory training studies.
Developmental Review.
Bjorklund, D. F., Schneider, W., Cassel, W. S., & Ashely, E. (1994).
Training and extension of a memory strategy: Evidence for
utilization deficiencies in the acquisition of an organizational
strategy in high- and low-IQ children. Child Development, 65,
951-965.
Bjorklund, D. F., Thompson, B. E., & Ornstein, P. A. (1983).
Developmental trends in children's typicality judgments.
Behavioral Research Methods and Instrumentation, 15. 350-356.
104


10
Why was variability reported in these studies when early research
on strategy development had reported a stagelike progression? As with
utilization deficiencies, the answer has to do with how strategies were
assessed. Early research on memory strategy development typically
classified children as using a single strategy only. Children might be
identified as rehearsing, sorting, or elaborating, but no child was
identified as using more than one strategy. Although variability was
present across individuals, data were often presented in terms of the
dominant strategy used at each age (Flavell, Beach, & Chinsky, 1966).
This type of data presentation, along with the strategy assessment
procedures, depicted memory strategy development as a series of stages.
Later research on memory strategy development assessed the possibility
of intraindividual variability in strategy use (i.e., multiple
strategies being used by a particular individual). This research
assessed several strategies on a particular trial or different
strategies across trials. Under these conditions, children showed
considerable variability in strategy use, often using a variety of
approaches within and across trials.
Evidence for variability in strategy use led to a new view of
strategy development championed by Robert Siegler (1996). Siegler
argued that strategy development does not involve the replacement of
different strategies, as implied by stage theories. Instead, he argued
that strategy development involves changes over time in the frequency of
occurrence of several strategic approaches. According to Siegler, at
any given age children use not one but a variety of strategies.
Strategy development consists of changes in the frequency of use of each


90
development. Extrapolating from these findings to the current study,
second graders who clustered perfectly were probably using a less mature
and less effective form of clustering than were fourth graders, who were
clustering deliberately and reaping the benefits of doing so. In
contrast, second graders who sorted perfectly were likely using a
relatively advanced and effective approach for their age, one that is
normally observed in older children and that eliminated a utilization
deficiency.
The findings for the analyses of perfect strategy use were
confirmed and extended in an analysis of grade differences in recall on
trials when children used each of the 15 possible strategy combinations
(Table 11). Data were analyzed only for strategy combinations that were
used on more than one trial by each age group. Of the seven
combinations included in the analysis, every one showed that fourth
graders had higher levels of recall than comparably strategic second
graders, with recall on four of the seven combinations being
significantly higher for fourth graders. These results, along with
those in the analysis of perfect strategy use presented above, provide
substantial evidence of utilization deficiencies for the second graders.
Although in the current study utilization deficiencies were
evaluated for all possible strategy combinations, previous research,
including my own, has examined utilization deficiencies for only a
subset of all possible strategy combinations (Bjorklund, Schneider,
Cassel, & Ashely, 1994; Coyle & Bjorklund, 1996). As a result,
utilization deficiencies are evaluated for some but not all
combinations. For example, a sort-recall study by Bjorklund et al.


47
Table 5
Mean Number of Trial-bv-Trial Strategy Changes, By Grade and Trial Transition,
and By Condition, Grade, and Trial Transition
Trial
Transition
1 to 2
2 to 3
3 to 4
4 to 5
5 to 6
6 to 7
Grade 2
M
.77
cr>
co
.73
.93
CO
.70
SD
.75
.58
.75
.85
.89
.69
Grade 4
M
.49
.80
.57
.47
.45
.43
SD
.64
.83
.67
.83
.50
.61
Ascending/Descending
Grade 2
M
CO
^4
. 64
.67
.95
.82
.69
SD
.83
.63
.74
.79
.82
.69
Grade 4
M
.40
.76
.64
.60
.40
.44
SD
.50
.78
.76
1.08
.50
.65
Descending/Ascending
Grade 2
63
.73
o
CO
.90
.93
.70
62
.52
.76
.92
vJD
CO
.70


92
The analyses that examined utilization deficiencies for children
who showed perfect strategy use and for children who showed equivalent
(but not necessarily perfect) strategy use did not yield identical
results (Tables 10 and 11). Both analyses examined the strategy
combinations of sorting, clustering, and sorting and clustering, and so
concordance for utilization deficiencies across analyses could be
evaluated for these strategies. Both analyses yielded no utilization
deficiency for the strategies of sorting, and sorting and clustering.
However, a utilization deficiency for second graders using clustering
was found in the analysis of children who showed perfect strategy use,
but not in the analysis of children who showed equivalent strategy use.
This discrepancy can be attributed to differences in how clustering was
assessed. The analysis of utilization deficiencies for children who
showed equivalent clustering assessed clustering by itself, not used in
combination with any other strategy. In contrast, the analysis of
utilization deficiencies for children who showed perfect clustering
ignored the possibility that clustering was used in combination with
rehearsal or category naming. Consequently, the latter analysis
probably included several trials on which clustering was used in
combination with another strategy. This possibility is supported by
data reported in the separate analyses of the 15 strategy combinations
(Table 11), which shows that clustering was used frequently with sorting
and also with rehearsal. Although the analysis examining perfect
clustering did examine both clustering and sorting together, it did not
examine clustering and rehearsal together, which, as shown in the
separate analyses of the 15 strategy combinations, resulted in a


13
Thelen & Smith, 1994), and that memory strategies have costs as well as
benefits (Bjorklund & Coyle, 1995; Miller & Seier, 1994). Second,
several studies have been designed with the explicit intent of assessing
utilization deficiencies and variability (Bjorklund, Coyle, & Gaultney,
1992; Coyle & Bjorklund, 1996; Miller, Seier, Barron, & Probert, 1994;
Siegler & Jenkins, 1989). These studies do not view utilization
deficiencies and variability as anomalies to be discounted, but consider
them as being worthy of study in their own right and deserving of
explanation.
Although utilization deficiencies and variability have received
considerable attention in contemporary strategy research, current
research investigating these phenomena is limited in at least three
ways. First, utilization deficiencies have been described almost
exclusively on tasks in which only a single strategy is assessed on all
trials. In most studies, children are said to be utilizationally
deficient when they use a particular strategy (e.g., rehearsal or
elaboration) and show no or little memory benefit or less benefit than
that shown by more experienced strategy users. Because only a single
strategy is assessed, the possibility of utilization deficiencies in
multiple-strategy use cannot be examined. Instead, the focus is on the
ineffective use of a particular strategy.
Second, variability generally has been assessed on multitrial
tasks in which only one strategy per problem solving trial is assessed.
In most studies, children are credited with using a single strategy each
time they are presented with a problem, although they can (and usually
do) use a variety of strategies across different problems. Because only


49
(grade) x 2 (condition) x 3 (strategy change type) ANOVA, with repeated
measures on the strategy change type factor. This analysis permitted
examination of possible grade and condition differences across the three
measures of strategy change.
The analysis revealed a significant main effect of grade, F(l,
116) = 10.65 (mean z-scores summed across the three measures of strategy
change: .22 and -.30 for second and fourth grade, respectively), which
was qualified by a significant Grade x Strategy Change Type interaction,
F(2, 232) =3.90. No other significant main effects or interactions
were found. Data pertaining to the significant Grade x Strategy Change
Type interaction are presented in Table 6. Fourth graders had
significantly lower levels of strategy change than second graders for
two of the three measures (trials with changes and total changes). The
grade difference for unique combinations was in the predicted direction
but only approached significance, p < .10. Paired comparisons among the
change measures within each grade revealed that fourth graders had
significantly fewer total strategy changes than unique combinations. No
other comparisons among the change measures within each grade were
found. These data, along with the data in the preceding section,
demonstrate that older children show fewer strategy changes than younger
children across a variety of measures of strategy change, with the
exception of unique combinations.
Variability within individual children. Although these findings
demonstrate that strategy changes decline with age, they are based on
analyses of group data, which often mask patterns of individual strategy
use. Thus, children were classified as stable or unstable based on


105
Ceci, S. J., & Howe, M. J. (1978). Age-related differences in free
recall as a function of retrieval flexibility. Journal of
Experimental Child Psychology, 26. 432-442.
Coyle, T. R., & Bjorklund, D. F. (1996). The development of strategic
memory: A modified microgenetic assessment of utilization
deficiencies. Cognitive Development. 11. 295-314.
Coyle, T. R., & Bjorklund, D. F. (1997). Age differences in, and
consequences of, multiple- and variable-strategy use on a
multitrial sort-recall task. Developmental Psychology. 33. 372-
380.
Coyle, T. R., Colbert, C. T., & Read, L. E. (1997, April). Strategy
variability and memory performance in average- and high-10
children. Poster presented at the meeting of the Society for
Research in Child Development, Washington, D.C.
DeLoache, J. S. (1984). Oh where, on where: Memory-based searching by
very young children. In C. Sophian (Ed.), Origins of cognitive
skills (pp. 57-80). Hillsdale, NJ: Erlbaum.
Dempster, F. N. (1992). The rise and fall of the inhibitory mechanism:
Toward a unified theory of cognitive development and aging.
Developmental Review, 12. 45-75.
Eimas, P. D. (1969). A developmental study of hypothesis behavior and
focusing. Journal of Experimental Child Psychology, 8. 160-172.
Flavell, J. H. (1970). Developmental studies of mediated memory. In H.
W. Reese & L. P. Lipsitt (Eds.), Advances in child development and
child behavior (Vol. 5, pp. 181-211). New York: Academic Press.
Flavell, J. H., Beach, D. H., & Chinsky, J. M. (1966). Spontaneous
verbal rehearsal in a memory task as a function of age. Child
Development. 37, 283-299.
Frankel, M. T., & Rollins, H. S. (1985). Associative and categorical
hypotheses of organization in the free recall of adults and
children. Journal of Experimental Child Psychology, 40. 304-318.
Gholson, B., & Barker, P. (1985). Kuhn, Lakatos, and Laudan:
Applications in the history of physics and psychology. American
Psychologist, 40, 755-769.
Goldin-Meadow, S., Alibali, M. W., & Church, R. B. (1993). Transitions
in concept acquisition: Using the hand to read the mind.
Psychological Review, 100, 279-297.


2
Memory Strategies Enhance Performance
The origin of the assumption that memory strategies usually
facilitate memory performance can be traced to strategy training studies
(for a review, see Flavell, 1970). In a typical training study,
children who did not spontaneously use a strategy (e.g., rehearsal) were
trained to do so, frequently showing marked improvements in memory
performance. Such children were said to be production deficient because
they were unable to spontaneously produce a strategy, even though they
could do so and show memory benefits when instructed.
The discovery of production deficiencies was followed by a number
of studies that examined the effectiveness of strategy training. In
general, these studies, like the earlier ones, demonstrated that
children who do not produce a strategy initially can be trained to do so
and show corresponding memory improvements. These findings led to the
assumption that memory strategies typically improve performance and that
the failure to use memory strategies is associated with relatively low
levels of performance (for examples, see Flavell, 1970). This view was
not limited to memory strategies but was implicit in the descriptions of
other cognitive strategies, including those used in analogical
reasoning, arithmetic, and reading (for a historical review, see
Bjorklund, 1992).
The view that memory strategies generally improve performance
began to be questioned in the middle to late 1980s. A number of
developmental studies during this period examined the effectiveness of
various mnemonics. Several of these studies showed that memory
strategies sometimes resulted in no or little benefit to memory,


75
Table 16
Percentage (and Number) of Trials on Which Strategy Changes Did and Did Not
Occur Immediately After Recall was Perfect or Not Perfect, By Condition
Trial
1
2
3
4
5
6
Number of Words
Presented
6
9
Ascending/Descending
12 15
12
9
Recall Perfect
Strategy-
Change
42
( 8)
50
( 3)
33 ( 1)
67 ( 2)
33
( 2)
29
( 5)
No Strategy
Change
58
(ID
50
( 3)
67 ( 2)
33 ( 1)
67
( 4)
71
(12)
Recall Not Perfect
Strategy
Change
56
(25)
57
(33)
51 (31)
59 (36)
53
(31)
55
(26)
No Strategy
Change
44
(20)
43
(25)
49 (30)
41 (25)
47
(27)
45
(21)
Number of Words
Presented
15
12
Descending/Ascending
9 6
9
12
Recall Perfect
Strategy
Change
0
( 0)
100
( 1)
30 ( 3)
26 ( 7)
27
( 4)
0
( 0)
No Strategy
Change
0
( 0)
0
( 0)
70 ( 7)
74 (20)
73
(ID
100
( 3)


80
Bjorklund (1997). The reason for this difference is not clear. Both
studies used similar tasks and designs, had very similar testing
procedures, and involved children in the same age range. One possible
explanation for the disparity is that children in each study attended
different types of schools. Whereas children in the current study
attended public schools, children in Coyle and Bjorklund attended a
university-affiliated laboratory school. The curriculums at public
schools and laboratory schools may differ in ways that promote or
inhibit organizational strategy use. For example, children who attend
the university-affiliated schools may receive explicit instruction in
organizing items by taxonomic categories, whereas children who attend
public schools may receive such instruction less often, if at all. Such
a curriculum difference would affect children's use of organizational
strategies, particularly category naming.
The near absence of category naming in the current study resulted
in fewer strategies being available for analyses of variability in
strategy use. Although a reduction in the total number of strategies
available for analyses could affect statistical outcomes, the age-
related patterns of variability and performance found in the current
study are comparable to those found in the very similar sort-recall
study by Coyle and Bjorklund (1997). Older children in both studies
used more strategies than younger children. Also, as reported later in
the Discussion, older children in both studies showed lower levels of
strategy change, and stronger relations between stable-strategy use and
recall, than did younger children. These results suggest that age-


56
fourth graders frequently switched from unstable-strategy use on early
trials (i.e., Trials 1 to 4) to stable-strategy use on later trials
(i.e., Trials 4 to 7). In contrast, second graders who showed unstable-
strategy use on early trials frequently also showed unstable-strategy
use on later trials.
Relation Between Strategy Use and Recall
Utilization Deficiencies
Correlations between number of strategies used and recall. Miller
and Seier (1994) have argued that significant and positive correlations
between strategy use and recall for older but not younger children
indicate a utilization deficiency for younger children. In the current
study, utilization deficiencies of this type were expected on most
trials for second graders. However, second graders were predicted to
overcome a utilization deficiency on trials with relatively few words,
when capacity requirements for strategy use were presumably minimal.
Utilization deficiencies were evaluated by computing correlations
between number of strategies used and percentage of words recalled,
separately for each Condition x Grade x Trial cell (see Table 9). The
pattern of correlations in the ascending/descending condition showed
clear age differences in the significance and magnitude of the relation
between strategy use and recall. Fourth graders showed significant and
positive correlations on all trials, whereas second graders showed
significant and positive correlations on only three of seven trials.
The magnitude of correlations for the fourth graders was higher than
that for second graders on all Trials except Trial 6. These data
provide evidence of utilization deficiency for the youngest children.


what he was talking about when he said, "Theory is a good thing but a
good experiment lasts forever."
I wish to thank Dr. Shari Ellis for suggesting that I examine
patterns of variability within individual subjects and the effectiveness
of individual strategy combinations. Dr. Ellis's suggestions were
incorporated into this dissertation and into Coyle and Bjorklund (1997).
I also thank Dr. Ellis for discussions, which I initiated, on funding
for education and on cross-cultural research.
I wish to thank Dr. Ira Fischler for bringing to my attention
several articles in the adult literature that utilize procedures for
assessing intentionality in cognition, notably Jacoby's (1991) process-
disassociation approach. The intentionality issue is often neglected in
strategy research, even though some researchers have made intentionality
the sine qua non of strategy use.
I wish to thank Dr. Patricia Miller for emphasizing the continuous
nature of strategy classifications. Her contribution is acknowledged in
Bjorklund and Coyle (1995, p. 166), and can be identified in the
analyses presented in this dissertation. I also thank Dr. Miller for
suggesting that I analyze qualitative differences in strategy use. Such
an analysis was performed for this dissertation, and it yielded some
interesting results.
I wish to thank Dr. Scott Miller for suggesting that I think
carefully about defining and measuring cognitive strategies. It is
interesting that defining cognitive strategies never has been a
favorite pastime of strategy researchers who study them. I also thank
Dr. Miller for his careful and timely reviews of my manuscripts,
including my dissertation. I have yet to find anyone whose knowledge of
IV


16
older children would have the capacity to produce additional strategies.
Strategy changes (e.g., trial-by-trial changes in strategy use) were
predicted to be high and comparable for both age groups. This
prediction was based on research showing that strategy changes occur
frequently in development, across a wide range of ages and on a variety
of tasks (Siegler, 1995, 1996).
Coyle and Bjorklund also predicted age differences in the relation
between variability and recall. Multiple-strategy use was predicted to
correlate with recall for older but not younger children. This
prediction was based on the assumption that older children would have
the mental capacity to produce and use effectively multiple strategies.
In contrast, multiple-strategy use was expected to consume so much of
young children's limited mental capacity that little would remain for
recall, resulting in a utilization deficiency. Strategy change, in
particular stable-strategy use (i.e., few trial-by-trial changes in
strategy use), was predicted to correlate with recall for older but not
younger children. This prediction was based on research showing that
older children are likely to stick with a single approach that yields
optimal performance, whereas younger children frequently use a variety
of ineffective approaches (Lemaire & Siegler, 1995).
The findings were generally consistent with the predictions.
Multiple-strategy use was greater for older than for younger children.
Although children of all ages used more than one strategy across trials,
older children used more strategies and had more trials with multiple
strategies than did younger children. Strategy changes were high and
comparable for children in all age groups. Although considerable


27
Table 1
Word Lists By Category Membership
List 1
List 2
List 3
List 4
Occupations
Trees
Metals
Buildings
Carpenter
Willow
Copper
Tepee
Lawyer
Maple
Brass
Castle
Nurse
Palm
Tin
Igloo
Dentist
Oak
Iron
Church
Farmer
Pine
Silver
Barn
Parts of a
House
Beverages
Weapons
Reading
Material
Window
Tea
Sword
Book
Roof
Milk
Grenade
Journal
Door
Soda
Cannon
Newspaper
Stairs
Water
Spear
Magazine
Ceiling
Coffee
Knife
Letter
Sports
Jewelry
Vegetables
Birds
Soccer
Earrings
Cabbage
Sparrow
Golf
Necklace
Onion
Eagle
Tennis
Crown
Celery
Parrot
Football
Watch
Peas
Dove
Hockey
Bracelet
Corn
Owl


96
classified as unstable, whereas no difference in recall was found for
second graders classified as stable or unstable. On later trials,
recall varied as a function of stability classification only; effects
involving grade and condition were not significant. Stable children
recalled more than unstable children, regardless of age or condition.
Together, the results pertaining to data on early and later trials
demonstrate that stable-strategy use on the early trials benefited
recall for fourth graders only, whereas stable-strategy use on later
trials benefited recall for both age groups. These findings qualify
previous reports of stable-strategy use being associated with high
levels of performance (Coyle & Bjorklund, 1997), showing that the
beneficial effect of stability on recall varies as a function of age and
task experience. Older children show recall benefits from stability
earlier than younger children, who show recall benefits from stability
only after they have acquired additional task experience.
Why did levels of recall vary as a function of variability in
strategy use? One possible answer to this question is that variability
is associated with high levels of capacity expenditure, which reduce
strategy effectiveness and task performance. For example, a study of
mathematical equivalence by Goldin-Meadow and her colleagues (Goldin-
Meadow, Nusbaum, Garber, Church, 1993) found that children who showed
variability in strategy use, displaying one strategy in gesture and a
different one in speech, also showed relatively high levels of capacity
expenditure, indicated by decreased performance on a secondary task. By
contrast, children who showed low levels of variability in strategy use,
displaying the same strategy in gesture and speech, showed less capacity


67
Table 13
Correlations Among Measures of Strategy Change, By Condition and Grade
Correlation
Combinations and
Trials with Changes
Combinations and
Total Changes
Trials with Changes
Total Changes
Ascending/Descending
Grade 2
.33*
#77***
Grade 4
. 60**
.65***
.93***
Descending/Ascending
Grade 2
.66***
.64***
,94***
Grade 4
_59***
.68***
.91***
*£ < .05, **£ < .01, ***£ < .001


RESULTS
All analyses are reported at p < .05, with post-hoc tests
evaluated with t-tests unless otherwise specified.
Preliminary Analyses
Some of the results were pertinent to general issues in cognition
and memory development but not to the focus of the current study. These
results are presented here. The next section reports results concerning
issues of strategy variability and the relation between variability and
recall.
Off-Task Behavior and Examination
Off-task behavior and examination were observed during each of the
three 30-s intervals of the study period (range: 0 to 3 per trial).
Each type of data was analyzed separately using 2 (grade) x 2
(condition) x 7 (trial) analyses of variance (ANOVAs), with repeated
measures on the trial factor. The analysis of off-task behavior
revealed no significant main effects or interactions. As shown in
previous research (Coyle & Bjorklund, 1997), off-task behavior was
slightly greater for younger than for older children (mean frequency of
off-task behavior per trial: .29 and .16 for second and fourth grade,
respectively). The analysis of examination revealed significant main
effects of condition, F(l, 116) = 5.51 (mean number of intervals of
examination per trial: 1.87 and 2.27 for ascending/descending and
descending/ascending conditions, respectively), and trial, F(6, 696) =
34


77
few words would provide greater opportunity for perfect recall, and
consequently result in few strategy changes.


page
REFERENCES 104
BIOGRAPHICAL SKETCH 108
viii


82
addition to metacognitive limitations, capacity limitations may have
prevented the use of additional strategies. Children have limited
mental capacity for executing cognitive operations such as strategies,
and such capacity constraints may impose limits on the number of
strategies that can be used (Guttentag, 1984). Capacity limits can
change as a result of task experience or familiarization, as may have
occurred when strategy use increased from Trial 2 to 3. However,
capacity limits probably place an upper limit on the number of
strategies used, resulting in changes that occur in a restricted range.
Strategy Changes
In addition to the observed age differences in multiple-strategy
use, the current study also revealed age differences in strategy
changes. Fourth graders showed fewer strategy changes than second
graders for two of the three measures of strategy change (number of
trials with changes and number of trial-by-trial changes). No grade
difference was found for the third measure of strategy change, number of
unique strategy combinations, although the pattern was in the predicted
direction. Although age-related declines in variability have been noted
elsewhere (Siegler, 1996), these are the first results to demonstrate
empirically that strategy changes decline with age. More importantly,
these results, along with the results pertaining to multiple-strategy
use described above, demonstrate that different measures of variability
show different developmental patterns. Number of strategies used
increased with age, number of trials with changes and total strategy
changes decreased with age, and number of unique strategy combinations
was comparable across age. These results suggest considerable diversity


This dissertation was submitted to the Graduate Faculty of the
Department of Psychology in the College of Liberal Arts and Sciences and
to the Graduate School and was accepted as partial fulfillment of the
requirements for the degree of Doctor of Philosophy
August, 1997
Dean, Graduate School


52
Table 7
Percentage (and Number) of Children Classified as Stable or Unstable Across
All Trials, on Early Trials, and on Later Trials, By Grade
All
Trials
Early Trials
Later
Trials
Grade
Stable
Unstable
Stable
Unstable
Stable
Unstable
2
23 (16)
77 (53)
39 (27)
61 (42)
38 (26)
62 (43)
4
47 (24)
53 (27)
43 (22)
57 (29)
63 (32)
37 (19)


86
different numbers of words. Children of all ages showed comparable
numbers of strategy changes after trials with relatively few words or
many words.
Why did children fail to show strategy changes on trials with many
words when they were predicted to do so and when such changes may have
benefited their performance? The answer to this question may involve
the same factors that were reviewed in the section on multiple-strategy
use: metacognitive limitations and capacity limitations. Metacognitive
limitations may have limited children's ability to monitor changes in
the number of words presented on each trial and to alter their strategy
use in response to such changes. Capacity limitations may have limited
children's ability to add strategies on successive trials even if they
had the metacognitive awareness to do so. Future research, providing
metacognitive instruction on when and how to use strategies and reducing
the capacity demands for strategy production and utilization, is needed
to assess these possibilities.
Relation Between Multiple-Strategy Use and Recall
An important purpose of the current study was to investigate the
relation between multiple-strategy use and recall, identifying possible
evidence for utilization deficiencies. An initial analysis examined age
differences in correlations between multiple-strategy use (i.e., number
of strategies used) and recall, computed separately in each condition
and across trials. The findings in the ascending/descending condition
were very similar to those observed in previous research examining age
differences in the relation between strategy use and recall (Coyle &
Bjorklund, 1996, 1997). Fourth graders showed significant correlations


8
and low levels of recall. Conversely, utilization deficiencies that
occur later in development are associated with relatively poor recall,
because the dominant alternative pattern is effective strategy use and
high levels of recall.
Memory Strategy Development is Stagelike
The origin of the assumption that memory strategy development is
stagelike can be traced to research on spontaneous (i.e., uninstructed)
strategy use in the late 1960s and 1970s. One goal of this research was
to identify the types of strategies used by different age groups. To do
this, children of different ages were presented with a memory task and
their mnemonic behaviors were recorded and compared. The general
finding was that children in each age group typically used a different
and unique strategy for remembering. This finding was remarkably
consistent across a variety of research paradigms. Serial recall
studies showed that young children often use no rehearsal strategy,
older children often use single-word rehearsal, and still older children
use cumulative rehearsal (Flavell, Beach, & Chinsky, 1966; Ornstein,
Naus, & Liberty, 1975). Organizational memory tasks showed that young
children often organize words along thematic dimensions whereas older
children often organize words along taxonomic dimensions (Ceci & Howe,
1978). Paired-associate learning tasks showed that young children often
form arbitrary links between word pairs whereas older children often
form relational links between word pairs (for a review, see Kee, 1994).
These early findings depicted memory strategy development as a
stagelike progression (Siegler, 1995). Stage descriptions were not
limited to memory strategies but included strategies in such diverse


70
Table 14
Stable
or Unstable
Across All
Trials, on Early Trials.
and on Later
Trials, By
Grade
All
Trials
Early Trials
Later
Trials
Grade
Stable
Unstable
Stable
Unstable
Stable
Unstable
2
.53 (.18)
.54 (.15)
.53 (.15)
.59 (.15)
.59 (.18)
.47 (.15)
4
.81 (.12)
.70 (.16)
.78 (.12)
.70 (.16)
.82 (.16)
.72 (.20)


37
Table 2
Trial (i.e., Collapsed Across
Grade), and
Grade Differences
in Recall at
Each
Trial Bv Condition
Trial
1
2
3
4
5
6
7
Ascending/Descending
Maximum Recall 6
9
12
15
12
9
6
Grade 2
M .81
.59
.49
.45
.42
.50
.71
SD .17
.22
.16
.18
.19
.28
.22
Grade 4
M .84
.73
.69
.70
.76
.89
.94
SD .18
.16
.23
.23
.21
.17
.14
Collapsed Across Grade
M .82
.64
.57
.55
.55
.65
.80
SD .17
.21
.21
.24
.26
.31
.22
Grade 4 Grade 2
Difference .03
.14
.20
.25
.34
.39
.24
Descending/Ascending
Maximum Recall
15
12
9
6
9
12
15
Grade 2
M
.38
.48
.53
.76
.56
.41
.36
SD
.18
.24
.24
.22
.30
.24
.18


Table 5
Grade 4
M
48
continued
58
.85
.50
.35
.50
.42
76
CO
CO
.58
.49
.51
.58
SD


64
section, these data demonstrate that fourth graders outperform
comparably strategic second graders, which is evidence of a utilization
deficiency for the second graders.
Strategy Change and Recall
Correlations between strategy change and recall. Previous
research (Coyle & Bjorklund, 1997) involving procedures and age groups
similar to those in the current study has shown that measures of
strategy change are significantly and negatively correlated with recall
for older but not younger children. That is, older children who showed
the fewest strategy changes across trials (i.e., high levels of
stability in strategy use) had the highest levels of recall. The
current study attempted to replicate this finding with a different
sample.
Correlations were computed separately between each measure of
strategy change and mean proportion recall across trials. The three
measures of strategy change were number of unique strategy combinations,
number of consecutive trials with strategy changes, and total number of
strategy changes on consecutive trials.
Correlations computed separately for each grade revealed a pattern
very similar to that observed in previous research. Fourth graders
showed significant and negative relations between recall and strategy
change for two of the three measures (trials with changes, r(51) = -.42,
p < .01, and total changes, r(51) = -.36, p < .05), but not for unique
combinations, r(51) = -.11, p > .10. Second graders showed no reliable
relation between recall and any measure of strategy change (rs(69) =
.23, -.15, and -.13 for unique combinations, trials with changes, and


32
included as a strategy for purposes of analyses. Unless specified
otherwise, strategy data were coded dichotomously, with each strategy
being coded as occurring or not occurring on each trial.
The strategies assessed during the study period (i.e., sorting,
rehearsal, and category naming) could be observed between zero and three
times during the 1 min 30 s study period. A child was credited with
using a strategy on a trial if he or she was observed to use that
strategy during at least one of the three 30-s intervals. The strategy
assessed during recall, clustering, was measured by the adjusted ratio
of clustering (ARC) score (Roenker, Thompson, & Brown, 1971). Following
Coyle and Bjorklund (1997), a child was credited with using a clustering
strategy if his or her ARC score was .50 or greater. This represents a
value of slightly more than one standard deviation greater than
clustering expected by chance. Children could be classified as using
any one of the four strategies or any combination of the four
strategies on a particular trial.
Reliability has been assessed in previous research that examined
the same study behaviors and strategies (Coyle & Bjorklund, 1997). This
research demonstrated that percentage of agreement for two independent
coders coding the study behaviors (i.e., sorting, rehearsal, category
naming, examination, and off-task behavior) was very high (92%).
Percentage agreement for coding the strategies of sorting, rehearsal,
and category naming was even higher (97%). These data, along with data
from other studies reporting reliability for similar strategies (Lange,
MacKinnon, & Nida, 1989; Wellman, Ritter, & Flavell, 1975), demonstrate


38
Table 2continued
Grade 4
M .55
.66
.80
.90
.80
.69
.61
SD .19
.21
.21
.22
.22
.24
.24
Collapsed Across Grade
M .46
.56
.66
.82
.67
.54
.48
SD .20
.25
.26
.23
.29
.28
.24
Grade 4 Grade 2
Difference .17
.18
.27
.14
.24
.28
.25
Note. Maximum recall indicates
the maximum
number
of words that
could be
recalled on a particular trial.


98
current study continued to use a strategy that yielded optimal
performance only after they had some experience on the task. No such
pattern of strategic responding was found when children had relatively
little experience on the task. An additional set of analyses revealed
little evidence that the win-stay/lose-shift approach was more apt to
occur on trials with relatively few words, with only two of six possible
trials on which few words were presented showing a pattern consistent
with the win-stay/lose-shift approach.
Conclusions
The findings of the current study extend research on multiple-
strategy use and strategy changes in two ways. First, whereas previous
research has described a single developmental pattern of variability
(variability declining with age and experience), the current study
showed that a single developmental pattern does not account for all
types of variability. Instead, the developmental course of variability
differed for different types of variability, with number of strategies
used increasing with age and strategy changes decreasing with age.
Second, whereas previous research has identified children as showing
stable- or unstable-strategy use, the current study demonstrated that
changes in the amount of stability observed in a brief testing session
vary with age. Specifically, older children tended to become more
stable in their strategy use over time, whereas younger children tended
to remain unstable.
The current study also extends research on the relation between
variability and performance in three ways. First, the current study
shows that utilization deficiencies may not apply to all possible


88
initially (i.e., Trials 1 to 4 in the ascending/descending condition),
but not when a decreasing number of words was presented initially (i.e.,
Trials 1 to 4 in the descending/ascending condition). Conversely,
second graders tended to use multiple strategies effectively when a
decreasing number of words was presented initially but not when an
increasing number of words was presented initially. Apparently, second
graders' effective use of multiple strategies occurred when an initially
large problem set (i.e., 12 or 15 words presented) was subsequently
reduced whereas fourth graders effective use of multiple strategies
occurred when an initially small problem set was subsequently increased.
I have no good explanation for these findings, and believe that lengthy
speculation is not warranted at this time. I conclude only that
strategy-recall relationships do vary as a function of age, amount of
information in the problem, and subsequent presentation of problems with
different amounts of information, and that the impact of these variables
on strategy variability warrants further investigation.
A second set of analyses examined grade comparisons in mean recall
on trials on which perfect clustering, sorting, or both sorting and
clustering were observed. Coyle and Bjorklund (1996) have argued that
utilization deficiencies can be inferred when grade differences in
recall are found despite comparable and perfect strategy use. The
analyses did not examine the effect of trials with different number of
words on recall because children rarely showed multiple trials with
perfect strategy use.
The analyses revealed that second graders who clustered perfectly
recalled fewer words than fourth graders who clustered perfectly. In


30
the first list, children were told that they would be presented seven
lists of words (each printed on a 3 x 7 in. [7.6 x 12.7 cm] index card)
to remember and later recall in any order they wished. They were told
that the lists and items would be presented one at a time and that some
lists would have a different number of words. They were not told how
many words would be presented on each list, nor were they told about the
categorical structure of the lists.
The experimenter presented each card (on which a word was printed)
to the child at a rate of about one card every 2 s. The experimenter
named the item and children repeated the name. Cards were placed in
front on children in rows, with the stipulation that no two items from
the same category were presented contiguously. Each row contained six
cards, unless the number of cards presented was not a multiple of six
(i.e., 9 or 15 cards). In this case, the row closest to the child
contained three cards. After the cards were presented, children were
instructed to "study the words and do whatever you want to remember them
later." After 1 min 30 s, the cards were covered with an opaque cloth
and then children solved problems on the Matching Familiar Figures Test
(Kagan, 1965) for approximately 30 s. Children were then asked to
recall as many items as they could in any order they wished. If the
child was silent for 10 s, the experimenter asked if there were any more
words that he or she could remember. When either another 10 s interval
elapsed with no more words recalled or the child stated that he or she
could remember no more words, the trial was ended. Trials 2-7 followed
immediately after Trial 1, using the same procedure with different sets


69
pertaining to analyses that examine patterns of variability for
individual subjects may not always be consistent with those pertaining
to analyses that examine patterns of variability in group data.
Data relevant to the significant Grade x Stability Classification
interaction are reported in columns one and two of Table 14.
Differences in recall between stable and unstable children were analyzed
separately within each grade. Second-grade children in each stability
classification showed equivalent levels of recall. In contrast, fourth
graders classified as stable recalled significantly more than fourth
graders classified as unstable. A further analysis examined grade
differences in recall separately within each stability group. Fourth
graders recalled significantly more than second graders within both
stability groups. However, the magnitude of this grade difference in
recall was greater for stable children than for unstable children (mean
fourth grade minus second grade recall difference: .28 and .16, for
stable and unstable children, respectively).
These findings were confirmed and extended in a final set of
analyses that examined grade differences in recall for children
classified as stable and unstable on early trials (Trials 1-4) and
separately on later trials (Trials 4-7). A 2 (grade) x 2 (condition) x
2 (stability classification) ANOVA was performed on mean proportion
recall on early trials and separately on later trials. As before, only
significant effects involving the stability classification factor are
reported. The recall data for these analyses are reported in columns
three through six in Table 14.


39
grade differences in recall were found on all trials. As shown in Table
2, the magnitude of grade differences in recall was least pronounced on
trials with the fewest words presented (Trials 1 and 7 in ascending/
descending and Trial 4 in descending/ascending), compared to the data on
adjacent trials.
Strategy Use
The percentage and mean number of trials on which children in each
grade used each strategy is presented by condition in Table 3. The
percentages within each grade do not sum to 100 because multiple
strategies were frequently used in combination on a single trial. The
number of trials on which each strategy was used (range = 0 to 7) was
examined by a 2 (grade) x 2 (condition) x 4 (strategy) ANOVA. The
analysis revealed a significant main effect of strategy, F(l, 116) =
73.09, and significant interactions of grade x strategy, F(3, 348) =
4.63, and condition x strategy, F(3, 348) = 5.72. Inspection of the
significant main effect of strategy revealed that sorting, rehearsal,
and clustering were used more often than category naming, with all other
strategy comparisons being nonsignificant (mean number of trials on
which each strategy was used: 3.18, 3.44, 3.23, and .12 for sorting,
rehearsal, clustering, and category naming, respectively). The floor
levels of category naming are inconsistent with previous research
showing that category naming was used relatively frequently by fourth
graders who received a sort-recall task similar to the one used here.
Although category naming was almost never used in the current study, the
near absence of this strategy did not prevent the detection of


4
child. Utilization deficiencies are inferred empirically when (a) the
correlation between strategy use and recall is nonsignificant for
younger children but significant for older children, (b) young strategy
users recall no more than their nonstrategic peers, (c) older children
recall more than comparably strategic younger children, and (d) strategy
use increases over trials with no corresponding improvements in memory
performance (for additional examples of utilization deficiencies, see
Miller & Seier, 1994). Evidence for utilization deficiencies has been
found in studies using a variety of memory paradigms and involving
participants ranging in age from preschool to late adolescence (for
reviews, see Bjorklund & Coyle, 1995; Miller & Seier, 1994).
Recent reviews of memory development research have revealed the
ubiquity of utilization deficiencies (Bjorklund & Coyle, 1995; Miller &
Seier, 1994). One such review was conducted by Miller and Seier (1994),
who examined the memory development literature from 1974 through mid-
1992 for evidence of utilization deficiencies in normal populations
(e.g., greater recall for older than comparably strategic younger
children). Miller and Seier used three criteria to select studies
appropriate for the examination of utilization deficiencies: (a)
independent measures of strategy use and recall, (b) spontaneous
strategy production (i.e., training studies were excluded), and (c)
analyses examining age differences in strategy use and performance. Of
the 59 studies they evaluated, 56 (95%) provided evidence for a
utilization deficiency.
Although Miller and Seier limited their review to spontaneous
strategy use, a more recent review by Bjorklund, Miller, Coyle, and


9
domains as arithmetic, number conservation, and scientific reasoning
(Siegler, 1996). Young children were described as using one approach,
older children as using a different approach, and still older children
as using yet another approach. At each age children were described as
using a single and unique strategy. Strategy development consisted of
one strategy being replaced by another more advanced strategy.
Although a stagelike pattern of strategy development appeared to
describe well the pattern of data in early studies, evidence
inconsistent with a stagelike progression was reported in the mid- to
late-1980s. Several studies during this period demonstrated that
children of a particular age used not one but several strategies. Such
variability was found across a variety of tasks, including ones
assessing memory strategies. For example, children asked to remember a
series of digits sometimes used no rehearsal strategy, sometimes
rehearsed only one digit at a time, and sometimes rehearsed all digits
together (McGilly & Siegler, 1989). Children asked to remember the
location of a hidden object sometimes talked about where the object was
hidden, sometimes stayed near the hiding place, and sometimes pointed to
the hiding location (DeLoache, 1984). Children presented with a paired-
associate learning task sometimes repeated the names of the items and
sometimes formed a sentence or image linking the word pairs (reviewed in
Kee, 1994). Children asked to remember a series of objects sometimes
visually inspected the objects, sometimes named the objects, and
sometimes physically manipulated the objects (Baker-Ward, Ornstein, &
Holden, 1984; Lange, MacKinnon, & Nida, 1989).


I wish to thank my cousin, Annette Fields, for providing me with
support and guidance throughout the years. Annette is an accomplished
lawyer and she has taught me by example the rules and standards of good
argumentation. She has shown me that anyone can rise to the top with
lots of hard work and discipline. Annette's best friend and confidant,
Ellen Ross, always has believed in me and my talents, and her support is
appreciated.
I wish to thank Deborah Hooks for loving me for what I am and,
more importantly, for what I can become. Deborah entered nearly all of
the data for this dissertation, and she provided numerous useful
suggestions about possible analyses. One of her suggestions, to examine
intrusions in childrens recall protocols, turned out to be very
promising and provides a possible basis for a new view of strategy
development that includes developmental differences in resistance to
interference. Deborah has taught me that love is the best part of life,
and without it, you really don't have much of a life at all.
To all the members of my family, I love you all.
vi


Table 4
Grade 4
M
SD
continued
1.08 1.27
.98 1.00
44
1.65 1.50
1.13 1.03
1.65 1.65 1.65
.80 1.16 1.16


60
Table 10
Mean Proportion Recall When Strategy Use Was Perfect, By Grade and Type of
Strategy Used
Strategies Used Perfectly
Sorting
Only
Clustering
Only
Sorting am
Clustering
Grade 2
M
.72
.50
.87
SD
.19
.17
.17
n
15
63
14
Grade 4
M
.80
.60
.89
SD
.15
.23
.10
n
14
24
28
Note, ns are number of children who showed at least one trial of perfect
strategy use.


85
Subsequent analyses revealed that this grade difference in stability
classification was attributed to a disproportionate number of fourth
graders being classified as stable on later trials (i.e., Trials 4 to
7). The distribution of children in each grade classified as stable and
unstable was comparable on early trials (i.e., Trials 1 to 4). These
findings were extended in an analysis of changes in individual-subject
stability classification across early and later trial blocks for
children whose initial stability classification was unstable. In that
analysis, fourth graders who showed unstable-strategy use on early
trials frequently showed stable-strategy use on later trials. In
contrast, second graders who showed unstable-strategy use on early
trials often remained unstable on later trials. These findings
demonstrate that stable-strategy use emerged during the later trials for
fourth graders but not for second graders. The fourth-grade data are
consistent with research demonstrating that variability in strategy use
declines with experience on a task (Coyle & Bjorklund, 1997; Siegler,
1996). Presumably, the second-grade children eventually would have
shown stability in strategy use if they had been given additional
practice and experience on the task. Microgenetic studies, assessing
children's strategy use over longer periods of time, are needed to
evaluate this hypothesis.
Children in all grades were predicted to show relatively few
strategy changes following trials with few words (i.e., trials with 6 or
9 words) and more frequent strategy changes following trials with many
words (i.e., trials with 12 or 15). Contrary to this prediction, the
results revealed that strategy changes did not vary with trials with


79
the relation between multiple-strategy use and recall, with particular
attention to patterns indicative of utilization deficiencies; (c) and
the relation between strategy changes and recall, with particular
attention to stability-recall relations. The pages that follow are
organized around these goals.
Variability in Strategy Use
Multiple-Strategy Use
As predicted, fourth graders tended to use more strategies than
second graders, with the number of strategies used increasing from Trial
2 to Trial 3 and remaining stable thereafter. These results were
confirmed and extended in the analysis of grade differences in the use
of each of the 15 unique strategy combinations (Table 11). In that
analysis, second graders used the strategies of rehearsal and clustering
and the two-strategy combination of rehearsal and clustering more than
fourth graders. In contrast, fourth graders used the two-strategy
combination of sorting and clustering and the three-strategy combination
of sorting, rehearsal, and clustering more often than second graders.
These data demonstrate that, when grade differences in strategy use were
found, second graders tended to use combinations with the fewest
strategies (i.e., single-strategy combinations) whereas fourth graders
tended to use combinations with the most strategies (i.e., three-
strategy combinations).
The very low frequency of category naming in the current study is
inconsistent with the results obtained by Coyle and Bjorklund (1997).
Whereas category naming was observed on only 2% of all trials in the
current study, it was observed on almost 31% of all trials in Coyle and


40
Table 3
Percentage (and Number) of Trials on Which Each Strategy Was Used, By
Condition and Grade
Strategy
Category
Sorting Rehearsal Clustering Naming
Ascending/Descending
Grade 2
36
(2.54)
60
Grade 4
58
(4.04)
58
Descending/Ascending
Grade 2
35
(2.47)
38
Grade 4
59
(4.15)
37
(4.21) 37 (2.56) 2 ( .15)
(4.04) 49 (3.40) <1 ( .04)
(2.67) 50 (3.53) 3 ( .20)
(2.62) 53 (3.69) <1 ( .04)


72
after serial recall performance that was perfect or less than perfect.
They found that children were more likely to show strategy changes when
recall was less than perfect than when recall was perfect. That is,
children tended to stick with a particular strategy on the next trial
when it had yielded perfect performance, but changed strategies on the
next trial when it had yielded less than perfect performance. This
pattern is consistent with the win-stay/lose-shift approach that has
been reported in decision-making literature (Eimas, 1969).
In the current study, evidence for the win-stay/lose-shift
approach was examined by classifying each trial as a trial on which
recall was perfect or not perfect and on which strategy changes were or
were not observed on the next trial. This resulted in four possible
classifications: recall perfect/strategy change; recall perfect/no
strategy change; recall not perfect/strategy change; recall not
perfect/no strategy change. Classifications were performed separately
for Trials 1-6, with each child contributing a single data point at each
trial. (Trial 7 was omitted from the analysis because a strategy change
following Trial 7 is not possible.)
The percentage of trials on which recall was perfect or not and
followed by a strategy change or not is shown in Table 15. The
classification data on each trial were analyzed separately by 2 (recall
perfect vs. recall not perfect) x 2 (strategy change vs. no strategy
change) chi-squares. For Trials 1 through 3, perfect recall was
followed by strategy changes or no strategy change approximately
equally, X s(l, N = 120) < 1. For Trials 4 through 6, however, perfect
recall was followed by no strategy change more frequently than by


23
relatively few strategy changes during the later trials of the sort-
recall task, when they have had considerable task-related experience.
Strategy changes were predicted to vary according to the number of
words presented on each trial. Specifically, strategy changes were
predicted to rarely follow trials with relatively few words (i.e.,
trials with 6 and 9 words), but to frequently follow trials with
relatively many words (i.e., trials with 12 and 15 words). These
predictions were based on research showing that strategy changes rarely
follow perfect performance but frequently follow less than perfect
performance (McGilly & Siegler, 1989). Because perfect recall was
likely on trials with few words but not on trials with many words, it
was predicted that strategy changes would be less frequent on trials
with few words compared to trials with many words.
The second goal of the current study was to examine the relation
between multiple-strategy use and recall as a function of age and number
of words on each trial. A specific aim was to examine data for possible
evidence of utilization deficiencies. Utilization deficiencies were
predicted to be less frequent for older children than for younger
children. This prediction was based on research examining evidence of
utilization deficiencies for children of different ages. For example,
Miller and Seier (1994) have shown that correlations between strategy
use and recall are often positive and significant for older but not
younger children, and have interpreted this as evidence of a utilization
deficiency for the younger children. Similarly, Coyle and Bjorklund
(1996) have shown that younger children recall less than comparably
strategic older children and have interpreted this finding as


36
difference in intrusions is consistent with findings demonstrating that
younger children have problems inhibiting task-inappropriate responses
{Dempster, 1992). The repetition and intrusion data are excluded from
all subsequent analyses.
Because possible recall varied trial-by-trial, the number of words
recalled on each trial was converted to the proportion of words recalled
relative to possible recall. Mean proportion recall on each trial, by
grade and condition, is presented in Table 2, which also shows mean
proportion recall by condition and trial (i.e., collapsed across grade)
and fourth grade minus second grade recall differences by condition and
trial. Proportion recall was examined by a 2 (grade) x 2 (condition) x
7 (trial) ANOVA, with repeated measures on the trial factor. The
analysis revealed significant main effects of grade, F(l, 116) = 10.08
(mean proportion recall: .54 and .75 for second and fourth grade,
respectively), condition, F(l, 116) = 6.83 (mean proportion recall: .65
and .60 for ascending/descending and descending/ascending conditions,
respectively), and trial, F(6, 696) = 3.69 (mean proportion recall: .65,
.61, .61, .67, .60, .60, and .65 for trials 1-7, respectively). Also
significant were interactions of grade x trial, F(6, 696) = 7.01, and
condition x trial, F(6, 696) = 57.68, both of which were qualified by a
significant interaction of grade x condition x trial, F(6, 696) = 2.85.
The significant three-way interaction was evaluated by comparing
grade differences in recall at each trial, separately for each
condition. In the ascending/descending condition, significant grade
differences in recall were observed on all trials except Trial 1, when
only six words were presented. In the descending/ascending condition,


Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
VARIABILITY AND UTILIZATION DEFICIENCIES IN CHILDREN'S MEMORY
STRATEGIES: A DEVELOPMENTAL STUDY
By
Thomas R. Coyle
August, 1997
Chair: Patricia H. Miller
Cochair: Shari A. Ellis
Major Department: Psychology
The goal of this study was to examine variability in memory
strategy use, and the relation between such variability and recall, as a
function of age and a measure of task difficulty (number of words to
remember). Second and fourth graders received seven sort-recall trials
of different categorizable words (e.g., nurse, lawyer, wall, roof, rose,
lily). The number of words presented varied across trials, whereas the
number of categories represented in all word lists remained constant.
Variability in strategy use was measured in terms of multiple-strategy
use (e.g., number of strategies used across trials) and strategy changes
(e.g., number of trial-by-trial changes in the types of strategies
used). Consistent with previous research, (a) older children used more
strategies and made fewer trial-by-trial changes than younger children;
(b) older children recalled more than comparably strategic younger
children, indicating a utilization deficiency for the younger children;
and (c) older children showed significant and positive relations between
xi


74
j
strategy changes, X s( 1, N = 120) > 5.43. These results demonstrate
that children were more likely to continue to use a strategy that
yielded perfect performance on later but not earlier trials.
Additional analyses examined the prediction that trials with
relatively few words (i.e., Trials 1, 2, and 6 in ascending/descending
and Trials 3, 4, and 5 in descending/ascending) would provide greater
opportunity for perfect recall, and consequently result in relatively
few strategy changes. To test this prediction, a series of 2 (recall
perfect vs. recall not perfect) x 2 (strategy change vs. no strategy
change) chi-squares were performed separately at each trial in each
condition (see Table 16). This resulted in a total of 12 individual
chi-squares (2 conditions x 6 trials). (Because including the grade
factor would have resulted in insufficient data to perform significance
tests for several of the grade x condition x trial combinations, the
grade factor was excluded from these analyses.) One of the 12 chi-
squares (Trial 1 in the descending/ascending condition) did not contain
sufficient data for a significance test. Of the remaining 11, only two
were significant. As predicted, the pattern of data on trials 4 and 5
in the descending/ascending condition revealed that perfect recall was
followed by no strategy change more frequently than by strategy changes,
X s(l, N = 56) > 5.18. The four other trials on which this pattern was
predicted (i.e., Trials 1, 2, and 6 in ascending/descending and Trial 3
in descending ascending) showed that perfect recall and strategy changes
did not vary as a function of number of words presented. These data
provide little evidence for the prediction that trials with relatively


41
significant effects concerning measures of strategy variability, as
shown in later analyses.
Data relevant for the significant interactions concerning strategy
use are presented in Table 3. Inspection of the Grade x Strategy
interaction revealed that sorting was used more by fourth graders (M =
4.13) than by second graders (M = 2.51), with grade comparisons for the
other strategies being nonsignificant. Evaluation of the Condition x
Strategy interaction revealed that rehearsal was used more in the
ascending/descending condition (M = 4.13) than in the
descending/ascending condition (M = 2.65), with the other strategies
being used approximately equally in both conditions.
These strategy data provide information concerning the frequency
of occurrence of each individual strategy. Subsequent analyses examine
the possibility of several strategies being used in combination on a
single trial, and changes in the mixture of strategies used across
trials.
Variability in Strategy Use
Two general types of variability were examined: multiple-strategy
use and strategy change. Multiple-strategy use refers to the number of
strategies used within a given trial. Strategy change refers to the
number of different strategies used across trials and trial-by-trial
changes in strategy use.
Multiple-Strategy Use
An initial analysis examined the prediction that multiple-strategy
use would increase with age and that the number of strategies used would
be greatest for trials on which relatively many words were presented


106
Goldin-Meadow, S., Nussbaum, H., Garber, P., & Church, R. B. (1993).
Transitions in learning: Evidence for simultaneously activated
hypotheses. Journal of Experimental Psychology: Human Perception
and Performance. 19. 1-16.
Guttentag, R. E. (1984). The mental effort requirements of cumulative
rehearsal: A developmental study. Journal of Experimental Child
Psychology. 37, 92-106.
Jacoby, L. L. (1991). A process dissociation framework: Separating
automatic from intentional uses of memory. Journal of Memory and
Language. 30. 513-541.
Kee, D. W. (1994). Developmental differences in associative memory:
Strategy use, mental effort, and knowledge-access interactions. In
H. W. Reese (Ed.), Advanced in child development and behavior
(Vol. 25, pp. 232). New York: Academic Press.
Kee, D. W., & Davies, L. (1990). Mental effort and elaboration: Effects
of accessibility and instruction. Journal of Experimental Child
Psychology. 49. 264-274.
Lange, G., MacKinnon, C. E., & Nida, R. E. (1989). Knowledge, strategy,
and motivational contributions to preschool children's object
recall. Developmental Psychology, 25. 772-779.
Langer, E. J. (1989). Mindfulness. Reading, MA: Addison-Wesley.
Lemaire, P., & Siegler, R. S. (1995). Four aspects of strategic change:
Contributions to children's learning of multiplication. Journal of
Experimental Psychology: General, 124, 83-97.
McGilly, K., & Siegler, R. S. (1989). How children choose among serial
recall strategies. Child Development, 60, 172-182.
Miller, G. A. (1956). The magical number seven plus or minus 2: Some
limits on our capacity for processing information. Psychological
Review, 63, 81-97.
Miller, P. H. (1990). The development of strategies of selective
attention. In D. F. Bjorklund (Ed.), Children's strategies:
Contemporary views of cognitive development (pp. 157-184).
Hillsdale, NJ: Erlbaum.
Miller, P. H., & Seier, W. L. (1994). Strategy utilization deficiencies
in children: When, where, and why. In H. W. Reese (Ed.), Advances
in child development and behavior (Vol. 25, pp. 108-156). New
York: Academic Press.
Miller, P. H., Seier, W. L., Barron, K. L., & Probert, J. S. (1994).
What causes a memory strategy utilization deficiency? Cognitive
Development, 9, 77-102.


87
on all trials, whereas second graders showed significant correlations on
only three of the seven trials (Trials 3, 4, and 6). The pattern of
correlations for the fourth graders indicated that they were able to
benefit from using multiple strategies from the beginning of the task.
The second graders pattern indicated that strategy use was rarely
linked to recall performance, which is evidence of a utilization
deficiency.
The findings in the descending/ascending condition revealed a
pattern opposite to that found in the ascending/descending condition.
Fourth graders now showed significant correlations on only two of the
seven trials (Trials 6 and 7), whereas second graders showed significant
correlations on five of the seven trials (Trials 2, 3, 5, 6, and 7).
The pattern of correlations for the second graders indicated that they
were using multiple strategies effectively. The fourth graders' pattern
was more difficult to interpret. Although it could be argued that
fourth graders were utilizationally deficient, such an interpretation is
probably incorrect because, with few exceptions, mean recall and
strategy use were higher for fourth graders than for second graders (see
Tables 2 and 4). Thus, fourth graders were probably not using
strategies ineffectively but likely using other means to recall the list
items, perhaps relying on nonstrategic factors (e.g., capacity, speed of
processing).
A comparison of the correlational data in each condition
demonstrates different patterns of strategy-recall relations across
trials for each age group. Fourth graders tended to use multiple
strategies effectively when an increasing number of words was presented


TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS iii
LIST OF TABLES ix
ABSTRACT xi
INTRODUCTION 1
Memory Strategies Enhance Performance 2
Memory Strategy Development is Stagelike 8
Evaluation of Research on Variability and Utilization
Deficiencies 12
The Current Study 18
Goals of the Current Study 21
METHOD 26
Participants 26
Stimuli and Design 26
Procedure 29
Coding 31
RESULTS 34
Preliminary Analyses 34
Off-Task Behavior and Examination 34
Recall 35
Strategy Use 39
Variability in Strategy Use 41
Multiple-Strategy Use 41
Strategy Change 45
Relation Between Strategy Use and Recall 56
Utilization Deficiencies 56
Strategy Change and Recall 64
DISCUSSION 78
Variability in Strategy Use 79
Multiple-Strategy Use 79
Strategy Changes 82
Relation Between Multiple-Strategy Use and Recall 86
Relation Between Strategy Changes and Recall 93
Conclusions 98
vii


46
measures on the trial transition factor. The analysis revealed a
significant main effect of grade, F(l, 116) = 11.41 (mean number of
strategy changes: .78 and .54 for second and fourth grade,
respectively), and a significant Grade x Trial Transition interaction,
F(5, 580) = 2.67. No other significant effects were found.
Data relevant to the significant Grade x Trial Transition
interaction are presented in Table 5, which also shows the number of
strategy changes for each Condition x Grade x Trial cell. Inspection of
grade differences in strategy changes on each trial transition revealed
that fourth graders had significantly fewer changes than second graders
on all trial transitions except transitions 2 to 3 and 3 to 4. These
data demonstrate that the grade difference mentioned above is primarily
a result of fourth graders having fewer strategy changes than second
graders on later rather than earlier trials. These findings are
consistent with the hypothesis that strategy changes decrease with age.
The absence of a significant Condition x Trial interaction indicates
that strategy changes did not vary across trials with different numbers
of words presented.
Other types of variability. Although number of strategy changes
across trials is one measure of strategy change, other measures of
strategy change are possible. Two additional measures of strategy
change are examined here: number of unique strategy combinations used
across all trials (range: 0 to 7), and number of consecutive trials
with strategy changes (range: 0 to 6). These measures, along with the
average number of strategy changes across trials (an average of the
measure analyzed above), were converted to z-scores and entered into a 2


12
limited, several different strategies are used when experience is
moderate, and few strategies are again used when experience is
extensive. Thus, the number of strategies used, plotted as a function
of task experience, produces an inverted-U shaped pattern. This pattern
has been found across a variety of strategic tasks.
Finally, contemporary research has shown that initial levels of
variability have implications for subsequent learning. For example,
Goldin-Meadow and her colleagues (Goldin-Meadow, Alibali, & Church,
1993) have shown that children who displayed high levels of variability
on a conceptual learning task showed increases in task performance
following instruction or practice. In contrast, children who displayed
low levels of variability typically showed no or relatively little
improvement in performance. Similarly, Siegler (1995) has shown that
children who used several strategies on a number conservation task
showed increases in subsequent learning. In contrast, children who used
few strategies on a number conservation task showed relatively little
change in subsequent learning. These findings raise the intriguing
possibility that variability may provide an index of when change is
likely to occur and when change can be induced to occur (cf. Thelen &
Smith, 1994).
Evaluation of Research on Variability and Utilization Deficiencies
The discovery of utilization deficiencies and variability has had
two important consequences in strategy development research. First,
several models of memory strategy development now explicitly account for
utilization deficiencies and variability. These models assume that
variability is present at all points in development (Siegler, 1996;


73
Table 15
Percentage (and Number) of Trials on Which Strategy Changes Did and Did Not
Occur Immediately After Recall vas Perfect or Not Perfect
Trial
1 2 3 4 5 6
Recall Perfect
Strategy
Change
42
( 8)
57 ( 4)
31 ( 4)
30 ( 9)
29 ( 6)
25
( 5)
No Strategy
Change
58
(ID
43 ( 3)
69 ( 9)
70 (21)
71 (15)
75
(15)
Recall Not Perfect
Strategy
Change
53
(54)
61 (69)
54 (58)
62 (56)
57 (56)
54
(54)
No Strategy
Change
47
(47)
39 (44)
46 (49)
38 (34)
43 (43)
46
(46)
Note. Percentages computed separately at each trial.


102
strategy use, utilization deficiencies were found for clustering but not
sorting or the combination of sorting and clustering. These findings
suggest that utilization deficiencies are not a general phase in
strategy development but are limited to the use of a particular strategy
in a particular context. Accordingly, future research should examine
the possibility of utilization deficiencies for each possible strategy
combination, not only for a subset of possible strategy combinations.
The results of the current study have implications for models of
strategy development postulating variability in strategy use. Such
models focus almost exclusively on multiple-strategy use, depicting
children as using multiple strategies early in development, later in
development, and at all points in between. The current study suggests
that models emphasizing multiple-strategy use might achieve even greater
descriptive power if they considered developmental changes in stability
in strategy use, defined as the consistent use of a particular strategy
or strategy combination. Children in the current study did show
considerable multiple-strategy use, but they also showed considerable
stability in strategy use, and such stability was particularly
pronounced for older children on the latter trials of the sort-recall
task. In addition, stability in strategy use was associated with
relatively high levels of performance, suggesting that stable-strategy
use was a relatively efficient form of strategy production. These
findings are consistent with research showing that older, more mature
strategy users are likely to select and use consistently a strategy that
yields optimal performance (Thelen & Smith, 1994). More importantly,
these findings suggest that stability can provide additional information


97
expenditure. Extrapolating from these findings to the current study
suggests that the inverse relation between variability and recall was
mediated by capacity requirements. Presumably, variability in strategy
use, indicated by many trial-by-trial changes in strategy use, resulted
in increased capacity expenditure, which consequently reduced strategy
effectiveness and lowered recall. No such effect was found when low
levels of variability were observed, presumably because low levels of
variability do not result in any appreciable capacity expenditure.
A final analysis examined the question of why strategy changes
occur. The analysis was developed based on the findings of a serial-
recall study by McGilly and Siegler (1989). In that study, strategy
changes were observed less frequently after trials on which serial-
recall performance was perfect than after trials on which serial-recall
performance was not perfect. This pattern is referred to as the win-
stay/lose-shift approach in the decision-making literature (Eimas,
1969).
Evidence for the win-stay/lose-shift approach in the current study
was examined by analyzing at each trial the occurrence of strategy
changes when recall was perfect and when recall was less than perfect.
For Trials 1 to 3, strategy changes occurred approximately equally
following perfect recall and following less than perfect recall. For
Trials 4 to 6, strategy changes were less frequent following perfect
recall than following less than perfect recall. These latter findings
are consistent with the win-stay/lose-shift approach. These are the
first results to my knowledge to demonstrate that the win-stay/lose-
shift approach varies as a function of task experience. Children in the


APA guidelines is as expansive as Dr. Miller's, and I probably never
will. I also wish to thank Dr. Miller for his contribution to the
Developmental area while he was on sabbatical.
I wish to thank Jennifer L. Slawiniski for her suggestions
regarding the design of my dissertation. I also thank Miss Slawinski
for the clever idea of applying a sequential design in the context of a
microgenetic experiment. Finally, I thank Miss Slawinski for her
incisive comments on examining gender effects and on reanalyzing
archival data.
I wish to thank the research assistants who helped with data
collection, analyses, and interpretation. These include Joshua List,
Chad Colbert, Victoria Otero, and Jerusha Azel. I suspect I learned as
much from them as they learned from me.
I wish to thank my mother and father, Oceania and Roger Coyle, for
their enduring support during my academic career. My mother and father
have taught me that hard work and perseverance will in the end always
pay off. Most important, my mother and father have taught me that the
most important thing in secular life is family. Their marriage of 35
years (and counting) is why I have acknowledged them together. This
dissertation is a testament to their love and support throughout the
years.
I wish to thank my brother, James Coyle, for his interest in my
work and his continued support of my goals, including the completion of
this dissertation. James is an exceptional guitar player, partly
because of exceptional talent, and partly because of exceptional
practice. I have learned much from observing his work ethic and
dedication to the instrument he loves so much.
v


ACKNOWLEDGMENTS
Dissertation acknowledgments typically say little about how those
acknowledged contributed to the student's academic development. Perhaps
this is how it should be, for the main purpose of a dissertation is to
present a student's original contribution to a recognized body of
knowledge. I will adhere to the academic tradition of brevity in these
Acknowledgments, but do so in a way that allows me to recognize specific
contributions of individuals who helped and supported me in completing
this dissertation. I first acknowledge faculty, and then acknowledge my
family.
I wish to thank Dr. James Algina for his contribution in teaching
me how to do some of the statistics in this dissertation, particularly
the analysis in which measures of variability were converted to z-scores
and analyzed simultaneously. I also thank Dr. Algina for discussions,
which I initiated, on issues pertaining to tenure in the university and
E. D. Hirsch's notion of cultural literacy.
I wish to thank Dr. David Bjorklund for convincing me to devote my
life to studying developmental psychology. Dr. Bjorklund was
instrumental in my training during the early part of my career, and he
deserves much credit for my achievements. Dr. Bjorklund has shown me
that the most exciting aspect of science is discovery, that description
is a reasonable goal for science, that the best research questions are
those that can account for the most data, and that Peter Kapista knew


3
particularly for younger, less practiced strategy users. For example,
research by Miller and her colleagues (reviewed in Miller & Seier, 1994)
demonstrated that young children using a selective attention strategy
had lower levels of recall than equally strategic older children. Such
findings were not limited to selective attention strategies. Similar
patterns were found in tasks assessing organizational and elaboration
strategies (Bjorklund & Harnishfeger, 1987; Kee & Davies, 1990).
Why did these studies show ineffective strategy use when earlier
research on production deficiencies showed effective strategy use? The
answer may have to do with how strategy use was measured. Production
deficiency research inferred strategy use from patterns of recall
following training. Improvements in recall were interpreted as
indicating the effective use of a mnemonic. No improvements in recall
were interpreted as indicating ineffective strategy training. This
latter pattern of data is ambiguous, however. No improvement in recall
may indicate ineffective training, but it may also indicate ineffective
strategy use. The only sure way to discriminate between these
alternatives is to assess recall and strategy use independently. Later
research on strategy effectiveness did assess recall and strategy use
independently. As mentioned above, these studies showed that increases
in strategy use do not always result in benefits to memory.
In 1990, Miller formally identified evidence of strategy use with
no recall benefits, and labeled such evidence a utilization deficiency
(Miller, 1990). According to Miller, utilization deficiencies occur
when a child produces an appropriate strategy but does not benefit from
it in terms of recall, or benefits less than an equally strategic older


45
Strategy Change
Number of strategy changes across trials. Although the analysis
above demonstrates that fourth graders used more strategies than second
graders, it did not examine possible changes in strategy use across
trials (i.e., additions and deletions in strategy use on consecutive
trials). For example, a child using two strategies across all trials
could be using sorting and rehearsal on all seven trials, sorting and
rehearsal on Trials 1-4 and sorting and clustering on Trials 5-7, or
sorting and rehearsal on all even trials and sorting and clustering on
all odd trials. In each case the child uses two strategies on all
trials but shows a different number of strategy changes. A child using
sorting and rehearsal on all trials shows no strategy changes; a child
using sorting and rehearsal on Trials 1-4 and sorting and clustering on
Trials 5-7 shows two strategy changes (i.e., dropping rehearsal and
adding clustering from Trial 4 to Trial 5); and a child using sorting
and rehearsal on all even trials and sorting and clustering on all odd
trials shows 12 changes (i.e., dropping a strategy and adding a
strategy on each of the six trial transitions (Trials 1 to 2, 2 to 3, 3
to 4, 4 to 5, 5 to 6, 6 to 7).
An analysis of strategy change evaluated the prediction that
strategy changes would decrease with age and that strategy changes would
occur most frequently on transitions to trials with more words (i.e.,
Trials 2 to 3 and 3 to 4 in the ascending/descending series and Trials 5
to 6 and 6 to 7 in the descending/ascending series). The number of
strategy changes on each of the six trial transitions was analyzed by a
2 (grade) x 2 (condition) x 6 (trial transition) ANOVA, with repeated


66
Table 12
Grade
Measure of Strategy Change
Unique
Trials
Total
Combinations
with Changes
Changes
Ascending/Descending
Grade 2
.23
-.21
-.06
Grade 4
.28
-.16
-.05
Descending/Ascending
Grade 2
.22
-.13
-.18
Grade 4
-.53**
-.66**
-.66**
**£ < .01


METHOD
Participants
Participants were 69 second graders, 36 boys and 33 girls (mean
age = 7 years 8 months, SD = 6.42 months), and 51 fourth graders, 21
boys and 30 girls (mean age = 9 years 7 months, SD = 5.00 months).
Children were recruited from schools and recreation centers in
Gainesville, Florida. The majority of children were White (80%) and
came from middle- and upper-middle-income households.
Stimuli and Design
Seven lists of categorically related words were constructed (three
categories per list, five words per category; see Table 1). The lists
were composed of words reported in three analyses of category norms
(Bjorklund, Thompson, & Ornstein, 1983; Posnansky, 1978; Uyeda &
Mandler, 1980). Each word was printed on a 3 x 5 in. (7.6 x 12.7 cm)
index card. Different words and categories were used on each list.
Items in each list varied in category typicality, with most items being
in the top-third frequency ranking for a particular category. Highly
associated words within a particular category (e.g., dog, cat; salt,
pepper) were avoided, thus minimizing the likelihood that clustering
would result from the automatic activation of semantic memory relations
(Frankel & Rollins, 1985; Schneider, 1986). Previous research has shown
that children in the age range tested here had little difficulty
26