Citation
Self-Serving Bias: A Possible Contributor of Construct-Irrelevant Variance in High-Stakes Testing

Material Information

Title:
Self-Serving Bias: A Possible Contributor of Construct-Irrelevant Variance in High-Stakes Testing
Creator:
BERGERON, JENNIFER M ( Author, Primary )
Copyright Date:
2008

Subjects

Subjects / Keywords:
Classrooms ( jstor )
Departmental majors ( jstor )
Elementary school students ( jstor )
Grade levels ( jstor )
High school students ( jstor )
High schools ( jstor )
Mathematics ( jstor )
Report cards ( jstor )
Students ( jstor )
Teachers ( jstor )

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright Jennifer M Bergeron. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Embargo Date:
12/31/2015
Resource Identifier:
658210148 ( OCLC )

Downloads

This item is only available as the following downloads:


Full Text

PAGE 1

SELF-SERVING BIAS: A POSSIBLE CONTRIBUTOR OF CONSTRUCTIRRELEVANT VARIANCE IN HIGH-STAKES TESTING By JENNIFER M. BERGERON A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2005

PAGE 2

ii ACKNOWLEDGMENTS I would like to take this opportunity to thank my research advisor, Dr. David Miller, without whose guidance and support, both personal and professional, this work would not be possible. I would also like to thank all the members of my committee for their help and encouragement: Dr. Anne Se raphine and Dr. David Therriault of the Department of Educational Psychology, Univers ity of Florida, and Dr. Carole Kimberlin of the Department of Pharmacy and Health Ca re Administration, University of Florida. I am also grateful to Dr. Brian Marchman a nd all the teachers at PK Young for their help and guidance in the recruitment of participants for this study ; without their contributions this project would not have been completed. Last, but not least, I would like to th ank my friends and colleagues here in Gainesville for their support and encouragement. Special th anks go to Jan MacInnes for her special friendship and advice, to my best friend in graduate school, Bruce Louis Rich, who always believed in me and was there for me along the way, and finally and most especially to my parents, Raymond and Kathleen Bergeron, fo r their unwavering love and support throughout my education.

PAGE 3

iii TABLE OF CONTENTS page ACKNOWLEDGMENTS..................................................................................................ii LIST OF TABLES...............................................................................................................v ABSTRACT....................................................................................................................... ..x CHAPTER 1 INTRODUCTION......................................................................................................1 Statement of the Problem............................................................................................1 Purpose of the Study...................................................................................................5 Theoretical Significance of the Study.........................................................................6 Practical Significance of the Study.............................................................................7 2 LITERATURE REVIEW.............................................................................................9 Historical Background..................................................................................................9 Validity and Construct-Irrelevant Variance................................................................12 Haladyna and Downing‘s Taxonomy for Studying CIV............................................17 Attribution and the Self-Serving Bias.........................................................................21 Explaining the Development of the Self-Serving Bias...............................................26 Classroom Grades and Accountability Testing..........................................................29 3 METHOD...................................................................................................................34 Derivation of General Research Hypothes es and Specific Research Hypotheses......34 Research Participants..................................................................................................36 Instrument...................................................................................................................37 Measures.....................................................................................................................38 Pilot Study..................................................................................................................40 Procedure....................................................................................................................41 4 RESULTS...................................................................................................................46 Consequences Associated with Poor Performance.....................................................46 Beliefs about Ability to Control Report Card Grades and FCAT Math Scores.........58 The Self-Serving Bias and CIV..................................................................................66

PAGE 4

iv 5 DISCUSSION AND RECOMMENDATIONS.........................................................78 Overview of the Findings...........................................................................................78 Discussion of Results..................................................................................................80 Recommendations for Future Research......................................................................95 REFERENCES..................................................................................................................97 APPENDIX A INTER-ITEM CORRELATIONS............................................................................102 B HIGH SCORING VERSUS LOW SCORING.........................................................109 C GRADES VERSUS FCAT ATTRIBUTIONS.........................................................114 BIOGRAPHICAL SKETCH...........................................................................................119

PAGE 5

v LIST OF TABLES Table page 1 Description of Variables...........................................................................................38 2 Item Correlations for Beliefs and Feelings about Consequences of Performance for Report Card...............................................................................................................43 3 Item Correlations for Beliefs and Feelings about Consequences of Performance for FCAT........................................................................................................................44 4 Item Correlations for Beliefs about Ability to Improve for Report Card.................45 5 Item Correlations for Beliefs about Ability to Improve for FCAT...........................46 6 Means and Standard Deviations about Feel ing Poorly on Each Item for Report Card Grades and FCAT Scores..........................................................................................47 7 Percentages of Student Responses (Agree or Strongly agree) for Feeling Poorly about Report Card Grade and FCAT Performance...................................................48 8 Means and Standard Deviations for Feeli ng Poorly about Performance on Each Item for Report Card and FCAT for Elementary School Students...................................49 9 Means and Standard Deviations for Feeli ng Poorly about Performance on Each Item for Report Card and FCAT for Middle School Students..........................................50 10 Means and Standard Deviations for Feeli ng Poorly about Performance on Each Item for Report Card and FCAT for High School Students..............................................50 11 Means and Standard Deviations by Grad e Level for Composite Score for Feeling Poorly about Performance on Each Item for Report Card and FCAT......................51 12 ANOVA Test on Student Res ponses for Feeling Poorly about Report Card and FCAT Performance for Elementary, Middle, and High School Students.................52 13 ANOVA Test on Student Res ponses for “You might feel embarrassed” on Report Card and FCAT for Elementary, Mi ddle, and High School Students......................53 14 ANOVA Test on Student Re sponses for “You might f eel frightened” on Report Card and FCAT for Elementary, Mi ddle, and High School Students......................54 15 ANOVA Test on Student Res ponses for “Your parents w ould be disappointed” on Report Card and FCAT for Elementar y, Middle, and High School Students...........54 16 ANOVA Test on Student Res ponses for “You would be punished” for Report Card and FCAT for Elementary, Middl e, and High School Students...............................55

PAGE 6

vi 17 ANOVA Test on Student Res ponses for “Your teacher w ould be disappointed” on Report Card and FCAT for Elementar y, Middle, and High School Students...........55 18 ANOVA Test on Student Re sponses for “Nothing would happen” on Report Card and FCAT for Elementary, Middl e, and High School Students...............................56 19 ANOVA Test on Student Res ponses for “Your friends might make fun of you” on Report Card and FCAT for Elementar y, Middle, and High School Students...........56 20 ANOVA Test on Student Res ponses for “You might be held back” on Report Card and FCAT Performance for Elementar y, Middle, and High School Students..........57 21 Means and Standard Deviations for Students’ Beliefs about Controlling Performance on Report Card and FCAT..................................................................58 22 Percentages of Student Responses (Agr ee or Strongly agree) for Beliefs about Controlling Performance on Report Card and FCAT...............................................60 23 Means and Standard Deviations by Grade Level for Composite Score for Students’ Beliefs about Improving Performa nce on Report Card and FCAT..........................60 24 Means and Standard Deviations by Item for Students’ Beliefs about Controlling Performance on Report Card and FCAT for Elementary School Students..............61 25 Means and Standard Deviations by Item for Students’ Beliefs about Controlling Performance on Report Card and FC AT for Middle School Students.....................62 26 Means and Standard Deviations by Item for Students’ Beliefs about Controlling Performance on Report Card and FC AT for High School Students.........................62 27 ANOVA Test on Student Re sponses for Students’ Be liefs about Controlling Performance on Report Card and FCAT fo r Elementary, Middle and High School Students.....................................................................................................................63 28 ANOVA Test on Student Respons es for Students’ Beliefs about Role of Ability on Report Card and FCAT for Elementa ry, Middle and High School Students............64 29 ANOVA Test on Student Respons es for Students’ Beliefs about Role of Effort on Report Card and FCAT for Elementa ry, Middle and High School Students............65 30 Means and Standard Deviations for Intern al and External Attr ibutions of High and Low Scoring Students for Perfor mance on Report Card and FCAT........................67 31 Means and Standard Deviations for Intern al and External Attr ibutions of High and Low Scoring Elementary School Students for Performance on Report Card and FCAT........................................................................................................................68

PAGE 7

vii 32 Means and Standard Deviations for Intern al and External Attr ibutions of High and Low Scoring Middle School Students for Performance on Report Card and FCAT69 33 Means and Standard Deviations for Intern al and External Attr ibutions of High and Low Scoring High School Students for Performance on Report Card and FCAT.69 34 Regression Analysis for High Scoring Children’s Responses to FCAT Question “You studied a lot” (i45)...........................................................................................74 35 Regression Analysis for High Scoring Children’s Res ponses to FCAT Question: “You studied the right things” (i46)..........................................................................75 36 Regression Analysis for High Scoring Children’s Responses to FCAT Question “You are smart” (i47)................................................................................................75 37 Regression Analysis for High Scoring Children’s Responses to FCAT Question “The teacher explaine d things well” (i48)................................................................75 38 Regression Analysis for High Scoring Children’s Responses to FCAT Question “Someone helped you” (i49).....................................................................................75 39 Regression Analysis for High Scoring Children’s Responses to FCAT Question “The work was easy” (i50).......................................................................................76 40 Regression Analysis for High Scoring Children’s Responses to FCAT Question “You didn’t study much” (i51).................................................................................76 41 Regression Analysis for Low Scoring Children’s Responses to FCAT Question “You did study the right things” (i52)......................................................................76 42 Regression Analysis for Low Scoring Children’s Responses to FCAT Question “You are not smart” (i53).........................................................................................76 43 Regression Analysis for Low Scoring Children’s Res ponses to FCAT Question: “The teacher did not help you” (i54)........................................................................77 44 Regression Analysis for Low Scoring Children’s Res ponses to FCAT Question: “You weren’t helped by anyone” (i55).....................................................................77 45 Regression Analysis for Low Scoring Children’s Res ponses to FCAT Question: “The work was hard” (i56).......................................................................................77 46 Inter-Item Correlations for Feelings and Beliefs about the Consequences of Performing Poorly for Report Card Grades............................................................102 47 Inter-Item Correlations for Beliefs in Ability to Improve Report Card Grades103 48 Inter-Item Correlations for Feelings and Beliefs about the Consequences of Performing Poorly for FCAT..................................................................................104

PAGE 8

viii 49 Inter-Item Correlations for Belie fs in Ability to Improve FCAT...........................105 51 Item-total Correlations for Beliefs in Ability to Improve Report Card Grade.......106 52 Item-total Correlations for Feelings and Beliefs about Consequences of Performance for FCAT...........................................................................................106 53 Item-total Correlations for Beliefs in Ability to Improve for FCAT......................107 54 Item-total Correlations for Self-Serving Bi as for FCAT for High -Scoring Students107 55 Item-total Correlations for Self-Serving Bias for FCAT for Lo w-Scoring Students108 56 ANOVA Test on Student Res ponses about Report Card Grade for “You studied a lot” (i22 v 28) for High versus Low Scoring Students...........................................109 57 ANOVA Test on Student Res ponses about Report Card Gr ade for “You studied the right things” (i23 v 29) for High versus Low Scoring Students.............................109 58 ANOVA Test on Student Res ponses about Report Card Grade for “You are smart” (i24v30) High versus Low Scoring Students..........................................................110 59 ANOVA Test on Student Res ponses about Report Card Grade for “The teacher explained things well”(i25v31) for High versus Low Scoring Students................110 60 ANOVA Test on Student Re sponses about Report Card Grade for “You were helped by someone” (i26v32) for High versus Low Scoring Students...................110 62 ANOVA Test on Student Re sponses about FCAT 04 fo r “You studied a lot” (45v51) for High versus Low Scoring Students.....................................................111 63 ANOVA Test on Student Re sponses about FCAT 04 fo r “You studied the right things” (46v52) for High vers us Low Scoring Students.........................................111 65 ANOVA Test on Student Res ponses about FCAT 04 for “The teacher explained things well” (48v54) for High ve rsus Low Scoring Students.................................112 66 ANOVA Test on Student Re sponses about FCAT 04 for “You were helped by someone” (49v55) for High versus Low Scoring Students....................................112 67 ANOVA Test on Student Re sponses about FCAT 04 for “The work was easy” (50v56) for High versus Low Scoring Students.....................................................113 68 ANOVA Test on Student Res ponses for “You studied a lot” (i22 v 45) for High Scoring Students for Report Card and FCAT.........................................................114 69 ANOVA Test on Student Res ponses for “You studied the right things” (i23 v 46) for High Scoring Students fo r Report Card and FCAT..........................................114

PAGE 9

ix 70 ANOVA Test on Student Re sponses for “You are smart” (i24 v 47) for High Scoring Students for Report Card and FCAT.........................................................115 71 ANOVA Test on Student Respons es for “The teacher expl ained things well” (i25 v 48) for High Scoring Students for Report Card and FCAT....................................115 72 ANOVA Test on Student Responses for “Som eone helped you” (i26 v 49) for High Scoring Students for Report Card and FCAT.........................................................115 74 ANOVA Test on Student Re sponses for “You didn’t study much” (i28 v 51) for Low Scoring Students for Report Card and FCAT.................................................116 75 ANOVA Test on Student Res ponses for “You didn’t study the right things” (i29 v 52) for Low Scoring Students for Report Card and FCAT.....................................116 76 ANOVA Test on Student Res ponses for “You are not smart” (i30 v 53) for Low Scoring Students for Report Card and FCAT.........................................................117 77 ANOVA Test on Student Respons es for “The teacher did not explain things well” (i31 v 54) for Low Scoring Stude nts for Report Card and FCAT..........................117 78 ANOVA Test on Student Res ponses for “You weren’t helped by anyone” (i32 v 55) for Low Scoring Students for Report Card and FCAT...........................................117 79 ANOVA Test on Student Res ponses for “The work was hard” (i33 v 56) for Low Scoring Students for Report Card and FCAT.........................................................118

PAGE 10

x Abstract of Dissertation Pres ented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy SELF-SERVING BIAS: A PO SSIBLE CONTRIBUTOR OF CONSTRUCT-IRRELEVANT VA RIANCE IN HIGH-STAKES TESTING By Jennifer M. Bergeron December 2005 Chair: David Miller Major Department: Educational Psychology The current study focuses on the self-serving bi as as a potential source of constructirrelevant variance (CIV) arisi ng from students, particularly in interpreting scores on high stakes tests such as the Florida Comprehensive Assessment Test (FCAT). To ascertain this, researchers investig ated 1) childrenÂ’s understanding of the consequences associated with poor performa nce on the FCAT, a high-stakes test, as opposed to performance in the classroom; 2) their perceptions of control over future performance on the FCAT as opposed to classr oom grades; 3) their actual attributions towards performance on the FCAT versus in the classroom; and 4) how these attributional processes are related to future test performance. Of secondary interest was the extent to which attributional processe s might differ across grade levels. One-hundredsixty children, grades four through seven, ni ne and ten, recruite d from a universityaffiliated developmental research school were individually surveyed. Results of this study

PAGE 11

xi revealed that overall these students would feel some level of embarrassment, fright and negative peer pressure if they were to perform poorly on th e FCAT and consequences of poor performance would be greater for the FCAT than for classroom performance. Although all children were relatively optimis tic about improving, they felt they had a higher level of control over their report card grades than their FC AT scores. In general high-scoring children made more internal and ex ternal attributions fo r their performance, attributing their high scores to both t eacher help and study strategies while low performing children failed to ma ke any internal or external attributions; i.e., there was no variance in their attribution responses. Finally, results prove d that attributions for an explanation of differential test performance were inconclusi ve. Findings of this study led to recommendations for future research in at tribution training with the caveat that such training should be administered uniformly to ensure that training does not become a source of CIV.

PAGE 12

1 CHAPTER 1 INTRODUCTION Statement of the Problem Over the past two decades, test scores ha ve come to dominate the discourse about schools and their progress. In recent years almo st all states have implemented some type of statewide accountability syst em as a requirement of the No Child Left Behind Act of 2002, with the vast majority relying heavily on testing of state developed content standards that can result in high-stakes consequences for st udents, schools and districts (Glass, 1991). Florida was one of the first stat es to initiate mandatory testing (Florida Comprehensive Assessment Test) and has b ecome a laboratory for school reform programs. As the impact of test performance increases, so does the need for validation, a complex process that consists of documenting the steps involved in test development, administration, scoring and interpretation (L inn, 2002). While the gathering of evidence should strengthen the validity of a particular test score or interpretation, the validity argument can also be strengthened by elimin ating or reducing alte rnative hypotheses or threats to validity (Cronbach, 1988). Although at least five major sources of invalidity threats stand out, the current study focuses on one important threat to high-stakes testing, construct-irrelevant variance (CIV) a nd its possible social consequences. Several researchers have attempted to expl ain CIV. In their discussion of classical true score theory, Lord and Novick (1968) were precursors in the development of the concept of CIV as they discussed a redefine d true score that is essentially biased.” Samuel Messick (1984, p. 216), in his influentia l work on validity, defined test-irrelevant

PAGE 13

2 variance, a construct analogous to CIV, as a ssessment that is too broad, containing excess reliable variance associated with other dis tinct constructs includi ng both situational and psychological variables. Finally, the Standards (AERA, APA, & NCME, 1999, p. 10) address CIV, discussing both the importance of studying systematic error among groups where no differences are believed to exist (S tandard 7.10) and the validity of inferences when CIV is present (Standard 7.2). In their review Haladyna and Downing (2004) develop a taxonomy, a simple classificati on of four variables that produce CIV: uniformity and types of test preparation; te st development, administration and scoring; cheating; and student characteri stics. These researchers sugge st that compared to other sources of CIV, student charac teristics provide the most seri ous threat to validity since they provide individual-specific as opposed to group-specific CIV that are more difficult to quantify. They stress the need for more re search that results fr om students, and in particular how motivation and test anxiety affect groups di fferentially. The purpose of the current study is to investigate some document ation about one potential source of CIV that derives from student characteristic s, the self-serving bias, an a ttributional bias that arises from studentsÂ’ beliefs about their test perf ormances, how these beliefs may affect their scores differentially, and consequently the c onstructs being measur ed in accountability assessment. The attributions that students make in or der to enhance, protect or maintain positive self concept are key elements in what is defined as the self-ser ving bias, a framework which initially suggested that i ndividuals tend to credit success to internal f actors such as ability or preparation and to bl ame failure on external factors such as bad luck, illness or test unfairness (Campell & Sedikiedes, 1999). Although some individu als have negative

PAGE 14

3 global or specific self-views, most have pos itive self-concepts (C ampell & Sedikiedes, 1999). Consequentially, when they are faced wi th feedback that threatens self-concept, such as doing poorly on the FCAT or rece iving a bad math grade, they undergo temporary decreases in state self-esteem. To a void this binding state they engage in selfserving attributions that offset and mitigate th ese threats. In fact, Campell and Sedikiedes (1999) demonstrated that as threat potent ial increases, for example when tasks are difficult rather than simple and important rather than unimportant, so do self-serving tendencies. They identify this as the se lf-threat model of the self-serving bias. Recently, Duval and Silvia (2002) broadene d existing views about attributions as they relate to the self-serving bias, creati ng a dual-systems model. Believing that selfenhancement is not the only motive associated with self-concept, they suggest that people are also motivated to seek accurate info rmation about their performance and the correctness of their opinions a bout themselves, even if such knowledge poses a threat to self-concept. They call this self-assessment motivation. In this model, both protection from self-threat and self-awareness motivations jointly determine success and failure attributions. When the two motivations exis t harmoniously; i.e., in success situations, little difference exists between these two mode ls. However, when the two motivations are in conflict, individuals who pe rform poorly may attribute performance to either internal or external causes depending on whether they believe they can improve (Duval & Silvia, 2002). In either case, failure elicits a self-t hreat scenario, a situ ation “when favorable views about oneself are questioned, cont radicted, impugned, mocked, challenged, or otherwise put in jeopardy” (Baumeister et al., 1993, p. 8). If fa iling students attribute their performance to external causes that are beyond their personal control, they may

PAGE 15

4 experience loss of motivation and fail to ta ke the steps necessary to improve future performance. Thus test performance may re flect a motivational component and could be linked to the attribution pr ocess that may differentially affect studentsÂ’ scores, contributing to CIV. Finally although the appearance of the self-serving bias in childrenÂ’s explanations for their academic achievement has been we ll documented (Nicholls & Miller, 1983), a secondary issue is attribution as it relates to studentsÂ’ development across grade levels; i.e., does the self-serving bias as a source of CIV change as children develop? Children of various ages may differ in terms of how th ey experience self-thr eat and consequently may display the self-serving bias in different magnitudes. Research in the development of academic self-concept has established that younger children are less likely to discriminate effort from ability, even though they do unde rstand the concept of ability (Chapman & Skinner, 1989). On the other hand, older children understand that ability may limit performance in some situations and compensate for lack of effort in others. Children also differ in what they consider more important, ability or effort (Chapman & Skinner, 1989). Nicholls and Miller (1983) found that younger children perceive effort as more important to performance than ability, while older chil dren see ability as more important. Thus younger children perceive classroom performance to be more controllable, internal and subject to change while older students see it to be less controllable, external and stable. Previous studies that focused only on earlier grades and examined scenarios where studentsÂ’ grades or scores might have reflec ted a greater emphasis on effort than ability are limited. However, by examining high stak es test scores, researchers may gain a clearer understanding of the developmen tal processes associated with CIV.

PAGE 16

5 In summary, the self-serving bias is a na turally occurring explanatory pattern that assists individuals in making sense of their behavior and serves as a self-protection mechanism in settings where self-threat is hi gh such as when individuals are judged by their performances. The questi on of attributional patterns an d how they relate to test performance appears to be a construct irreleva nt variable and therefore raises important concerns as to the validity of scores for children who expe rience negative attributional patterns; i.e., there could be more reflected in the math FCAT scores than just math performance or classroom tests. This eviden ce supports the need to investigate childrenÂ’s understandings and awareness of the cons equences both of classroom and FCAT performance as well as their feelings about how much control they have over their ability to improve and finally thei r attributional patterns. Purpose of the Study Although many factors that affect perf ormance on the Florida Comprehensive Assessment Test (FCAT) are not completely understood, the current study focuses on the self-serving bias as a potentia l source of construct irreleva nt variance (CIV) arising from students, particularly in interpreting their sc ores on FCAT math. In order to determine this, it is necessary to address a) th e extent to which students understand the consequences of scores on high stakes tests such as FCAT; b) how high stakes tests, as opposed to classroom performance, affect student sÂ’ attributions of their performance; and c) how these attributional processes are related to future student performance. An issue of secondary interest is to what extent the attributional processes of students differ at different grade levels, produci ng varying degrees of CIV.

PAGE 17

6 Theoretical Significance of the Study The theoretical focus of the present st udy was framed from the perspective of attribution theory and its relationship to vali dation of high-stakes test scores. The selfserving bias is posited to be an attributional behavior that has its roots in how individuals relate to high threat situati ons. In this view, individuals, wh en exposed to contexts that threaten their self-concepts, will readily ac cept credit for success and blame failure on external causes. There have been a number of theories proposed to account for this naturally occurring human behavior. Recent con ceptualizations of the self-serving bias have emphasized the importance of self-awarene ss and the belief in the ability to improve as determinants of these behavioral patte rns (Duval & Silvia, 2002). However, despite widespread interest, there is nothing known about how the se lf-serving bias could be a potential threat to test validity. Although Haladyna and Downing’s (2004) taxonomy for sources of CIV includes motivation, they point out the need for increas ed research that syst ematically identifies understudied sources of CIV. They believe that the first step is a careful identification of sources so that “this research should build a r obust literature that pr ovides a clear picture of the seriousness of this threat to va lidity” (Haladyna and Downing, 2004, p. 25). The present study would add yet one more dime nsion to their interpretation of student characteristics as sources of CIV Of secondary interest is the understanding of the ontogeny of the self-serving bias and when it develops. This will add to the body of knowledge because it is not known what age groups rely on this behavior patt ern and when it emerges. One viewpoint that complements the lifespan perspective is the contextual view, which posits that behavior must be understood in terms of the total setting or context in which it occurs.

PAGE 18

7 Development is viewed as a dynamic, changing process in which the individual and the environment continuously interact. The indi vidual is both affected by the environment and participates in its changes. Thus behavior cannot be in terpreted out of context. In this case contexts include both classroom activit ies as well as accountability tests. In addition this study will expand kn owledge on how children view their academic self-concept in light of the ne w accountability movement. Because of the relative importance and prev alence of the standards-based movement, an understanding of the impact of childrenÂ’s academic self -concept on accountability testing is needed. Practical Significance of the Study The practical relevance of this study is associated with the emerging importance of the high stakes testing movement highlig hted by the passage of the No Child Left Behind Act of 2002 and the influence of CIV on the understanding of the social consequences both on the interpretation of sc ores and test use and more importantly on students. In light of this legislation, reforms in educational assessment are being made, with the increased push for statewide testing progr ams. However, this testing will remain ineffective unless the validity of test interpretations and their social impact on students can be understood. Psychometric and policy ch anges in the design, te st administration, and interpretation of scores are all important and benefici al, but a broader understanding is needed, one that includes the test ta ker and possibly both th e potential and actual maladaptive behaviors that may cause differenc es in scores and in their relevance and utility. This turns out to be a question of CIV. The mean ing of a score along with its interpretations and uses are all pi eces of this con cept (Messick, 1995).

PAGE 19

8 This study is aimed not only at examining developmental differences in childrenÂ’s attributions for performance but also at discovering how the SSB as a potential source of CIV affects scores. If such a relationship can be established, then the obvious extension of this study calls for a r eevaluation of score interpreta tion as well as age-targeted interventions that may aid ch ildren in modifying some of their maladaptive beliefs and behaviors.

PAGE 20

9 CHAPTER 2 LITERATURE REVIEW This review begins with a historical description of the high stakes testing movement in Florida. It will then address se veral issues related to construct irrelevant variance (CIV) that result from the FCAT. Firs t, it will address issues of validity with a special focus on construct validity; next, CIV will be defined; then, a taxonomy will be presented with a special focus on students as contributors to CIV; and finally it will address attribution theory and the self-servi ng bias as possible contributors to CIV. The review will then examine the importance of c ontext, i.e., classroom grades versus FCAT scores, and conclude with a brief overview of developmental issues. Historical Background The current faith in and reliance on accountab ility tests began in the Sputnik era of the late 1950Â’s when political rival Soviet Russia sent the first man to space, causing Americans to question the state of educa tion. The focus on educa tion continued with Lyndon JohnsonÂ’s War on Poverty in 1965 that aimed to improve educational opportunities for the socio-economically disa dvantaged (National Conference of State Legislators, 2005). Federal a nd state governments became increasingly more active in educational efforts, including supporting incr eased use of assessment in school learning and federal funding increased by 200% (Nationa l Conference of State Legislators, 2005). However in the 5 years that followed, the economy entered a slump and funding for education began to slip. In the 1980Â’s many states discontinued minimum competency exams as complaints emerged that they promoted low standards and the federal

PAGE 21

10 government reduced its level of funding for education by 21% (Na tional Conference of State Legislators, 2005). It was widely vi ewed that minimum competency tests were dummying down the curriculum in Ameri can schools. Nevertheless, while Ronald Reagan opposed an expanded role for the federal government in education, he was instrumental in the establishment of the National Commission on Excellence in Education (NCEE) (National Confer ence of State Legislators, 2005). In 1983, the NCEE released A Nation at Risk , the most influe ntial report on education to date. It called for an end to the minimum competency movement and the beginning of the high-stakes movement. Although not entirely accurate in its information, it argued that American students were perfor ming poorly in comparis on to students from other countries and that Ameri ca could be close to losing its global advantage. The report, which cited declining test scor es and an overall deterioration in the quality of curriculum, elicited a nationwide panic regarding the weakened state of American education (National Conference of St ate Legislators, 2005). Despite its lack of empirical support, A Nation at Risk had profound effects. The National Commission on Education required more rigorous standards and an accountability system to bring America out of its educational recession. As a result, every state except Iowa developed educational sta ndards, and every state except Nebraska implemented assessment polices to evaluate th ese standards. In se veral states, serious consequences were attached to tests in order to hold schools, admini strators, teachers and students accountable for their performance. The Clinton administration joined in the standards and assessment movement when President Clinton signed th e Improving AmericaÂ’s Schools Act in 1994, and states were

PAGE 22

11 required to meet “adequate yearly progress” (National Conference of State Legislators, 2005). In his 2000 presidential campaign, Govern or George Bush put the example of Texas’s high stakes program front and foremost in his domestic policy. After his election, a modified Texas approach to accountability including testing of all students in grades 38 became the model for the entire nation. On January, 8, 2002, President Bush signed the No Child Left Behind Act (NCLB), a bipartisan education bill that increased the federal role in public education in return for more accountability efforts from state education programs. States were required to implement testing programs, supply result data, ensure quality instruction in every classroom and guarantee that every student, regardless of demographics, achieved a “proficient” leve l of education by 2014 (N ational Conference of State Legislators, 2005). At present, the current trend among all states has been the initiation of some type of comprehensive accountability system designe d to align government mandated content standards with curriculum, inst ruction and assessment used to evaluate student learning. Florida was one of the first states to initiat e early mandatory testi ng of its students. In 1976, after some difficult policy decisions, le gislators developed and implemented a statewide minimum competency test that students were required to pass prior to graduation (Bloom, 2005). Florida’ s touted initial results seemed to be an example of how standards and accountability systems could enhance education, but legal challenges to the graduation provision ended th e program temporarily (Bloom, 2005). In 1995-96, Florida’s Educational Refo rm and Accountab ility Commission returned to the idea of mandatory testi ng and recommended de veloping a statewide assessment system to provide informati on for improving public schools by maximizing

PAGE 23

12 learning gains of all students and keepi ng parents informed about their children’s educational progress. The commission’s findi ngs led to the development of the FCAT, designed to assess student achie vement in higher-order cognitiv e skills set forth in the Florida Sunshine State Standards (Amrein & Berliner, 2002). In 1999, the legislature mandated an annual assessment of students in grades 3-10 in the areas of reading, math and writing. It also broadened the consequences of testing, requiring that student test scor es be factored into the Sc hool Accountability Report used to evaluate school performance, assign school grades, and provide monetary rewards for high-achieving schools and sanctions to fa iling schools (Goldhaber & Hannaway, 2004). As a result, stakes became high for admini strators and teachers, but especially for children: students who failed to meet ach ievement standards would ultimately find themselves tracked, retained and even failing to graduate (Goldhaber & Hannaway, 2004). In fact, school personnel we re required to use test scores to make decisions about third grade retention, even if students earned A’s in their regu lar classes. Moreover, at the high school level, failing student s were assigned to remedial classes and required to take the test until they passed or failed to graduate (Goldhaber & Hannaway, 2004). Validity and Construct-Irrelevant Variance One modern test theorist, Samuel Messick (1995), wrote that “V alidity includes the evidence and rationales supporting the trustworthiness of score interpretation in terms of the explanatory concepts that account for both test performance and score relationships with other variables” (p. 13), where construct validity represents the evidential basis of test interpretation. A complex process, va lidation involves the accumulation of validity evidence that consists of the documenta tion of the processes associated with development, administration, scoring and inte rpretation. Different types of inferences

PAGE 24

13 from test scores may require different type s of evidence and may involve the examination of the content, cognitive processes, internal structure of items, reliability and the relationships of test scores w ith other constructs to ensure that the evidence supports the interpretations of test scores. According to the Standards (AERA, APA, & NCME, 1999), “Validity refers to the degree to which evidence and th eory support the interpretation of test scores entailed by proposed use of tests” (p. 9). In va lidation the most fundamental step is construct formulation or defining the construct (Cr onbach & Meel,1955). Two kinds of achievement constructs seem to be represen ted in the Florida Comprehensive Assessment Test and in national content standards (H alydyna & Downing, 2004). The first construct can be conceptualized as a large domain of knowledge and skills that includes measures of declarative and procedural knowledge. All achievement test s are to include items that are representative of that part icular domain. In Florida this domain is the Sunshine State Standards (SSS) that measures selected benc hmarks in mathematics, reading, and writing that are embedded in students’ core classe s. Teachers, administrators, and curriculum supervisors across the state review these it ems to ensure curricular relevance and representativeness of items. Although test specifications are usually employed to develop statewide exams, the sample of this do main is usually small (Haladyna &Downing, 2004). Students are expected to demonstrate ade quate performance in this domain to earn passing scores. The second type of achievement construct represented in state and national content standards is cognitive ability including reading, math, and problem solving ability. This domain is represented by complex tasks th at are difficult to te ach and learn. Messick

PAGE 25

14 (1989) labels measurement of this domain as construct referenced because test items represent ability itself and not the domain of knowledge or skills. This domain is sometimes referred to as fluid ability, deve loping ability (Messick, 1984), or learned ability (Sternberg, 1998). On the FCAT this domain includes questi ons and performance tasks that incorporate thinki ng and problem-solving skills that match the complexity of the standards being measured. Both types of ach ievement constructs are subject to CIV. Messick (1989) argued that in constructi ng a test for a part icular purpose, one should address both its evidential basis; i.e., the adequacy of the test in measuring the intended construct of interest as described above, and its c onsequential basis; i.e., how the test will be used. The former is dete rmined by evaluating the psychometric properties of the test, its conten t, administration and scoring, a nd appraising evidence for construct validity. The latter addresses whether the test should be used for the particular intended purpose, a more ethical questi on that can be examined by ev aluating the potential social consequences of the proposed use in terms of its social values. Messick suggests that a system of assessment could be designed that is both psychometrically sound in terms of its interpretations and uses but may have a negative impact on its consumers by way of its social consequences and labels. The Test Standards state that evidence based on consequences incorporates “intended and unintended consequences of test use into the concept of validity” (p. 16). Messick likewise reflects on the importance of value implications for test interpretation. He argues that value labels can both directly bias score-ba sed inferences and actions and indirectly influence meanings and implications attributed to test scores not only for individuals but also for society and institutions at large.

PAGE 26

15 Messick (1995) proposes that the apprai sal of consequences depends not only on their worth but also on their ca uses; it is not the adverse social consequences that cause their use to be invalid, but rather that they are not accounted for by any source of test invalidity. The main measurement concern is with respect to adverse consequences that are a result of what he ca lls construct under-representati on or construct irrelevant variance (Messick, 1995). For example, cons truct under-representati on is present when low scores occur as a result of the test missing something important that is related to the focal construct, which if present could have permitted the affected student to demonstrate competence. On the other hand, construct irrele vant variance is present when low scores occur due to something irrelevant that interf eres with the performance of the student. It does not matter if consequences are positive or negative as long as their sources are known and accounted for. To increase validity in a ssessment tests, Messick suggests it is necessary to examine what may cause variance in student scor es that is not related to what is being tested in a test construct. Information that produces va riance in scores not based on constructs is what Messick (1989) calls CI V, the “excess reliable variance that is irrelevant to the interpreted construct” ( p. 216). Messick identifie s two basic kinds of construct irrelevant variance, construct-irre levant difficulty and construct-irrelevant easiness. The former refers to a situation in which “aspects of the task that are extraneous to the focal construct make the test irreleva ntly more difficult for some individuals or groups” (Messick, 1995, p. 34). The latter refers to situations in which “extraneous clues in items or test formats permit some individual s to respond correctly in ways irrelevant to the construct being assessed” (Messick, 1995, p. 34). The Standards (AERA, APA,

PAGE 27

16 NCME, 1999) also address CIV, stating that “the term bias in tests and testing refers to construct –irrelevant co mponents that result in systemati cally higher or lower scores for identifiable groups of examinees” (p. 76 ). Prior to Messick, Lord and Novick (1968) had conceptualized CIV in terms of redefined true score that is bi ased. Comparing it to random erro r, they illust rated in their classical true score model that any observed score y is the linear co mposite of true score t, random error, er, and systematic error, es, which they view as an “undesirable change in true score” that is due to variables that were “irrelevant” to the construct being measured. While random error is uncorrelated with true score and observed score, systematic error (CIV) is correlated with both so that th e unaffected group of individuals will have observed scores that are closer to true score and the expect ation of systematic error or CIV is a non-zero value. Kane (2002) also posits that test scores can be interpreted in two ways. The first, which he labels descriptive interpretation, asse sses scores in terms of a specific content domain; in the case of the FCAT, scores are us ed to estimate achievement on the specific Sunshine State Standards contained in the te st and then extrapolated to estimate the degree to which students have mastered all the standards and finally the degree to which overall achievement is evident. The second, which he labels decision-based interpretation, refers to the ways in which stakeholders will use the information obt ained from the test: in the case of the FCAT, those with scores above a certain le vel will qualify for graduation while those with scores below this level will not. Kane’s delineation of interpretations is supported by the Standards which state that “validation logically begins with an explicit statement of

PAGE 28

17 the proposed interpretation of test scores, al ong with a rationale for the relevance of the interpretation to the proposed use” (AERA et al., 1999, p. 9). Haladyna and Downing (2004) suggest that CI V is “error variance that arises from systematic error”, error that is not random but is group or person specific. They identify two types of systematic error. The first is a systematic overor understatement of the true score for all members of a particular group caused by situations such as rater inconsistency or different test forms. The s econd, CIV, is the overor underestimation of an individual’s score that may be comp romised by elements such as confusing vocabulary, readability (especially when not a reading test), cu ltural knowledge, and quality of instructions. Additional examples of what Haladyna and Downing (2004) refer to as person-specific CIV include motivati on to perform on a test, test anxiety and fatigue, elements that are relevant to the present study. Haladyna and Downing‘s Taxonomy for Studying CIV In their 2004 review, Haladyna and Down ing talk about the need for studies investigating possible sources of CIV in high-stakes testi ng programs by systematically identifying potential sources of CIV. Thes e researchers develop a simple taxonomy for classifying variables that produce CIV along with logica l arguments, hypotheses and empirical evidence for each. They identify f our major sources of CIV. The first deals with uniformity and types of test preparation: a disparity in the am ount and extensiveness of test preparation in all classrooms and the use of unethical test preparation contribute to sources of CIV. Next, in the area of test development, administration and scoring, although guidelines for constructing test item s are well documented, poorly crafted items, non-uniform test item formats, faulty handling of the test administration process and time extensions, rater unreliabili ty, and the exclusion of lo w-scoring students in school

PAGE 29

18 districts are all sources of CIV. Third, cheat ing at both the institu tional and individual levels, even though limited in scope, contribut es to CIV. Finally students provide the most serious threat to CIV and will be the focus of this study. In their research, Haladyna and Downing (2004) found that the influence of verbal abilities on test performance, the unique pr oblems of special populati ons and finally test anxiety, motivation and fatigue are all threat s to the validity of any given test. For students who are deficient in reading co mprehension, reading speed and vocabulary, measures of achievement in other subject domai ns may be contaminated: in the case of an FCAT math test, studentsÂ’ abil ity in applying problem-solving techniques to the task at hand might be adversely impacted by their in ability to read the questions. Secondly, the absence of uniformity in the way accomm odations are administered can contaminate interpretations of test scores involving comparisons of the performance of students with disabilities, LEP students, students living in poverty and students living in cultural isolation to those in the general population. Finally, and most important to this study, affective variables arising from students also contribute to CIV and generally lowe r test performance. Younger students, for example, may often be more susceptible to fa tigue in longer testing situations than older students, and testing conditions may affect some students differently than others. Students on average also experience threats of various magnitudes to standardized tests. In one study researchers examined attitude s of 360 middle school students grades 6 through 9 and found that a larg e proportion exhibit fear wh en taking achievement tests and dissatisfaction with results (Karmos & Ka rmos, 1984). Fortyseven percent of those surveyed reported that achievement tests we re a waste of time; 30% said they thought

PAGE 30

19 more about being done with the test th an performing well; 36% thought the tests themselves were dumb; 22% thought there was no reason to perform well; while 21% said that they did not ev en try at all (Kellaghan et al. , 1982). Another study conducted in Ireland reported similar results. When stude nts were asked to report their sentiments about a norm-referenced test, 16% said they di d not care; 11% reported that they disliked them ; 21% felt afraid; 19% did not feel c onfident; 29% felt nervous; and16% felt bored (Kellaghan et al., 1982). The results of these studies suggest that fear and apprehension may be absent in normal classroom settings where students perceive they have greater control over outcomes. There is also a direct relationship between test anxiety and lower test performance. Hancock (2001) demonstrates that an evalua tive threat may increase anxiety while Zohar (1998) establishes a link betw een the disposition to test anxiety and high stakes situations. Thornton (2001) reports that te achers in training, under pressure to perform well on a qualification test, dropped out of the program a nd made alternative career plans because of their anxiety. Research in the area of test anxiety reveal s that as many as ten million students in elementary and secondary school s experience some t ype of test anxiety that interferes with performance (Hill & Wigfield, 1984). H ill and Sarason (1996) further found that these students were a year behind district averages in math and reading achievement and that test anxiety was an accu rate predictor of achievement tests scores. Moreover these effects became progressively worse as ch ildren proceeded through the grades. Low performing students had poor study habits and incomplete master y of material that led to poor performance and increased failure experi ence. When children perceived that failures

PAGE 31

20 were caused by low ability, these feelings led to shame, humiliation and increased anxiety in evaluative situa tions (Weiner, 1979). Interesti ngly, not just low performing children experience anxiety. High achieving st udents may also perceive unrealistic parental, peer and self-imposed expectations to perform well and also experience high anxiety (Wigfield & Eccles, 1989). Finally the motivational level of students ma y affect test performance regardless of student achievement level. Individuals who experience low motivation often do not take the test seriously: they may omit items, Christ mas tree answers or fail to finish the test. Paris and others (1991) were interested in examining the developmental trend in student motivation with respect to standardized tests. A fortyitem survey was administered to one thousand students in Mich igan, California, Arizona, an d Florida. The study found a decline in motivation to excel on standardized tests over grade levels. They attributed this occurrence to students’ belief in the irrelevan ce of standardized test scores as compared to school grades. For instance, some older students who felt that tests only serve a political agenda expressed their discontent by reducing effort. Resear chers explained that such attitudes reflect students’ attempts to protect self-esteem by self-handicapping. For example, in response to one survey item, olde r students reported more often that “People think I’m stupid if I get a low score on a test ” and “Most students know each others test scores” (Paris et al., 1991, p. 32). When schools or districts take measures to motivate students through positive or negative reinforcement, they are contributing to CIV since such motivational tactics are not uniformly applied. Another possible source of CIV related to motivation is students’ attributions about their performance. If fa iling students attribute their performance to

PAGE 32

21 external causes that are be yond their personal control, th ey may experience loss of motivation and fail to take the steps necessa ry to improve. Thus low test scores may reflect a multitude of components, one of which may be motivational, that could be linked to the attribution pro cess which may differentially affect student scores. This source of CIV is an important one to consider when interpreting and ut ilizing test scores. Attribution and the Self-Serving Bias The phenomenon known as the self-serving bi as (SSB) appears to be a naturally occurring human trait that is ubiquitous across contexts. Be cause most individuals have high opinions of themselves, as they proce ss self-relevant information, a bias intrudes that may temporarily allow them to alleviate threat to their self-concepts. Research has clearly demonstrated that people readily accept credit when told they have succeeded, attributing their performance to their ability or effort, yet attribute failure to external factors such as bad luck or some circumstance of the task (Campell & Sedikiedes, 1999). This is apparent in school settings wher e students try to jus tify how they perform on exams. Students display the SSB when they make external attributions such as blaming the teacher or bad luck for a low grade. They also display the SSB when they over credit themselves for their success. Over the last thirty years, empirical studies have attempted to explain this behavioral phenomenon in term s of why individuals make se lf-serving attributions, when they make them and how these attributions affect performance. Heider (1958) was the first to associate attributions as linked to the self-concept, and he was instrumental in developing a classification system for attributions in which he labeled causes of behavior as internal versus external, dispositio nal versus fixed, and intentional versus unintentional. Rotter (1966) fr amed his conceptualization of attribution theory in the

PAGE 33

22 stimulus-response tradition, a by-product of the behaviorist movement. RotterÂ’s work (1990) sparked much empirical research li nking academic achievement with locus of control. In the late 1960Â’s when psychology becam e dominated by cognitive theories. Kelly (1971) presented research for understanding how students make attributions about test performance. KellyÂ’s (1971) co-variation model provided a framework for understanding how students explain test scores, using three pieces of information to infer causality: consensus, distinctiveness, and consistency. The combination of these elements leads individuals to make specific attributions. For example, when a student has failed an exam, if most other students pass the test (low consensus), this student fails other exams (low distinctiveness), and he has failed fre quently in the past (high consistency), he should make internal attributions (e.g. I lack ability). On the othe r hand if most students fail the test (high consensus), the studen t receives good grades on other tests (high distinctiveness), and he has typi cally received high grades in the past (high consistency), the student should make an ex ternal attribution for the fa ilure (e.g., it was something unique to this exam). In the mid-seventies, Weiner (1979) and colleagues posited that an individualÂ’s perceptions of the causes of success and failure affect the quality of future performance. His model is particularly applic able to testing contexts because it explains the attributions individuals make in competitive environments. Weiner (1979) asserted that commonplace and routine events usually do not bring a bout causal attributions whereas unexpected events and unattained goals are antecedents th at elicit causal searcher s. He concluded that the importance of an event (such as passing the FCAT) influences attribution formation.

PAGE 34

23 One key element of his model is that the sp ecific cause attributed to an event is less important than its latent dime nsionality. In this context, he examines locus, control, stability, and globality, four cr itical dimensions. Locus refers to whether the event is due to internal or external causes and distinguish es between causes within the individual such as intelligence and effort and external causes such as task difficulty or luck. Controllability refers to whethe r one could have affected or influenced an event. Stability refers to how frequently an even t is or has occurred. In this c ontext, ability is perceived as long lasting as opposed to effort, which may vary over time and outcomes viewed as due to ability predict future performance more accurately than effort. Globality refers to crosssituational factors that may be specific such as faili ng a reading test due to poor reading ability or low intelligence. Together these dimensions determine the type of affective, behavioral and cognitive outcom es likely to result from attributions. In the late 1970Â’s, some cognitive theorist s began to view the at tribution process as a psychological strategy employed to protec t the individualÂ’s self-esteem (Campell & Sedikiedes, 1999). Some research ers believed that individuals were seen to exhibit the self-serving bias, an explanatory pattern of be havior that occurs when they readily credit their success while excusing failure. However, researchers could not agree as to whether the SSB was truth or fiction. Numerous la boratory tests of the SSB were conducted during this time period in which researcher examined a multitude of moderator variables. Three meta-analytic reviews conducted by Mi ller and Ross (1975), Wear-Bradley, and Zuckermann (1979) found some evidence for th e SSB but were unable to determine its size and scope.

PAGE 35

24 The SSB received little attention in th e literature until Campbell and Sedikes (1999) revisited the topic, explaining the SSB in terms of a self-threat model in which individuals are motivated to protect, enhance, and maintain their self-concepts. When a negative event occurs, individua ls experience a momentary dr op in self-esteem and make self-serving attributions in an attempt to minimize this. Campbell and Sedikes posit that the greater the threat to the individual, the more self-serving the attribution will be, i.e., that self-threat will magnify the SSB. In a meta-analysis examining 14 moderator variables including task importance and percei ved task difficulty, tw o variables relevant to the present study, they found that individuals are more likely to exhibit the SSB when tasks are difficult and important and therefore mo re likely to cause se lf-threat. This is an important finding that demonstrates the ex istence of the SSB as a self-protective mechanism. Recently Duval and Silvia (2002) derived a model to explain the self-serving bias in terms of a dual-systems approach. Recognizi ng that some individuals make external attributions for failure while others demonstr ate internal attributions for it, Duval and Silvia propose that this contradiction o ccurs because of the competition between two systems. Individuals want to be consistent wi th their self-standards; i.e., they want to maintain positive self-concepts but they also want to accurately place events with their causes. These two goals are in concert in success situations: when an individual is successful and attributes su ccess internally, he will simultaneously satisfy the selfconcept as well as provide accurate information to the self. However, the two goals are in conflict in failure situations. If an individual makes an internal attribution for failure, he

PAGE 36

25 may satisfy the attributional motive but harm the self -tostandard motive. On the other hand, if the individual makes an external at tribution for failure, the opposite may occur. Duval and Silvia (2002) demonstrate that th is conflict is resolv ed when individuals believe that they have a chance to improve after failure because when this occurs, the discrepancy between the self and others woul d only be short lived and failure could be attributed internally. Conversel y if people feel that their pe rformance is stable then an internal attribution would result in a lowered self-concept and negative affect. Furthermore, these researchers point out that high levels of self-awareness exaggerate the belief in oneÂ’s ability to improve. Thus, the dual-process model is an important contribution to the self-serving bias literature because it not only addresses individualsÂ’ explanations for past behaviors but also empi rically takes into account self-awareness and the consideration for future behavior. The e ffect of the attributi onal process on future performance ultimately is the most important relationship to understand since there is really no cure for the past. To summarize, attributions arise which may be internal or external and stable or unstable. They may elicit feelings of prid e and accomplishment or shame and guilt as well as some type of expectancy for future pe rformance that is said to influence future behavior. Although patterns may be self-serving in the short run because they temporarily alleviate decreases in self-esteem, they may have detrimental effects on studentsÂ’ long term overall performance. If failing student s attribute their performance to external causes that are beyond their pers onal control, they may experience loss of motivation and fail to take the necessary steps to improve. Fa iling grades thus may reflect a multitude of components, one of which may be motivational, that could be linked to the attribution

PAGE 37

26 process. This issue is an important one to consider, especially when interpreting and utilizing test scores. Explaining the Development of the Self-Serving Bias Despite the widespread research on attri bution theory, little is known about how self-serving attributions deve lop over the elementary and secondary school years. This section will focus on changes in childrenÂ’s pro cessing of evaluative feedback they receive as well as their perceptions of intelligence and ability as mechanisms for explaining why the self-serving bias may change fo r some children over grade levels. Children in early elementary grades ar e very optimistic about performance on different tasks, even when they are not doing well (Nicholls, 1979). On average they rank their performance at or near the best in their class, even when it does not correspond to their actual performance. When asked to res pond to a situation in which one child does worse on an activity than another, they believe that the lower scori ng individual can still do well if he keeps on trying (Nicholls, 1979) . Parsons and Ruble (1977) measured the expectancies of three groups of children to success and fa ilure conditions over time. They noted that younger children were able to su stain high expectations for performance on an achievement task even when they had performed poorly on similar tasks before. Despite childrenÂ’s awareness of their pe rformance, they do not seem to use this information in their self-assessments until th e second grade (Ruble, 1983). One reason that childrenÂ’s attributions are often so optim istic is that cognitively they cannot separate effort and ability in explaining their performance due to the fact that they may not integrate temporally separated events (Parsons & Rubl e, 1977). Instead they believe all positive events as non mutually exclusive; i.e., a person who tries hard is going to do well

PAGE 38

27 (Nicholls, 1979). Belief in thei r ability to succeed is also supported by adults who are patient and offer support and prai se rather than criticism. By mid-elementary school, childrenÂ’sÂ’ views begin to change, becoming more realistic. Their self-percepti ons appear to become more c onsistent with their actual performance: children who succeed continue to have higher expectations, while children who fail begin to lower their expectations (Parsons & Ruble, 1977; Rholes, Blackwell, Jordan & Walters, 1980). They no longer see them selves as the best in their class and their perceptions of their grad es correspond to their actual grades showed. Older children decreased their expectations for success after performi ng poorly on a task (Nicholls 1979). They also begin to understand performan ce in terms of how it relates to their peers (Ruble, 1983). ChildrenÂ’s perceptions of intelligence and ability also change as they develop. Children around five or six years old tend to eq uate ability in school with effort so they view smartness as trying hard and therefore be lieve if they work harder then they might increase their ability. However, at age eleven or twelve, children are able to separate ability from effort, and they discover that those with more ability may do well while studying less and conversely, those with low ability, even after expending high effort, may not succeed (Nicholls, 1979). In additi on, Dweck and Bempechat (1983) found that young children adopt incremental views of ability, that is, they believe that their ability may increase through effort while older children believe that ability is stable and cannot be improved very much. ChildrenÂ’s perceptions of ability and greater reliance on social comparisons in the higher grade levels may lead to an increase in self-threat or anxiety over repeated failures

PAGE 39

28 (Dweck & Goetz, 1978). When children repeated ly fail, they experi ence negative affect such as shame or humiliation. Covington (1986 ) demonstrated that perceptions of low ability result in anxiety in evaluative contexts . This pattern is most likely to occur for children who view ability as stable (Wigfield & Eccles , 1989). Over time, learned helplessness may develop in students who belie ve that they are not competent because they fail to make the connection between prior knowledge and what they can do to improve. These children fail to develop the metacognitive strategies necessary for high achievement (Wigfield & Eccles, 1989). Lack of effective strategies, reduced persistence and a sense of being controlled by external factors become a vicious cycle. Thus it appears that younger children seem to be prot ected from self-threat more than older children because they maintain optimistic views of themselves and in this case may display lower degrees of th e SSB than older children. At present, researchers have not system atically examined how these different processes relate to the development of the SSB and how it relates to CIV. It is believed that the SSB is most likely to develop in ch ildren when they come to view ability as separate from effort and come to view thei r performance as compared to others. It is important to understand the ontogeny of the sel f-serving bias so it can be determined in what grades attribution inte rventions might work best. The literature overall suggests that the self-s erving bias fluxuates based on the level of threat to the self that a specific outc ome engenders as well as an individualÂ’s perceptions of being able to improve his pe rformance. Given this information one would expect that school tasks differ in their leve l of importance, difficu lty, and perception of controllability, elements which will elicit diffe rent explanatory patterns of behavior. It

PAGE 40

29 would seem that based on anecdotal repor ts encountered in the news that an accountability system such as the FCAT woul d be the perfect labor atory to investigate the SSB as a potential source of CIV because of its mere level of importance and salience in the community that seems very different fr om what normally occurs in the classroom. Classroom Grades and Accountability Testing Because the self-serving bias is not an i nvariant explanatory pattern of behavior, it is necessary to consider the importance of context in the evaluation of the attribution process. In this study, learning contexts include both classroom activities and accountability testing in school , since childrenÂ’s progress is measured in both ways. These contexts may differ in terms of their level of imposed self-threat as well as the autonomy they allow students to improve upon their performance and therefore may produce differences in how st udents explain their performa nce. Hence, one purpose of this study is to determine the relative effect of CIV on high stakes tests by comparing it to classroom grades. Although there has been much focus on account ability testing over the last decade, students spend a great deal of school time engaged in classr oom activities that teachers evaluate (Crooks, 1988). These may include formal teacher made tests, curriculumembedded tests, and assessment of motivational as well as attitudinal variables associated with learning. Salient and summative in natu re, classroom evaluations inform students about their current achievement, and over time also convey to students the message of academic potential. Therefore, they play a ke y role in any academic system. However, a significant body of literature has focused on the subjective nature of classroom assessment and grading, and this research indi cates that teachers use grades to motivate

PAGE 41

30 students in addition to evalua ting their performance and putting closure on content or skill areas (Guskey, 1994) thus calling into question the validity of such assessment. The tendency of teachers to use attitude , effort, and achievement to calculate grades, often at the expense of more established objective measurement processes, has been well documented. In one study Stiggens, Frisbie, and Griswold (1989) found that teachers sometimes valued student motivation and effort more than actual achievement and set different expectations for students with differing abilities. Other research has shown that teachers often use class discussi on, participation, and student behavior in addition to test scores to assign grades (Gul lickson, 1985). Overall, it appears that student achievement is not the sole determinant of grades. StudentsÂ’ awareness of teacher grading pr actices may well be reflected in their academic self-concepts and therefore in wh at they believe they can accomplish and control. As they progress through school, students have different conceptions of intelligence that affect their perceptions of their grades. Six to ten year old children view intelligence in terms of the subjective difficulty of subject material and more importantly on obedience to authority (Leahy & Hunt, 1983) . What they perceive as fair depends more on the observable conseque nces of completing assignments in exchange for reward opportunities than on the amount of learning (Thorkildsen, 1989). However, children aged ten to eighteen view intellectual skills differently, i.e., in terms of the amount of information that they can learn by effort and memorization. For students in this age group, intelligence is a matter of working hard and expending the necessary effort (Leahy & Hunt, 1983). They perceive the fairness of gr ades in terms of the amount of learning that could be obtaina ble by most students.

PAGE 42

31 While students see that teachers have a leve l of control over grades they receive, they also perceive, as a reflection of the way teachers grade, that they themselves have a level of control over their own grades as a re sult of the effort and motivation they expend in their studies. Such responses to grades affect subsequent learning through a combination of motivational and cognitive factors as well as studentsÂ’ perceptions about the fairness of grades. However, studentsÂ’ aw areness of teacher grading practices that focus mainly on effort that is within thei r control and not on some more objective means to quantify performance have me t with criticism on many fronts. Although such subjective grading practices have become a driving force in the movement to state and nationwide accountabilit y testing, there is ample evidence in the literature to suggest that high stakes tests and their conseque nces have a negative impact on the stakeholders. The stress for schools to perform well in high stakes tests and the consequences connected to poor performance ar e present at all levels , but especially for students. High stakes exams promote anxiety an d reinforce negative ability perceptions in many children. The focus on grades and test sc ores fosters increased attention on ability perceptions, competition and social compar isons (Hill & Wigfield, 1984). Comparisons to peers based on grades and test results force students to become competitive and anxious about their performance to avoid scruti ny. Older student are also not as likely to expend as much effort and to respond randomly to standardized test items by Christmastreeing. Theses students frequen tly discount the tests and schooling in general. Hostility and lack of participation are a last resort (Thorkildsen, 1989). Finally, use of test scores causes many children to focus on the extrinsic value of learning and not on its intrinsic value.

PAGE 43

32 The threat of retention or failure to graduate as a result of unsatisfactory performance on high stakes tests also adds to student stress since there are greater consequences attached to students’ futures that are associated with these tests. Retention is occurring more and more in schools across the country for students who fail high stakes tests, and its impact on them is deva stating. They perceive it as punishment and proof of their inability to succeed in school . Parents frequently add to this stress by punishing children for failing. Jimerson and othe rs (2002) surveyed children about their most stressful life events. Their results indicate that school pressure appears more and more frequently on their lists. For example, in the 1980’s children reported that by the time they reached the sixth grade, they feared retention most af ter the loss of a parent and going blind. When the study was replicated in 2001, sixth graders repo rted that retention was the most stressful life event, even great er than a loss of a parent (Jimerson et al., 2002). Byrnes and Yamamoto (1986) interviewe d seventy-one retained elementary students and their teachers about their view s on academic retention. Eighty-four percent of children reported feeling “sad”, bad, and upset; 3% used the word embarrassed; 7% reported feelings such as being angry at th emselves or shy. When questioned about how their parents felt, 46% reported that their parents felt mad, 28 % reported that their parents felt sad, while only 8% said that their parents did not care. Of the fi fty-five children who responded to a follow up question, “Were you punished”, 47% responded yes. The key point emerging from these findings suggests that many children have concerns about school retention, and that they feel an overall increase in school pressure brought on by high stakes testing .

PAGE 44

33 Research examining the overall effects of nineteen empirical studies on retention conducted in the 1990Â’s suggests that grade re tention has a negative impact on all areas of achievement and socio-emotional adjustme nt. Retention of adolescents is linked to their dropping out of high school by age ninete en (Jimerson et al., 2002). Impact at the secondary level also demonstrates that st udents who are retained are more likely to experience emotional distress, engage in ci garette smoking, alcohol use, drug abuse and early sexual activity, and are more likely to experience violent behaviors and suicidal intentions (Jimerson et al., 2002). In summary, the research revealed in th is section suggests th at childrenÂ’s selfserving attributions may be different in the classroom versus in high stakes situations. This may be a reflection of teacher grading pr actices that give children a sense of control over their achievement. However, more importan tly it could be a reflection of childrenÂ’s perceptions of consequences asso ciated with high stakes test ing that may be perceived as a high threat. Hence contexts that emphasize social comparison and have strict abilitybased evaluation with little autonomy for im provement foster the development of selfthreat that has been shown to magnify the self-serving bias.

PAGE 45

34 CHAPTER 3 METHOD The main purpose of this study was to de termine if childrenÂ’s self-serving biases contribute to CIV in childrenÂ’s scores on the FCAT as compared to classroom grades. In this chapter, I describe hypot heses, research participants, procedures and measures used in the current study. Derivation of General Research Hypothes es and Specific Research Hypotheses A variety of literature supports the n eed for consideration of CIV in the interpretation of high-stakes test scores a nd suggests that the self-serving bias may be a potential source of CIV arising from student s. The literature presented in Chapter 2 indicates that it is also necessary to examine childrenÂ’s understanding of the consequences of high-stakes testing a nd their beliefs about improving future performance. These variables, when considered in th e context of classroom grading practices, will add a further dimension to the study in te rms of the relative importance of CIV in high-stakes testing, a context where CIV may be most detrim ental; i.e., considering all these variables in both contex ts will simplify our ability to make our arguments for the contribution of the self-serving bias as a pot ential source of CIV in high-stakes testing. Therefore the following general rese arch hypotheses were generated: General Research Hypothesis 1. Children will report that the consequences associated with poor performance on the FCAT, a high-stakes test, are more serious than the consequences of performing poorly in the classroom. A split-plot 2 X 3 X2 ANOVA

PAGE 46

35 will be used to investigate this question w ith FCAT 2004 and grade level as the betweensubjects factors and consequences as the within-subjects factor. Specific Research Hypotheses 1a. Overall, regardless of performance, the mean score on negative feelings about poor performa nce on the FCAT will be greater than the mean score for feelings about poor performance in the classroo m; and B) there will be an interaction between childrenÂ’s grade levels and their feelings about performance on both FCAT and in the classroom. General Research Hypothesis 2. Children will report that they have more control over improving their classroom grades in math than in improving their FCAT math scores. This hypothesis will be confirme d with a 2 X 3 X 2 split-plot ANOVA with FCAT 2004 and grade level as the betweensubjects factors and improvement as the within -subjects factor. Specific Research Hypothesis 2a. Overall, regardless of performance and grade level, the mean score on control over future performance on math grades will be greater than the mean for FCAT performance. General Research Hypothesis 3. The self-serving bias is a source of constructirrelevant variance. Three types of evid ence for this hypothesis will be examined. Specific Research Hypothesis 3a. High scoring children will make more internal attributions about their performance than low scoring children on both the FCAT and classroom grades (i.e., the mean scores on it ems capturing internal at tributions will be higher for high performers versus low perfor mers); and conversely, low scoring children will make more external attributions about their performance on the FCAT and classroom grades than high performing ch ildren (i.e., the mean scores for items capturing external

PAGE 47

36 attribution will be higher for low perfor ming children than high performing children.) These relationships may differ across grade levels. This hypothesis will be confirmed with a series of two-way betweensubjects ANOVAs with grade level and performance group as the betweensubjects factors. Specific Research Hypothesis 3b. Low performing children will make more external attributions on the FCAT than on classroom grades (i.e., the mean scores on items capturing external attributions will be higher for FCAT than for grades.) High performing children will make more internal at tributions for FCAT than for grades (i.e., the mean scores on items capturing internal at tributions will be highe r for FCAT than for grades.) These relationships may differ across grade level. This question will be examined with a series of two-way betw een-subjects ANOVAs with grade level and performance assessment as the between -subjects factors. Specific Research Hypothesis 3c. There should be a relati onship between childrenÂ’s FCAT scores in 2005 and childrenÂ’s attribut ions about their 2004 FCAT scores while controlling for performance in 2004. There should be a negative relationship between ability and external attributions and perf ormance in 2005. These relationships may vary across grade level. This hypothesis will be confir med with a series of regression analyses. Research Participants Participants consisted of 160 children, grade four through seven, ninth and tenth, recruited from a university-affiliated devel opmental research school located in North Central Florida. The school must enroll a student population that approximates the demographic composition of the student age pop ulation of Florida as a whole. Students are evenly distributed by gender. The racia l/ ethnic distri bution consists of a population that is 59% white, 25% blac k, 11% Hispanic, 1% Asian, an d 4% other. The number of

PAGE 48

37 exceptional children permitted to enroll in the school is limited. As part of its admissions process, the school also requires that st udents be in good academic standing, have a minimum GPA of 2.0, and have achieved a certain level of achievement on an achievement type test. Information about childrenÂ’s academic st atus was obtained from the school. This included each subjectÂ’s last two re port card grades in math as well as their math scores on the spring 2004 and 2005 Florida Comprehensive Assessment Tests, with level 1-2 scores representing unsatisfactory and above 2 repr esenting satisfactory. Pa rticipants and their parents were asked to sign permission forms and assured that their personal information would be kept confidential to the extent provided by law. Approval of the studyÂ’s procedures was obtained from the Univ ersityÂ’s Institutional Review Board. Instrument The survey consisted of 66 items divided in to scales that measured reactions to doing poorly on report card (items 2-11) versus FCAT (items 35-44), beliefs about ability to improve report card grades (items 12-21) and FCAT scores (items 57-63), attribution scales for report card grades (items 2233) and FCAT scores (items 45-56) and comparisons of beliefs about changing classr oom versus FCAT scores (items 63-66). In addition, respondents were asked to indicate last report card grades and 2004 FCAT scores. Each scale was preceded by a 5-item Li kert scale. All respondents were instructed to answer items 1-21, 34-44, and 57-66 with teacher assistance. Students who scored highly on report cards and/or FCAT were to respond to items 22-27 and/ or 45-50 respectively. Those who performed poorly in the classroom or on the 2004 FCAT were to respond to items 28-33 and/ or 51-56 respectively.

PAGE 49

38 Measures The methods used in the assessment of self -serving bias as well as studentsÂ’ beliefs in their ability to improve and their beliefs about the consequences of their performance were borrowed and adapted from a scale de veloped by Weiner (1986). The instrument was composed of parallel items pertaining to both FCAT and classroom performance. Table 1 presents descriptions of the variables used in this study. Table 1. Description of Variables Variable Description Range FCAT 04 Ability; previous performance on FCAT math 04 1= Fail; levels 1&2 0= Pass; levels 3 and above FCAT 05 Future performance on FCAT math 05 1= Fail; levels 1&2 0= Pass; levels 3 and above Report card 1 Report card grades in math time 1 Continuous scale scores Report card 2 Report card grades in math time 2 0 = strongly disagree; 4 = strongly agree ; neutral option included Feeling and beliefs about consequences (FCAT and Report Card) ChildrenÂ’s feelings and beliefs about consequences of performance if they were to perform poorly 0 = strongly disagree; 4 = strongly agree ; neutral option included Beliefs about ability to improve (FCAT and Report Card) ChildrenÂ’s beliefs in their ability to improve performance 0 = strongly disagree; 4 = strongly agree ; neutral option included Self-serving bias (FCAT and Report Card) ChildrenÂ’s attributions about the causes of their performance on FCAT 04 and about report card 1 0 = strongly disagree; 4 = strongly agree ; neutral option included Assessment The Florida Comprehensive Assessment Test (FCAT) is part of FloridaÂ’s overall accountability plan to increas e student achievement by implementing higher standards in

PAGE 50

39 areas of mathematics, reading, science, and wr iting from the Sunshine State Standards for students in grades 3-10. From these tests it is possible to obtain ach ievement levels, scale scores, and developmental scale scores as well as performance on specific content standards. The scale scores are divided into five categories, from 1 (lowest) to five (highest), called achievement levels with levels of 3 and above designated passing. The school supplied children’s scores on the FC AT for 2004 and 2005 as well as their math report card grades for the two spring, 2005 grading periods. Levels 1 and 2 were designated as the cut off point for low perf ormance on FCAT and U (unsatisfactory), D and F as the cut off points for low performance in school. Feelings and Beliefs about Consequences of Performance Children’s knowledge of their performan ce on the FCAT and classroom grades was measured by two items: a) the math grade on th eir last report card a nd b) the math score on the 2004 FCAT. Children’s beli efs and feelings about th e consequences of their performance on both tasks was measured by a ten-item scale borrowed and adapted from Covington (1987) . This scale asked children to report th eir feelings and beliefs that they, as well as their parents, peers and teachers, would have if th ey received a bad math grade or did poorly on the FCAT. Examples of ques tions included: “If you were to do poorly on your report card/ FCAT, you might feel sa d;” “If you were to do poorly on your report card/ FCAT, you might feel embarrassed;” a nd “If you were to do poorly on your report card/ FCAT your parents would be disappointe d.” Children answered the questions using a 5-point Likert scale ranging fr om 1 (strongly agree) to 5 (strongly disagree). Internal consistency for this subscale was .71 for th e report card score and .79 for the FCAT. Self-Serving Bias Measures

PAGE 51

40 Children’s explanations for the causes of their academic outcomes as well as their performance on the FCAT were measured by six items borrowed and adapted from Stipek (2002). Survey items asked childre n to report their feelings for various explanations as to their performance on the FCAT and academic outcomes including effort, ability, and help from the teacher as reasons for performance. Examples of question included “You studied a lot”; You are smart”; and “The work was hard.” Children answered these questi ons using a 5-point Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree) with internal consistency of .40 for the internal scale for report card grade and .70 for the internal scal e for the FCAT. The internal consistency for the external scale for the report card grade was .70 and for the FCAT was .84. Beliefs in Ability to Improve Measures Children’s beliefs in their ability to imp rove their performance on both the FCAT and their report card grades in math we re measured by six questions borrowed and adapted from Dweck (2000, p. 177). Examples of questions included “Your FCAT ability in math/report grade in math is something that you can’t change about yourself”, “You can learn new things, but you can’t really ch ange your basic math FCAT score/report card grade in math”, and “You can always ch ange your FCAT score in math/report card grade by studying harder”. Children answered the questions using a 5-point Likert scale ranging from 1 (strongly disagree) to 5 (strong ly agree). Internal consistency for this subscale was .70 for the report car d score and .78 for the FCAT. Pilot Study A pilot study was conducted to determine wh ether the survey items proposed for the project were appropriate for use with child ren grades four through seven, nine and ten. Two fourth-grade teachers were selected from the university-affiliated developmental lab

PAGE 52

41 school who reviewed the survey for clarity of directions, reading level and instrument length. Also, teachers nominated three fourth-grade students to participate in this pilot study. Younger students were selected since th ey would most likely have the lowest reading levels. The teachers were asked to comment on directions, comprehension and test length. Results of the pilo t study revealed that teacher assistance would be required to ensure that students were correctly completi ng forms. Teachers also indicated that some students might feel uncomfortable answering a few of the questions. They also highly suggested that it would be necessary to co mpare studentsÂ’ self-reported grades with actual grades. Finally, it was determined that younger students would require more time to complete surveys than their older counterparts. Procedure The survey was administered on the cam pus of the laboratory school in Spring 2005 in childrenÂ’s math classes before th e administration of the FCAT for 2005. The school also provided followup report grades and 2005 FCAT scores. Information about the study and parental permission forms were delivered to participants in class and permission forms were returned to class before children filled out surveys. During the survey session, I obtained the childrenÂ’s assent, discussed the study, and answered questions about the protocol of the study. Child ren were told that they would be asked questions about their feelings about their math performance on their last FCAT and last report card grade in math. Children were given as much time as needed to complete the 66 -item survey. The responses were electronically scanned and tabulated. In turn, each survey was inspected to ensure adherence to instruc tions. Although survey items were supported by ample precedents, the instrument did present some limitations. One hundred sixty

PAGE 53

42 participants responded after their most recent math report card grades and FCAT scores for 2004 and 2005 were obtained, and their answer s were tallied. However, actual report card grades and FCAT scores provided by the school were used in lieu of student reported items 1 and 34 to ensure accuracy of SSB scales. Eight students did not have FCAT scores for both 2004 and 2005, and seventeen had either not completed surveys beyond the first 11 items or indicated "Neither agree nor disagree” for the entire survey; therefore these students’ responses were not included in the analysis. Since these responses came primarily from students in the low classroom grade and FCAT categories, this served to reduce the population of low scor ing students, already in the minority. In addition, some responses resulted in problems that had to be addressed, corrected or eliminated. First, it became apparent that res pondents whose teachers did not explain the format of the instrument did not recognize that items 22-33 and 45-56 were to be answered based on grades/ FCAT scores th ey received. Responses to these items were adjusted accordingly and irrele vant scores were eliminate d. In addition, items 17 through 21 were eliminated because of redundancy.A n item analysis was conducted on the scale items as well as Cronbach’s Alpha of Items De leted. According to Ebel’s criteria, there were no serious problems for any of the items for the Beliefs in Ability to Improve measures and the Feelings and Beliefs about Consequences of Performance measures. However, for the Self-Serving Bias measures, low reliabilities of the scales do present some problems. As a result, analyses at the item level were co nducted. Item correlation tables are presented in Tables 2-5. Item to total correlations are given in Appendix A, Tables 46-55.

PAGE 54

43 Table 2. Item Correlations for Beliefs and F eelings about Consequences of Performance for Report Card Variable I2 I3 I4 I5 I6 I7 I8 I9 I10 I11 You might feel sad (I2) 1.000 You might feel embarrassed (I3) .474 1.000 You might feel frightened (I4) .346 .227 1.000 You wouldnÂ’t care (I5) .334 .179 .247 1.000 Your parents would be disappointed (I6) .357 .250 .147 .270 1.000 You would be punished (I7) .435 .230 .396 .135 .518 1.000 Your teacher would be disappointed (I8) .251 .099 .109 .164 .182 .257 1.000 Nothing would happen (I9) .142 .142 .173 .173 .264 .370 .115 1.000 You friends might make fun of you (I10) .056 .170 .207 -.002 .055 .147 .115 -.048 1.000 You might be held back (I11) .205 .203 .209 -.012 .103 .220 .093 .044 -.048 1.000

PAGE 55

44 Table 3. Item Correlations for Beliefs and F eelings about Consequences of Performance for FCAT Variable I35 I36 I37 I38 I39 I40 I41 I42 I43 I44 You might feel sad (I35) 1.000 You might feel embarrassed (I36) .584 1.000 You might feel frightened (I37) .449 .490 1.000 You wouldnÂ’t care (I38) .440 .265 .252 1.000 Your parents would be disappointe d (I39) .396 .296 .249 .303 1.000 You would be punished (I40) .367 .261 .328 .192 .411 1.000 Your teacher would be disappointe d (I41) .078 .254 .130 .048 .308 .352 1.000 Nothing would happen (I42) .167 .214 .153 .307 .401 .215 .321 1.000 You friends might make fun of you (I43) .191 .394 .320 .034 .159 .307 .220 -.005 1.000 You might be held back (I44) .385 .298 .237 .141 .201 .258 .244 .152 .309 1.000

PAGE 56

45 Table 4. Item Correlations for Beliefs a bout Ability to Improve for Report Card Variable I12 I13 I14 I15 I16 You have a certain amount of math intelligence, and you canÂ’t do much to change your report cardÂ… (I12) 1.000 Your math ability is something about you so you canÂ’t change your gradeÂ…(I13) .427 1.000 You can learn new things but you canÂ’t really change your grade (I14) .208 .516 1.000 You can change your math grade by studying (I15) harder .267 .155 .174 1.000 You can always change your math grade by preparing (I16) .219 .285 .082 .418 1.000

PAGE 57

46 Table 5. Item Correlations for Beliefs about Ability to Improve for FCAT Variable I57 I58 I59 I60 I61 You have a certain amount of math intelligence, and you canÂ’t do much to change your FCAT scoreÂ… (I57) 1.000 Your math ability is something about you so you canÂ’t change your FCAT scoreÂ…(I58) .554 1.000 You can learn new things but you canÂ’t really change your FCAT score (I59) .419 .456 1.000 You can change your FCAT score by studying (I60) harder .316 .332 .195 1.000 You can always change your FCAT score by preparing (I61) .148 .371 .064 .701 1.000

PAGE 58

46 CHAPTER 4 RESULTS The main purpose of this study was to determine whether students’ self-serving attributions were a potentia l source of construct-irre levant variance that might differentially affect student performance and threaten validity of te st scores and their interpretations. In order to a rrive at a conclusion, it was firs t necessary to establish that students exhibited an understa nding of the consequences as sociated with both report card grades and FCAT scores and that they possessed certain thought s and beliefs about control over improvement of their perform ance. Developmental differences from elementary through high school were also explored in this study. One hundred sixty participants were survey ed after their most recent math report card grades and FCAT scores for 2004 and 2005 were obtained and their responses were tallied. Eight of these students did not have FCAT scores for both 2004 and 2005, and seventeen had either not completed surveys beyond the first 11 items or had indicated "Neither agree nor disagree” for the entire survey; therefore these students’ responses were not included in the anal ysis. A variety of statistical methods were conducted to answer the research questions. This chapte r presents the results and findings of the analyses of data examined in the study. Consequences Associated with Poor Performance The first research hypothesis posited that children woul d report that the consequences associated with poor performa nce on the FCAT, a high-stakes test, were more serious than the conse quences of performing poorly in the classroom. Ten survey

PAGE 59

47 items were used to assess childrenÂ’s beliefs about their report card grades and ten parallel items to assess their beliefs about FCAT math scores. Children were asked how sad, embarrassed and frightened they would be, what the reactions of th eir parents, teachers and peers would be, whether they recognized consequences of th eir performance and whether poor results woul d lead to retention. Descriptive Statistics Table 6 shows the means and standard de viations on each of these items. Mean scores from Table 6 indicate that there is some difference between how children would feel if they were to perform poorly on eith er their report cards or on the FCAT. Mean scores reveal that respon ding students would experience greater embarrassment, peer ridicule, and fear of retention for poor FCAT performance, but greater parental disappointment and fear of punishment for doing poorly on their report cards. Table 6. Means and Standard Deviations about Feeling Poorly on Each Item for Report Card Grades and FCAT Scores Report Card FCAT Variable M N SD M N SD You might feel sad 2.84 135 1.09 2.91 129 1.32 You might feel embarrassed 2.20 135 1.25 2.78 131 1.31 You might feel frightened 2.20 135 1.44 2.15 131 1.57 You wouldnÂ’t care .74 135 1.19 .85 131 1.25 Parents disappointed 3.30 134 1.04 3.02 131 1.19 You would be punished 2.25 135 1.35 1.78 131 1.31 Teacher disappointed 2.47 133 1.17 2.38 131 1.27 Nothing would happen .98 135 1.08 1.19 130 1.13 Friends would make fun 1.25 135 1.21 1.53 129 1.35 You might be held back 1.46 135 1.27 2.10 131 1.36 Note. 0 = Strongly disagree; 1 = Disagree; 2 = Neither disagree nor agree; 3 = Agree; and 4 = Strongly agree

PAGE 60

48 Similarly, frequencies of student responses are presented in Table 7 below. These percentages are a combination of responses 3 = agree and 4 = strongly agree with each of the twenty items. They also reveal childrenÂ’s embarrassment, fear of peer ridicule and retention associated with performing poorly on the FCAT and parental disappointment and punishment for poor grades. Very few re ported that they would not care and that nothing would happen. Finally, relative comparisons show that students are fairly evenly split (46.7%) on whether poor FCAT performanc e or poor report card performance is worse. Table 7. Percentages of Student Responses (Agr ee or Strongly agree) for Feeling Poorly about Report Card Grade and FCAT Performance If you were to do poorly on yourÂ…? Report Card FCAT You might feel sad 69.6% 70.3% You might feel embarrassed 48.2% 63.0% You might feel frightened 33.3% 41.5% You wouldnÂ’t care 9.7% 13.4% Your parents would be disappointed 84.4% 70.4% You would be punished 43.7% 27.4% Your teacher would be disappointed 52.6% 49.6% Nothing would happen 9.6% 11.8% Your friends might make fun of you 17.8% 28.1% You might be held back 17.0% 41.5% Relative comparisons Overall It is worse to do poorly on the FCAT math section than to earn a ba d grade on my report card It is worse to do poorly on the FCAT math section than to ear n a poor quiz grade 46.7% 60.0% Note. Percentages are a combination of resp onses 3 = Agree and 4 = Strongly agree Because developmental trends were of intere st, data were separa ted into three grade level categories: grades 4 and 5 were classified as elementa ry school, grades 6 and 7 as

PAGE 61

49 middle school, and grades 9 and 10 as high sc hool. Mean scores are presented by grade levels in Tables 8-10. These scores showed ne gligible differences across grade levels for internal items such as sadness, embarrassmen t, and fright between studentsÂ’ feelings about poor performance on their report cards ve rsus their FCAT scores overall. However, children seemed to report more external than internal pressure for poor performance on both report cards and FCAT grades. Levels of parental dissatisfac tion were the highest reported scores on the scales for FCAT a nd report card grades. In addition, children reported that they would be more likely to be punished for poor report card performance than for poor FCAT grades. Interestingly, stude ntÂ’s sense of teacher disapproval appeared to decrease, particularly at the high school level. However, students at all grade levels reported that their peers would not make fun of them. Finally, elementary students reported feeling slightly more embarrassed, afra id and fearful of retention for the FCAT. Table 8. Means and Standard Deviations for Feeling Poorly about Performance on Each Item for Report Card and FCAT for Elementary School Students Report Card FCAT Variable M N SD M N SD You might feel sad 2.72 50 1.14 2.94 49 1.45 You might feel embarrassed 1.76 50 1.32 2.58 48 1.47 You might feel frightened 1.58 50 1.37 2.17 48 1.69 You wouldnÂ’t care .88 50 1.32 1.02 48 1.47 Parents disappointed 3.12 49 1.15 2.97 48 1.40 You would be punished 2.08 50 1.43 1.81 48 1.41 Teacher disappointed 2.55 50 1.14 2.56 48 1.49 Nothing would happen 1.12 50 1.14 1.15 48 1.22 Friends would make fun 1.28 50 1.28 1.79 47 1.44 You might be held back 1.74 50 1.45 2.70 47 1.30 Note. 0 = Strongly disagree; 1 = Disagree; 2 = Neither disagree nor agree; 3 = Agree; and 4 = Strongly agree

PAGE 62

50 Table 9. Means and Standard Deviations for Feeling Poorly about Performance on Each Item for Report Card and FC AT for Middle School Students Report Card FCAT Variable M N SD M N SD You might feel sad 2.92 36 1.11 2.91 34 1.36 You might feel embarrassed 2.25 36 1.20 2.80 35 1.28 You might feel frightened 2.00 36 1.55 1.86 35 1.67 You wouldnÂ’t care .64 36 1.20 .60 35 1.09 Parents disappointed 3.22 36 1.17 3.06 35 1.19 You would be punished 2.28 36 1.47 1.71 35 1.24 Teacher disappointed 2.57 36 1.29 2.20 35 1.34 Nothing would happen 1.00 36 1.17 .94 35 1.21 Friends would make fun 1.11 36 1.26 1.09 34 1.03 You might be held back 1.21 36 1.21 1.50 34 1.44 Note. 0 = Strongly disagree; 1 = Disagree; 2 = Neither disagree nor agree; 3 = Agree; and 4 = Strongly agree\ Table 10. Means and Standard Deviations for Feeling Poorly about Performance on Each Item for Report Card and FCAT for High School Students Report Card FCAT Variable M N SD N M SD You might feel sad 2.87 45 1.08 2.83 42 1.21 You might feel embarrassed 2.69 45 1.02 2.98 44 1.15 You might feel frightened 1.84 45 1.40 2.27 44 1.35 You wouldnÂ’t care .73 45 1.07 .91 44 1.14 Parents disappointed 3.53 45 .79 3.00 44 .94 You would be punished 2.38 45 1.13 1.73 44 1.17 Teacher disappointed 2.24 45 1.11 2.32 44 1.12 Nothing would happen .84 45 .93 1.45 44 1.04 Friends would make fun 1.33 45 1.11 1.73 44 1.17 You might be held back 1.33 45 1.04 1.80 44 1.13 Note. 0 = Strongly disagree; 1 = Disagree; 2 = Neither disagree nor agree; 3 = Agree; and 4 = Strongly agree

PAGE 63

51 An analysis on the overall scale score by grade level was conducted by grade level as described above. Mean scores and standard deviations for overa ll scale scores are reported in Table 11. Table 11. Means and Standard Deviations by Grade Level for Composite Score for Feeling Poorly about Performance on Each Item for Report Card and FCAT Report Card FCAT Variable M N SD M N SD Elementary School 22.85 48 7.08 25.51 47 8.78 Middle School 24.23 39 6.47 24.23 35 7.98 High School 24.64 45 5.34 24.38 42 6.07 Note. 0 = Strongly disagree; 1 = Disagree; 2 = Neither disagree nor agree; 3 = Agree; and 4 = Strongly agree Analysis To test hypothesis 1, a 2 x 3 x 2 split-pl ot analysis of variance was conducted on the consequence scale. Grade levels were ag ain collapsed into elementary, middle, and high school for ease of interpretation. FCAT scores for 2004 were designated high (0) for students who performed at a level three or above and low (1) for students who performed at level 2 and below. There were the two between-subjects fact ors. Criterion was the within-subjects factor because each participant had to answer questions about the consequences for performing poorly on the tw o criterions, i.e., FCAT and grades. A Huynh-Feldt Epsilon was applied to adjust for any departures from sphericity. A summary ANOVA is presented in Table 12 belo w. An alpha level of .005 was used in these analyses in order to cont rol for inflation of the type 1 error rate associated with running multiple tests on individual items. The Bonferroni adjustment technique was applied in order determine the statistical si gnificance of each item in this scale.

PAGE 64

52 Table 12. ANOVA Test on Student Responses for Feeling Poor ly about Report Card and FCAT Performance for Elementar y, Middle, and Hi gh School Students Source df SS MS F p Between subjects Grade level 2 21.40 10.70 .134 .875 FCAT 04 1 .265 .265 .003 .954 Grade level X FCAT 04 2 .691 .346 .001 .996 Error between 108 8618.36 79.80 Within subjects Criterion 1 8.43 8.43 .376 .54 Criterion X Grade level 2 79.85 39.93 1.78 .17 Criterion X FCAT 04 1 9.54 9.54 .43 .52 Criterion X Grade level X FCAT 2 3.41 1.70 .08 .93 Error within 108 2423.21 22.44 Note. Dashes indicate not applicable. * p < .005. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: High and low scoring. Criterion included two levels: report card grade and FCAT. The results of the split-plot analysis on the scale for f eeling poorly about performance on report card and FCAT for the three grade levels (elementary, middle and high school) and two ability le vels (low FCAT 04 score and high FCAT 04) and the descriptive data on the scale sc ores as a whole indicated child ren did not differ in feelings about performing poorly on their FCAT versus report card grades; i.e., there were no significant main effect s or interactions. However, a correlation matrix of the 10 items was examined. It was determined based on the nature of the item correlations and through re-examination of the substantive nature of the items that it would be more appropriate to evaluate each item separately since more than one distinct construct appear ed to emerge on the basis of inspection of the correlation matrix. A series of spl it-plot ANOVAs on the ten individual items did reveal some significant findings. As before FCAT 04 and grade leve l were the betweensubject factors and criterion wa s the within subject factor si nce each subject answered the same items about beliefs about performing poorly on report card and FCAT. This time

PAGE 65

53 the outcome variables for each analysis were individual items instead of composites. In order to control for inflation of the Type 1 error rate, the Bonfe rroni adjustment was applied, =.005. Summary ANOVA tables for these data ar e presented below. For items 3/36 (You might feel embarrassed), Table 13, there was a significant main effect for criterion F(1,121) = 15.77, p =.000. These results along with descriptive data reveled that overall children would feel more embarra ssed if they were to perform poorly on their FCAT than if they were to perform poorly on their report card and grades. Table 13. ANOVA Test on Stude nt Responses for “You might feel embarrassed” on Report Card and FCAT for Elementa ry, Middle, and High School Students Source df SS MS F p Between subjects Grade level 2 17.29 8.65 3.92 *.022 FCAT 04 1 4.23 4.23 1.92 .168 Grade level X FCAT 04 2 2.17 1.08 .492 .613 Error between 121 266.58 2.20 Within subjects Criterion 1 14.60 14.60 15.77 *.000 Criterion X Grade level 2 1.94 .969 1.05 .354 Criterion X FCAT 04 1 .000 .000 .000 .986 Criterion X Grade level X FCAT 2 .683 .341 .369 .693 Error within 121 112.08 .926 Note. Dashes indicate not applicable. * p < .005. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: High and low scoring. Criterion included two levels: report card grade and FCAT. For items 4/37 (You might feel frightened), Table 14, there was a significant main effect for criterion F (1,121) = 10.02, p = .002, a significant main effect for grade level F (2,121) = .077, p = .000. This revealed that acr oss previous performance levels, children would feel more frightened about performi ng poorly on the FCAT than on their report cards. In general, the descriptive data reveal that older children are more frightened about performing poorly.

PAGE 66

54 Table 14. ANOVA Test on Student Responses for “You might feel frightened” on Report Card and FCAT for Elementary, Mi ddle, and High School Students Source df SS MS F p Between subjects Grade level 2 2.84 1.42 .077 *.000 FCAT 04 1 .270 .270 .406 .781 Grade level X FCAT 04 2 1.36 .679 .194 .824 Error between 121 422.95 3.50 Within subjects Criterion 1 10.46 10.46 10.02 *.002 Criterion X Grade level 2 5.49 1.84 5.26 .024 Criterion X FCAT 04 1 3.68 5.49 1.76 .176 Criterion X Grade level X FCAT 2 .963 .481 .461 .632 Error within 121 126.32 1.04 Note. Dashes indicate not applicable. * p < .005. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: High and low scoring. Criterion included two levels: report card grade and FCAT. For items 6/39 (Your parents would be disappointed), Table 15, there was a significant main effect for criterion F (1,120) = 9.00, p =.003. Overall, children reported that they would feel that their parents w ould be more disappointed if they were to perform poorly on their report card s than their FCAT scores. Table 15. ANOVA Test on Student Responses for “Your parents would be disappointed” on Report Card and FCAT for Elementar y, Middle, and High School Students Source df SS MS F p Between subjects Grade level 2 3.60 1.80 1.01 .367 FCAT 04 1 4.40 4.40 2.47 .119 Grade level X FCAT 04 2 1.03 .515 .289 .750 Error between 120 213.99 1.78 Within subjects Criterion 1 6.82 6.82 9.00 *.003 Criterion X Grade level 2 1.38 .688 .908 .406 Criterion X FCAT 04 1 2.09 2.09 2.76 .099 Criterion X Grade level X FCAT 2 .999 .500 .659 .519 Error within 120 120 .758 Note. Dashes indicate not applicable. * p < .005. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: high and low scoring.

PAGE 67

55 For items 7/40 (You would be punished), Table 16, there was a significant main effect for criterion F (1,121) = 10.08, p = .002. Overall, children had stronger feelings about being punished for their report card grades than their FCAT scores. Table 16. ANOVA Test on Student Responses for “You would be punished” for Report Card and FCAT for Elementary, Mi ddle, and High School Students Source df SS MS F p Between subjects Grade level 2 5.21 2.61 1.14 .324 FCAT 04 1 2.65 2.65 1.58 .284 Grade level X FCAT 04 2 2.04 1.02 .466 .641 Error between 121 277.00 2.29 Within subjects Criterion 1 12.30 12.30 10.08 *.002 Criterion X Grade level 2 1.62 .808 .662 .518 Criterion X FCAT 04 1 .176 .176 .144 .705 Criterion X Grade level X FCAT 2 .596 .298 .244 .784 Error within 121 147.71 1.22 Note. Dashes indicate not applicable. * p < .005. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: high and low scoring. Criterion included two levels report card grade and FCAT. For items 8/41 (Your teacher would be disappointed), Table 17, there were no significant main effects or in teractions. Overall, childre n were neutral about their teachers’ reactions for both report card grad es and FCAT scores: half of the students survey believed that teachers would be di sappointed for poor performance on both report card grades and FCAT scores while the othe r half believed teachers would not care. Table 17. ANOVA Test on Student Responses for “Your teache r would be disappointed” on Report Card and FCAT for Elementar y, Middle, and High School Students Source df SS MS F p Between subjects Grade level 2 1.57 .786 .394 .675 FCAT 04 1 8.06 8.61 4.31 .040 Grade level X FCAT 04 2 1.19 .592 .297 .744 Error between 119 237.45 1.10 Within subjects Criterion 1 .305 .31 .306 .581 Criterion X Grade level 2 1.50 .75 .752 .474

PAGE 68

56 Table 17. Continued Source df SS MS F p Criterion X FCAT 04 1 .912 .91 .917 .340 Criterion X Grade level X FCAT 2 3.09 1.55 1.55 .216 Error within 119 118.37 .955 Note. Dashes indicate not applicable. * p < .005. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: high and low scoring. Criterion included two levels report card grade and FCAT. For items 9/42 (Nothing would happen), Tabl e 18, there were no significant main effects or interactions. Overall children felt strongly about the fact that something might happen if they were to perform poorly in general. Table 18. ANOVA Test on Stude nt Responses for “Nothi ng would happen” on Report Card and FCAT for Elementary, Mi ddle, and High School Students Source df SS MS F p Between subjects Grade level 2 3.17 1.59 1.12 .329 FCAT 04 1 .036 .036 .025 .874 Grade level X FCAT 04 2 .273 .137 .10 .908 Error between 120 169.56 1.41 Within subjects Criterion 1 4.98 4.98 5.18 .025 Criterion X Grade level 2 4.52 2.26 2.35 .100 Criterion X FCAT 04 1 .011 .011 .011 .917 Criterion X Grade level X FCAT 2 .252 .126 .131 .877 Error within 120 115.38 .961 Note. Dashes indicate not applicable. * p < .005. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: high and low scoring. Criterion included two levels report card grade and FCAT. For items 10/43, “Your friends might make fun of you”, Table 19, there were no significant main effects or inte ractions. Overall children did not feel that their friends would make more fun of them for doing poor ly on either report cards or the FCAT. Table 19. ANOVA Test on Student Responses for “Your friends might make fun of you” on Report Card and FCAT for Elementar y, Middle, and High School Students Source df SS MS F p Between subjects Grade level 2 8.89 4.43 1.80 .170

PAGE 69

57 Table 19. Continued Source df SS MS F p FCAT 04 1 .683 .683 .276 .600 Grade level X FCAT 04 2 1.12 .560 .226 .798 Error between 119 294.35 2.47 Within subjects Criterion 1 4.53 4.53 5.48 .021 Criterion X Grade level 2 4.83 2.41 2.92 .058 Criterion X FCAT 04 1 .876 .876 1.06 .305 Criterion X Grade level X FCAT 2 .743 .372 .450 .639 Error within 119 98.26 .826 Note. Dashes indicate not applicable. * p < .005. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: high and low scoring. Criterion included two levels report card grade and FCAT. Finally, for items 11/44 (You might be held back), Table 20, there was a significant main effect for criterion F(1,121) = 18.81, p = .000 and a significant main effect for grade level F(2, 121) = 7.04, p = .001. Overall and across grade levels, children felt that poor FCAT performance would be more likely to lead to retention. Elementary school students were most fearful of being held back. Table 20. ANOVA Test on Student Responses for “You might be held back” on Report Card and FCAT Performance for Elemen tary, Middle, and High School Students Source df SS MS F p Between subjects Grade level 2 31.19 15.59 7.04 *.001 FCAT 04 1 .23 .232 .105 .747 Grade level X FCAT 04 2 8.60 4.30 1.94 .148 Error between 121 268.19 2.22 Within subjects Criterion 1 18.70 18.70 18.81 *.000 Criterion X Grade level 2 4.68 2.34 2.35 .100 Criterion X FCAT 04 1 1.10 1.10 1.10 .296 Criterion X Grade level X FCAT 2 1.65 .822 .827 .440 Error within 121 120.34 1.00 Note. Dashes indicate not applicable. * p < .005. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: high and low scoring. Criterion included two levels report card grade and FCAT.

PAGE 70

58 Beliefs about Ability to Control Report Card Grades and FCAT Math Scores Ten survey items, 5 for report cards and 5 for FCAT math scores, were used to assess childrenÂ’s beliefs about whether they would have more control over improving their classroom grades in math than in improving their FCAT math scores. Descriptive Statistics In Table 21 presented below, mean scores indicate that children are more optimistic in their beliefs that they can improve thei r performance in the classroom than their performance on the FCAT. Mean scores reflectin g the relationship between level of math intelligence and subsequent grades earned on re port cards are higher than mean scores for corresponding FCAT math items, indicating that children agree that intelligence and ability are not necessarily limited. Conversely higher mean scores for the relationship between effort expended and preparation undert aken and report card grades are greater than for parallel FCAT items showing that children believe they can control improving their school grades more than their FCAT scores through effort. Table 21. Means and Standard Deviations for StudentsÂ’ Beliefs about Controlling Performance on Report Card and FCAT Report Card FCAT Variable M N SD M N SD You have a certain amount of math intelligence, and you canÂ’t do much to change your report card grade/ FCAT scoreÂ… *2.77 135 1.15 *2.44 131 1.22 Your math ability is something about you so you canÂ’t change your grade/ FCAT scoreÂ… *2.80 135 1.15 *2.42 130 1.16

PAGE 71

59 Table 21. Continued Report Card FCAT Variable M N SD M N SD You can learn new things but you canÂ’t really change your grade/ FCAT score *2.84 135 1.18 *2.45 131 1.27 You can change your math grade/ FCAT score by studying harder 3.28 134 1.00 2.88 130 1.12 You can always change your math grade/ FCAT score by preparing 3.09 134 1.01 2.84 129 1.23 Note . * indicates items were reverse coded for ease of interpretation. In Table 22 percentages reveal that st udents disagree that for both report card grades and FCAT math scores, intelligence a nd ability can change but agree that studying harder and putting forth greater effort will im prove their performance. Overall, studentsÂ’ percentages suggest that they be lieve that both intelligence an d ability play a greater role in performing well on their report cards than on the FCAT, a high-stakes test. Relative comparisons demonstrate that only 22.3% beli eve it is easier to ch ange an FCAT score than a report card grade. Overall it appears that students seem to take a more incremental view when thinking about improving report card grades than FCAT performance. However, children do not appear to f eel that their ability is stable. Data were separated by grade level (elementary, middle and high school) to determine whether age differences affected be liefs. Composite mean scores are presented in Table 23. These scores for grades and FCAT show that there is an inverse relationship between control over improvement for report cards and FCAT scores, most strongly at the high school level; i.e., as grade level increases, belief s about control over grades increase from elementary/ middle to high sc hool while beliefs about control over FCAT

PAGE 72

60 scores decrease for elementary and middle school students but most strongly for high school students. Table 22. Percentages of Student Responses (A gree or Strongly agree) for Beliefs about Controlling Performance on Report Card and FCAT What are your thoughts and belief s about yourÂ… Report Card FCAT You have a certain amount of math intelligence, and you canÂ’t really do much to change yourÂ… *15.6% *21.5% Your math ability is something about you so you canÂ’t really change yourÂ… *16.3% *20.8% You can learn new things in class but you canÂ’t really change yourÂ… *14.8% *25.9% No matter who you are, you can change yourÂ… by studying harder. 84.4% 63.7% You can always change yourÂ…greatly by preparing 76.3% 64.4% Relative comparisons Overall It is easier to change my FCAT math score than it is my math grade *22.3% Note. * Indicates 0 = Strongly disagree and 1 = disagree; non-starred items indicate 3 = agree and 4 = strongly agree Table 23. Means and Standard Deviations by Grade Level for Composite Score for StudentsÂ’ Beliefs about Improving Pe rformance on Report Card and FCAT Report Card FCAT Variable M N SD M N SD Elementary School 14.50 50 3.65 13.45 49 3.57 Middle School 14.53 38 3.58 13.15 34 4.01 High School 15.31 45 3.46 12.60 45 4.87 Note. Composite grades includes items 12-16 and 57-61 with items 12, 13, 14, 57, 58, and 59 reverse-coded for interpretation Means scores and standard deviations for students at all three grade levels were then separated by items as well to determine the influences of effort and ability in shaping studentsÂ’ perceptions a bout controllability of grades and scores (Tables 24-26). Mean scores are once again higher for report ca rd grades than for FCAT scores; for the first two items this suggests that students disregard ability and intelligence, two stable

PAGE 73

61 characteristics, as factors that may impede their performance, most strongly at the high school level. Responses to effort items appear to show that as students mature, they feel that effort has less to do with controllability of performance on FCAT relative to report card grades. Also elementary school stude nts seemed to make less differentiation between the role of effort on report cards versus FCAT than did their counterparts in middle and high school. Table 24. Means and Standard Deviations by It em for StudentsÂ’ Beli efs about Controlling Performance on Report Card and FCAT for Elementary School Students Report Card FCAT Variable M N SD M N SD You have a certain amount of math intelligence, and you canÂ’t do much to change your report card grade/ FCAT scoreÂ… *2.86 50 1.31 *2.55 49 1.32 Your math ability is something about you so you canÂ’t change your grade/ FCAT scoreÂ… *2.68 50 1.12 *2.43 49 1.19 You can learn new things but you canÂ’t really change your grade/ FCAT score *2.62 50 1.31 *2.18 49 1.41 You can change your math grade/ FCAT score by studying harder 3.30 50 1.04 3.16 49 1.18 You can always change your math grade/ FCAT score by preparing 3.04 50 1.16 3.12 49 1.33 Note . * indicates items were reverse coded for ease of interpretation

PAGE 74

62 Table 25. Means and Standard Deviations by It em for StudentsÂ’ Beli efs about Controlling Performance on Report Card and FCAT for Middle School Students Report Card FCAT Variable M N SD M N SD You have a certain amount of math intelligence, and you canÂ’t do much to change your report card grade/ FCAT scoreÂ… *2.67 36 1.31 *2.28 49 1.32 Your math ability is something about you so you canÂ’t change your grade/ FCAT scoreÂ… *2.86 36 1.20 *2.40 35 1.04 You can learn new things but you canÂ’t really change your grade/ FCAT score *2.67 36 1.20 *2.54 35 1.22 You can change your math grade/ FCAT score by studying harder 3.23 35 1.06 2.94 35 .94 You can always change your math grade/ FCAT score by preparing 3.20 35 .901 2.91 34 1.06 Note . * indicates items were reverse coded for ease of interpretation Table 26. Means and Standard Deviations by It em for StudentsÂ’ Beli efs about Controlling Performance on Report Card and FC AT for High School Students Report Card FCAT Variable M N SD M N SD You have a certain amount of math intelligence, and you canÂ’t do much to change your report card grade/ FCAT scoreÂ… *2.82 45 1.01 *2.47 45 1.18 Your math ability is something about you so you canÂ’t change your grade/ FCAT scoreÂ… *2.89 45 1.09 *2.47 45 1.20 You can learn new things but you canÂ’t really change your grade/ FCAT score *3.29 45 .79 *2.64 45 1.13

PAGE 75

63 Table 26. Continued Report Card FCAT Variable M N SD M N SD You can change your math grade/ FCAT score by studying harder 3.29 45 .97 2.53 45 1.12 You can always change your math grade/ FCAT score by preparing 3.02 45 .941 2.49 45 1.18 Note . * indicates items were reverse coded for ease of interpretation Analysis To test hypothesis 2, a 2 x 3 x 2 split-pl ot analysis of variance was conducted on the control data. The design had three fact ors: FCAT 2004, criterion, and grade level. There were two between-subjects factors: grade level (elementary, middle, and high school) and FCAT 2004 (0 = high performing an d 1 = low performing). Criterion (Report card versus FCAT) was a within-subjects f actor because each participant was asked questions at each level, i.e., FCAT and report card grades. A Huynh-Feldt Epsilon was applied to adjust for any departures from sphericity. A summary ANOVA is presented in Table 27 below. Table 27. ANOVA Test on Student Responses for StudentsÂ’ Beliefs about Controlling Performance on Report Card and FCAT fo r Elementary, Middle and High School Students Source df SS MS F p Between subjects Grade level 2 1.14 .557 .03 .97 FCAT 04 1 89.48 89.48 4.49 *.04 Grade level X FCAT 04 2 129.04 14.52 .73 .49 Error between 117 2330.22 19.92 Within subjects Criterion 1 178.58 178.58 20.26 *.00 Criterion X Grade level 2 15.08 7.54 .86 .43 Criterion X FCAT 04 1 17.65 7.65 .87 .35

PAGE 76

64 Table 27. Continued Source df SS MS F p Criterion X Grade level X FCAT 2 23.52 11.58 1.33 .27 Error within 117 1031.05 8.82 Note. Dashes indicate not applicable. * p < .05. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: High and low scoring. Criterion included two levels report card grade and FCAT. The dependent variable was responses to the 5-item control scale. The results of the split-plot analysis indi cated a significant main effect for criterion F (1,117) = 20.26, p = .000 and a significant effect for FCAT 04 F (1,117) = 4.49, p = .04, indicating that overall, mean difference in control over improvement was higher for report card grades than FCAT scores and that high scoring children ove rall felt as if they had more control over their performance than low performing children. There were no other significant main effects or interactions . Because the scale composed of both effort and ability components and because item s within each of the components were sufficiently correlated r(15,16)=.42, r(12,13) = .44 , additional analyses were conducted. A series of split-plot ANOVAs on the two com ponents did reveal some significant findings. Summary ANOVAs are presented in Tables 28 and 29 below. Table 28. ANOVA Test on Student Responses for StudentsÂ’ Beliefs about Role of Ability on Report Card and FCAT for Elementar y, Middle and High School Students Source df SS MS F p Between subjects Grade level 2 .691 .345 .066 .937 FCAT 04 1 37.56 37.56 7.13 *.009 Grade level X FCAT 04 2 2.93 1.47 .279 .76 Error between 120 631.76 5.27 Within subjects Criterion 1 30.63 30.63 12.89 .000* Criterion X Grade level 1 2.07 1.04 .44 .65 Criterion X FCAT 04 2 2.42 2.42 1.02 .32 Criterion X Grade level X FCAT 2 23.77 11.89 5.00 .008* Error within 120 285.12 2.38 Note. Dashes indicate not applicable. * p < .05. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: High and low scoring. Criterion included two levels report card grade and FCAT. The dependent variable was responses to the ability items 12, 13, 57 and 58.

PAGE 77

65 Table 29. ANOVA Test on Student Responses for StudentsÂ’ Belie fs about Role of Effort on Report Card and FCAT for Elementar y, Middle and High School Students Source df SS MS F p Between subjects Grade level 2 7.61 3.81 .83 .439 FCAT 04 1 .15 .15 .03 .857 Grade level X FCAT 04 2 16.67 8.33 1.82 .167 Error between 118 541.203 4.59 Within subjects Criterion 1 25.33 25.33 8.66 .004* Criterion X Grade level 2 13.88 6.94 22.37 .098 Criterion X FCAT 04 1 .005 .005 .002 .967 Criterion X Grade level X FCAT 2 .78 .39 .13 .875 Error within 118 345.31 2.93 Note. Dashes indicate not applicable. * p < .05. Grade level included three levels: elementary, middle, and high school. FCAT 04 consisted of two levels: High and low scoring. Criterion included two levels report card grade and FCAT. The dependent variable was responses to the ability items 15, 16, 60, and 61. The results of the split-plot analysis fo r ability showed a significant criterion X grade level X FCAT 04 inte raction F (2,120) = 5.00, p = .008, a significant main effect for criterion F(1,120) = 12.90, P = .000 and a si gnificant main effect for FCAT 04 F(1, 120) = 7.13, p = .009, indicating that low scoring children felt that amount of ability placed more limitations on improvement for th e FCAT than for report card grades. For high scoring children, a devel opmental pattern was observed. Elementary school children felt that amount of ability placed more lim itations on improvement of grades than FCAT. However, at higher grade levels, the opposite was the case: children felt amount of ability imposed greater limitations on FCAT than grades . The results of the split-plot analysis for effort showed a significant main e ffect for criterion F (1,118) = 8.66, p = .004. Children felt that effort played a greater role in improving report card grades than FCAT scores. There were no other significant effects.

PAGE 78

66 The Self-Serving Bias and CIV Twenty-four survey questions divided be tween high and low scoring students for report card and FCAT grades were used to assess the internal and external attributions of students. Participants were asked about how much they st udied, what they studied, how smart they were, how much the teacher or someone else helped them to prepare and how difficult the work was. Responses to these it ems would indicate the existence of a selfserving bias which might be a potential source of CIV. Internal and External Attributions about Performance It was expected that high scoring children would make more internal attributions than low scoring children about their pe rformance on both the FCAT and in the classroom and conversely that low scori ng children would make more external attributions than high performing children about their performance on the FCAT and classroom grades. Descriptive Statistics Means and standard deviations are pres ented in Table 30 below. Mean scores indicate that high scor ing students make more internal e ffort and ability attributions for report card grades and for FCAT scores th an low performing students. However, high scoring students also make more external at tributions for report card grades and for FCAT scores than low scoring students with the exception of items 27 and 50 (The work was easy) and items 33 and 56 (The work was hard). Here mean scores of low scoring students were slightly higher than those of high scoring st udents for report card grades while slightly lower for low performing students than high scorers for FCAT. High scoring students appear to attribute intel ligence to success on the FCAT more than for

PAGE 79

67 report card grades while low scoring students reject the idea that poor performance is a result of low aptitude. Mean scores across grade levels are repor ted in Tables 31-33 below. These scores reveal that overall across grade levels hi gh scoring children made more internal and external attributions than their low scoring counterparts for both report card grades and FCAT scores. High scoring elementary school st udents, as compared to their low scoring counterparts, made the greatest internal and ex ternal attributions for performance of the grade levels surveyed for both repor t card grades and FCAT scores. Table 30. Means and Standard Deviations for In ternal and External Attributions of High and Low Scoring Students for Perf ormance on Report Card and FCAT Report Card FCAT Variable M N SD M N SD High Scoring Students You studied a lot 2.64 91 1.19 2.34 88 1.35 You studied the right things 2.76 91 1.17 2.51 87 1.27 You are smart 2.74 91 1.25 3.08 86 .95 Teacher explained well* 2.76 91 1.16 2.51 86 1.23 Someone helped you* 1.97 90 1.32 1.79 85 1.37 The work was easy* 2.16 91 1.14 2.46 84 1.32 Low Scoring Students You studied a lot 2.25 44 1.38 1.59 34 1.33 You studied the right things 1.48 44 1.37 1.85 33 1.25 You are smart 1.05 43 1.29 1.03 33 1.24 Teacher didnÂ’t explain well* 1.64 44 1.30 1.53 32 1.24 No one helped you* 1.45 44 1.36 1.42 33 1.20 The work was hard* 2.34 44 1.40 2.00 33 1.37 Note. * indicates items which are external attributio ns; non-starred items are internal attributions. In middle school there appears to be a grea ter difference in the means of internal attributions and external at tributions for high and low sc oring children but the pattern

PAGE 80

68 remained the same; i.e., high scoring children made more internal and external attributions than low scoring children. Finally, in high school ,the pattern of relationship changes. High scoring students made more inte rnal and external attr ibutions for report card grades but low scoring st udents made more internal a nd external attributions for FCAT. However, while this difference does not appear to be very large for the low scoring children, it does suggest that when children reach high school, they become more cognizant of the consequences of performance, particularly when failure threatens to deny them a diploma. Table 31. Means and Standard Deviations for In ternal and External Attributions of High and Low Scoring Elementary School Stude nts for Performance on Report Card and FCAT Report Card FCAT Variable M N SD M N SD High Scoring Students You studied a lot 3.04 28 1.20 2.90 30 1.27 You studied the right things 2.71 28 1.30 3.17 30 1.05 You are smart 2.96 28 1.32 3.20 30 1.10 Teacher explained well* 2.46 28 1.17 2.57 30 1.38 Someone helped you* 1.89 28 1.50 1.80 30 1.54 The work was easy* 2.54 28 1.04 2.45 29 1.43 Low Scoring Students You didnÂ’t study much 2.22 23 1.28 1.29 17 1.31 You didnÂ’t study the right things 1.52 23 1.34 1.65 17 1.34 You are not smart .95 23 1.33 1.00 17 1.33 Teacher didnÂ’t explain well* 2.00 23 1.27 1.53 17 1.23 No one helped you* 1.22 23 1.31 1.18 17 1.01 The work was hard* 2.70 23 1.40 2.29 17 1.40 Note. * indicates items which are external attributio ns; non-starred items are internal attributions

PAGE 81

69 Table 32. Means and Standard Deviations for In ternal and External Attributions of High and Low Scoring Middle School Students for Performance on Report Card and FCAT Report Card FCAT Variable M N SD M N SD High Scoring Students You studied a lot 2.39 33 1.12 3.00 25 .96 You studied the right things 2.67 33 1.16 3.00 24 .84 You are smart 2.33 33 1.36 3.09 23 .79 Teacher explained well* 3.00 28 1.17 2.57 30 1.38 Someone helped you* 1.94 28 1.50 1.80 30 1.54 The work was easy* 1.91 28 1.04 2.45 29 1.43 Low Scoring Students You didnÂ’t study much 2.09 11 1.64 1.00 8 .93 You didnÂ’t study the right things 1.27 11 1.56 1.00 7 1.00 You are not smart 1.09 11 1.22 .57 7 .79 Teacher didnÂ’t explain well* 1.18 11 1.33 1.14 7 1.22 No one helped you* 1.73 11 1.49 1.00 7 .82 The work was hard* 1.82 11 1.54 .86 7 .90 Note. * indicates items which are external attributio ns; non-starred items are internal attributions Table 33. Means and Standard Deviations for In ternal and External Attributions of High and Low Scoring High School Students for Performance on Report Card and FCAT Report Card FCAT Variable M N SD M N SD High Scoring Students You studied a lot 2.53 30 1.20 1.33 33 1.05 You studied the right things 2.90 30 1.06 1.55 33 1.06 You are smart 2.97 30 .93 2.97 33 .92 Teacher explained well* 2.77 30 1.17 2.12 33 1.17 Someone helped you* 2.07 30 1.23 1.67 33 1.16 The work was easy* 2.10 30 1.02 2.75 33 1.27 Low Scoring Students You didnÂ’t study much 2.50 10 1.43 2.67 9 1.12 You didnÂ’t study the right things 1.60 10 1.35 2.89 9 .93 You are not smart 1.20 10 1.40 1.44 9 1.59

PAGE 82

70 Table 33. Continued Teacher didn’t explain well* 1.70 10 1.34 1.88 8 1.36 No one helped you* 1.70 10 1.34 2.22 9 1.48 The work was hard* 2.10 10 1.10 2.33 9 1.23 Note. * indicates items which are external attributio ns; non-starred items are internal attributions Analysis These results were confirmed with a se ries of two-way between subjects ANOVA designs that were conducted on the attribution data for both report card and FCAT scores. There were two between-subjec t factors, grade level (e lementary, middle, and high school) and FCAT 04 (high and low). Both repor t card and FCAT attribution scales were analyzed item by item because the reliability of the first scale was not sufficiently high and because previous research suggested that ability attribution ite ms are conceptually distinct from effort attribution items (Passer, 1977). In order to contro l for inflation of the Type 1 error, the Bonferroni adjustment was applied, =.008. Summary ANOVA tables are presented in Appendix B, Tables 56-67, for each pair of items for high and low performers and for each performance categ ory, i.e., report card grades and FCAT. The first three items on each scale measured for internal attributions, both effort and ability. The first two groups asked about ti me spent studying as a possible attribution for performance: for report card items 22/28 (“You studied a lot”; “You didn’t study much”) there were no significant interactions or main effects, i.e., both high and low performing students were neutra l in their attributions a bout how much they studied (Table 56). However, for corresponding FCAT items 45/51 (You studied a lot; You didn’t study much), Table 62, there was a signi ficant grade level X FCAT 04 interaction F(2,116) =18.84 ,p=.000 and a significant main effect for FCAT 04 F(1,116) = 10.19, p =.002. The second group of questions for both groups asked about material studied. For

PAGE 83

71 report card items 23/29 (You studi ed the right things; You di dnÂ’t study the right things), Table 57, there was a significant main effect for FCAT 04 F(1,129) = 28.89, p = .000, but no significant main effect for grade level or grade level X FCAT 04 interaction. For parallel FCAT items 46/52 (You studied th e right things; You di dnÂ’t study the right things), Table 63, there was a significant gr ade level X FCAT 04 interaction F(2,114) = 21.24, p = .000 and a significant main effect for FCAT 04 F(1,114) = 10.49, p = .002. For ability items 24/30 (You are smart; You are not smart), Ta ble 58, for report card and parallel FCAT ability items 47/53, Table 64, there were significant main effects for FCAT 04 F(1,128) = 47.85, p =.000 and F(1, 113) = 86.24 , p =.000 respectively. External items addressed three areas: teache r help, help from others, and degree of task difficulty. For external items regardi ng teacher assistance, report card items 25/31 (The teacher explained things well; The teach er didnÂ’t explain thi ngs well), Table 59, revealed a significant grade level e ffect F(2,129) = .113 , p = .002, while for corresponding FCAT items 48/54 (Table 65), there was a main effect for FCAT 04 F(1,112) = 15.48, p = .000. For the second area ad dressing help from others, on report card items 26/ 32 (Someone helped you; Y ou werenÂ’t helped by anyone), Table 60, there was a significant grade level X FCAT 04 interaction F(2,128) = .323, p = .005; for corresponding items 49/55 (Table 66) for FCAT 04 there were no significant main effects or interactions. Finally, as to degree of task difficulty, for report card items and FCAT scores 27/33, Table 61, there were no si gnificant main effects or interactions. Comparing Internal and External Attributions between Report Card and FCAT

PAGE 84

72 Our second specific research hypothesis posited that low performing children would make more external attr ibutions for FCAT than for classroom grades and that high performing children would make more internal attributions for FCAT than for grades. Descriptive Statistics Descriptive statistics are presented in Table 30 above. Mean scores for low performing students on items capturing external attributions (The te acher didnÂ’t explain; No one helped you, and The work was hard), Table 30, were lower for FCAT than for report card scores. However, mean scores for high performing students on items capturing internal attributions were mixed: for effort ite ms (You studied a lot and You studied the right things), mean scores were higher for report grades than FCAT while for ability item (You are smart), the mean scor e was higher for FCAT than for report card. Mean scores by grade level are reported in Tables 31-33 above. For low performing elementary and middle school students, mean scores for external attribution items (Teacher didnÂ’t explain; No one helped you and The work was hard) were higher for report card grades than for FCAT while fo r low performing high school students, mean scores for these external items were higher fo r FCAT. Differences at all three levels were negligible. However, mean scores for high performing students were more consistent with what was predicted. High scoring elem entary and middle school students had higher mean scores for internal attribution items (Y ou studied a lot; You studi ed the right things and You are smart), for FCAT than for grades , but at the high school level high scoring studentsÂ’ mean scores were lower for intern al FCAT effort items (You studied a lot; You studied the right things), and equa l for ability item You are smart. Analysis

PAGE 85

73 Again, these results were confirmed with a series of two-way between subjects ANOVA designs that were conducted on the attri bution data for both report card grades and FCAT scores. There were two between-s ubjects factors, grade level (elementary, middle, and high school) and criterion (report card grade and FCAT). Both report card and FCAT attribution scales we re analyzed item by item. In order to control for inflation of the Type 1 error, the Bonf erroni adjustment was applied, =.008. Summary ANOVA Tables 68-79 are presented in Appendix C for each pair of items for high and low performers and for each performance categor y, i.e., report card grades and FCAT. The first three items on each scale measured fo r internal attributions for both effort and ability on report cards and FCAT. For high scoring children responding to effort items 22/45 (You studied a lot), Table 68, there was a grade level X criterion interaction F(2,173) = 9.56, p = .000; there was also a sign ificant main effect for grade level F(2, 173) = 13.51, p =.000. For low scoring student s responding to parall el items 28/51 (You didnÂ’t study much), Table 74, there were no main effects or interactions. For high scoring children responding to effort ite ms 23/46 (You studied the ri ght things), Table 69, there was both a grade level X criterion interac tion F(2,172) = 12.70, p = .000 and a significant main effect for grade level F(2, 172) = 7.46, p =.001. For low scoring children responding to corresponding items 29/52 (You didnÂ’t study the right things), Table 75, there were no main effects or interactions . For ability items 24/47 (You are smart), Table 70, and parallel items 30/53 (You are smart), Table 76 there were no significant main effects or interactions. The first group of external items asked students to respond to how well the teacher explained things. For high scoring studentsÂ’ external items 25/48 (The teacher explained

PAGE 86

74 things well), Table 71, and for low scoring students for items 31/54 (The teacher didn’t explain things well), Table 77, there were no si gnificant main effects or interactions. For both high and low scoring students’ extern al items 26/49, Table 72, and correspondingly 32/ 55 (Someone helped you/No one helped you), Table 78, there were no significant main effects or interactions. Finally, for bot h high scoring students on external items 27/50 (The work was easy), Table 73, and fo r low scoring students on items 33/56 (The work was hard), Table 79, there were no si gnificant main effects or interactions. Self-Serving Attributions and CIV The third specific research hypothesis c oncerned whether students’ self-serving attributions were a potential s ource of construct-irrelevant variance. More specifically, I predicted that there should be a relationship between childr en’s FCAT scores in 2005 and their attributions about thei r scores in 2004 while controlling for performance in 2004 and that these relationships w ould vary across grade level. These results were confirmed with a series of regression analys es with individuals’ raw scores for FCAT 2005 as the dependent variable and children’s scores on each attribution item, grade level and raw score for FCAT 2004 as the independent variables. Regression results are presented in Tables 34-45 below. In or der to control for inflation of the Type 1 error, the Bonf erroni adjustment was applied, =.008. Table 34. Regression Analysis for High Scorin g Children’s Responses to FCAT Question “You studied a lot” (i45) Model Unstandardized beta Std. error Standardized beta t p Constant 157.054 30.327 5.179 .000 fcat04 .568 .079 .664 7.186 .000 Grade level -.340 3.909 -.008 -.087 .931 i45 -.656 2.518 -.025 -.260 .795 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent va riable was scale score for FCAT 05. R2=.45.

PAGE 87

75 Table 35. Regression Analysis for High Sc oring Children’s Responses to FCAT Question: “You studied th e right things” (i46) Model Unstandardized beta Std. error Standardized beta t p Constant 171.097 30.967 5.525 .000 fcat04 .542 .078 .645 6.912 .000 Grade level -.898 3.994 -.023 -.225 .823 i46 -1.939 2.807 -.071 -.691 .492 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent va riable was scale score for FCAT 05. R2=.44. Table 36 . Regression Analysis for High Scoring Children’s Responses to FCAT Question “You are smart” (i47) Model Unstandardized beta Std. error Standardized beta t p Constant 158.093 26.646 5.933 .000 Fcat04 .552 .078 .656 7.037 .000 Grade level .434 3.617 .011 .120 .905 i47 .629 3.270 .017 .192 .848 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent va riable was scale score for FCAT 05. R2=.44. Table 37. Regression Analysis for High Scorin g Children’s Responses to FCAT Question “The teacher explained things well” (i48) Model Unstandardized beta Std. error Standardized beta t p Constant 150.871 27.502 5.486 .000 fcat04 .561 .077 .667 7.283 .000 Grade level .657 3.591 .017 .183 .855 i48 2.131 2.384 .077 .894 .374 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent va riable was scale score for FCAT 05. R2=.45. Table 38. Regression Analysis for High Scorin g Children’s Responses to FCAT Question “Someone helped you” (i49) Model Unstandardized beta Std. error Standardized beta t p Constant 162.821 26.707 6.097 .000 fcat04 .551 .078 .656 7.103 .000 Grade level .335 3.604 .009 .093 .926 i49 -1.058 2.114 -.043 -.500 .618 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent va riable was scale score for FCAT 05. R2=.44.

PAGE 88

76 Table 39. Regression Analysis for High Scorin g Children’s Responses to FCAT Question “The work was easy” (i50) Model Unstandardized beta Std. error Standardized beta t p Constant 163.705 26.751 6.120 .000 fcat04 .527 .084 .626 6.257 .000 Grade level .733 3.620 .019 .202 .840 i50 1.910 2.380 .075 .802 .425 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent va riable was scale score for FCAT 05. R2=.44. Table 40. Regression Analysis for High Scorin g Children’s Responses to FCAT Question “You didn’t study much” (i51) Model Unstandardized beta Std. error Standardized beta t p Constant 187.087 46.611 4.014 .000 fcat04 .465 .161 .489 2.890 .007 Grade level -.139 7.983 -.003 -.017 .986 i51 -6.285 5.294 -.215 -1.187 .245 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent va riable was scale score for FCAT 05. R2=.35. Table 41. Regression Analysis for Low Scori ng Children’s Responses to FCAT Question “You did study the ri ght things” (i52) Model Unstandardized beta Std. error Standardized beta t p Constant 157.538 57.802 2.725 .011 fcat04 .558 .195 .579 2.860 .008 Grade level -5.462 8.634 -.121 -.633 .532 i52 1.347 6.676 .044 .202 .842 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent va riable was scale score for FCAT 05. R2=.30. Table 42. Regression Analysis for Low Scori ng Children’s Responses to FCAT Question “You are not smart” (i53) Model Unstandardized beta Std. error Standardized beta t p Constant 183.052 45.634 4.011 .000 fcat04 .486 .157 .505 3.091 .005 Grade level -3.263 7.243 -.072 -.451 .656 i53 -6.407 5.139 -.202 -1.247 .223 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent va riable was scale score for FCAT 05. R2=.34.

PAGE 89

77 Table 43. Regression Analysis for Low Scori ng Children’s Responses to FCAT Question: “The teacher did not help you” (i54) Model Unstandardized beta Std. error Standardized beta t p Constant 172.086 51.463 3.344 .003 fcat04 .517 .170 .537 3.037 .005 Grade level -4.595 7.793 -.098 -.590 .561 i54 -1.565 5.693 -.048 -.275 .785 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent variable was scale score for FCAT 05. R2=.30. Table 44. Regression Analysis for Low Scori ng Children’s Responses to FCAT Question: “You weren’t helped by anyone” (i55) Model Unstandardized beta Std. error Standardized beta t p Constant 175.470 45.359 3.868 .001 fcat04 .508 .157 .528 3.236 .003 Grade level -2.108 7.753 -.047 -.272 .788 i55 -5.094 5.647 -.154 -.902 .375 Note. * p <.008. Grade level included three levels: elemen tary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent variable was scale score for FCAT 05. R2=.32. Table 45. Regression Analysis for Low Scori ng Children’s Responses to FCAT Question: “The work was hard” (i56) Model Unstandardized beta Std. error Standardized beta t p Constant 193.665 53.835 3.597 .001 fcat04 .464 .172 .482 2.695 .012 Grade level -4.442 7.261 -.098 -.612 .546 i56 -4.592 5.007 -.162 -.917 .367 Note. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 was the scale score for FCAT 04. The dependent variable was scale score for FCAT 05. R2=.32. For high scoring students on FCAT 2004 fo r responses to items 45-50 there were no significant main effects for attributions. Overall, previous FCAT performance (i.e., FCAT 2004) was the greatest pr edictor of FCAT performa nce for FCAT 2005, i.e., the only significant main effect was for FCAT 2004. The same was true for low scoring children on FCAT 2004 for responses to item s 51-56. Again, previous performance was the greatest predictor of FCAT 2005 performance.

PAGE 90

78 CHAPTER 5 DISCUSSION AND RECOMMENDATIONS This study examined the effect of student sÂ’ self-serving biases on performance both on the FCAT and in the classroom. This chap ter presents a discussion and interpretation of the results as they relate to the orig inal research hypotheses and concludes with recommendations for future research. Overview of the Findings The purpose of this research was to determ ine if attributional patterns would be a source of construct-irrelevant variance in high-stakes tes ting and thereby affect the validity of test scores and their interpretations . To ascertain this, it was necessary to show 1) that participants underst and the seriousness of conseque nces associated with poor performance on the FCAT as opposed to in the classroom; 2) that they perceive possessing a greater level of c ontrol over classroom performan ce than high-stakes tests; and 3) that these conditions would adversely affect their attributi ons about current and future performance on high stakes tests. Of s econdary interest was the influence of age on responses and outcomes. One hundred sixty children, grades four th rough seven, nine and ten, recruited from a university-affiliated developmental research school located in North Central Florida, were individually surveyed. The children completed a sixty-six item questionnaire that measured reactions to doing poorly on repor t card (items 2-11) versus FCAT (items 3544), beliefs about ability to improve report card grades (items 12-21) and FCAT scores (items 57-63), attribution scales for report ca rd grades (items 22-33) and FCAT scores

PAGE 91

79 (items 45-56) and comparisons of beliefs a bout changing classroom grades versus FCAT scores (items 63-66). In addi tion, respondents were asked to indicate last report card grades and 2004 FCAT scores. These grades and scores for FCAT 2004 and FCAT 2005 were also obtained from the school. Initially, the results of the composite scor es for the feeling poorly scale indicated that children would feel equally as bad about poor performance on the FCAT and in the classroom. However, it was determined, based on the nature of the item correlations and after examination of the substa ntive nature of the items, that it was more appropriate to evaluate the items separately. Inspection of individual items revealed higher levels of embarrassment, fright and negative peer pressure among both high and low scoring students across grade levels for poor FCAT performance but higher levels of parent disappointment for poor classr oom performance. Consequences of poor performance, most notably retention, would be greater for FCAT than for classroom performance. Next, although high scoring children belie ved they would have more control in general over both FCAT and classroom performa nce, children regardless of grade level and ability level reported that they perceive d a higher level of cont rol over their report card grades than their FCAT scores. They di ffered in the strength of their feelings and beliefs about their ability to c ontrol grades and scores as well as in their beliefs about the influence of effort and ability in controlling performance. Low performers strongly believed that level of ability placed more limitations on potential for success on the FCAT than in the classroom. Third, an examination of attributions a bout performance in both arenas revealed that overall high scoring students made more in ternal and external at tributions than did

PAGE 92

80 low scoring students for both FC AT and classroom performance. This is a departure from what is reported in the self-serving bias l iterature that indicate s high scoring children should make more internal attributions and low scori ng children more external attributions (Campbell & Sedike s, 1999). Finally, childrenÂ’s at tributions about previous performance had no significant eff ect on their future performance. Some developmental differences were also f ound with fright. Levels of fright were strongest for high school students for FCAT, an indication that at these grade levels, students are more aware of negative conseque nces of low scores. Furthermore their ability attributions for report card grades in creased with age while the same attributions for FCAT performance decreased. Finally fear of retention as a consequence of poor performance on the FCAT was highest for elementary school students. Discussion of Results Beliefs about Performance The first question of interest was wh ether children would report that the consequences of doing poorly on the FCAT w ould be more serious than for doing poorly in the classroom. Upon examining individua l scale items, it was found that students experienced high levels of negative feelings about themselves (embarrassment and fright) and social pressure from their parents in this study even though anal yses of the composite scales for both report card and FCAT were not significant. Overall, both high and low scoring students were aware that certain consequences w ould be associated with poor performance and would care more about failure on the FCAT than on report card grades. The results of these analyses confirmed those of earlier researchers whose findings suggest greater threat, fear, a nd anxiety are associated with high stakes tests like FCAT (Paris et al., 1991). Of the fee lings explored in the presen t study, students reported that

PAGE 93

81 poor performance in general would elicit sadne ss but there were no differences between report card and FCAT. However, level of fright students reported experiencing was higher for high school students for FCAT. More specifically, this feel ing of fright could be associated with fear of retention since our results also found that fear of retention for FCAT over report card was stronge st at the elementary level where the threat of being held back is perhaps most salient because of third grade retention and at the high school level where students realize passing the FCAT is necessa ry for graduation. These findings are in agreement with Karmos & Karmos (1984) who found that a large proportion of students exhibit fear when taki ng achievement tests, resulting in reduced effort by many. Other researchers (Jimers on et al., 2002; Byrnes & Yamamoto, 1986) have also reported on the in creases of stress and anxiet y among students who fear retention when they do not pass a high stakes test. A majority of students in this study, regard less of their ability levels, also reported they would feel more embarrassed about pe rforming poorly on the FCAT. These results are consistent with previous research in motivation and emotion where researchers have emphasized that emotions such as embarrassment and shame are affective states that reflect awareness of social st andards and the concerns of ot hers about them. In one study, Paris et al. (1991) surveyed high school st udents about low test scores. These students indicated people “would think they were stupid” if they receiv ed a low score on a test and were concerned that other students knew thei r test scores. The media has also recorded the emotions of students who drop out due to the embarrassment caused from exit exams. The National Center for Fair and Open Tes ting Coalition For Authentic Reform (2002) quoted a student from Massachusetts who sa id, “A lot of students who don’t pass MCAS

PAGE 94

82 (Massachusetts high stakes exam) are goi ng to drop out next year... MCAS makes students think they are stupid.” (Widening –Gaps, Growing Vulnerability, p.14). In the present study, false survey reports for both FC AT scores and/or report card grades by several tenth grade students substantiate the high levels of embarrassment associated with failure. On the other hand, some interesting resu lts emerged suggesting that in some instances, children’s concerns are more severe for poor clas sroom performance. Overall, most children acknowledged their parents w ould be more disappointed by poor grades than by poor FCAT scores. These results are in keeping with resear ch that suggests the high level of attachment to family member s and desire to please them (McDevitt & Ormrod, 2002). They are further supported by the responsibility literatu re that suggests adults are often more understanding when children fail because of something beyond their control such as an FCAT test. In ot her words, parents would be more upset by children’s classroom failures if they perceive that effort dictates classroom grades more than FCAT scores (Juvonen, 2000). Although these few instance s of feeling more poorly about certain aspects of report card grades exist, overall resu lts suggest that Hypothesis 1, which says that children report consequences of poor performance on FCAT to be more serious than poor performance in math courses, is supported. Beliefs about Controllability The second research question addressed wh ether children differ in their beliefs about the degree of control they have over improving their FCAT math scores versus classroom math grades. Results of this study reveal that while ch ildren were optimistic that they could improve both their grades a nd FCAT scores, overall they believed that

PAGE 95

83 they could control their repor t card grades to a greater degree. In fact, 77% of all respondents indicated that it wa s easier to improve report card grades than FCAT scores. This was true across both grade level a nd ability level even though high performing children displayed a greater degree of optimism overall, i.e., for both report card grades and FCAT scores, than their low performing counterparts. The thoughts and beliefs scales for both re port card and FCAT in this study asked subjects to respond to items about the importan ce of math ability, inte lligence, and effort as they relate to improving performance. Res earchers report that beginning in elementary school, students begin to make distinctions be tween effort and ability as possible causes for failure and success. They attribute their su ccess to effort and hard work and so are willing to persevere even when they fail. However, as students progress through middle and high school, research suggests that they attribute success and failure to ability, a factor that is beyond their cont rol, and therefore they beco me more discouraged by poor performance. In fact, researchers who d eal with high school ag e subjects note their tendency to recognize that abilit y more than effort plays an important role in successful performance (Covington, 1992). Thus children gr adually develop expectations for their future performance. They either attribute th eir accomplishments to their own ability and effort or, for those who frequently fail despit e their best effort, are likely to think that success is something beyond their control, perhaps due to lack of ability (Covington, 1987). On the basis of this research, I examin ed effort and ability items separately, and some interesting patterns emerged. Children in our study believed that they could improve their report card grades more than their FCAT scores by putting fort h greater effort. Their responses indicated

PAGE 96

84 that studying harder and prep aring more would yield better results in school than on standardized tests, a result that is consistent with previous resear ch on teacher attitudes and classroom grading. Several researchers have examined th e role of the teacher in influencing student attitude s about performance. Covingt on (1984) notes that teachers recognize hard work and d reward student persis tence, because they presume that effort is under a studentÂ’s control. Thus teacher attitudes toward effort lead children to internalize a work ethic and children come to value effo rt as a source of self-worth, given teacher reinforcement patterns. According to Stiggins et al., children also appe ar to learn as they progress through school that effort is a larg e component in classroom grades and that classroom performance is controllable. St udents understand early on that grades depend to a large extent on completing assignments and further recognize that teachers reward students for effort, even if actual performan ce scores are low. This, however, is not the case with high stakes tests; in fact, one of the major aims of the accountability movement has been to remove student-recognized teach er bias in assessing student progress. Results for ability items varied dependi ng on previous performance. Overall low scoring students in our study felt that their le vel of ability placed gr eater limitations on their potential to perform we ll on the FCAT than it did for their report card grades. As previous research suggests, as failing children beco me more exposed to the pressures of school, they develop a pattern of learned helplessness: when children continue to fail, they experience shame and humiliation and many fail to persist (Burhans & Dweck, 1995). Our results seem to suggest that studen ts believe that FCAT is a greater measure of ability, a stable characteristic, than effort , and therefore, unlike classroom performance

PAGE 97

85 where outcome for low performers in particular is at the teacherÂ’s mercy, is beyond their control. Perhaps this explains wh at our subjects are experiencing. For high scoring students there is a devel opmental trend for ability that is in line with previous research indi cating that younger children believe that effort is a more reliable predictor of performance than abil ity while for older children ability is more important than effort in predicting perfor mance (Nicholls & Mill er, 1983). Elementary school children in this study felt that their level of ability pl aced greater li mitations on improvement of report card grades than FC AT. Although at first these results appear puzzling, examination of previous resear ch suggests that young children do not necessarily differentiate betw een ability and effort. According to Covington (1986), the psychological equivalence of effort and ability occurs for several reasons. Of foremost importance, younger children believe that thos e who try harder are smarter than those who do not. (Nicholls, 1979; Stip ek, 1998). Children at this age view effort and ability as being the same so that high effort is an indicator of ability. Secondly, younger children also believe that increases in effort can also lead to increa sed ability. As a result, they view ability as well as effort within their control. Finally the main reason, as Covington (1986) suggests, is immaturity in childrenÂ’s information processing systems. Thus in this study, children view ability, which they see as linked to effort, as more detrimental to classroom performance, a context in which gr ades are linked to effort, than to FCAT performance. However at higher grade levels, children begin to change thei r perceptions about ability. High scoring students in this study indicated their be lief that innate ability more than study or preparation would determine success on the FCAT to a greater degree than

PAGE 98

86 in the classroom. Older children view ab ility as a stable unchanging factor. As researchers (Chapman & Skinner, 1989; Nicholls, 1979) point out, older children recognize that ability can often compensate for lack of effort for a given task. In fact, when students perceive that their ability is limited, they frequently expend minimal effort because they believe if someoneÂ’s not smart they can only do so well. Given these attitudes, it is reasonable to expect they would recogni ze the role of ability in performance on a one-time test as opposed to a classroom situation in which a grade is the result of a series of assignments that include both effort and ability components. In addition, these results are in keeping with self-worth theory that stresses ability perceptions as the primary activators of ach ievement. Specifically, self-worth theory focuses on the need for success and avoidan ce of failure that causes a sense of worthlessness and social disapproval. It is widely known in our society that a personÂ’s worth depends on his accomplishments. Because ab ility is seen as a necessary element of success and low ability as a primary cause of fa ilure, self-perceptions of ability play an important role in an i ndividualÂ’s self-definition. Beliefs about the Self-Serving Bias: Intern al and External Attributions about Performance The third research question addressed wh ether high scoring children would make more internal attributions a bout their performance than low scoring children on both their report card grades and FCAT scores, while low scoring children would make more external attributions. The attr ibution scales for both repor t card and FCAT scores were divided between high and low scoring student s and asked children about how much they studied (internal), what they studied (internal), how smart th ey were (internal/ability), how much the teacher helped them (external) , and how difficult the work was (external).

PAGE 99

87 Results revealed that for the most part hi gh scoring students in this study make more internal attributions for effort for report ca rd grades than low scoring students. These students indicated that studying the right things is important in making good grades in the classroom while low scoring students seemed to believe that what they studied did not account for their low performance, even t hough both high and low sc oring children did not seem to differ in their causal beliefs a bout studying a lot to achieve success in the classroom. As research has shown, the most successful students are those who have a propensity to employ meta-cogni tive strategies that guide their studying habits. For example, these students realize that the am ount of study time invested is not directly related to performance, that is, performance also depends on the ability to employ correct strategies and skills necessary to perform a specific task. (Clifford, 1984). Low scoring students, on the other hand, do not employ thes e meta-cognitive strategies and this may explain why in this study they fail to differentiate between studying a lot and studying the right things. For the most part, effort attributions reported by students for FCAT performance paralleled those for classroom grades. Whereas once again, low scoring elementary and middle school students disagreed that poor perf ormance was a result of lack of effort, high scoring elementary and middle school st udents felt that studyi ng a lot played an important role in FCAT success. They furthe r indicated that studying the right things was important in making high FCAT scores. This is in keeping with previous research which suggests that younger children beli eve that peers value diligen ce and hard work and thus manage their impressions to others by attri buting their high test performance to working hard (Juvonen, 2000). However, for high school students, the results were reversed. High

PAGE 100

88 scorers indicated that their performance was not dictated by either how much they studied or what they studied, but, as will be discusse d later in this secti on, by their perceived high level of ability. As Covington (1984) suggests, by late middl e school, students begin to devalue effort because of threat s to their self-worth that resu lt from their realization that effort is no longer the “supreme virtue.” Lo w scoring high school students in this study indicated that they had neither studied a lot nor studied the right things. Covington, in his 1984 research refers to effort as a “doubleedged sword,” especially for older students. Increased effort on their part elicits praise and approval from teachers, thereby reducing the guilt associated with low effort, but high effort combined with failure reinforces students’ suspicions of low ability. Thus stude nts with a history of failure must balance the level of effort they expend so as to garn er teacher approval but to protect themselves from the humiliation which results from trying hard and failing anyway. Previous research findings indicate a hi gh correlation betwee n ability and selfworth. Perceptions of high ability impl y worthiness, even in the absence of accomplishments. In addition, these perceptions influence worth indirectly, suggesting that ability enhances perfor mance. Individuals define themselves in terms of their successes and failures; thus they see ability as a primary component of success and its absence as a cause of failure (Covington, 1984). Our results suppor t previous findings. High scoring students in this study believe that their good grad es in school are at least partially due to their high ability level. Lo w scoring children, however, disagree with the statement “you are not smart”, thus indicating that in their minds ab ility has nothing to do with their poor performance in the classroom. This is support ed by previous findings that individuals with a history of failure tend to ac t in ways that minimize the implication that

PAGE 101

89 they lack ability. In fact, lo w scoring children from previous studies were even apt to describe themselves as lazy and unmotivat ed so as to keep from blaming poor performance on low ability (Covington, 1984). This pattern was exactly the same for FCAT. Surprisingly, high scoring students also made more external attributions than low scoring students. Their responses demonstrated overall their beliefs th at help from others contributed to some degree to their success in the classroom even though their feelings about the influence of teacher help on cl assroom success was less clear. The importance of teacher assistance was most evident for middle school students. This might be in keeping with the trend in middle school to foster interpersonal relationships between students and their peers and between students and teachers. Thus students come to see interactions with teachers as of prime im portance in achieving successful outcomes since teachers focus to a large degree on improving th eir studentsÂ’ sense of self-worth even when it does not coincide with academic improvement (Midgley, 1995). Elementary school students on the other hand actually disc ounted help from others as a source of their success. These findings are in keeping w ith the optimistic att itudes that such young children exhibit about their ow n potential to perform well base d on the levels of effort and ability they put forth. In fact, young child ren in particular were unlikely to make many external attributions at a ll for either outside help or task difficulty, so strong were their internal attributions (Normandeau & G obeil, 1998). Contrary to previously reported research, low scoring students did not blame their poor classroom performance on lack of teacher help or help from others. Like their high scoring counterparts, low scoring middle school students actually credited te achers for assistance while low scoring

PAGE 102

90 elementary school students seemed to disagree most strongly that out side help was not at all instrumental to classroom success. For FCAT, overall, regardless of grade leve l, high scoring students believed that the teacher helped them prepare for the test and that this help was influential in their performance. Low scoring students disagreed that lack of teacher help negatively impacted their performance. Across both abi lity levels, students uniformly discredited help from outside sources as a reason for performance on the FCAT. For FCAT, high and low scoring students were neutral in their feelings about task difficulty. Beliefs about the Self-Serving Bias: Compar ing Internal and External Attributions for Report Card and FCAT The fourth research question addresse d whether high performing students would make more internal attributions and whet her low performing students would make more external attributions for FCAT than for classroom grades. The same attributions scales were used for both report card and FCAT performance as for the previous research question. Results of the study for high pe rforming students revealed the following. High performing middle school students indi cated that studying a lot had a greater impact on FCAT performance than classroom pe rformance. This is supported by previous research by Midgley (1995). These research ers pointed to the emphasis on self-esteem and emotional well being in middle school cl assrooms, frequently at the expense of academic performance and achievement. In fact , when children move from elementary to middle school, Midgley reports there were no results supporting an increase in perceptions of an emphasis on performance goa ls in the classroom. Thus we would expect that middle school st udents would be more focuse d on FCAT and the potential threats it presents.

PAGE 103

91 However, elementary and high school stude nts reported that expended effort had a larger impact on classroom performance than for FCAT. These results were surprising given that high performing students should be making greater internal attributions for FCAT because it is a high threat test. Thus students, especially at grade three where retention is a direct result of failure and at grade ten where failure impacts future class assignments and ultimately graduation, should be more cognizant of the consequences of studying for the test. There are, however, se veral factors which might explain these unusual findings. At the elementary level, st udents value effort most highly, especially since it results in teacher reinforcement in th e form of praise and ultimately high grades in the classroom (Covington, 1984). However, in spite of the fact th at these children are aware of the threat of retention because of FCAT performance, they are more concerned with pleasing their teachers and parents, a situation that students in our study reported in their surveys. By the time high scoring st udents reach high school, they become aware that effort is not the most important contri butor to success on a test such as FCAT and place more emphasis on ability. These students have had sufficient exposure to FCAT to realize that tasks it tests are ability related. In addition, by high school, students realize that classroom grades result in the accumula tion of credits necessary for graduation and that high grades result in rewards in th e form of college acceptance and ancillary academic rewards. Because they already beli eve they possess sufficient ability to pass FCAT, students in our study repor ted that they spent little ti me studying a lot or the right things since the threat of failing FCAT for them has long abated. While high school students, as indicate d above, believed that studying the right things resulted in higher grades in the cl assroom than on FCAT, both elementary and

PAGE 104

92 middle school students said that studying the right things had a gr eater impact on FCAT performance than in the classroom. Their r eactions are consistent with our previous discussion on the development of strategies by these students and the hi gh level of threat they associate with FCAT as opposed to classroom grades. Finally high scoring students failed to repor t that being smart played a greater role in FCAT success than for report card grades . These findings were not consistent with research already reported about how children mo ve from believing that effort and ability are the same to a realization that ability, more than effort, determines success and selfworth. For low performing students, there were no differences in attributions for report card grades and FCAT scores. These st udents did not blame their poor FCAT performance on lack of help from teachers or outside sources and did not indicate that the task was especially difficult. In fact, they ma de no internal or external attributions for performance. This is inconsistent with prev ious attribution resear ch which suggests that failure situations lead to more self-serving attributions, allowing these students to selfprotect. It appears that hi gh performing students seem to understand exactly what factors contribute to their success: they indicate that both their own effort and outside assistance affect performance. However, low performi ng students may either fail to understand reasons for poor performance that are internal or external to themselves or simply select responses that allow them to disengage. Self-Serving Bias and CIV

PAGE 105

93 Finally, the last research question addressed whethe r studentsÂ’ self-serving attributions were a potential source of CIV. It was pred icted that there would be a relationship between childrenÂ’s FCAT scores in 2005 and their attri butions about their scores in 2004, and that this relationshi p would vary across grade level. More, specifically it was expected that the attribut ions of low scoring ch ildren would be most affected by CIV, that is, for low scoring st udents making internal effort attributions scores would increase, while for low scoring students making external attributions scores would either decrease or remain the same. Th e portion of the attribution scales used for research questions three and four were ag ain utilized. Performa nce measures; i.e., studentsÂ’ scale scores for FCAT 2004 and 2005 were used. Results of the study reveal that for both high and low performing stude nts, previous performance on FCAT 2004 was the greatest predictor of success in 2005, that is, attributions did not make a difference in future performance. For high scoring students, these findings were not particularly su rprising in light of previous research that sugge sts that individuals Â’ perceptions of th e causes of success influence the quality of future performance (Weiner, 1986). The majority of students in this group performed successfully on both tests and indicated on surveys that for the most part they were making attributions in keep ing with success outcomes and with previous research findings. Thus, because high scoring students made internal effort attributions for previous performance on their FCAT, it woul d be expected that they would continue to make high scores for FCAT 2005. It is interesting to note that high scoring participants in this study also made external attribu tions for successful performance. However, despite the fact that studen ts believed both teacher help and task difficulty partly

PAGE 106

94 contributed to performance, re sults suggest that these childr en still invested effort and strategies to maintain their high scores However, for low scoring students, results were inconsistent with our hypothesis that attributions would be a possible sour ce of CIV which would differentially affect FCAT scores. This was not surprising sin ce low performing children in this study made neither internal nor external attr ibutions, that is, that there wa s very little variance in their attribution responses at all. Thus it appears that CIV as an explanation for poor performance is essentially unrealisti c for this group of students. The main question still unanswered by th ese findings is why low scoring children failed to make attributions and why were they optimistic about improving performance when they did not seem to understand what we nt wrong in the previous yearÂ’s test. There are a number of reasons to explain these results. The first has to do with timing of the su rvey. Responses abou t attributions for previous test performance were gathered almost a year after FCAT 2004 scores were returned but also two weeks before thes e same students were to take FCAT 2005. Attribution effects of test re sults could have dissipated an d/or been contaminated as children were asked to remember reasons for their performance on a test that was administered a year before and during a tim e teachers were preparing students for the upcoming test by enhancing their confiden ce for future performance through test preparation and motivational activities. Another explanation is that re sults could be due to false reporting on the part of low performing students who do not want to reveal any information abou t their performance or their attributions for it. Examination of individual surveys revealed that several

PAGE 107

95 students’ grades and FCAT scores were not the same as what was provided by school administrators. In addition, a number of stude nts gave neutral responses for every item, making neither internal nor extern al attributions. This behavior is in keeping with social desirability bias in which i ndividuals self-report incorrect information to protect their egos or manage the impressions they pr esent to others (Paulhus & Reid, 1991). Another reason for these results may be th at children failed to understand why they were not successful. Low scorers reported with reasonable consistency that they studied a lot, studied the righ t things and were smart enough to pass the test. In addition, they neither blamed their teachers nor others fo r their failures. It appears they were not thinking about how to improve or what went wrong, but just assuming that they could improve somehow since their responses indicat ed they felt relatively optimistic about improvement. Finally results could also reflec t the school climate of a laboratory school that places little emphasis on the test itself. However, the most important finding of the study was not the fact that attributions were not found to be a source of CIV but that low performing ch ildren did not report making any attributions at all. This is a problematic behavi or since previous research shows that effort attributions lead to improved performance. Without internal attributions low scoring children have nothing to guide them in their preparation Recommendations for Future Research Determining why low performing students fail ed to make any kind of attributions for poor performance on high stakes tests shou ld be a prime focus for future research. This research should seek to discover wh ether these findings ar e an artifact of a laboratory school framed in an “A” school co ntext or whether they generalize across the public school population. This could be easil y determined by replicating this study in

PAGE 108

96 other schools with different school grades. Futu re research should also seek to eliminate possible sources of confounding such as timi ng of survey administration and social desirability bias. This coul d be achieved by limiting responde nts to a higher number of low performing students and limiting the amount of time between when students receive FCAT results and when surveys are administered. Findings of this study also lead to re commendations for further research for attribution training. Numerous studies in th is area have documented improvement in academic achievement. For example, Dweck (1975) demonstrated that through training, children are able to change to effort attri butions in both success and failure outcomes and to develop motivation to learn as well as acad emic persistence. Fortheth et al., (2002) also found support for educational interventions which were able to change studentsÂ’ selfserving attributions to performance -facil itating patterns that em phasized attributing failure to internal, controllable causes. There is also the question of the cause and effect relationship between improvement and performa nce. Observing the pattern of behaviors in poor performing individuals who were able to adjust their own attr ibutions after failure may help to fill in some of these gaps si nce there are a group of students who changed attributions without retrai ning. Although these techniques ar e viable, they should be uniformly administered to ensure that at tributions do not become a source of CIV.

PAGE 109

97 REFERENCES American Educational Research Associati on, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing . Washington, DC: American Educational Research Association. Amrein, A. L., & Berliner, D. C. (2002). Hi gh-stakes testing, uncertainty, and student learning. Education Policy Analys is Archives, 10(18) . Baumeister, R. F., Heatherton, T. R., & Tice, D. M. (1993). When ego threat leads to self-regulation failure: Negative c onsequences of high self-esteem. Journal of Personality and Social Psychology, 64 , 141-156. Bloom, Karen. The civil rights project at Harvard (FCAT). Retrieved May 5, 2005, from http://www.fcar.info /trend.gest/docs/bloom.htm Burhans, K. K., & Dweck, C. (1995). Helplessness and early childhood: The role of contingent worth. Development, 66, 1719-1738. Byrnes, D. & Yamamoto, K. (1986). Views on grade repetition. Journal of Research and Development in Education, 20, 14-20. Campell, K. W., & Sedikiedes, C. (1999). Se lf-threat magnifies th e self-serving bias: A meta-analytic integration. Review of General Psychology, 3(1) , 23-43. Chapman, M., & Skinner, E. A. (1989). Childre n's agency beliefs, cognitive performance and conception of effort and ability: I ndividual and developmental differences. Child Development, 60 , 1229-1238. Clifford, M. M. (1984). Thoughts on a theory of constructive failure. Educational Psychologist , 19 , 108-120. Covington, M. V. (1984). Motivat ion for self-worth. In R. Ames & C. Ames (Eds.), Research on motivation in education . New York: Academic Press. Covington, M. V. (1986). Anatomy of failure-i nduced anxiety: The role of cognitive mediators. In R. Schwarzer (Ed.), Self-regulated cogniti ons in anxiety and motivation (pp. 247-263). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Covington, M. V. (1987). Achievement motivat ion, self-attributions, and the exceptional learner. In J. D. Day & J. G. Borowski (Eds.), Intelligence and exceptionality. Norwood, NJ: Ablex.

PAGE 110

98 Covington, M. V. (1992). Making the grade: A self-worth perspective on motivation and school reform. Cambridge, England: Cambridge University Press. Covington, M. V., & Omelich, C. L. (1981). As failures mount: Affective and cognitive consequences of ability demotion in the classroom. Journal of Educational Psychology, 73 (6), 796-808. Cronbach, L. J. (1988). Five perspectives on va lidity argument. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 3-17). Hillsdale, NJ: Erlbaum. Cronbach, L. J., & Meehl, P. E. (1955). C onstruct validity in psychological tests. Psychological Bulletin , 52 , 281-302. Crooks, T. (1988). The impact of classr oom evaluation practices on students. Review of Educational Research, 58(4) , 438-481. Duval, S. T., & Silvia, P. J. (2002). Self-a wareness, probability of improvement, and the self-serving bias. Journal of Personality and Social Psychology, 82 (1), 49-61. Dweck, C. (2000). Self-theories: Their role in motivation, personality and development. Essays in social psychology . Florence, KY: Psychology Press Dweck, C. S. (1975). The role of expectations and attributions in the alleviation of learned helplessness. Journal of Personality and Social Psychology, 31, 674-685. Dweck, C. S., & Bempechat, J. (1983). Childre nÂ’s theories of intelligence: Consequences for learning. In S. G. Paris, G. M. Olson, & H. W. Stevenson (Eds.), Learning and motivation in the classroom (pp. 239-256). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Dweck, C. S., & Goetz, T. E. (1978). Attribu tions and learned helplessness. In J. H. Harvey, W. Ickes, & R. F. Kidd (Eds.), New directions in attribution research (Vol 2, pp.157-179). Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Forsyth, D. R., & McMillen, J. H. (1981). Attrib utions, affect and expectations: A test of WeinerÂ’s three-dimensional model. Journal of Educational Psychology, 73 , 393403. Glass, G. V. (1991). Politics, markets and American schools. Educational Researcher, 20, 24-27 . Goldhaber, D., & Hannaway, J. (2004). Accountib ility with a kicker: Observations on the Florida accountability plan. Phi Delta Kappan, 85, 598. Gullickson, A. R. (1985). Student evaluation t echniques and their relationship to grade and curriculum. The Journal of Educational Research , 79 , 96-100. Guskey, T. R. (1994). Making the gr ade: What benefits students? Educational Leadership, 52 (2), 14-20.

PAGE 111

99 Haladyna, T. M., & Downing, S. M. (2004). Cons truct-irrelevant vari ance in high-stakes testing. Educational Measurement: Issues and Practice, 17-27. Hancock, D. R. (2001). Effects of test anxiety on students' achievement. Journal of Educational Research, 58 , 47-77. Heider, F. (1958). Psychology of interpersonal relations. New York: John Wiley and Sons, Inc. Hill, K. T., & Sarason, S. B. (1996). The relation of test anxiety and defensiveness to test and school performance over th e elementary school years. Paper presented at the Monographs of the Society for Re search in Child Development. Hill, K. T., & Wigfield, A. (1984). Test anxiety: A major educational problem and what can be done about it. Elementary School Journal, 85 , 105-126. Jimerson, S. R., Gabrielle, E., & Whipple, A. D. (2002). Winning the battle and losing the war: Examining the relation between grade retention and dropping out of high school. Psychology in the Schools, 39 (4), 441-457. Juvonen, J. (1988). Outcome and attributional disagreements between students and their teachers. Journal of Educational Psychology, 80, 330-336. Kane, M. (2002). Validating hi gh-stakes testing programs. Educational Measurement: Issues and Practice, 21 (1), 31-41. Karmos, A., & Karmos, J. (1984). Attitudes to ward standardized achievement tests and their relation to achievement test performance. Measurement & Evaluation in Counseling & Development, 17 (2), 56-66. Kellaghan, T., Madus, G. F., & Airasian, P. M. (1982). The effects of standardized testing . Boston, MA: Kluwer-Nijhoff. Kelly, H. H. (1971). Causal schemata and the a ttribution process. In E. E. Jones et al. (Eds.), Attribution: perceiving the causes of behavior (pp. 151-174). Morristown, NJ: General Learning Press. Leahy, R. L., & Hunt, T. M. (1983). A cognitive-developmental approach of theconceptions of in telligence. In R. L. Leahy, (Ed.), The childÂ’s conception of social inequality, (pp. 79-107) . NY: Academic Press. Linn, R. (2000) Assessments and accountability. Educational Researcher, 29, 4-16. Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison Wesley. McDevitt, T. M., & Ormrod, J. E. (2002). Child development: educating and working with children and adolescents. New York: Prentice Hall.

PAGE 112

100 Messick, S. (1984). The psychology of educational measurement. Journal of Educational Measurement, 21 , 215-237. Messick, S. (1989). Validity . New York: Macmillan Publishing Co. Messick, S. (1995). Validity of psychological assessment: Validation and inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist, 50 , 741-749. Midgley, C. (1995). Differenc es between elementary and middle school teachers and students: A goal theory approach. Journal of Early Adolescence, 15, 90-113. Miller D. I., & Ross, M. (1975) . Self-serving bias in attrib ution of causality: Fact or fiction? Psychological Bulletin, 82, 213-225. National Center for Fair and Open Testing Coalition for Authentic Reform. Retrieved May 5, 2005, from http://www.parentscare.org National Conference of State Legislators. 2005 . Retrieved May 5, 2005, from http://www.ncsl.org/public/n csl/ncslDocsearch.cfm Nicholls, J. G. (1979). The development of th e concepts of effort and ability, perception of academic attainment, and the understand ing that difficult tasks require more ability. Child Development, 49 , 800-814. Nicholls, J. G., & Miller, A. T. (1983). Differentiation of th e concepts of difficulty and ability. Child Development, 54, 951-959. Normandeau, S., & Gobeil, A. (1998). A developmental perspective on childrenÂ’s understanding of causal attributions in achievement-related situations. International Journal of Behavioral Development, 22 , 611-632. Paris, S. G., Lawton, T. A., Turner, J. C., & Roth, J. L. (1991). A developmental perspective on standardized achievement testing. Educational Researcher, 20 (5), 12-20. Parsons, J., & Ruble, D. N. (1977). The development of achievement-related expectancies. Child Development, 48 , 1075-1079. Passer, M. W. (1977). Perceiving the cause s of success and failure revisited: A multidimensional scaling approach. Unpublis hed doctoral dissertation, University of California at Los Angeles. Paulhus, D., & Reid, D. B. (1991). Enhancem ent and denial in socially desirable responding. Journal of Personality and Social Psychology , 60, 307-317. Rholes, W. S., Blackwell, J., Jordan C., & Walters, C. (1980). A developmental study of learned helplessness. Developmental Psychology, 16, 616-624.

PAGE 113

101 Rotter, J. B. (1966). Generalized expectations for internal versus external control of reinforcement. Psychological Monographs, 80 . Rotter, J. B. (1990). Internal versus external control of reinforcement: A case history of a variable. American Psychologist, 45 (4), (489-493). Ruble, D. N. (1983). The role of social comparis on processes and their role in achievement related self socialization . New York: Cambridge University Press. Sternberg, R. J. (1998). Abilities ar e forms of developing expertise. Educational Researcher, 27 (3), 11-20. Stiggins, R. J., Frisbie, D. A., & Griswol d, P. A. (1989). Inside high school grading practices: Building a research agenda. Educational Measurement: Issues and Practice, 8(2) , 5-14. Stipek, D. J. (2002). Motivation to learn: Integr ating theory and practice . 4th Ed. Boston: Allyn and Bacon. Thorkildsen, T. A. (1989). Justice in the classroom: The student's view. Child Development, 60 , 323-334. Thornton, K. (2001, May 4). Test pressures forces trainees to quit. The Times Educational Supplement , N. 4427. Wear-Bradley, G., & Zuckermann, M. (1979). Self -serving bias in the attribution process: A reexamination of the f act or fiction question. Journal of Personality and Social Psychology, 36 , 56-71. Weiner, B. (1979). A theory of motiva tion for some classroom experiences. Journal of Educational Psychology, 71 (1), 3-25. Weiner B. (1986). Attribution, emotion and ac tion. In R. M. Sorrentino, Higgins, E. T. (Ed.), Handbook of Motivation and Cognition: Foundations of Social Behavior, pp. (281-312.) New York: Guilford Press. Widening gapsgrowing vulnerability . Retrieved May 5, 2005, from http://www. fairtest.org/care Wigfield, A., & Eccles, J. (1989). Test anxi ety in elementary and secondary students. Educational Psychologist, 24, 159-183. Zohar, D. (1998). An additive model of test a nxiety: Role of exam-specific expectations. Journal of Educational Psychology, 90 (2), 330-340.

PAGE 114

102 APPENDIX A INTER-ITEM CORRELATIONS Table 46. Inter-Item Correlati ons for Feelings and Beliefs about the Consequences of Performing Poorly for Report Card Grades Variable I2 I3 I4 I5 I6 I7 I8 I9 I10 I11 You might feel sad (I2) 1.000 You might feel embarrassed (I3) .474 1.000 You might feel frightened (I4) .346 .227 1.000 You wouldnÂ’t care (I5) .334 .179 .247 1.000 Your parents would be disappointed (I6) .357 .250 .147 .270 1.000 You would be punished (I7) .435 .230 .396 .135 .518 1.000 Your teacher would be disappointed (I8) .251 .099 .109 .164 .182 .257 1.000 Nothing would happen (I9) .142 .142 .173 .173 .264 .370 .115 1.000 You friends might make fun of you (I10) .056 .170 .207 -.002 .055 .147 .115 -.048 1.000 You might be held back (I11) .205 .203 .209 -.012 .103 .220 .093 .044 -.048 1.000

PAGE 115

103 Table 47. Inter-Item Correlations for Beliefs in Ability to Improve Report Card Grades Variable I12 I13 I14 I15 I16 You have a certain amount of math intelligence, and you canÂ’t do much to change your report cardÂ… (I12) 1.000 Your math ability is something about you so you canÂ’t change your gradeÂ…(I13) .427 1.000 You can learn new things but you canÂ’t really change your grade (I14) .208 .516 1.000 You can change your math grade by studying (I15) harder .267 .155 .174 1.000 You can always change your math grade by preparing (I16) .219 .285 .082 .418 1.000

PAGE 116

104 Table 48. Inter-Item Correlati ons for Feelings and Beliefs about the Consequences of Performing Poorly for FCAT Variable I35 I36 I37 I38 I39 I40 I41 I42 I43 I44 You might feel sad (I35) 1.000 You might feel embarrassed (I36) .584 1.000 You might feel frightened (I45) .449 .490 1.000 You wouldnÂ’t care (I46) .440 .265 .252 1.000 Your parents would be disappointed (I47) .396 .296 .249 .303 1.000 You would be punished (I48) .367 .261 .328 .192 .411 1.000 Your teacher would be disappointed (I49) .078 .254 .130 .048 .308 .352 1.000 Nothing would happen (I50) .167 .214 .153 .307 .401 .215 .321 1.000 You friends might make fun of you (I51) .191 .394 .320 .034 .159 .307 .220 -.005 1.000 You might be held back (I52) .385 .298 .237 .141 .201 .258 .244 .152 .309 1.000

PAGE 117

105 Table 49. Inter-Item Correlations for Beliefs in Ability to Improve FCAT Table 50. Item-total Correlations for F eelings about doing poorly on Report Cards Variable I57 I58 I59 I60 I61 You have a certain amount of math intelligence, and you canÂ’t do much to change your FCAT score (I57) 1.000 Your math ability is something about you so you canÂ’t change your FCAT scoreÂ…(I58) .554 1.000 You can learn new things but you canÂ’t really change your FCAT score (I59) .419 .456 1.000 You can change your FCAT score by studying (I60) harder .316 .332 .195 1.000 You can always change your FCAT score by preparing (I61) .148 .371 .064 .701 1.000 Variable Corrected Item-Total Correlation Cronbach's Alpha if Item Deleted You might feel sad (I2) .566 .654 You might feel embarrassed (I3) .410 .678 You might feel frightened (I4) .441 .672 You wouldnÂ’t care (I5) .300 .696 Your parents would be disappointed (I6) .451 .674 You would be punished (I7) .582 .643 Your teacher would be disappointed (I8) .280 .700 Nothing would happen (I9) .283 .698 You friends might make fun of you (I10) .157 .719 You might be held back (I11) .230 .710

PAGE 118

106 Table 51. Item-total Correlati ons for Beliefs in Ability to Improve Report Card Grade Variable Corrected Item-Total Correlation Cronbach's Alpha if Item Deleted You have a certain amount of math intelligence, and you canÂ’t do much to change your report card (I12) .417 .417 Your math ability is something about you so you canÂ’t change your grade (I13) .549 .549 You can learn new things but you canÂ’t really change your grade (I14) .366 .366 You can change your math grade by studying harder (I15) .361 .361 You can always change your math grade by preparing (I16) .355 .355 Table 52. Item-total Correlations for Fee lings and Beliefs about Consequences of Performance for FCAT Variable Corrected Item-Total Correlation Cronbach's Alpha if Item Deleted You might feel sad (I35) .594 .750 You might feel embarrassed (I36) .597 .749 You might feel frightened (I37) .498 .762 You wouldnÂ’t care (I38) .362 .778 Your parents would be disappointed (I39) .508 .762 You would be punished (I40) .510 .760 Your teacher would be disappointed (I41) .355 .779 Nothing would happen (I42) .346 .779 You friends might make fun of you (I43) .369 .778 You might be held back (I44) .418 .772

PAGE 119

107 Table 53. Item-total Correl ations for Beliefs in Ability to Improve for FCAT Variable Corrected Item-Total Correlation Cronbach's Alpha if Item Deleted You have a certain amount of math intelligence, and you canÂ’t do much to change your FCAT score (I57) .500 .679 Your math ability is something about you so you canÂ’t change your FCAT score (I58) .623 .632 You can learn new things but you canÂ’t really change your FCAT score (I59) .380 .728 You can change your FCAT score by studying harder (I60) .546 .663 You can always change your FCAT score by preparing (I61) .424 .709 Table 54. Item-total Correlations for Self-S erving Bias for FCAT for High -Scoring Students Variable Corrected Item-Total Correlation Cronbach's Alpha if Item Deleted You studied a lot (I45) .541 .537 You studied the right things (I46) .544 .540 You are smart (I47) .396 .605 The teacher explained things well (I48) .525 .549 Someone helped you (I49) .269 .648 The work was easy (I50) .077 .713

PAGE 120

108 Table 55. Item-total Correlations for Self -Serving Bias for FCAT for Low-Scoring Students Variable Corrected Item-Total Correlation Cronbach's Alpha if Item Deleted You didnÂ’t study much (I51) .566 .824 You didnÂ’t study the right things (I52) .569 .822 You are not smart (I53) .667 .803 The teacher didnÂ’t explain things well (I52) .645 .808 You werenÂ’t helped by anyone (I55) .698 .800 The work was hard (I56) .575 .823

PAGE 121

109 APPENDIX B HIGH SCORING VERSUS LOW SCORING Table 56. ANOVA Test on Student Responses about Report Ca rd Grade for “You studied a lot” (i22 v 28) for High ve rsus Low Scoring Students Source Type III Sum of Squares df Mean Square F p FCA\T 04 3.99 1 3.99 2.56 .112 Grade level 2.98 2 1.49 .955 .388 Grade level * FCAT 04 3.19 2 1.60 1.02 .363 Error 201.63 129 1.56 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above. Table 57. ANOVA Test on Student Responses about Report Ca rd Grade for “You studied the right things” (i23 v 29) for Hi gh versus Low Scoring Students Source Type III Sum of Squares df Mean Square F p FCAT 04 45.26 1 45.26 28.89 .000 Grade level 1.24 2 .620 .396 .674 Grade level * FCAT 04 .207 2 .103 .066 .936 Error 202.07 129 1.57 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above.

PAGE 122

110 Table 58. ANOVA Test on Student Responses about Report Card Grade for “You are smart” (i24v30) High versus Low Scoring Students Source Type III Sum of Squares df Mean Square F p FCAT 04 75.03 1 75.03 47.85 .000 Grade level 2.30 2 1.15 .734 .482 Grade level * FCAT 04 2.93 2 1.47 .935 .395 Error 200.73 128 1.57 Note. Dashes indicate not applicable. * p < .008. Grade level included thr ee levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above. Table 59. ANOVA Test on Student Responses about Report Card Grade for “The teacher explained things well”(i25v31) for High versus Low Scoring Students Source Type III Sum of Squares df Mean Square F p FCAT 04 37.19 1 37.19 25.88 .167 Grade level .324 2 .162 .113 .002 Grade level * FCAT 04 6.96 2 3.48 2.42 .036 Error 185.37 129 1.44 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above. Table 60. ANOVA Test on Student Responses about Report Ca rd Grade for “You were helped by someone” (i26v32) for High versus Low Scoring Students Source Type III Sum of Squares df Mean Square F p FCAT 04 4.69 1 4.69 .109 .020 Grade level 2.59 2 1.29 .490 .011 Grade level * FCAT 04 1.16 2 .581 .323 .005 Error 230.62 128 1.80 Note. Dashes indicate not applicable. * p < .008. Grade level incl uded three levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable wa s responses to the single attribution item listed above.

PAGE 123

111 Table 61. ANOVA Test on St udent Responses about Report Card Grade for “ The work was easy” (27v33) for High versus Low Scoring Students Source Type III Sum of Squares df Mean Square F P FCAT 04 .014 1 .014 .010 .921 Grade level 12.34 2 6.17 4.23 .016 Grade level * FCAT 04 .334 2 .167 .115 .892 Error 187.80 129 1.46 Note. Dashes indicate not applicable. * p < .008. Grade level included thr ee levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above. Table 62. ANOVA Test on Student Responses about FCAT 04 for “You studied a lot” (45v51) for High versus Low Scoring Students Source Type III Sum of Squares df Mean Square F p FCAT 04 12.96 1 12.97 10.19 .002 Grade level .224 2 .112 .088 .916 Grade level * FCAT 04 47.92 2 23.96 18.84 .000 Error 147.56 116 1.27 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above. Table 63. ANOVA Test on Student Responses about FCAT 04 for “You studied the right things” (46v52) for High ve rsus Low Scoring Students Source Type III Sum of Squares df Mean Square F p FCAT 04 11.33 1 11.33 10.49 .002 Grade level 2.45 2 1.23 .325 .020 Grade level * FCAT 04 45.88 2 22.94 21.24 .000 Error 123.12 114 1.08 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above.

PAGE 124

112 Table 64. ANOVA Test on Stude nt Responses about FCAT 04 for “You are smart” (47v53) for High versus Low Scoring Students Source Type III Sum of Squares df Mean Square F p FCAT 04 92.75 1 92.75 86.24 .000 Grade level 1.81 2 .908 .844 .433 Grade level * FCAT 04 3.35 2 1.67 1.55 .215 Error 113 1.08 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above. Table 65. ANOVA Test on Stude nt Responses about FCAT 04 for “The teacher explained things well” (48v54) for High versus Low Scoring Students Source Type III Sum of Squares df Mean Square F p FCAT 04 22.74 1 22.74 15.48 .000 Grade level .069 2 .035 .024 .977 Grade level * FCAT 04 7.62 2 3.80 2.59 .080 Error 164.85 112 1.47 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above. Table 66. ANOVA Test on Student Responses about FCAT 04 for “You were helped by someone” (49v55) for High versus Low Scoring Students Source Type III Sum of Squares df Mean Square F p FCAT 04 2.48 1 2.48 1.42 .235 Grade level 4.16 2 2.08 1.19 .307 Grade level * FCAT 04 8.55 2 4.27 2.45 .091 Error 195.11 112 1.74 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above.

PAGE 125

113 Table 67. ANOVA Test on Student Responses about FCAT 04 for “The work was easy” (50v56) for High versus Low Scoring Students Source Type III Sum of Squares df Mean Square F P FCAT 04 7.68 1 7.68 4.55 .035 Grade level 15.89 2 7.95 4.707 .011 Grade level * FCAT 04 4.18 2 2.09 1.24 .294 Error 187.39 111 1.69 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle and high school. FCAT 04 included two levels: high scoring and low scoring students. The dependent variable was responses to the single attribution item listed above.

PAGE 126

114 APPENDIX C GRADES VERSUS FCAT ATTRIBUTIONS Table 68. ANOVA Test on Student Responses for “You studied a lot” (i22 v 45) for High Scoring Students for Report Card and FCAT Source Type III Sum of Squares df Mean Square F p Grade level 35.031 2 17.516 13.507 .000 Criterion 2.623 1 2.623 2.023 .157 Grade level * Criterion 24.768 2 12.384 9.550 .000 Error 224.343 173 1.297 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above. Table 69. ANOVA Test on Student Responses for “You studied the right things” (i23 v 46) for High Scoring Students for Report Card and FCAT Source Type III Sum of Squares df Mean Square F p Grade level 18.231 2 9.116 7.463 .001 Criterion 1.581 1 1.581 1.294 .257 Grade level * Criterion 31.026 2 15.513 12.700 .000 Error 210.096 172 1.221 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above.

PAGE 127

115 Table 70. ANOVA Test on Student Responses for “You are sm art” (i24 v 47) for High Scoring Students for Report Card and FCAT Source Type III Sum of Squares df Mean Square F p Grade level 4.050 2 2.025 1.674 .191 Criterion 4.770 1 4.770 3.943 .049 Grade level * Criterion 4.228 2 2.114 1.747 .177 Error 206.860 171 1.210 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above. Table 71. ANOVA Test on Student Responses for “The teacher explained things well” (i25 v 48) for High Scoring Stude nts for Report Card and FCAT Source Type III Sum of Squares df Mean Square F Sig. Grade level 10.341 2 5.170 3.759 .025 Criterion 1.428 1 1.428 1.038 .310 Grade level * Criterion 4.984 2 2.492 1.812 .166 Error 235.213 171 1.376 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above. Table 72. ANOVA Test on Student Responses for “Someone helped you” (i26 v 49) for High Scoring Students for Report Card and FCAT Source Type III Sum of Squares df Mean Square F Sig. Grade level .301 2 .151 .082 .922 Criterion 1.081 1 1.081 .587 .445 Grade level * Criterion 1.374 2 .687 .373 .690 Error 311.508 169 1.843 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above.

PAGE 128

116 Table 73. ANOVA Test on Student Responses for “The work was easy” (i27 v 50) for High Scoring Students for Report Card and FCAT Source Type III Sum of Squares df Mean Square F Sig. Grade level 7.981 2 3.991 2.704 .070 Criterion 2.629 1 2.629 1.781 .184 Grade level * Criterion 4.166 2 2.083 1.412 .247 Error 249.390 169 1.476 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above. Table 74. ANOVA Test on Student Responses for “You didn’t study much” (i28 v 51) for Low Scoring Students for Report Card and FCAT Source Type III Sum of Squares df Mean Square F Sig. Grade level 11.983 2 5.991 3.455 .037 Criterion 6.448 1 6.448 3.719 .058 Grade level * Criterion 4.752 2 2.376 1.370 .261 Error 124.852 72 1.734 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above. Table 75. ANOVA Test on Student Responses for “You didn’t study the right things” (i29 v 52) for Low Scoring Stude nts for Report Card and FCAT Source Type III Sum of Squares df Mean Square F Sig. Grade level 11.378 2 5.689 3.509 .035 Criterion 2.381 1 2.381 1.469 .230 Grade level * Criterion 6.319 2 3.159 1.949 .150 Error 115.092 71 1.621 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above.

PAGE 129

117 Table 76. ANOVA Test on Student Responses for “You are not smart” (i30 v 53) for Low Scoring Students for Report Card and FCAT Source Type III Sum of Squares df Mean Square F Sig. Grade level 2.401 2 1.200 .728 .486 Criterion .096 1 .096 .058 .810 Grade level * Criterion 1.436 2 .718 .436 .649 Error 115.400 70 1.649 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above. Table 77. ANOVA Test on Student Responses for “The teache r did not explain things well” (i31 v 54) for Low Scoring Stud ents for Report Card and FCAT Source Type III Sum of Squares df Mean Square F Sig. Grade level 4.122 2 2.061 1.254 .292 Criterion .046 1 .046 .028 .868 Grade level * Criterion .720 2 .360 .219 .804 Error 115.008 70 1.643 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above. Table 78. ANOVA Test on Student Responses for “You weren’t helped by anyone” (i32 v 55) for Low Scoring Students for Report Card and FCAT Source Type III Sum of Squares df Mean Square F Sig. Grade level 7.544 2 3.772 2.345 .103 Criterion .111 1 .111 .069 .794 Grade level * Criterion 3.514 2 1.757 1.092 .341 Error 114.221 71 1.609 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above.

PAGE 130

118 Table 79. ANOVA Test on Student Responses for “The work was hard” (i33 v 56) for Low Scoring Students for Report Card and FCAT Source Type III Sum of Squares df Mean Square F Sig. Grade level 7.981 2 3.991 2.704 .070 Criterion 2.629 1 2.629 1.781 .184 Grade level * Criterion 4.166 2 2.083 1.412 .247 Error 249.390 169 1.476 Note. Dashes indicate not applicable. * p < .008. Grade level included three levels: elementary, middle, and high school. Criterion included two levels, report card grade and FCAT. The dependent variable was responses to the single attribution item listed above.

PAGE 131

119 BIOGRAPHICAL SKETCH Jenny Bergeron was born in Silver Spring, Maryland, on September 1, 1974. The oldest child of Raymond and Kathleen Berger on, she grew up in Gainesville, Florida. In 1992, she graduated from Gainesville High Sch ool and enrolled at the University of Florida. She was elected to Phi Beta Kappa in her junior year and graduated from the University of Florida in 1997 with honors, ear ning the Bachelor of Science in psychology with a minor in French. Her research as an undergraduate focused on the effects of cocaine on behavior a nd learning in small animals. In 2002 she received her masterÂ’s degree in educational psychology at the Univ ersity of Florida, focusing on intrinsic motivation as it relates to information pro cessing for special populations (i.e., children with Attention Deficit Hype ractivity Disorder [ADHD]) Her interests were shifted in 2003 when she entered the Ph.D. program in measurement and evaluation at the University of Florida where she is currently interested in measurement and assessment issues as they relate to validity in high-stakes testing.