1 PROGRESS MONITORING IN MIDDLE SCHOOL MATHEMATICS: THREE CURRIC U LU M BASED MEASURES OF MATHEMATICS PROFICIENCY By ANGELA DENISE DOBBINS A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2013
2 2013 Angela Denise Dobbins
3 To my mom, dad, granddad, stepmother, sister, and brother for your continued support
4 ACKNOWLEDGMENTS Throughout this journey, I have received support (emotional, financial, and academic) and encouragement from a number of individuals. My mother and fathe r have been a constant support system for me throughout my educational career. I am grateful to them for instilling in when others did not understand my decision m aking in relation to my career path will always be a reminder to reach for my goals despite what others may think. To Dr. Nancy Waldron, thank you for fostering my growth as a scholar practi ti oner in the field through grant and teaching opportunities. I w ould not be the practi tio ner that I am today without the guidance of Dr. Diana Joyce; your dedication to the field of psychology and commitment to mentoring students is greatly appreciated. I also thank Dr. James Algina for challenging me to think critical ly about research design and statistical analysis. I am forever grateful for your assistance during the development of this research study. Dr. Joseph Gagnon, I thank you for supporting my development as a scholar through collaborations on manuscripts to i nform future practice in mathematics instruction. This process would not have been possible without the support of my colleagues, friends, and university staff. I could not have completed the endless days and nights of writing and studying without the li stening ears of Chris Raye, Shauna Miller, Camee Maddox, Erika Jones, Cathy Pasia, and Brittaney Ross. I am extremely thankful for the Office of Graduate Minority Programs faculty and staff, without you all I would not have remained committed to finishing this goal. Thank you for ensuring that I felt connected to the University and the Gainesville community, as well as providing financial support when all other resources were unavailable. Your hard work and dedication are valued.
5 TABLE OF CONTENTS page ACKNOWLEDGMENTS ................................ ................................ ................................ ............... 4 LIST OF TABLES ................................ ................................ ................................ ........................... 7 ABSTRACT ................................ ................................ ................................ ................................ ..... 8 CHAPTER 1 INTRODUCTION A ND REVIEW OF THE LITERATURE ................................ ............... 10 Statement of the Problem ................................ ................................ ................................ ........ 10 Review of Relevant Literature ................................ ................................ ................................ 11 Educational Reform in Mathematics ................................ ................................ ............... 11 Response to Intervention and Math ................................ ................................ ................. 13 Curriculum Based Measurement ................................ ................................ ..................... 14 Stages of CBM Research ................................ ................................ ................................ 17 Technical adequacy ................................ ................................ ................................ .. 17 Growth traject ory ................................ ................................ ................................ ..... 18 Instructional utility ................................ ................................ ................................ ... 19 CBM at the Elementary Level ................................ ................................ ......................... 19 CBM at the Middle School Level ................................ ................................ .................... 24 Statement of Purpose ................................ ................................ ................................ .............. 28 2 RESEARCH METHODOLOGY ................................ ................................ ........................... 30 Conceptual Framework ................................ ................................ ................................ ........... 30 Research Setting and Participants ................................ ................................ ................... 31 Measures ................................ ................................ ................................ .......................... 32 CBM test specifications ................................ ................................ ........................... 33 Test specification and item reviewers ................................ ................................ ...... 34 Guidance given to reviewers ................................ ................................ .................... 35 Crit erion measure ................................ ................................ ................................ ..... 36 Data Collection Procedures ................................ ................................ ............................. 37 Administration time ................................ ................................ ................................ .. 38 Scoring and data entry ................................ ................................ .............................. 39 Data Analy sis ................................ ................................ ................................ ................... 41 3 RESULTS ................................ ................................ ................................ ............................... 45 Technical Adequacy ................................ ................................ ................................ ............... 46 Internal Consistency ................................ ................................ ................................ ........ 46 Item level Functioning ................................ ................................ ................................ .... 46 Criterion Validity ................................ ................................ ................................ ............. 49 Factor Analysis ................................ ................................ ................................ ................ 50
6 4 DISCUSSION ................................ ................................ ................................ ......................... 64 Technical Adequacy ................................ ................................ ................................ ............... 65 Internal Consistency ................................ ................................ ................................ ........ 65 Item Level Functioning ................................ ................................ ................................ ... 66 Criterion Validity ................................ ................................ ................................ ............. 69 Factor Analysis ................................ ................................ ................................ ................ 71 Limitations ................................ ................................ ................................ .............................. 72 Implications for Future Research ................................ ................................ ............................ 73 APPENDIX A TEST SPECIFICATIONS ................................ ................................ ................................ ...... 76 A 1. Test Specifications for NO CBM ................................ ................................ ................... 76 A 2. Test Specifications for AL CBM ................................ ................................ ................... 78 A 3. Test Specifications for GE CBM ................................ ................................ ................... 80 A 4. Item Reviewer Rating Scale ................................ ................................ ........................... 83 B CBM PROBES ................................ ................................ ................................ ....................... 87 C ITEM ANALYSIS ................................ ................................ ................................ .................. 99 C 1. Number operations (NO CBM) using dichotomous scoring. ................................ ......... 99 C 2. Algebra (Alg CBM) item analysis using dichotomous scoring. ................................ .. 100 C 3. Geometry (Geo CBM) item analysis using dichotomous scoring. ............................... 101 LIST OF REFERENCES ................................ ................................ ................................ ............. 102 BIOGRAPHICAL SKETCH ................................ ................................ ................................ ....... 107
7 LIST OF TABLES Table page 2 1 Sample demographics distributed by participating school ................................ ................ 4 4 3 1 ................................ .... 53 3 2 Number operations (NO CBM) item analysis using digits corr ect scoring ....................... 54 3 3 Algebra (AL CBM) item analysis using digits correct scoring ................................ ......... 55 3 4 Geometry (GE CBM) item analysis using digits correct scoring ................................ ...... 56 3 5 Descriptive statistics for CBM measures and FCAT content areas ................................ ... 57 3 6 Correlation matrix for student CBM and FCAT performance ................................ ........... 58 3 7 Correlation matrix using CBM f actor parcels ................................ ................................ .... 59 3 8 Factor loadings for single factor model of mathematics achievement .............................. 60 3 9 Factor loadings for three factor model of mathematics achievement ................................ 61 3 10 Goodness of fit indices for single and three factor models of mathematics achievement ................................ ................................ ................................ ....................... 62 3 11 Factor correlations for the three factor model of mat hematics achievement .................... 63
8 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy PROGRESS MONITORING IN MIDDLE SCHOOL MAT HEMATICS: THREE CURRICUL UM BASED MEASURES OF MATHEMATICS PROFI CIENCY By Angela D. Dobbins August 2013 Chair: Nancy Waldron Major: School Psychology Educational reform efforts in the area of mathematics have focused on strengthening the connection between instruction and progress monitoring practices. In order to provide a valid measure of mathematic s achievement, i t is necessary that assessment proced ures be designed to measure the a ppropriate gr ade level curricular standards Previous research has focused on the development of curriculum based measures (CBM) as a method to monitor student progress towards mathematics proficiency. Numerous studies have developed and examined the technical adequacy of CBM measures for elementary level mathematics; however limited studies have focused on middle school. Given the significant curricular shift towards higher order thinking during middle school, a need exists to develop technically adequate measures for middle school mathematics. The primary aim of this study was t o develop CBM measures that are representative of the national curriculum standards for 6 th grade mathematics and examine the technical adequacy of teach measure in assessing student performance. Two schools were recruited to participate in the study, with one 6 th grade mathematics teacher from each school consenting to administer the measures to their students. Data were obtained from 135, 6 th grade students on each of the three CBM measures (NO, AL, and GE) and the state standardized assessment. Analyses procedures examined the technical adequacy of the
9 CBM measures through internal consistency, item analysis, and criterion validity. Factor analysis procedures were used to determine the factor structure for the three CBM measures as a measure of mathematics achievement. Results indicated that each of the three measures items require significant revisions, as evidenced by internal consistency coeffici ent alpha and item analysis (item difficulty and discrimination). Factor analysis results suggest that a three factor model fit the CBM data adequately; however the observed factors were correlated. Study results provide an extension of previous research by focusing on middle school and provide implications for future research in the field.
10 CHAPTER 1 INTRODUCTION AND REVIEW OF THE LITERATURE Statement of the Problem Best practices for meeting the needs of a diverse group of learners in education involves the integration of instruction and assessment to form a data based decision making model for service delivery within schools (Allsopp, Kyger, Lovin, Gerreston, Carson, & Ray, 2008; Shinn, 2002). To date, research stu dies have focused extensive ly on identifying and examining effective instructional strategies, interventions, and progress monitoring systems for reading (Bryant & Bryant, 2008). In contrast, empirical research on the effective use of assessment to inform educational decision making in mathematics has received limited attention, particularly in grade levels beyond the elementary level. The need for more research focused on math skill development has been recognized based on studies show ing that approximately 5 10 percent of school age children exhibit math disabilities (Fuchs et al., 2005; Fuchs, Fuchs & Hollenbeck, 2007). Upon entering middle school, students with math disabilities perform basic addition facts at a third grade level, with growth trajectory rates of one year to every two or more years of instruction (Calhoon & Fuchs, 2003). This struggle to reach grade level expectations has been attributed to the higher level cognitive demands associated with middle school mathematics curri culum (Ketterlin Geller, Chard, & Fien, 2008). Middle school is a point in which there is a dramatic shift in the curricular emphasis in mathematics, with increased focus on algebraic thinking and geometric skills (Jiban & Deno, 2007; Foegen, 2008; Ketterl in Geller, Chard, & Fien, 2008; Milgram, 2005; Witzel, 2005). Following middle school, mathematics courses become more domain specific, with students needing to integrate and extend prior learned skills to succeed.
11 Early identification of students at risk for failure in middle school is important because as students progress to high school, the achievement gap begin s to grow substantially (Thurlow, Albus, Spicuzza, & Thompson, 1998) Results obtained from the National Assessment of Educational Progress (NA EP) i ndicated that in 2011, 36 percent of 8th grade student s with math disabilities are at or above basic proficiency levels on standardized mathe matics tests in comparison to 78 percent of their same grade peers that do not have a math disability ( Nation al Center for Educational Statistics, 2012). With this in mind, sixth grade is an important grade level with regard to secondary math achievement because it is the point at which most students are first exposed to this aforementioned curricular shift. In a ddition, sixth grade is when students may begin to show early signs of difficulties in learning higher order math skills. In order to effectively identify early indicators of student achievement levels, assessment measures that are technically adequate sho uld be administered frequently to monitor student progress towards proficiency. The method for meeting the needs of this group of students involves utilizing assessment measures that are aligned with national educational standards for sixth grade mathemati cs proficiency and are embedded in a data based decision making, instructional service delivery model. Review of Relevant Literature Educational Reform in Mathematics Educational reform efforts have put into place national standards to enhance curriculum, instruction, and assessment practices within the United States. The National Council of Tea chers of Mathematics (NCTM; 2006 ) developed the Curriculum Focal Points as a set of standards for effective mathematics instruction that emphasize conceptual underst anding, mathematical reasoning, and problem solving. These focal points are recommended content emphases for each grade level that address the domain areas of number operations, algebra, and geometry. In
12 addition to the NCTM Curriculum Focal Points, the Na tional Governors Association Center for Best Practices (NGA) and the Council of Chief State School Officers (CCSSO) initiated the development of the Common Core State Standards for Mathematics (CCSS, 2010). This set of standards was designed to provide a u niversal set of guidelines for mathematics that are closely aligned with the Curriculum Focal Points ( NCTM, 2006 ). When integrated, NCTM (2006) and CCSS (2010) provide a detailed blueprint of the components for an effective mathematics curriculum; however these standards provide little guidance in relation to specific service delivery models that promote the incorporation of empirically based instructional standards that can be applied to students with diverse learning needs. Students experiencing difficul ties in mathematics are a heterogeneous group; exhibiting deficits within a number of domain areas. Educational policies such as the Individual with Disabilities Education Act (2004) and No Child Left Behind Act (2001) were enacted to ensure that schools h ave a framework for providing educational services to all students, regardless of achievement level. In addition to requiring that schools implement service delivery models that use evidence based instructional practices, IDEA and NCLB require the collecti on and analysis of student performance data when making instructional decisions (NCLB, 2001; IDEA, 2004). Response to Interve ntion (RtI) now also referred to as Multi Tiered System of Support s (MTSS) is a practical method for systematically providing ev idenced based instruction and conducting assessments of student performance to address the needs of all students ( Algozzine, Wang & Violette, 2011; Crawford & Ketterlin Geller, 2008; Johnson & Smith, 2008). Research literature has shown that in addition to being effective in meeting the needs of a diverse group of learners, RtI is also an effective method of identifying students struggling to reach proficiency
13 standards such as NCTM (2006) and CCSS (2010) (Allsopp, 2009;Fuchs, Mock, Morgan, & Young, 2003). Response to Intervention and Math The essential design of an RtI model for mathematics consists of three or more tiers of instructional support, with the levels of intensity and individualization of instruction increasing at each tier (Bryant & Bryant, 2008). Tier I, the initial phase, insures that all students receive high quality core instruction. Student achievement is assessed through universal screening measures conducted periodically throughout the school year. Systems level instructional changes are made when the assessment data reveals a group deficit in math ematic skills. Students unable to make adequate gains without additional instructional support are moved into Tier II and receive supplemental small group mathematics instruction. Thi s additional instruction focuses on targeted skill areas with a suggested duration of 20 40 minutes, four times per week (Gersten & Beckman, 2009). Students from Tier II that are not making sufficient growth towards grade level math benchmarks are transiti oned to the third tier of the RtI model. Tier III consists of one on one or small group instruction that is focused on providing high levels of exposure and practice with instructional material. For students in Tier II and Tier III progress towards grade level math benchmarks is continuously monitored, with the frequency of monitoring increasing at each tier (Gersten & Beckman, 2009). Decisions related to instructional changes in an RtI model are ains towards yearly curriculum goals, as assessed by progress monitoring measures (Burns & Gibbons, 2008; Fuchs & Fuchs, 2006; Kashima, Schleich, Spradlin, 2009; Wallace, Espin, McMaster, Deno, & Foegen, 2007). Due to progress monitoring being an integral part of the educational decision making process within an RtI model, it is vital that the chosen form of as sessment has empirical support (Burns & Gibbons, 2008; Wallace, Espin, McMaster, Deno, & Foegen, 2007).
14 Over the past 35 years, education literature has provided support for the use of curriculum based measurement (CBM) as a form of progress monitoring to assess student academic progress toward reaching grade level curricular expectations (Deno, 2003; Stecker, Fuchs, & Fuchs, 2005; Thurber, Shinn, Smo lkowski, 2002). Of the studies focused on ma thematics, the majority examine d the development and use of CBM measures at the elementary level. Considering the significant number of school age adolescents that are failing to reach grade level expectations an d underperform on high stakes standardized mathematics assessments, increased research in the area of monitoring student performance at grade levels beyond elementary school is necessary. By increasing the focus of mathematics research on progress monitori ng at the middle and high school levels, educators will be able to identi fy adolescent students struggling to reach annual mathematics curriculum goals before these students fall significantly behind. Curriculum Based Measurement A key component of the progress monitoring process is the selection of a screening/assessment measure that suits the needs of the school system, and is closely aligned with national, state, and local curriculum standards (Wallace et al, 2007). Traditional mathematics assessment measures such as unit mastery exams, also known as sub skill mastery measures (SMMs), have been used as a method for monitoring student progress on a subset of skills (Christ & Vining, 2006). SMMs are by definition narrow in content focus, with the expectation that students will master the material after instruction is provided over brief periods of time (Christ & Vining, 2006). SMMs are not designed to measure student progress across the annual mathematics curriculum, but instead mea sure specific skills that have been taught as a component of the annual curriculum. This separation of specific skills into sub assessments of
15 level curricula r content throughout the year. SMM data is used to determine the trend of mathematics performance relative to a narrow skill focus; thereby serving as an indicator of subskill proficiency and not an indicator of overall mathematics proficiency (Fuchs & Den o, 1991). Another form of assessment, curriculum based measurement (CBM; Deno, 1985), has become one of the most highly recognized and researched tools to monitor student progress across the general curriculum. CBM is a brief (e.g. 1 11 minutes) assessmen t of general outcomes for a specified academic domain. The function of CBM as a general outcome measure (GOM) is to serve as a dynamic indicator of basic skills (DIBS) that are essential to school success (Shinn, 2007). In terms of being dynamic, CBM measu res are designed to measure changes in student performance when administered within short periods of time (i.e. 4 6 weeks); requiring that the measures be sensitive to short term effects of instruction. As an indicator of basic skills, student performance on CBM measures provides an indication of overall performance in a particular subject area such as mathematics. CBM measures are not designed to measure all skills within mathematics; instead CBM content is determined by selecting grade level skills that are deemed as important components of the curriculum (Shinn, 2007; Stecker, Fuchs, & Fuchs, 2000). Additionally, as an indicator, CBMs are not designed to serve as a diagnostic or prescriptive tool (Shinn, 2007). More specifically, the function of CBM does not include being able to identify areas of strengths and weaknesses. Instead, the functions of CBM as a general outcome measure are: a) to follow standardized administration and scoring procedures to gathering data that provides critical indicators of s tudent achievement, and b) assess individual student proficiency on global, long term curricular goals (Foegen & Deno, 1991). The standardized administration procedures associated with CBM
16 ally, the repeated use of CBM to establish a series of student performance data over time assists in formatively assessing the effectiveness of the curriculum and instructional program (Deno, 2003; Fuchs, Fuchs, & Zumeta, 2008; Shinn, 2002; Stecker, Lembke & Foegen, 2008). Selecting mathematics skills that are indicators of overall mathematics achievement is an important component of developing a CBM measure. Mathematics is a subject area that consists of a complex hierarchy of skills, which creates chall enges in constructing CBM measures that adequately sample the broad array of domain specific skills related to mathematics proficiency (Kilpatrick, Swafford, & Finddell, 2001). Fuchs (2004) identified two approaches to developing the content of mathematics CBM measures. The first approach is the robust indicators method for developing CBM measures. This method involves the selection of mathematical tasks that broadly are associated with proficiency in mathematics, which results in the measures not accuratel y representing the spec ific yearly curriculum. This feature of robust indicator CBM measures limits the ability to directly inform instructional practices because the measures are not a representative sample of the grade level curriculum. The second approa ch identified by Fuchs (2004) is called curriculum sampling, which involves a systematic method of creating a representative sample of mathematical tasks associated with the annual curriculum. Each administration of CBM measures using this particular appro ach consists of identified problem types represented equally across the measure, with the total score serving as an indicator of math proficiency on the yearly curriculum. An advantage to using the curriculum sampling approach over the robust indicator met hod involves the ability to generate a graphical representation of student rate of learning through the use of slope of the progress monitoring data points. This graph can be used to inform instructional decisions throughout the academic year because each
17 progress monitoring administration includes the assessment of each skill in the yearly curriculum. Stages of CBM Research Research associated with constructing curriculum based measurements for mathematics progress monitoring has been conducted, detailing the advantages and disadvantages of the robust indicator and curriculum sampling approaches (Foegen, Jiban, & Deno, 2007). Regardless of the approach taken to item selection, the empirical process of developing a CBM for use involves three research stages that align with the three key features of an effective CBM (Fuchs, 2004). Technical adequacy An effective CBM measure must be technically adequate, which is most often measured through the use of reliability and validity statistics. Technically adequate m easures produce reliable student scores that are relatively consistent or static when administered during a fixed time period (Deno, 2003; Stecker, Lembke, & Foegen, 2008; Thurber, Shinn, & Smol kowski, 2002). The reliability of student scores produced by a CBM is measured in terms of alternate forms, test retest, and/or internal consistency coefficients. Research methods and statistical analyses literature suggest that generally reliability coefficients above 0.70 are acceptable. However, because CBM data i s used in conjunction with other student performance data to make significant educational decisions (e.g. retention, level of services, placement/programming), reliability coefficients of at minimum 0.90 is recommended (Nunnelly & Bernstein, 1994). CBM me asures should be valid, functioning as an adequate measure of the mathematics constructs ass ociated with the grade level expectations. The validity of CBM measures is typically measured using a criterion measure, most often a state standardized assessment. Correlation coefficients are used to determine the significance of the relationship between
18 student performance on CBM measures and performance on an established mathematics criterion. The significance level and strength of the correlation provide informa tion about the CBM measures ability to measure an identified construct, such as mathematics proficiency at the sixth grade level. data, CBM can be used to provide freq uent information regarding student acquisition of curriculum based skills prior to state assessments. Therefore, the focus during the first stage of CBM research involves collecting data in order to determine the technical adequacy of measures, which will set the stage for the use of these measures in making vital instructional support decisions. Growth t rajectory A second key feature of a CBM meas ure involves having the ability to f requent ly monitor student progress, which allows for the graphical depict i on/representation of individual student progress over time (Espin, Scierka, Skare, & Halvorson, 1999 ; Fuchs, 2004; Shinn, 2002). This is done by plotting the individual data points from at least three administration s and fitting the data to a straight line. The slope of the fitted line represents the rate of progress each student is making towar ds grade level goals. The second stage of CBM research examines the technical features of the slopes when the measures are used as an on going progress monitorin g system for mathematics (i.e. frequent a dministrations). Visual analysis of slopes is completed to ensure that an observed increase in total score is related to an increase or improvement in overall math proficiency for the yearly curriculum. Additionally slopes that are static or decreasing over time indicate that student performance is not changing or that student performance is regressing. Research at this second stage is critical to informing the use of CBM as a part of the decision making process in an RtI service delivery model. In order to use CBM data to inform decisions,
19 the measures must provide slopes that are reflective of student growth over time (Fuchs & Fuchs, 2002; Shinn, 2002). Once research is completed to address these concerns, the inst ructional utility of using CBM measures to monitor student performance can be examined. Instructional utility The third feature of CBM is that it allows for the formative assessment of the effectiveness of instructional and curriculum programs through ana lyzing patterns in student growth levels (Deno, 2003; Shinn, 2002). When student CBM data does not indicate appropriate levels of growth, educators can determine which portions of the instruction must be adjusted to meet the needs of the students. The fin al stage of CBM research as identified by Fuchs (2004) consists of studies that examine whether educators can use the CBM data to improve the instructional decision making process, resulting in improvement of student performance. These studies measure if t eacher planning is enhanced through the incorporation of CBM data, as well as evaluating whether planning has an impact on overall student achievement. CBM at the Elementary Level Over the past 35 years, the body of research focusing on stages of CBM devel opment has increased substantially, with both the robust indicator and curriculum sampling approaches utilized to select assessment items (Fuchs, Hamlett, & Fuchs, 1998; Foegen, 2000, 2007; Stecker & Fuchs, 2000). However, research that addresses CBM in ma thematics has predominately focused on the assessment of skills at the elementary level. A vast majority of the elementary studies have used CBM probes that include math tasks involving the computation of basic facts and problem solving/applications proced ures (Espin, Deno, Maruyuna &Cohen, 1989; Fuchs, Fuchs, Hamlett, Thompson, Roberts, Kupek, & Stecker, 1994; Fuchs, Fuchs, Karns, Hamlett, Dutka, & Katzaroff, 2000 ; Lee, Lembke, Moore, Ginsburg, & Pappas, 2007 ). In particular, many utilize the Monitoring Ba sic Skills Progress measures of Computation and Concepts
20 Applications (MBSP; Fuchs, Hamlett, & Fuchs, 1998) as a progress monitoring measure. The MBSP Comp and MBSP ConApp measures were developed by sampling mathematics skills and concepts from the state o f Tennessee math ematics curriculum for grades two through six (Fuchs et al, 1998). MBSP Comp is a 25 item measure that assesses a range of grade level specific fundamental computation skills such as addition, subtraction, multiplication, and/or division o f whole numbers, fractions, and/or decimals. Computation skills are measured in isolation by the MBSP Comp probes, with no emphasis on applied problem solving. Items representing the various skills are presented in random order across the page. Fuchs and c olleagues (1998) established a dminis tration times for the MBSP Comp that range from two to six minutes, with the highest time for students in sixth grade. Scoring of the MBSP Comp involves counting the number of correct digits written by the student in eac h answer, then computing the total number of correct digits completed within the allotted time. The MBSP ConApp contains 24 level specific concepts and apply mathematical knowledge to real world pr oblems. More specifically, MBSP ConApp assesses mathematics domains such as counting, number concepts, measurement, money, charts and graphs, and applied word problems (Fuchs, Fuchs, Zumeta, 2008). Each MBSP ConApp probe has parallel form probes for each g rade level, with each of the probe set s having the same proportion of items per skill domain. Administration time for the MBSP ConApp ranges from six to eight minutes, depending on the grade level. Scoring procedures for MBSP ConApp h ave consisted of compu ting the total correct responses (Foegen, 2008) or calculating the total number of digits correct (Fuchs, Fuchs, Zumeta, 2005), dependent on the Stage o f CBM research being conducted, form of mathematics skills assessed (i.e. basic
21 computation versus conce pts/applications), and /or type of response format (i.e. open versus closed). Numerous research studies, reflective of the first stage of CBM research, have indicated tha t MBSP Comp and MBSP ConApp are valid and produce reliable measures of student mathema tics achievement at the elementary level, as well as at the sixth grade level. In particular, Fuchs et al. (1994, 1998, 1999) examined the technical adequacy of the MBSP Comp and MBSP ConApp across general education and special education classrooms. Intern al consistency, which is the extent that items assess the specified domain of skills and provides consistent scores, was above 0.90 for the MBSP Comp and MBSP ConApp measures. This indicates that the items of each measure have a h igh level of item homogene ity, which suggests that the items are indicators for the same content domain (i.e. math computation and math application). Criterion validity measures such as state mandated standardized assessments were used to determine the relationship between MBSP Com p and MBSP ConApp performance and criterion scores. Validity coefficients for the two measures were between 0.50 and 0.90, indicating a high level of variability in the relationship with state mandated standardized assessments. In particular, criterion val idity coefficients were 0.52 for students receiving special education services when scores were compared to those of the Stanford Achievement Test in Math (Fuchs et al., 1994). When compared with another state assessment, special education ores on the MBSP measures had a 0.80 correlation with the criterion measure (1994). This may be an indication that the mathematical domain skills measured in the MBSP are not equivalent to those of the Stanford Achievement Test in Mathematics. However, overall research studies provide support for the technical adequacy of the MBSP Comp and MBSP
22 ConApp as progress monitoring measures of mathematics achievement for students at the elementary level. In relation to the second stage of CBM research, Fuchs et al. ( 1993, 1994) examined the growth indicators of the MBSP Comp and MBSP ConApp progress monitoring measures for general education students in first through sixth grades by computing a mean weekly slope of student data. Findings show that weekly growth rates o f 0.25 to 0.75 digits were obtained across the grade levels. Fuchs et al. (1999) provide suggestions for setting growth goals of 0.30 and 0.45 digits per week for math computation skills in first through third and fourth through fifth respectively. Concep t and application skills growth goals of 0.4, 0.58, 0.69, 0.19, and 0 .12 digits per week for grades two through six respectively were found to be suitabl e and attainable (Fuchs et al. 1999 ). Fuchs and colleagues (1999) recommendation of growth goals for mathematics skills were primarily based on general education populations, which brings to discussion whether these goals are suitable for students receiving special education services (Foegen, 2008; Foegen & Deno, 2001; Shapiro, Edwards, Zigmond, 2005). Sh apiro, Edwards, and Zigmond (2005) conducted a mathematics progress monitoring project, sa mpling 120 students in first to sixth grade who were receiving special education services. Results of the study indicated that 66 percent of the sample met or exceeded the growth goals in computation and 37 percent in concepts/applications, indicating that goals for concepts/applications may be too high for students with learning disabilities. In addition to MBSP Comp and MBSP ConApp, Lee et al. (2007) developed and evaluated the technical adequacy of six, one minute CBM measures for kindergarten a nd first grade: Count Out Loud (120 items), Quantit y Discrimination (63 items), Number Identification (84 items) Missing Nu mber (54 items), Next Number (84 items), and Number Facts (60 items)
23 with the number o ne. Quantity Discrimination assessed their ability to discriminate between two numbers between 0 and 20 by naming the number with the highest value. Student ability to orally name numbers from 0 to 100 was measured by the Number Identification CBM, while M issing Number evaluated their ability to identify a missing number in a sequence. Somewhat similar to the Missing Number CBM, Next Number required students to verbally say the number that comes next when provided with a number between 0 and 100. The final measure, Number Facts, required students to answer, within one minute, early numeracy facts including basic addition and subtraction using digits less than ten. Items were scored by counting the number of correctly provided answers/responses. Technical a dequacy of the six measures developed by Lee et al. (2007) was evaluated estimates. For the measures, internal consistency est imates ranges from 0.87 to 0.98, i ndicating that items contained in each measure are homogenous: items are likely measuring a similar mathematics construct. Criterion validity estimates ranged from 0.53 to 0.68, with the Number Identification measure producing the highest correlation with the criterion measure. These findings suggest that performance on the six CBM measures is not a strong predictor of performance on a standardized mathematics assessment. Like many other math studies to date, F uchs et al. (1993,1994,1999), Lee et al. (2007 ), and Shapiro et al. (2005) used student samples that primarily consisted of elementary students One grade level was represented from middle school (sixth grade) in studies conducted by Fuchs and colleagues (1993, 1994, 1999) and Shapiro et al. (2005) A dditionally, the number of students sampled from sixth grade was on average, relatively small, which could have an impact
24 on the interpretation and generalizability of the research findings to the sixth grade population. Sixth grade is an important grade l evel because of the dramatic shift in curricular focus to include higher order thinking skills. Research is needed in this area to develop progress monitoring measures that can be incorporated with instruction so that students struggling to learn more adva nced math skills receive the appropriate supports. The research at the elementary level provides a framework for developing technically adequate measures of mathematics proficiency at the middle school level with results that can be generalized to the targ et population. CBM at the Middle School Level Much of the discussion around middle school mathematics CBM research has focused on the lack of measures designed specifically to assess skills associated with national middle school mathematics curriculum sta ndards (Foegen & Deno, 2001; Foegen, 2007; Jiban & Deno, 2007). Although the MBSP Comp and MBSP ConApp measures have been found effective in measuring student progress at the sixth grade level, these measures were not developed to be aligned with universal standards for mathematics proficiency, but rather were aligned with state level standards. Dependent upon the state standards, the skills measured by MBSP Comp and MBSP ConApp may not measure the higher order thinking skills associated with more advanced mathematical tasks. Potentially focusing on lower level math skills and tasks creates numerous concerns related to appropriate assessment of grade level proficiency. The National Council of Teachers of Mathematics ( NCTM, 2006 ) Curriculum Focal Points for mathematics instruction and the Common Core State Standards for Mathematics (CCSS, 2010) emphasize the importance of middle school mathematics standards that promote higher order mathematics skills and cognitive processes. Mor e specifically, NCTM (2006) and CCSS (2010) emphasize the need for increased levels of conceptual knowledge rather than focusing on declarative and/or procedural knowledge. Conceptual understanding as it pertains to
25 mathematics is viewed as a prerequisite to effectively applying procedures and operations in problem solving circumstances (Moss & Case, 1999; NCTM, 2006). By definition, conceptual understanding is the connection of knowledge and relationships, which are developed through reflective learning, n ot rote memorization. In middle school mathematics, conceptual understanding is important because of the hierarchy and interconnectivity of the curricular domains (Helwig, Anderson, & Tindal, 2002). With the increased emphasis on the importance of algebrai c thinking and conceptual knowledge at the middle school level, it is important that CBM measures assess these higher order thinking skills in relation to mathematics (Gersten & Beckman, 2009; Lembke & Stecker, 2007; Woodard & Montague, 2002). Helwig and c olleagues (2002) examined the use of conceptual based CBM measures in predicting middle school student performance on a statewide mathematics test. The 11 items were selected from a sample of field tested items that measured conceptual knowledge at an eigh th grade level The measure was administered to 171 students in eighth grade sampled from eight different schools, with 81 of the students identified with a learning disability. Although the measure was untimed, the authors noted that most students complet ed the measure within ten minutes. The results indicated that for general edu cation students the pearson product correlation (r ) between CBM and sta tewide test scores was 0.80, but r = 0.61 for students receiving special education services. In addition to these findings, a floor effect was present which causes further limitations in the interpretation of the study findings. Thirty three percent of the sample did not respond correctly to any o f the 11 items. A possible cause of this floor effect may be related to the level of difficulty of the items considering 47 percent of the sample had a learning disability in a content area (i.e. mathematics or reading). More specifically, across the eight schools from which students were sampled, the number of students meeting grade level mathematics
26 benchmarks ranged from 49 percent to 78 percent. Helwig and colleagues (2002) did not collect detailed demographic information for individual students; theref ore it is not certain whether the sample consisted of a large proportion of students performing below grade level math expectations. If the student sample had a higher percentage of individuals identified with a math disability, the floor effect could be a ttributed to item difficulty. Another potential cause of the floor effect could pertain to CBM misalignment with core curriculum instructional standards within the sample schools. The CBM items may have measured skills that were not a part of the instructional curriculum, which would explain why performance was so low. These issues must be considered and addressed before a measure could be used to support instructional decisions. Foegen and colleagues (2000, 2001, 2008 2011 ) have conducted a ser ies of studies (Stage I II) to investigate the development and use of math CBM measures for middle school. Foegen (2000) and Foegen and Deno (2001) examined the technical adequacy of a 40 item the best estimate answer for addition, subtraction, multiplication, and division problems. This mathematical task was the mathematical skill that is widely appli for this measure was multiple choice, with students required to choose the best estimate out of three options. Students were asked to complete as many of the problems possible in one minute. Scoring was completed by totaling correct responses, followed by using an adjustment procedure to account for guessing (Foegen, 2000). Internal consistency and test retest reliability coefficients were above 0.80, indicating that student total scores were consist ent across administrations
27 The previously described estimation measure was also used in a study by Foegen (2008 ), who investigated the technical adequacy of four, author developed math progress monitoring measures The student sample consisted of 563 stu dents in grades six to eight, of which 5.9 percent were receiving special education services. The measures used in the study were Basic Facts, Estimation, Complex Quantity Discrimination, and Missing Number. The Basic Facts measure consisted of 80 items wi th a time limit of one minute. Skills assessed by this measure facts with whole numbers. Total digits correct were used as the scoring method for this open resp onse measure. The 44 item Complex Quantity Discrimination measure required students to determine the relationship between two pairs of quantities by selecting the symbols for greater than (>), less than (<), and equal to (=) within one minute. Foegen (200 8) designed the Missing Number three numbers and one blank indicating a missing e lement in the sequence. For the Complex Quantity Discrimination total correct responses were used to calculate overall student performance, while on the Missing Nu mber measures; total digits correct were used to determine student scores. used as criterion measures for each of the four measures in this study (Foegen, 2008). The item analysis findings of the study illustrate d an absence of floor and ceiling effects for the measures. The Estimation measure produced criterion validity coefficients bet ween 0.50 and 0. 60 for grade six, but coefficient s from 0.26 to 0.53 for grades seven and eight Growth rates were lowest for Missing Number, ranging from 0.09 0.16 digits correct per week. This indica tes that the measure
28 may not have a sufficient level o f sensitivity to student growth; a key elemen t of an effective CBM measure. The Complex Quantity Discrimination measure produced reliability coefficients above 0.70 and validity coefficients of at least 0.55 with all criterion measures, except teacher rati ngs. Foegen (2008) provided insight into the technical adequacy of CBM measures for middle school mathematics; however, issues were identified in relation to the selection of skill doma ins and criterion validity The Estimation, Complex Quantity Discrimination, Basic Facts, and Missing Number measures were not developed in alignment with middle school mathematics curricular standards or guidelines. The Estimation measure in particular was chosen based on the lls are a component of the middle school mathematics curriculum. Not ensuring that task domains were aligned with curriculum standards likely impacted the level of criterion validity with the state standardized assessment. The criterion validity coefficien ts had high levels of variability across grade levels with c oefficients ranging from 0.2 6 to 0. 87 with the lowest coefficient representative of the seventh and eighth grade portion s of the sample and the highest with the sixth grade (Foegen, 2008). This s uggests that the CBM measures were not assessing the same skills as the state standardized assessments. More specifically, the level of correlation between CBM scores and the criterion measure (state standardized test) de creased with increases in grade lev el CBM measures that do not provide strong indicators of student performance on other measures of mathematics proficiency make it difficult to justify use as a formative assessment to assist in instructional decision making. Statement of Purpose The pre sent study involved the development of three curriculum based measures for mathematics at the sixth grade level that assessed three concept domains identified by the NCTM Curriculum Focal Points (Algebra, Number Operations, & Geometry) and Common Core Stat e
29 Standards (2010) Item analysis procedures were utilized to examine the technical features of each measure, in addition to factor analysis procedures to identify an appropriate factor model to represent the CBM data The primary research question for th is study was: Do CBM math measures that are aligned with NCTM (2006) and CCSS (2010) standards in the areas of Number Operations, Algebra, and Geometry, provide a technically adequate mathematics achievement? The s econdar y research questions answered through this study included: 1. To what extent is the content represented in the items of the each measure homogeneous? 2. For each of the three CBM measures (Number Operations (NO) Algebra (AL) and Geometry (GE ), which items indicate individual differences in mathematics achievement through item difficulty and item discrimination? 3. To what extent are scores obtained on the three measures related to those obtained on a state standardized mathematics assessment? 4. To what extent d o the NO CBM, AL CBM, and GE CBM represent single factor model of mathematics achievement ?
30 CHAPTER 2 RESEARCH METHODOLOGY Conceptual Framework A defining characteristic of curriculum based measurement is that each measure consists of a small number of items that represent a broad construct (i.e. subject area). The brevity of CBM measures allows for short administration, thus reducing the time demands on classroom teachers and students. Despite the brief nature of measures, CBM has the ability to prov ide a valid and reliable indicator of student performance, which can be used to inform decision making. CBM data is most effective when used as a formative, rather than summative, measure of student achievement (Deno, 2003). Formative assessment using CBM measures consists of collecting student data during multiple time points throughout the school year. This data is examined following each administration, with instructional decisions made in response to student level of progress. Due to the inferences draw n about student academic achievement and instructional needs, developers of CBM measures must justify that the scores obtained from these measures can be used for these formative purposes (Crocker & Algina, 2008). This is accomplished through a systematic approach to test development, which according to Crocker and Algina (2008) includes the following steps: 1. Identify the primary purpose for which the test scores will be used. 2. Identify behaviors that represent the construct or define the domain 3. Prepare a se t of test specifications, delineating the proportion of items that should focus on each type of behavior identified in Step 2 4. Construct an initial pool of items 5. Have items reviewed (and revised as necessary) by expert panel 6. Conduct field test of items on large sample representative of population targeted 7. Determine item properties (validity and reliability) This study examined the technical features of three mathematics curriculum based measures of skills associated with the three curricular domains specified for sixth grade by the Curriculum Focal Points of NCTM (2006) which are: Number Operations, Algebra, and
31 Geometry. The CCSS (2010) breakdowns sixth grade mathematics curriculum into five areas, which combine to represen t the three domain areas of NCTM Focal Points (2006). These domains represent the key content areas that are essential to general mathematics proficiency at the sixth grade level. Although some of the skills associated with each domai n may overlap with one another; each domain has sp ecific skills that are unique. Developing separate measures for the three domains allowed for the technical features of CBM to be maintained. More specifically, three separate measures allows for valid and reliable measurement o f each domain by increasing the number of items representing skills within each domain. The purpose of this study was to build upon the existing CBM research for middle school mathematics by designing technically adequate CBM measures that are aligned wi th ability to assess student progress in developing proficient grade level mathematics skills was n students who may need additional instruction or intervention Additionally, student performance on each of the measures was compared to performance on a state standardized mathematics assessment. Research Setting and Participants Prior to beginning the study, a proposal to conduct this study was submitted to the University of Florida Institutional Review Board (IRB). Following acceptance by the university IRB committee, permission to conduct the study was obtained at the district a dministrative level, with each school principal/director provided with a detailed description of the study. After administrative approval was obtained, sixth grade math teachers within each school were contacted and provided with a written description of t he purpose and components of the study. Using a convenience sampling approach, t wo schools in North Central Florida were recruited to participate in the present study Although convenience sampling was used, the two
32 schools were also selected in an effort to diverse the overall student sample. The first school (A), a university affiliated public school, had an annual enrollment of approximately 1150 students in grades K 12. Fifty two percent of the stu dent population is classified as ethnic minorities and forty one percent qualify for free or reduced lunch status. In terms of mathematics achievement, 77 percent of enrolled students pass ed the 2012 mathematics standards measured by the state standardized assessment. The second school (B), a public middle sch ool, had an annual enrollment of 700 students in grades 6 8 with 72 percent of students representing ethnic minority groups and 50 percent qualifying for free or reduced lunch. Sixty percent of the students enrolled at School B met proficiency on the 2012 mathematics standards. Across the two schools, two sixth grade mathematics teachers agreed to participate in the study and administer each measure in their mathematics classes. The CBM measures were administered to a tota l of 135 sixth graders, with gend er equally represented (52 % male and 48% female), and student ethnicity largely African American (58 %) and Caucasian (23%). The student sample consisted of primarily students without a documented learning disability; 90 percent of students did not have a learning disability. Detailed total sample and school specific demographic information is represented in Table 2 1. Measures To guide the identification of appropriate domain skills for each of the three CBM measures using a curriculum sampling approach to development, the NCTM Focal Points (2006) Common Core State Standards (CCSS ; 2010 ) Core domain skills were identified based on overlap between national and state standard specifications for sixth grade general curriculum. The researcher identified dom ain skills emphasized acro ss the standards as core skills representative of the sixth grade mathematics curriculum. Based on these identified skills, a table
33 of test specifications was created for each CBM measure, with sample items for each r epresented skill (s ee Appendix A 1 to A 3 for detailed test specification tables ). C BM test specifications According to NCTM, proficiency in number operations is broadly defined as sixth grade students being able to (a) demonstrate an ability to solve rate and ratio problems and (b) develop an understanding of and fluency with the multiplying and dividing of fractions and decimals (2006 ). The Number Operations CBM measure (NO CBM) consisted of 20 items, with the specific mathematical tasks associated with the Number Operations/Number Systems domain adopted from the NCTM Focal Points, CCSS guidelines, and Next Generation Sunshine State Standards for sixth grade mathematics. In particular, the NO CBM assessed skills such as multiplying two fractions, using ratio concept to describe relationships, and dividing numbers with multiple decimal places. The proportion of items per skill was equivalent for each skill (i.e. 4 items per skill). The Algebra CBM measure (AL CBM) consisted of 20 items that broadly measure sixth grad mathem atical tasks measured by the AL CBM were drawn from the Algebra domain of the NCTM Focal Points and the Expressions and Equations domain of CCSS and NGSSS guidelines S ome of the skills of the A L CBM included having the ability to solve algebraic expressions with one unknown number, use the distributive properties to simplify algebraic expressions, and use variables to represent a relationship between two quantities. The ratio of items per skill for the AL CBM was 5 :1, with each of the key skills represented by an equal number of items. The final measure, GE CBM, was comprised of tasks that encompass the geometry skills associated with sixth grade mathematics standards a s identified by NCTM (2006) CCSS (2010) and NGSSS. This measure was composed of 16 items related to solving problems involving area,
34 perimeter, and volume, with an equal number of items per skill (i.e. 4 items per skill). The specific geometric shapes th at are appropriate for sixth grade level mathematics were drawn from the NCTM Focal Points and CCSS guidelines for sixth grade math curriculum. The items for each measure were developed through an examination of the NCTM Focal Points (2006) and CCSS (2010 ), which provided specific examples of each domain area skill for sixth grade mathematics. Items were modeled after these examples through usage of similar mathematical language, content, and visual representation. Test specification and item r eviewers Ex pert review of test items is a vital component of test development because it provides support for the content validation of the measure (Crocker & Algina, 2008). To validate the skills and sample items selected, four reviewers were recruited from the two schools included in the sample. Based on recommendations defined by Downing & Haladyna (1997) for content expert reviewers, t hese individuals were selected because of their experience in teaching middle school mathematics and knowledge of national and stat e mathematics standards. Specific background information is provided ab out each of the item reviewers, as well as specific information about the instructional sequence for sixth grade mathematics. Reviewer A, with over nine years of teaching experience, h ad three years of middle school math teaching experience. Reviewer A held the following degrees: Bachelor of Art in Education (BAE) with a specialization in elementary education, Master of A rts in Education (MAE) in K 12 special e ducation, and an Education Specialist degree (Ed.S.) with a major in education l eadership. Reviewer B had a Bachelor of Science with a major in secondary math and science teaching, Master of Education (M.Ed.) with a major in educational l eadership, and at the time of the study had completed coursewor k towards an Ed.S. degree in e ducational l eadership. Additionally, Reviewer B had served on numerous local and state mathematics standards writing
35 committees, engaged in conference presentations related to mathematics instruction and cur team With a Bachelor of Science with a major in management and Master of Education in secondary s cience, Reviewer C had experience teaching sixth and seventh grade ma thematics and served on the middle school mathematics instructional team. Reviewer D was a current sixth grade mathematics teach er with a Bachelor of Arts in e ducation and five years of teaching experience in middle school mathematics A semi structured interview was conducted with three of the reviewers to gain general information about the mathematics curriculum sequence and places of content emphasis in relation to the researcher identified core skill areas Difficulties were encountered with schedulin g the interview with Reviewer D; therefore information was only available from Reviewers A, B, and C. Collectively, reviewers indicated that the content emphasis detailed within each of the three measure table of test specifications was an accurate repre sentation of the areas of curriculum focus within sixth grade mathematics. However, it was reported that geometry instruction is typically the final content domain covered during the academic year for sixth grade mathematics. The timing of the instructiona l focus on geometry often coincides with state assessment preparations and administration, resulting in fewer instructional days to expose students to geometry content. Guidance g iven to r eviewers Reviewers examined each item of the three measures and det ermined if the item matched the specified skills within the table of specifications. Each rater was provided with a rating f orm (s ee Appendix A 4 for rating form) and table of test specifications that included sample measure items. Dichotomous ratings were assigned to each item indicating a match with specified skills The
36 that there was not a need to revise items based on alignment with skill specifications. general clarity, formatting, wording, and lev el of complexity of each item. Reviewe rs were instructe d to evaluate each item, solve the problems, and provide suggestions for edits. Primarily, reviewer feedback was associated with formatting and spacing for equations and geometric shapes. General grammatical errors were also indicated. Spe cific item revision s were made based on feedback provided by the reviewers. Criterion measure School records were used to obtain student scores on the mathematics section of the Florida Comprehensive Assessment Test 2.0 (FCAT 2.0 ; FL Department of Education 2011 ). The mathematics section of the FCAT 2.0 for sixth grade measures student performance in three content areas: fractions, ratio, proportional relationships, statistics; expressions and equations; and geometry/measurement which align with d omain areas of the NCTM Focal Points (2006) and CCSS (2010) Test administration occurred in a group format, over the course of two 70 minute sessions. Response patterns for sixth grade FCAT 2.0 items included multiple choice and gridded response. These it ems were computer scored using the number of items correct, with score reports generated for each student. The present study used two forms of student math scores: FCAT 2.0 Developmental Scale Score (DSS) and Content Area Score. Student annual progress in mathematics is represented by the FCAT 2.0 Developmental Scale Score (DSS), which is a scale score with a range of 170 284, with a higher scale score indicating higher performance ( FL Department of Education, 2011 ). In addition to the FCAT 2.0 DSS, conten t area scores are also computer generated for each student. Each of the three content areas for the FCAT 2.0 mathematics has a corresponding score, which represents the total raw
37 score points (i.e. items correct) within each content area (FL Department of Education, 2011). The number of items per content area may vary yearly, as well as across grade levels. This limits the use of the content area scores in making comparisons across grade levels; however can provide information when performing single grade l evel analysis. Information pertaining to the number of items per content area for the Spring 2012 administration of the FCAT 2.0 mathematics for sixth grade was as follows: 1. Fractions, Ratio, Proportional relationships, and statistics: 18 items 2. Expressions and Equations: 17 items 3. Geometry and Measurement: 9 items. Data Collection Procedures Each classroom teacher was responsible for administering the three, curriculum based measures at the beginning of a designated math class period during May 2012. A trai ning session was provided to instruct the teachers on administering each measure using a standardized administration protocol. Each classroom teacher received an administration packet for each of the three CBM probes that included the following: (a) script ed instructions for administration, (b) time limits for each measure, (c) instructions for the collection and storage of completed measures. During the training session, the researcher modeled these procedures and provided each teacher an opportunity to pr actice administration and ask questions. The researcher provided each teacher with email and phone contact information if additional questions or concerns arose during the administration period. Group administration of the measures occurred during one cla ss session during the school day with the measures given in a predetermined order. No makeup sessions were given for students absent during the data collection day. On the administration day, each student received a multi page, probe packet; containing ea ch of the three CBM probes ( s ee Appendix B for CBM probes
38 name and a designated identification number at the top corner of this and subsequent pages. This identificat ion number was used for data analysis files and provided a method for de identifying student data prior to analysis. On the designated CBM administration day, the classroom teachers distributed the CBM probe packets to each student, instructing students t o write their names in the appro priate area on the cover page. Once each student wrote their names on the cover page, the teachers engaged in following the administration script for the specific CBM probe. When prompted by the classroom teacher, students c ompleted the designated CBM probe within the allotted administration time. As stated in the administration instructions, the teacher was to monitor student progress and ensure that students started and stopped working on each measure at the appropriate tim es. Students were instructed to use the provided space under each item to solve problems. The use of calculators during administration was prohibited. Administration t ime No consistent patterns across the literature between the number of CBM items and adm inistration time could be identified, with studies allowing between 1 to 11 minutes for completion and the number of items ranging from 20 to 80 (Foegen, 2003, 2008; Fuchs et al. 1993, 1994 Lee et al., 2007 ). Analysis of previously developed mathematics C BM probes indicated that item complexity has an impact on the selection of administration time. For mathematics CBM probes that assess basic computation skills such as addition, subtraction, multiplication, and division, administration times typically rang e between one to three minutes. With mathematics tasks such as solving algebraic expressions and performing geometric functions, administration times increased to a maximum of 11 minutes. In addition to item complexity, item response format has a role in d etermining administration time for mathematics CBM probes. For probes that have multiple choice response formats, lower administration times
39 are used, while open response formats often correspond to a higher administration time (Foegen, 2003, 2008). Due to the moderate cognitive difficulty of the concepts represented on each of the measures, a time limit of 11 minutes was implemented. Similar time limits for sixth grade mathematics CBM probes of comparable complexity and response pattern as the proposed mea 2008 ; Foegen, Olson, & Impecoven Lind, 2008 ). Scoring and d ata e ntry Following the conclusion of the administration of the three probes, each teacher retained the c over page for each student packet and returned the remaining pages of each packet ( i.e. CBM probes) to the researcher. An identification number was on the top corner of the probes in each student packet, as well as on the cover page. Once the teachers deta ched the cover page, the file during scoring and data entry. This method was used to maintain the confidentiality of student information. The researcher provi ded each teacher with one, password protected, Microsoft excel data file for data entry. The file contained a list of the randomized identification numbers listed on the CBM probe packets and columns for student names, demographic variables (gender, disabi lity status, and ethnicity), and FCAT 2.0 scores. The classroom teachers were responsible for entering student names and demographic information into the file with the corresponding identification number. Due to FCAT 2.0 administration being in April 2012, results were not available until June 2012. Once this data was received, an administrative designee at each school entered the FCAT 2.0 mathematics DSS score, in addition to the three content area scores. A designee was used to complete the FCAT 2.0 score entry because participating teachers were no longer on contract during the month of June. During this time, the author was available to assist
40 with data entry questions or issues ; however additional assistance was not requested. Following the FCAT d ata entry, the author obtained the password protected data file with only random identification numbers associated with the CBM, FCAT, and demographic data. Teachers were instructed to remove all identifying information (i.e. student names) from the files prior to giving them to the author for data scoring and entry. All probes contained open response items, which were scored using two different methods: counting the number of items correct per probe and counting the number of digits correct in each item a nswer. In order to obtain a measure of student performance on the CBM measures, two scoring methods were originally used during the data analysis process: dichotomous (correct vs. incorrect) and total digits correct per item. In dichotomous scoring, items answered correctly performed by using the answer key to each item to assign and calculate the number of digits within an item. Components of an item answer that wer and decimals The place value of each digit in student answers had to be correct to be counted toward the total digits correct. Commas and symbols representing monetary unit (e.g. dollar sign) and distance/length (e. g. feet, inches) were not counted as digits. The researcher scored all probes and responses; followed by a reliability check by an independent scorer. A trained graduate student scored 33 percent of the measures to ensure reliability in scoring. Inter rat er agreement was calculated at 99 percent using the following formula: number of agreements/(number of agreements + number of disagreements) x 100 (Frick & Semmel, 1978). After agreement was reached, the CBM scores were entered into the data file. Data ent ry for the
41 measures was verified by having the scores entered into a second file and crosschecked for consistency. Review of the CBM mathematics literature was conducted to inform the appropriate representation of student performance on each measure, such that the present results are generalizable to Stage I CBM development research. The literature indicates that the conventional approach to scoring mathematics curriculum based measures involves calculating the total digits correct (Kelley, Hosp, & Howell, 2008; Foegen, Olson, & Impecoven Lind, 2008; Lei, Wu, DiPerna & Morgan, 2009; Shinn, 1989). A treatment effect study conducted by Fuchs, Fuchs, Hamlett, Thompson, Roberts, Kubek, and Stecker (1994) provided empirical support for the use of digits correct scoring method. In this study, when compared to dichotomous scoring, the digits correct approach produced significant between group treatment effects. Subsequent studies proposed that based on these findings, digits correct scoring has a higher level of se nsitivity to change or growth in student mathematics performance (Foegen, Olson, & Impecoven Lind, 2008; Lei, Wu, DiPerna & Morgan, 2009). Although limited empirical support exists for the digits correct scoring method, it has become the conventional proce dure for examining the technical adequacy of mathematics CBMs, particularly those that use open response formats for measures of problem solving ability (Foegen, 2008). Therefore, in keeping with the trends within the field and allowing for the generalizab ility of results to previous studies, the digits correct scoring method was used in reporting the present findings. Information pertaining to the results using dichotomous scoring procedures can be found in Appendix C. Data Analysis The purpose of this study was to develop three, technically adequate CBM measures for sixth grade mathematics using a curriculum sampling approach to item selection. These measures were aligned with national mat hematics curriculum standards and evaluated by item
42 reviewers as a secondary measure of content validity The primary research question was: Do CBM math measures that are aligned with NCTM (2006) and CCSS (2010) standards provide a reliable and valid measure of student achievement in mathematics? In order to evaluate th is question, the internal structure of each measure was examined. To do so, item analysis procedures were used to examine the internal consistency of the measures, in addition to the item discrimination, item difficulty, and validity coefficients (Crocker & Algina, 2008). In addition, factor analysis procedures were used to determine if the three CBM measures represent an overall mathematics proficiency construct. Item difficulty indices were calculated to determine the proportion of students that correctl y answered each item. During the data scoring process, two methods were used: dichotomous scoring (i.e. correct or incorrect) and total digits correct within an item. For dichotomous scoring, the mean for each item is the proportion of students that correc tly answered the specific item. Using total digits correct scoring, the proportion of students answering the item correctly is equivalent to the frequency of stude nts obtaining the maximum digits correct for that item divided by one hundred Difficulty pro portions or indices that are low indicate that the ite m is very difficult for the sample of students and may need revision. In order to evaluate the measures ability to indicate individual differences in mathematics achievement, item discrimination coeffi cients were calculated. To obtain item discrimination coefficients for each item, the total score on each measure is used to operationally define performing students h ave a high probability of answering correctly and low performing students have low probability of answering correctly. In attention to item analysis procedures, criterion
43 v alidity measures were examined to determine the relationship between CBM performance and performance on a state standardized mathematics assessment. Factor analysis was used to evaluate pattern of correlations among groups of items across e ach measure. NCTM (2006) and CCSS (2010) identify Number Operations, Algebra, and Geometry as skill domains in the sixth grade mathematics curriculum, which substantiates using a factor analysis procedure to support the theory that items on each measure are highly co rrelated with a one factor model Literature on the acceptable sample size for factor an alysis procedures recommend at minimum 100 participants per measure 10 items (Everitt, 1975). Following this recommendation, the sample size for this study should be at least 200 students. The availability of students on the administration day resulted in a sample size of 135 students, which is below that of the recommended level for factor analysis. In studies when an adequate sample size is not Widaman, 1995). Due to the sample size being below the acceptable level, parceling was used for the present study. For each of the three CBM probes, a subset of the total items was selected, resulting in the creation of parcels that represent the observed variable. It was hyp othesized that items within each parcel loaded onto one primary factor (i.e. overall mathematics proficiency ). Factor loadings and goodness of fit indices were examined to determine the appropriate factor model to represent the CBM data.
44 Table 2 1. Sample demographics distributed by participating school School Gender Ethnicity Learning Disability School A (n = 60) 32 (53%) male 28 (47%) female 14 (23%) African American 29 (48%) Caucasian 10 (16%) Hispanic 5 (8%) Multi Ethnic 2 (3%) Asian 56 (93%) No 4 ( 7 %) Yes School B (n = 75) 38 (51%) m ale 37 (49%) f emale 64 (85%) African American 2 (3%) Caucasian 2 (3%) Hispanic 6 (8%) Multi Ethnic 1 (1%) Asian 66 (88%) No 9 (12%) Yes Total Sample (n=135) 70 (52%) male 65 (48%) female 78 (58%) African American 31 (23%) Caucasian 12 (9%) Hispanic 11 (8%) Multi Ethnic 3 (2%) Asian 122 (90%) No 13 (10%) Yes
45 CHAPTER 3 RESULTS The purpose of this study was to examine the technical adequacy of three mathematics progress monitoring measures designed to be used for sixth grade. More specifically, the goals of the p resent study were to (a) evaluate the internal consistency of the NO, AL, a nd GE CBM measures, (b) examine item level functioning through the computation of item difficulty and item discrimination indices, (c) explore relationships between performance on CBM measures and state standardized mathematics assessment (i.e. criterion validity) and (d) explore the factor structure of t he NO, AL, and GE CBM measures. The C BM development portion of this study was guided by test development procedures outlined by Crocker and Algina (2008), which included the identification of key skills and test items (i.e. test specifications), a structured panel review of content skills and items and administration of measures to the identified population Content area skills and specific items for each CBM were identified based on represen tation o f both NCTM (2006) and CCSS (2010) domain standards for math ematics in sixth grade Teachers with middle school mathematics experience reviewed these skills and items for appropriateness in measuring sixth grade mathematics proficiency. To conduct the administration of the CBM measures, t wo local schools were recru ited to participate in the study, with two sixth grade mathematics teachers consenting to participate in administrating the measures to all students within their classes. CBM data from 135 sixth grade students were scored and used during analyses procedure s Using the digits correct scoring method, s tatistical analyses at the item level for each measure were conducted to determine the total number of responses, mean scores, standard deviations, item difficulty, and item discrimination. Means and standard d eviations were calculated using the proportion of digits correct per item. At the scale level, analysis procedures
46 measurement. Factor analysis was used to det ermine whether the three CBMs reflect a single underlying construct (i.e. overall mathematics achievement). Technical Adequacy Internal Consistency In ord er to address research question one; t he degree of relationship between items on each measure was examined to obtain the internal consi stency correlation coefficient. was examined for each measure. Coefficient alpha ranges from zero to one, with values closer to one indicating a higher below 0.70 should be used with caution and may require item revision (Nunnally, 1978). Howeve literature recommends that coefficients be at minimum equal to 0.90, with a coeffic ient of 0.95 being desirable (Nunnally & Bernstein, 1994). Using the latter criteria, internal consistency coefficients for all three measures were 0.83, which is moderately below the acceptable range (See Table 3 1). As a part of the item analysis procedu res conducted in IBM SPSS Statistics for Windows Version 21.0, the changes to internal consistency when items are removed were minimal, with no substantial increas es or decreases noted (s ee Tables 3 2 to 3 7). Item level Functioning Item analyses for the three CBM measures were conducted using IBM SPSS Statistics for Windows Version 21.0. In order to address the second research question, two item level statistics f rom the item analyses were examined: item difficulty and item discrimination.
47 Item difficulty, the proportion of students that correctly answered an item, was determined by dividing the number of students that correctly answered the item by the total numb er of students who answered the item. Item difficulty proportions range from 0.00 (indicating none of the students answered the item correctly) to 1.00 (all students correctly answered the item). Due to a lack of CBM related studies that utilize item diffi culty to examine item level functioning; a review of general test development research was conducted to inform the identification of items potentially needing revision. Test development research related to open response measures like CBM, indicates that it em difficulty indices should range from 0.30 to 0.70 ( Allen & Yen, 2002 ). Items below this threshold are hard for most students and those above the threshold mostly easy, which limits the ability to function as appropriate measures of the varying levels of student content knowledge ( Allen & Yen, 2002 ). In a tiered service delivery model such as RtI, it is anticipated that certain percentages of students are expected to need additional interventions beyond that of the core curriculum instruction to reach p roficiency (Burns & Gibbons, 2008; Bryant & Bryant, 2008; Bryant et al., 2008; Fuchs & Fuchs, 2006; Hosp & Madyun, 2007). More specifically, within a grade level, 15% of the students will need Tier 2 interventions and five percent will require intensive in tervention through Tier 3, as described in previous sections. Curriculum based measures can serve as a method for differentiating between students based on performance on grade level curricular standards, in addition to identifying those that need additio nal instructional support or interventions (Fuchs & Fuchs, 2001; Gersten & Beckman, 2009; Lembke & Stecker, 2007). In relation to item analysis, the ability of an item to identify individual differences among responders is called item discrimination. Thi s level of item analysis discriminates between students based on level of knowledge in the content area being assessed. More specifically,
48 within item analysis procedures, the item total correlation is the relationship between item scores and total measure scores. Correlations range from 1 to 1, with values closer to 1 indicating higher levels of discrimination. Suggested cutoffs for item total correlations indicate that items with correlation coefficients below 0.30 require further evaluation (Nunnally & Berstein, 1994). Number operations In the Number Operations measure (NO CBM), students were asked to solve 20 problems designed to measure student skill level in multiplying fractions, using ratio concept to describe relationships, and dividing numbers with multiple decimal places. The NO CBM descriptive statistics for the digits correct item scoring method are represented in Table 3 2. Descriptive statistics computed using the proportion of total digits correct resulted in item mean proportion scores of 0.01 (Item 16) to 0.81 (Item 8). Item standard deviations ranged from 0.09 (Item 16) to 0.89 (Item 8), indicating a higher level of dispersion of student responses for Item 8. Items 3, 4, 6, 7, 9, and 13 20 had item difficulty indices below the cutoff of 0.30, indicating that a large proportion of the student sample missed these items. Items 16 20 had discrimination indices below threshold of 0.30. Alge bra On the Algebra measure (AL CBM), students completed 20 items designed to to solve algebraic expressions, use the distributive property to simplify expressions, and use variables to represent a relationship between two quantities. Using the digits co rrect scoring method for the AL CBM, the highest mean proportion of digits cor rect (0.82) was produced by responses on Item 2, while Item 19 had the lowest mean score of 0.02. In relation to the dispersion of student scores, standard deviations ranged from 0.11 (Item 19) to 0.50 (Item 11). Table 3 3 contains item di fficulty informat ion for the AL CBM. Student answers on It ems 1, 4 7, 9, 13 20 produced indices below the 0.30 lower bound cutoff and Items 2 and 6
49 were above the 0.70 upper bound cutoff Item total correlation coefficients for Items 9, 12, and 19 were below the 0.30 cuto ff for item discrimination. Geometry The Geometry measure (GE CBM) consisted of 16 items that measured skills related to finding the area, perimeter, and volume of geometric shapes consistent with sixth grade mathematics standards. Item mean proportion of digits correct ranged from 0.08 to 0.70, with Items 8 and 15 having the lowest mean and Item 3 obtaining the highest mean (s ee Table 3 4). Of the sixteen items on the GE CBM, only four (Items 2, 3, 12, and 14 ) produced item difficulty indices within the acceptable range of 0.30 to 0.70 indicating that most of the sample experienced difficulties on this measure of mathematics skills. In addition to this, scores on Items 8,12, and 14 16 all resulted in item discrimination indices below the 0.30 cutoff. C riterion Validity The development process for the three CBM measures involved identifying key skills from national mathematics curriculum standards ( NCTM, 2006 ; CCSS, 2010) FCAT 2.0 (FL DOE, 2011) the state standardized assessment used as a criterion me asure for the study was also developed to measure student performance on grade level mathematics content, as identified in state standardized assessment data, CBM can be used to provide frequent information regarding student acquisition of curriculum based skills prior to state assessments. Therefore, it was important to examine the relationship between total scores on each measure and those obtained on the Florida Comprehensive Assessment Test (i.e. criterion validity). The mathematics portion of FCAT 2.0 for sixth grade measures performance in three content areas: fractions, ratio, proportional relationships, statistics; expressions and equations; and geometry/mea surement. The skills assessed by each of the content areas align with those of the three CBM measures created in this study. Descriptive statistics for both FCAT 2.0 content
50 areas and the three CBM measures were obtained using the total score s for each ass essment measure (s ee Table 3 5). Prior to measuring criterion validity, correlations were computed to determine the relationship between the CBM measures, as well as correlations between the FCAT 2.0 content area measures. Correlation coefficients in Tabl e 3 6 indicate that relationships between the three CBM measures ranged from 0.74 to 0.79. Correlation coefficients for the three FCAT 2.0 content area m easures ranged from 0.65 to 0.81 also indicating a moderate relationship across content areas. The re lationships between total scores for the three content areas and those of the three progress monitoring measures were examined to address research question three Criterion validity results indicate that the three CBM probes (NO CBM, AL CBM, and GE CBM) we re significantly related to the appropriate content area scores of the FCAT 2.0 The NO CBM total scores and Fractions, Ratio, Proportional Relationships, and Statistics content area had a significant, positive correlation ( r = 0.78, p = 0.00). Expressions and Equations and AL CBM, as well as the Geometry/Me asurement and GE CBM scores were positively correlated ( r = 0.68, p = 0.00; r = 0.70, p = 0.00). The positive relationship denotes that as scores on one measure increase, so do those on the state standardized assessment. Factor Analysis In order to address research question four, f actor analysis procedures were used to evaluate the relation ship between three curriculum based measures as representations of mathematics achievement. Each measure was designed to assess student knowledge in the broad construct areas of number operations, algebra, and geometry. In addition, NCTM (2006) and CCSS (2 010) identify Number Operations, Algebra, and Geometry as d omain areas in the sixth grade mathematics curriculum, which substantiates developing a factor model of mathematics
51 achievement that incorporates observed variables representative of Number Operati ons, Algebra, and Geometry skills. For each of the three CBM probes, a subset of the total items were created using parcels. Item parceling is a factor analysis procedure that involves combining individual test items that are assumed to measure the same co nstruct (Kishton & Widaman, 1994). Four parcels were created for each measure, with items assigned to each parcel using the item total correlation coefficients. Items were distributed across the parcels starting with the highest coefficient and alternating parcel assignment until the lowest coefficient was reached. Factor analysis procedures were conducted using a one and three factor model. The following goodness re error of approximation (RMSEA), comparative fit index (CFI), Tucker Lewis index (TLI), and standardized root mean square residual (SRMR). Recommended guidelines are that RMSEA be from 0.05 to 0.08, CFI and TLI greater than 0.95, and SRMR less than 0.05 for adequate model fit (Hu & Bentler, 1999). Based on the NCTM and CCSS for sixth grade mathematics and the correlation between CBM total scores (See Table 3 6), it was hypothesized that once parceled, the correlations would indicate a single overall fact or of mathematics achievement. In determining the number of factors to retain during the factor analysis, the commonly used criterion is to retain factors with eigenvalues (i.e. variance of the factors) greater than one (Floyd & Widaman, 1995). In this cas e, a single factor had an eigenvalue above one. The correlations between the CBM parcels and the extracted factor, which were produced by the correlation matrix in Table 3 7, are reported in Table 3 8. These correlations, also known as factor loadings, dis play the relationship between the factor and item parcels with coefficients ranging from 0.65 to 0.82 The single factor model that was retained during the analysis procedures resulted in goodness of fit ind ices suggesting that the
52 single factor model does not adequately fit the CBM data (RMSEA = 0.11; CFI = 0.90; TLI = 0.88). An additional factor model was examined to determine whether a three factor model representing the three different domain areas would result in a model that is a better representatio n of the CBM data. Table 3 9 contains the factor loadings for the parcels using three factors. Model fit indices for the three factor model indicate that the model better accounts for the data (RMSEA = 0.08; CFI = 0.94; TLI = 0.93). Although the three fact or model adequately fits the data, the estimated correlation matrix in Table 3 11 indicates that the three factors are highly correlated. More specifically, the three domain areas (number operations, algebra, a nd geometry) are indiscriminate, as evidenced by estimated factor correlations ranging from 0.83 to 0.93.
53 Table 3 1. Measures NO CBM 0.83 AL CBM 0.83 GE CBM 0.83 Note: NO = Number Operations, AL = Algebra, GE = Geometry.
54 Table 3 2. Number o perations (NO CBM) item analys is using digits correct scoring Items Item m ean Item s tandard d eviation Item d iscrimination Item d ifficulty lpha if item d eleted NO 1 0.41 0.27 0.51 0.47 0.81 NO 2 0.52 0.49 0.38 0.49 0.82 NO 3 0.44 0.34 0.40 0.16 0.83 NO 4 0.24 0.40 0.42 0.19 0.82 NO 5 0.33 0.33 0.44 0.45 0.82 NO 6 0.23 0.37 0.54 0.17 0.81 NO 7 0.56 0.85 0.50 0.24 0.81 NO 8 0.81 0.89 0.53 0.32 0.81 NO 9 0.26 0.44 0.44 0.25 0.82 NO 10 0.28 0.27 0.63 0.54 0.81 NO 11 0.50 0.50 0.50 0.50 0.82 NO 12 0.42 0.49 0.54 0.40 0.81 NO 13 0.11 0.27 0.53 0.06 0.82 NO 14 0.13 0.28 0.54 0.05 0.81 NO 15 0.07 0.20 0.34 0.01 0.82 NO 16 0.01 0.09 0.13 0.01 0.83 NO 17 0.03 0.14 0.23 0.01 0.83 NO 18 0.02 0.15 0.27 0.02 0.83 NO 19 0.06 0.24 0.25 0.06 0.83 NO 20 0.03 0.12 0.29 0.01 0.82 Note: = Item difficulty less than 0.30 or greater than 0.70. = Item discrimination less than 0.30.
55 T able 3 3. Algebra (AL CBM) item analysis using digits correct scoring Items Item m ean Item s tandard d eviation Item d iscrimination Item d ifficulty lpha if item d eleted AL 1 0.21 0.30 0.47 0.07 0.82 AL 2 0.82 0.35 0.45 0.78 0.82 AL 3 0.20 0.25 0.50 0.41 0.82 AL 4 0.11 0.30 0.35 0.10 0.82 AL 5 0.39 0.48 0.52 0.37 0.81 AL 6 0.79 0.40 0.38 0.78 0.82 AL 7 0.18 0.25 0.52 0.01 0.81 AL 8 0.72 0.43 0.31 0.69 0.82 AL 9 0.15 0.27 0.29 0.02 0.84 AL 10 0.14 0.32 0.44 0.10 0.82 AL 11 0.52 0.50 0.53 0.51 0.82 AL 12 0.03 0.15 0.25 0.02 0.83 AL 13 0.28 0.45 0.47 0.28 0.82 AL 14 0.50 0.49 0.59 0.48 0.81 AL 15 0.10 0.25 0.45 0.01 0.82 AL 16 0.40 0.48 0.50 0.37 0.81 AL 17 0.27 0.43 0.55 0.23 0.81 AL 18 0.15 0.33 0.46 0.11 0.82 AL 19 0.02 0.11 0.24 0.01 0.83 AL 20 0.05 0.15 0.37 0.01 0.82 Note: = Item difficulty less than 0.30 or greater than 0.70. = Item discrimination less than 0.30.
56 Table 3 4. Geometry (GE CBM) item analysis using digits correct scoring Item Item m ean Item s tandard d eviation Item d iscrimination Item d ifficulty lpha if item d eleted GE 1 0.30 0.43 0.65 0.24 0.81 GE 2 0.52 0.46 0.55 0.42 0.82 GE 3 0.70 0.46 0.50 0.69 0.84 GE 4 0.23 0.39 0.60 0.13 0.82 GE 5 0.25 0.38 0.44 0.16 0.83 GE 6 0.12 0.31 0.51 0.09 0.82 GE 7 0.24 0.37 0.74 0.16 0.80 GE 8 0.08 0.24 0.27 0.04 0.84 GE 9 0.30 0.35 0.52 0.13 0.83 GE 10 0.18 0.34 0.51 0.11 0.83 GE 11 0.14 0.32 0.64 0.09 0.81 GE 12 0.46 0.50 0.24 0.45 0.84 GE 13 0.11 0.27 0.45 0.07 0.83 GE 14 0.37 0.46 0.24 0.33 0.84 GE 15 0.08 0.24 0.24 0.05 0.84 GE 16 0.20 0.33 0.24 0.10 0.84 Note: = Item difficulty below 0.30 or above 0.70. = Item discrimination below 0.30.
57 Table 3 5. Descriptive statistics for CBM measures and FCAT content areas Measure Mean Standard d eviation NO CBM 12.00 8.23 AL CBM 14.46 9.62 GE CBM 13.32 11.23 NO FCAT 8.74 4.23 AL FCAT 10.00 3.75 GE FCAT 3.92 2.41 Note: NO = Number Operations, AL = Algebra, GE = Geometry, FCAT = Florida Comprehensive Assessment Test
58 Table 3 6. Correlation matrix for student CBM and FCAT performance Measure NO CBM AL CBM GE CBM NO FCAT AL FCAT GE FCAT NO CBM 1.00 AL CBM 0.79 1.00 GE CBM 0.74 0.75 1.00 NO FCAT 0.78 0.73 0.76 1.00 AL FCAT 0.70 0.68 0.66 0.81 1.00 GE FCAT 0.64 0.64 0.70 0.73 0.65 1.00 Note: NO = Number Operations, AL = Algebra, GE = Geometry, FCAT = Florida Comprehensive Assessment Test
59 Table 3 7. Correlation matrix using CBM factor parcels Parcel NOP1 NOP2 NOP3 NOP4 ALP1 ALP2 ALP3 ALP4 GEP1 GEP2 GEP3 GEP4 NOP1 1.00 NOP2 0.57 1.00 NOP3 0.58 0.61 1.00 NOP4 0.49 0.60 0.63 1.00 ALP1 0.57 0.66 0.60 0.59 1.00 ALP2 0.49 0.53 0.59 0.59 0.68 1.00 ALP3 0.49 0.47 0.61 0.55 0.62 0.57 1.00 ALP4 0.39 0.45 0.55 0.50 0.58 0.46 0.54 1.00 GEP1 0.42 0.65 0.60 0.39 0.65 0.60 0.54 0.53 1.00 GEP2 0.41 0.55 0.51 0.44 0.61 0.53 0.56 0.47 0.74 1.00 GEP3 0.48 0.62 0.55 0.52 0.55 0.55 0.48 0.49 0.63 0.70 1.00 GEP4 0.42 0.60 0.55 0.58 0.54 0.52 0.44 0.38 0.60 0.60 0.59 1.00 Note: NO = Number Operations, AL = Algebra, GE = Geometry
60 Table 3 8. Factor loadings for single factor model of mathematics achievement Parcel Factor 1 NO P1 0.65 NO P2 0.78 NO P3 0.77 NO P4 0.71 ALP 1 0.82 AL P2 0.71 AL P3 0.75 AL P4 0.65 GE P1 0.79 GE P2 0.76 GE P3 0.76 GE P4 0.72 Note: NO = Number Operations, AL = Algebra, GE = Geometry
61 Table 3 9. Factor loadings for three factor model of mathematics achievement Parcel NO f actor AL f actor GE f actor NOP1 0.69 NOP2 0.79 NOP3 0.81 NOP4 0.75 ALP1 0.85 ALP2 0.77 ALP3 0.74 ALP4 0.67 GEP1 0.84 GEP2 0.84 GEP3 0.79 GEP4 0.73 Note: NO = Number Operations, AL = Algebra, GE = Geometry
62 Table 3 10. Goodness of fit indices for single and three factor models of mathematics achievement I ndex Single factor model Three factor model RMSEA 0.11 0.08 CFI 0.90 0.94 TLI 0.88 0.93 SRMR 0.05 0.04
63 Table 3 11. Factor correlations for the three factor model of mathematics achievement Factor NO f actor AL f actor GE f actor NO Factor 1.00 AL Factor 0.93 1.00 GE Factor 0.83 0.86 1.00 Note: NO = Number Operations, AL = Algebra, GE = Geometry
64 CHAPTER 4 DISCUSSION Research indicates that students entering middle school experience significant difficulties with reaching grade level mathematics expectations, which has been attributed to a shift in curricular emphasis that requires an higher level of mathematical thinking in the areas of algebra and geometry (Calhoon & Fuchs, 2003; Jiban & Deno, 2007; Ketterlin Geller, Chard, & Fien, 2008). Despi te the substantial mathematics difficulties that students experience at the middle school level, mathematics CBM literature has primarily focused on the development of measures for use at the elementary level ( Espin, Deno, Maruyuna &Cohen, 1989; Fuchs, Fuc hs, Hamlett, Thompson, Roberts, Kupek, & Stecker, 1994; Fuchs, Fuchs, Karns, Hamlett, Dutka, & Katzaroff, 2000; Lee, Lembke, Moore, Ginsburg, & Pappas, 2007 ). Of the studies focused on middle school CBM development, few have addressed the need to align the measures with national curriculum standards such as NCTM (2006) and CCSS (2010), which is an important component of development ( Wallace et al, 2007). One of the primary functions of CBM is to serve as an indicator of overall proficiency in an academic ar ea, in this case sixth grade mathematics (Hosp, Hosp, & Howell, 2007). As a performance indicator, CBM allows classroom teachers to make instructional decisions based on student areas of need (Fuchs, 2004 ). The development of technically adequate CBM measu res is of importance due to the nature of decisions that are made pertaining to intervention services and supports Previous research included findings related to reliability and validity coefficients; however to date, only one study (an elementary level C BM) examined the item level functioning of measures using item analysis statistics such as item difficulty and item discrimination ( Lee, Lembke, Moore, G insburg, & Pappas, 2007). In the past, decisions re lated to item revision were based on the presence of item floor or ceiling effects by examining item frequency distributions
65 ( Foegen et al., 2000, 2001, 2008; Helwig et al., 2002 ). This method of item level analysis does not accoun t for items that do not produce floor or ceiling effects, but are substantial ly hard or easy for the study sample, which is important in the development and revision of measure items. The purpose of this study was to expand the research in middle school mathematics through examining the technical adequacy of three CBM measures for sixth grade mathematics aligned with national curricular standards : Number Operations (NO), Algebra (AL), and Geometry (GE) This study examined the internal consistency, criterion validity, item l evel functioning using item analysis (i.e. item difficulty and discrimination), and the factor structure of the three CBM measures. This chapter provides an interpretation of the results obtained through these analyses procedures, an integration of findings with relevant litera ture, and an examination of key study limitations to inform future research. Technical Adequacy Internal Consistency Analyses procedures were conducted to examine the internal consi stency of three 6 th grade CBM measure s: Number O perations (NO) Algebra (AL ), and Geometry (GE). Internal n and educational programming/placement, it is recommended that coefficient alpha be between 0.90 and 0.95 (Nunnally & Bernstein, 1994). Internal consistency for all three CBM measures was 0.83, which was inconsistent with that of previous studies such as those conducted by Fuchs and colleag ues (1994, 1998, 1999) that produced coefficients of 0.90 and coefficients obtained by Lee et al. (200 7) that ranged from 0.87 to 0.98 The findings of the present study suggest potential inconsistencies across it ems in measuring content representative of a defined mathematics domain area. Results from the item level
66 analysis procedures d iscussed in the next section below, provide specific details pertaining to inconsistent performance on items representing similar skill areas which likely impacted the level of internal consistency for the NO, AL, and GE CBM measures. In addition, there is overall inconsistency in performance on all items, as evidenced by the variability in item means (see Tables 3 2 to 3 4). As mentioned earlier, the proportion of digits correct was calculated per item and used to determine the mean proportion of digits correct. The high variability in mean proportion of digits correct also suggests that overall; homogeneity of items within each measure was low. Item Level Functioning Item analysis involves the examination of item difficulty and item discrimination to determine the level of functioning of items within an assessment measure. For item difficulty, indices ranging from 0.30 to 0.70 are cons idered acceptable ( Allen & Yen, 2002 ). Items with indices above 0.70 are considered very easy and those below 0.30 are considered very difficult. The results of the present study indica ted that for each CBM measure, the majority of items produced difficult y indices outside of the acceptable range. For the NO CBM, thirteen items had indices that were below 0.30, indicating that these items were very hard for the students within the sample. Student responses on eleven of the AL CBM items resulted in difficult y indices below the 0.30 t hreshold, while two item s (I tem 2 and 6) was above the 0.70 difficulty threshold. Of the sixteen items on the GE CBM, twelve items were outside of the acceptable range for item difficulty These items were well below the lower end of the cut off, indicating that for the GE CBM, most of the items were very difficult for the student sample. Comparing the difficulty levels of each measure, the results indicate that a large proportion of items for each measure were very hard. The item l evel statistics were examined to determine the presence of patterned
67 results. This examination identified patterns in relation to item content and time allowed for completion. The majority of the items on the NO CBM that were very difficult (i.e. produced low difficulty indices) measured skills related to rate/ratio relationships and comput ation in volving fractions. Within this skill area, students were more likely to incorrectly answer items that required computation of fractions without a common denomina tor (e.g. items 4, 6, and 9). In addition, the two items that contained decimals (Items 3 and 14) were hard for most students, as evidenced by the difficulty i ndices below the 0.30 standard. Of the seven items with acceptable difficulty indices, three were representative of skills associated with measurement of central tendency and variability of a set of data (i.e. mean, mode, and range) The numerals used within these items were below 35, each series containing at least eight nu mbers. The NO CBM conta ined another central tendency and variability item (Item 15); however most students missed this item. On this item, students had to calculate the mean of a series of test grades. When compared with a similar item (Item 8), Item 15 contained substantially h igher numerals: ranging from 45 89. This difference in represented numerals may have impacted student performance due to the higher level of item complexity. Item 15 required a substantial amount of double digit addition with sizeable numbers, in compariso n to the numbers represented on Item 8. This suggests that numeral combinations has an impact on student ability to successfully answer items, which should be considered in future research involving the development of measures of mathematics proficiency. On the AL CBM, items associated with the fo llowing skills proved to be difficult for students : ( 1) applying the distributive property of operations to simplify algebraic expressions, and ( 2) solving algebraic expressions that use letters and numbe rs. Stude nts performed higher on
68 items that involved the evaluation of algebraic expression s with specific values for variables, with difficulty indices within the acceptable range for 75 percent (3 of 4) of the items. Performance on items representing the ability to solve problems using equations such as those in Items 2, 5, 10, 16, and 17, was inconsistent across the student sample. Two of these items (Items 5 and 16) produced difficulty indices within the acceptable range; however the remaining three did not. The se results, along with analysis of the item content, suggest that the grammatical structure and level of computation required influenced inconsistency in item difficulty indices. This inconsistency may have also impacted the internal consistency of the AL CBM measure. Of the four items on the GE CBM that were within the acceptable range for item rectangular prism (Items 2, 12, and 14). Students experienced difficultie s with all other sk ill areas represented on the GE CBM, which may be attributed to the instructional focus in the area of geometry. In other words, the focus on geometry instruction may have been on finding the volume, rather than circumference and area of geometric shapes. This would imply that students receive limited exposure to the geometry content; which would have a direct impact on their performance on the GE CBM. All items required students to provide written responses, with a time limit of eleven minutes. For the NO CBM and AL CBM, item difficulty indices began to significantly decrease (i.e. items became more difficult) near the en d of the measures. This provided some indication that as students approached the end of the measure, they were less li kely to answer the item correctly. Item scoring methods did not differentiate between answers that were incorrect and those that were not attempted; therefore it was not possible to determine the specific impact that the time limit had on student answers n ear the end of each measure. However, the data trends
69 near the end of the measures support the hypothesis that administration time impacted student overall performance. Similar CBM research by Foegen (2008) addressed such concerns related to administration time by conducting studies to compare st udent performance using varying time limits. Administration times were selected by examining the impact of time on reliability and validity coefficients. Significant increases in technical adequacy with increases in administration time were interpreted as an indicator of an appropriate administration time for a measure. Item discrimination indices, often described as item total correlations, were also examined as a part of the item analysis procedures. The literature indicates that acceptable correlation coefficients are above 0.30 (Nunnally & Berstein, 1994). The following items had poor discrimination a b ilities: NO CBM items 16 20; AL CBM items 9, 12, and 19; and GE CBM items 8, 12, 14 16. Although most of the measure items had acceptable discrimination abilities, these findings must be compared to tho se related to item difficulty. Item discrimination is a measure of the relationship between performance on an individual item and overall performance or total score. Item discrimination provides a method for analyzing the ability of an item to differentiate between students based on level of knowledge. The f indings of this study indicate that although items may have sufficient discrimination abilities, overall student performance on these items might be low. More specifically, items on a measure may differentiate between students level of knowledge; however i n general, stud ent performance on the items was poor. Criterion Validity As educational reform efforts have evolved to address the growing concerns related to student mathematics achievement, states have adopted standardized assessments to serve as accoun tability measures of student achievement and school staff (i.e. teachers and administrators) performance (Kelley, Hosp, & Howell, 2008). CBM measures that are aligned
70 with state and national curriculum standards can function as a method of predicting stude nt performance, in addition to serving as measures of accountability for school administration and staff (Kelley, Hosp, & Howell, 2008). The criterion measure for this study, the state of Florida standardized mathematics assessment (FCAT 2.0), was develope d in accordance with NTCM standards (Florida Department of Education, 2011). In contrast to the CBM measures, the response process for FCAT 2.0 consisted o f gridded response and multiple choice, with an administration time of 140 minutes. Correlations betw een the CBM measures and FCAT 2.0 ranged from 0.68 to 0.78, indicating a moderate relationship between student performance on each assessment measure. For criterion validity, the correlations obtained in previous CBM development studies at the elementary l evel ranged from 0.50 to 0.90 (Fuchs et al., 1994, 1998, 1999 ; Lee et al., 2007 ), while correlations for studies at the middle school level have been lower, ranging from 0.30 to 0.79 (Foegen & Olsen, 2007). Criterion validity coefficients from past studie s indicate that differences in the range of coefficients exist between elementary and middle school levels. This suggests that increases in cognitive demands of content may impact the correlation between CBM and standardized assessment measures. The result s from the present study were within the range of previous findings for cri terion validity; however item response pattern and administration time may have impacted the criterion validity coefficients for the present study The i tem response pattern of the NO, AL, and GE CBM measures was open response, while the FCAT 2.0 included multiple choice and gridded response methods. Multiple answer, with the chance of selecting the appropriate answer without fully understanding the content with in each item. The nature of multiple choice items allows for a greater chance of
71 getting the problem correct in co mparison to open response items, likely resulting in higher perform ance scores. As previo usly mentioned, administration time for the CBM measures was 11 minutes, while the FCAT 2.0 has a limit of 140 minutes for 44 items. The difference in administration times results in substantial differences in the amount of time per item students have to s olve the problem. This, combined the differences in response pattern, could enhance student performance on the FCAT 2.0 when compared with the CBM measures. Factor Analysis Factor analysis procedures were used to determine the extent to which the three C BM measures represent a sing le constr uct of mathematics achievement. Four item parcels were created for each of the CBM measures as a method of c onducting the factor analysis. Based on the goodness of fit indices and factor loadings, the single factor mode l did n ot adequately fit the CBM data ( s ee Table s 3 8 and 3 10). Due to the CBM measures representing different mathematical domain areas (i.e. Number Operations, Algebra, and Geometry), it was determined that exploring a three factor model structure was a ppropriate. Factor loadings and goodness of fit indices indicated that the three factor model fits the CBM data moderately well; however correlations between the factors were relatively large, ranging from 0.83 to 0.93 (s ee Table 3 11). Based on NCTM (2006) and CCSS (2010) identification of the three domain areas associated with sixth grade mathematics achievement, the three factor model was not consistent with original factor structure hypothesi s. NCTM (2006) and CCSS (2010) identify number operations algebra and geometry as separate domain areas that represent the sixth grade curriculum, which suggests that although these areas are different, they represent one common factor: sixth grade mathematics achievement. These findings indicate that although the three
72 factor model fits the data more so than the single factor model, the factors are not distinct as evidenced by strong factor correlations More specifically, findings suggest that the three CBM measures are comprised of some unique skills; howev er performance on these skills is directly correlated to those of the other measures The three factor structure provides support for NCTM (2006) and CCSS (2010) identification of three separate domain areas, in addition to the creation of three separate CBM measures that evaluate each of these areas. However, the factor correlation results call into question the necessity of distinct measures for each domain. In addition, the pattern of factor correlations were consistent with correlations between FCAT 2. 0 content areas, with correlations ranging from 0.65 0.81. The FCAT 2.0 is comprised of items representative of the three domain areas; however it is a single assessment measure of mathematics achievement. Previous CBM development research for middle sch ool has not examined the factor structure representative of all measures; instead researchers have consistently developed and evaluated only the technical adequacy of separate, domain or skill specific measures. Therefore, the strong level of factor correl limited research on factor structure of CBM measures provide support for future research that compares the technical adequacy of single assessment measures and d omain specific measures. Limitations It is important to examine the limitations of the study methodology to determine the generalizability of study findings and inform the direction of future research. As a component of the construct validity of the measur es, the domain skill and item reviewers were solicited from the two selected schools. Students learning potential and performance on assessment measures such as CBM may be affected by the level of instructional exposure to the content areas and the quality of instruction provided. Although the teachers were selected to participate based on their
73 expertise and experience with middle school mathematics, there was not a formal criterion for selection. In addition, limited information was available on the effec instructional practices across the three domain areas, each of which may have impacted the item review process and student performance. size for factor analysis procedures; how ever it is suggested th at samples should consist of 100 participants per 10 items ( Everitt, 1975 ). In this case, the suggested sample size would be at least 200 students. The current study had a sample size of only 135 students due to the availability of m athematics classrooms and teachers willing to participate in the study. Having a larger sample size may have resulted in estimates of factor loadings that were more precise and reliable. In addition to sample size, the schools included in the study were lo cated in a North Central Florida and were selected based on convenience; therefore limiting the sample to a subpopulation of the state of Florida. This geographi cal limitation impacts the gene r a lizability of the findings to other states or countries. The administration of the CBM measures took place on one school day during the end of the academic year, a time when student motivation is often low. Motivation levels, a potential CBM administration as evidenced by the decline in student performance on items at the end of each measure Although administration was completed on one day to meet the requests of classroom teachers, this may have caused student fatigue, potentially impac ting the level of student performance. Implications for Future Research The p resent study contributes to CBM development research for middle school mathematics by creating measures aligned with national mathematics curricular standards and examining the t echnical adequacy of each measure. Findings from the present study show that at
74 the item level, student performance was low across a significant proportion of the items on each measure, as evidenced by the item difficulty indices. In an effort to address t his finding, future research might consist of a systematic review of each item to determine components requiring revision, such as item content, formatting, and phrasing. Due to the complex nature of determining the specific areas of revision needed for ea ch item, numerous rounds of test specification and item review may be necessary to identify items for further piloting with students. In addition to examining item content for areas needing revision, future research may seek to incorporate more detailed o bservation and analysis of the instructional sequence across each classroom. In the present study, item reviewers indicated that all skills represented on the test specifications were key content skills within their classrooms; however student performance was low on a substantial number of items. In order to expand on the findings of the present study, future research might gather information on the frequency and duration of instruction for each mathematics domain and skill s to inform the interpretation of item analysis results. The administration time of 11 minutes may ha ve impacted student performance, as evidenced by the decline in student performance near the end of each measure. Future research might examine the appropriateness of the 11 minute admini stration time by using methods such as those performed by Foegen, Olson, & Impecoven Lind (2008) in a study examining the use of Algebra CBM at the secondary school level. Based on previous CBM research, a baseline administration time of three minutes was used for the measures. After the initial three minutes Students were instructed to continue this method of marking, but instead of three minutes students were given an additional one minute each time. Statistical analysis procedures were
75 conducted to examine the impact of time on the technical adequacy of each measure. By incorporating a similar strategy, future research may determine the impact of administration on stu dent performance to provide empirical support for an appropriate administration time for each measure. Results from the factor analysis procedures indicated that although the three factor model representing the content domains of Number Operations, Algeb ra, and Geometry adequately fit the data, the factors were highly correlated. This suggests that each content domain has some unique skill components; however performance on one area is strongly related to that of the other areas. Based on these findings, future research may examine the use of one CBM measure for sixth grade mathematics that incorporates skills from each of the three content domains. Findings could be compared to those obtained using three separate measures to determine if significant diffe rences in results are obtained.
76 APPENDIX A TEST SPECIFICATIONS A 1. Test Specifications for NO CBM Skill Sample Problem NGSSSS CCSS Apply and extend previous understandings of multiplication and division to fluently multiply and divide fractions by fractions. 1. Solve: (2/4)/(1/3) 2. A recipe for brownies calls for 2/3 cups of sugar. You want to make two batches of brownies. How much sugar will you need? 3. Matt wants to make punch for his class party. The punch recipe calls for cup of orange juice per serving. How much orange juice does Matt need to make 16 servings of punch? 4. How much candy would each person get if 4 people share a lb. bag of candy? Each person will get an equal amount of candy. M.A.6.A.1.3 M.A.6.A.1.2 6.NS.1 Understand the concept of a unit rate and use rate language in the context of a ratio relationship. Interpret and compare rates and fractions. 1. Misty found two recipes for cookies. One calls for a cup of semi sweet chocolate chips, while the other recipe calls for 2/3 a cup more of chocolate chips. How many cups of chocolate chips does the 2 nd recipe need? 2. farm? 3. Your class is going on a field trip to the zoo. Your school requires 1 adult for every 13 students to go on the trip. If there are 117 students on the trip, how many adults should go also? 4. A soup recipe makes 96 cups of chili to serve 72 people. You want to serve 12 people. How many cups of chili should you make? M.A.6.A.2.2 6.RP.1 6.RP.2 Connect ratio and rate to 1. Natalie and Tom paid $24 for 3 t shirts. If each shirt is the M.A.6.A.2.1 6.RP.3
77 multiplication and division same price, how much did each shirt cost? 2. Three movie tickets cost $25.00. How much would 7 movie ticket s cost? 3. If it took 3 hours to plant 4 trees, then at that rate, how many trees can be planted in 36 hours? 4. Wal Mart has an 8 pack of paper towels for $7.82 or a 12 pack for $10.44. What is the difference in price per roll? Add, subtract, multiply, and divide multi digit numbers using standard algorithm 1. Solve 8 3 3+1 = 1. Solve 8.1 8.2 2.0 = 2. Solve: (4 +1.2) 3 2 3. 5 ( 1) 4 = M.A.6.A.3.5 6.NS.2 6.NS.3 Compute measures of central tendency and variability 1. What is the mode of the following data set: 12 4 11 12 4 8 12 2. Lynn needs to calculate average test scores for one of her students. The student has the following test scores: 45 70 3. What is the range of the following data set: 9 13 5 8 11 7 3 5 4 4. What is the mean of the following data set: 11 7 33 11 23 11 M.A.6S.6.1 M.A.6.S.6.2 6.SP.2 6.SP.3
78 A 2. Test Specifications for AL CBM Skills Sample Problems NGSSS CCSS Evaluate and solve expressions that use letters for numbers 1. 24 y=12 2. 2z+ 14 3z =0 3. 3 b +2.4 = 4.5 4. 2.7 b = 54 5. 2z+14 3z=0 M.A.6.A.3.2 M.A.6.A.3.4 6.EE.1 6.EE.2 Apply the properties of operations to generate equivalent expressions 1. Simplify 3(2 +x) 4x 2. Simplify: 12 (3z + 1) 3. Simplify: + 12a (3 6) 4. Simplify: 2 (a+2b) 6b a 5. Simplify: (2x 4) 5 + (6 3) M.A.6.A.3.5 6.EE.3 Solve real world problems using equations in the form of x+p=q and px=q 1. Roger is 13 years old. His sister, Mary is three times his age How old is Mary? 2. Misty is the age of her brother Larry. Mary is 33. How old is Larry? 3. A lighting fixture with a support and 4 track lights weighs 3.4 pounds. The support weighs 0.6 pound What is the weight ( w ) of each track light? 4. Meghan weighs three as much as her little sister, Amy. Amy is 85lbs. How much does Meghan weigh? 5. A pine tree is 2 feet taller than 10times the height of an oak tree. If an oak tree is 15ft., how tall is the pine tree? MA.6.A.3.2 6.EE.7 Evaluate expressions with specific values for variables 1. Evaluate 2z+ 3 w 2 when z=2 and w=3 2. Evaluate: w = 5 v +20 when w is 25, what is the value of v ? M.A.6.A.3.2 6.EE.2
79 3. Evaluate: 15 x = a (5+10) +10 x when a is 2, what is the value of x ? 4. Evaluate: ( t +2b) (10 b 3 t ) when t is 3 and b is 2. 5. Evaluate: 10+ 5 k when k= 4
80 A 3 Test Specifications for GE CBM Skills Sample Problems NGSSS CCSS Find the area of triangles, polygons, quadrilaterals, and circles. 1. What is the area of the right triangle with side lengths: 3ft, 4ft, and 5ft.? 2. Kathy and Matt want to put down new tile on the bathroom floor. What is the surface area of the floor? 3. Find the area of the circle with a diameter of 12. Use 3.14 for M.A.6.G.4.2 M.A.6.G.4.3 6.G.1 3 5 7 9 4
81 1. If the area is 96cm 3 what is the length of x?
82 Find the circumference of circles 1. What is the circumference of a circle with a diameter of 18? 2. Dexter wants to create a globe just like the one his teacher has in science class. He needs to know the circumference of her globe. Based on the figure below, what have? 3. A pet store has a circular fish bo wl with a radius of 8cm. What is the circumference of the fish bowl? 4. Find the circumference. M.A.6.G.4.1 6.EE.2
83 A 4. Item Reviewer Rating Scale Number Operations CBM Rating Form Using the Table of skills and CBM probe items, rate by checking Apply and extend previous understandings of multiplication and division to fluently multiply and divide fractions by fractions. Item 1 Yes No Item 2 Yes No Item 3 Yes No Item 4 Yes No Understand the concept of a unit rate and use rate language in the context of a ratio relationship. Interpret and compare rates and fractions. Item 1 Yes No Item 2 Yes No Item 3 Yes No Item 4 Yes No Connect ratio and rate to multiplication and division Item1 Yes No Item 2 Yes No Item 3 Yes No Item 4 Yes No Add, subtract, multiply, and divide multi digit numbers using standard algorithm Item 1 Yes No Item 2 Yes No Item 3 Yes No Item 4 Yes No Compute measures of central tendency and variability Item 1 Yes No
84 Item 2 Yes No Item 3 Yes No Item 4 Yes No Algebra CBM Rating Form Evaluate and solve expressions that use letters for numbers Item1 Yes No Item 2 Yes No Item 3 Yes No Item 4 Yes No Item 5 Yes No Apply the properties of operations to generate equivalent expressions Item 1 Yes No Item 2 Yes No Item 3 Yes No Item 4 Yes No Item 5 Yes No Solve real world problems using equations in the form of x+p=q and px=q Item 1 Yes No Item 2 Yes No Item 3 Yes No Item 4 Yes No Item 5 Yes No Evaluate expressions with specific values for variables Item 1 Yes No
85 Item 2 Yes No Item 3 Yes No Item 4 Yes No Item 5 Yes No Geometry CBM Rating Form probe item accurately measures Find the area of triangles, polygons, quadrilaterals, and circles. Item1 Yes No Item 2 Yes No Item 3 Yes No Item 4 Yes No Find the circumference of circles Item 1 Yes No Item 2 Yes No Item 3 Yes No Item 4 Yes No Find the volume of a rectangular prism using the formula: Volume= length x width x height Item 1 Yes No Item 2 Yes No Item 3 Yes No Item 4 Yes No Find the perimeter of two dimensional, complex figures Item 1 Yes No
86 Item 2 Yes No Item 3 Yes No Item 4 Yes No
87 APPENDIX B CBM PROBES Fill in your information below. Wait for further instructions. First Name: ______________________________ Last Name: ______________________________ Age: _______________ Check one: Male _____________ Teacher Name: __________________________ Class Period: __________________
88 ID #____________ Directions: Complete as many of the math problems as you can before I say stop. Work across the page. Problems for Number Operations CBM Probe (NO CBM): 1). A recipe for brownies calls for cups of sugar. You want to make two batches of brownies. How much sugar will you need? 2). What is the range of the following data: 9, 13, 5, 8, 11, 7, 3, 5, 4 3). Solve: 4). Four people want to share a bag of candy equally. The bag weighs lb. How much candy will each person receive? 5). Solve: 6). Misty found two recipes for cookies. One calls for cup of semi sweet chocolate chips, while the other recipe calls for cup more of chocolate chips. How many cups of chocolate chips does the 2 nd recipe need? 7). It took 3 hours to plant 4 trees. If you are planting trees at the same rate, how may trees would you plant in 36 hours? 8). What is the mean, of the following data: 11, 7, 33, 11, 23, 11
89 9). Matt wants to make punch for his class party. The punch recipe calls for cup of orange juice per serv ing. How much orange juice does Matt need to make 16 servings? 10). Natalie and Tom paid $24 for 3 t shirts. How much did each t shirt cost? 11). Solve: 12). What is the mode of the following data: 12, 4, 6, 11, 12, 4, 8, 12 13). A soup recipe makes 96 cups of chili to serve 72 people. You want to serve 12 people. How many cups of chili should you make? 14). Solve: 15). Lynn needs to calculate mean test score for one of her students. The student has the following test scores: 45, 70, 55, 10, 78, 89 mean test score? 16). There are 77 total The ratio of female cows and male cows is 7:4. How many male cows are at
90 17). Wal Mart has an 8 pack of paper towels for $7.82. How mu ch would a 12 pack cost? 18). Your class is going on a field trip to the zoo. Your school requires that 1 adult for every 13 students go on the trip. If there are 117 students on the trip, how many adults must go? 19). Solve: 16 (2) 2 = 20). Three movie tickets cost $25.00. How much would 7 movie tickets cost? ******* STOP!!!!!!
91 ID #____________ Directions: Complete as many of the math problems as you can before I say stop. Work across the page. Sample Problems for Algebra CBM Probe (AL CBM): 1). S implify: 2). Roger is 13 years old. His sister, Mary is three times his age. How old is Mary? 3). Solve: when =2 and =1 4). Solve: 3 b + 3 = 4 5). Misty is the age of her brother Larry. Misty is 34. How old is Larry? 6). Solve: 7). Simplify: 8). Solve: when k = 6
92 9). Simplify: 10). A lighting fixture with a support and 4 track lights weighs 3.4pounds. The support weighs 0.4 pound. Find the weight w of each track light. 11). Solve: When w is 25, what is the value of v ? 12). Solve: 2z + 15 + 3z =0 13). Evaluate: When a is 2, what is the value of ? 14). Solve: 2 b = 54 15). Simplify: 16). Meghan weighs three times as much as her little sister, Amy. Amy is 85lbs. How much does Meghan weigh? 17.) A pine tree is 2 feet taller than 10 times the height of an oak tree. If an oak tree is 15 ft, how tall is the pine tree? 18). Solve: when t is 3 and b is 2 19). Solve: 3 z +14 2 z = 0 20). Simplify: ********STOP!!!!!
93 ID #____________ Directions: Complete as many of the math problems as you can before I say stop. Work across the page. Problems for Geometry CBM Probe (Geo CBM): 1). What is the circumference of a circle with a diameter of 18? C = d 2). A family wants to fill their new swimming pool with water. The pool is 20ft long, 16ft wide, and 6ft deep. Calculate the volume of the pool to determine how much water will be needed to fill the pool to the top. V = lwh 3). Kathy and Matt want to put d own new tile on the bathroom floor. The length is 7ft and height is 9ft. What is the surface area of the floor? A = bh 7 9
94 4). Find the area of the circle. Use 3.14 for Diameter is 12. A = r 2 5). Find the perimeter. P = sum of all sides 6). The length ( l) tank is 4ft and the width ( w) is 3ft. The volume of the tank is 94ft. 3 What is the height ( h) of the tank? V = lwh 12
95 7). A pet store has a circular fish bowl with a radius of 8cm. What is the circumference of the fish bowl? Use 3.14 for C = d 8). Calculate the perimeter of the shaded region. P = sum of all sides 9). Find the perimeter. P = sum of all sides
96 10). Find the area. A = lw 11). Dexter wants to create a globe just like the one his teacher has in history class. He needs to know the circumference of her globe. Based on the figure below, what should be the Use 3.14 for C = d 12). Find the volume. V= lwh l = 6 w = 2 h = 2 9in.
97 13). Find the circumference. Use 3.14 for C = d 14). Find the volume. V = lwh 15). If the area is 96cm 2 what is the length of x ? A =
98 16). Find the perimeter. P = sum of all sides
99 APPENDIX C ITEM ANALYSIS C 1. Number operations (NO CBM) using dichotomous scoring. Item Item s tandard d eviation Item d iscrimination Item d ifficulty lpha if item d eleted NO 1 0.50 0.52 0.47 0.82 NO 2 0.50 0.34 0.49 0.83 NO 3 0.37 0.48 0.16 0.82 NO 4 0.39 0.31 0.19 0.83 NO 5 0.50 0.41 0.45 0.82 NO 6 0.38 0.50 0.17 0.82 NO 7 0.43 0.56 0.24 0.81 NO 8 0.47 0.53 0.32 0.82 NO 9 0.44 0.48 0.25 0.82 NO 10 0.50 0.67 0.54 0.81 NO 11 0.50 0.53 0.50 0.82 NO 12 0.49 0.60 0.40 0.81 NO 13 0.24 0.47 0.06 0.82 NO 14 0.22 0.33 0.05 0.83 NO 15 0.12 0.19 0.01 0.83 NO 16 0.00 -0.00 -NO 17 0.09 0.01 0.01 0.83 NO 18 0.15 0.24 0.02 0.83 NO 19 0.24 0.13 0.06 0.83 NO 20 0.09 0.26 0.01 0.83
10 0 C 2. Algebra (Alg CBM) item analysis using dichotomous scoring. Item Item s tandard d eviation Item d iscrimination Item d ifficulty lpha if item d eleted Al 1 0.25 0.29 0.07 0.81 Al 2 0.41 0.43 0.78 0.81 Al 3 0.49 0.51 0.41 0.80 Al 4 0.30 0.31 0.10 0.81 Al 5 0.48 0.51 0.37 0.80 Al 6 0.41 0.40 0.78 0.81 Al 7 0.09 0.27 0.01 0.82 Al 8 0.47 0.37 0.69 0.81 Al 9 0.15 0.10 0.02 0.82 Al 10 0.31 0.39 0.10 0.81 Al 11 0.50 0.59 0.51 0.80 Al 12 0.15 0.25 0.02 0.82 Al 13 0.45 0.50 0.28 0.80 Al 14 0.50 0.60 0.48 0.80 Al 15 0.09 0.27 0.01 0.82 Al 16 0.48 0.44 0.37 0.81 Al 17 0.42 0.53 0.23 0.80 Al 18 0.32 0.41 0.11 0.81 Al 19 0.09 0.27 0.01 0.82 Al 20 0.09 0.27 0.01 0.82
101 C 3. Geometry (Geo CBM) item analysis using dichotomous scoring. Item Item st andard d eviation Item d iscrimination Item d ifficulty a lpha if item d eleted Geo 1 0.43 0.44 0.24 0.77 Geo 2 0.49 0.44 0.42 0.77 Geo 3 0.46 0.45 0.69 0.77 Geo 4 0.33 0.46 0.13 0.77 Geo 5 0.37 0.42 0.16 0.77 Geo 6 0.29 0.45 0.09 0.77 Geo 7 0.36 0.57 0.16 0.76 Geo 8 0.21 0.30 0.04 0.78 Geo 9 0.33 0.50 0.13 0.78 Geo 10 0.32 0.34 0.11 0.78 Geo 11 0.29 0.60 0.09 0.76 Geo 12 0.50 0.23 0.45 0.79 Geo 13 0.25 0.32 0.07 0.78 Geo 14 0.47 0.27 0.33 0.79 Geo 15 0.22 0.29 0.05 0.78 Geo 16 0.30 0.33 0.10 0.78
102 LIST OF REFERENCES Algozzine, B., Wang, C., & Violette, A.S. (2011). Reexam ining the relationship between academic achievement and social behavior. Journal of Po sitive Behavior Interventions, 13( 91), 3 16. Allen, M., & Yen, W. (20 02). Introduction to measurement theory. Long Grove, IL: Waveland Press. Allsopp, D., Kyger, M., Lovin, L. Gerretson, H., Carson, K., & Ray, S. (2008). Mathematics dynamic assessment: Informal assessment that respo nds to the needs of struggling learners in mathematics. Teaching Exceptional Children 40 (3), 6 16. Aud, S., Hussar, W., Johnson, F., Kena, G., Roth, E., Mannin g, E., Wang, X., and Zhang, J. (2012). The condition of education 2012 (2012). U.S. Dep artment of Education, National Center fo r Education Statistics. Washington, DC. Retrieved from http://nces.ed.gov/pubsearch Bryant, B. R., & Bryant, D. P. (2008). Introduction to a special se ries: Mathematics and learning disabilities. Learning Disabil ity Quarterly, 31 (1), 3 8. Burns, M., & Gibbons, K. (2008). Implementing response to intervention in elementary and secondary schools: Procedures to assure scientific based practices. NY: Routledge. Calhoon, M. B., & Fuchs, L. S. (2003). The effects of peer as sisted learning strategies and curriculum based measurement on mathematics performa nce of secondary students with disabilities. Remedial and Special Education, 24 (4), 235 245. Christ, T.J., & Vining, O. (2006). Curriculum based mea surement procedure s to develop multiple skill mathematics computation probes. Evaluation of random and stratified stimulus set arrangements. School Psychology Review, 35 387 400. Coffman, D. L. & MacCallum, R. C. (2005). Using parcels to con vert path analysis models into latent variable models. Multivariate Behavioral Research, 40 (2), 235 259 Common Core State Standards (2010). Common core state stand ards imitative. Retrieved from http://www.corestandards.org/ Crawford, L., & Ketterlin Geller, L.R. (2008). Introduction to the special issue. Remedial and Special Education, 29 (1), 5 8 Crocker, L., & Algina, J. (2008). Introduction to classical and modern test theory. Mason, OH: Cengage Learning. Deno, S. L. (1985). Curriculum based measurement: The emerging alternative. Exceptional Children, 52 219 232.
103 Deno, S. L. (2003). Developments in curriculum based measurement. Journal of Special Education, 37 (3), 184 192. Downing, S. & Haladyna, T., (1997). Test item development: Val idity evidence from quality assurance procedures. Applied Measurement in Education, 10 (1), 61 62. Espin, C. A., Deno, S. L., Maruyama, G., & Cohen, C., (1999). Th e basic academic skills sample (BASS): An instrument for the screening and identification of children at risk for failure in regular education classrooms. Paper presented at annual meeting of American Educational Research Association. San Francisco, CA. Florida Department of Education (2011). Math content focus. Retrieved from: http://fcat.fldoe.org/fcat2/pdf/MathContentFocus.pdf Florida Department of Education (2012). Spring 20 12 fact sheet. Retrieved from: http://fcat.fldoe.org/fcat2/pdf/spring12ffs.pdf Florida Department of Education (2012). FCAT 2.0 mathematics t est specifications grades 6 8. Retrieved from: http://fcat.fldoe.org/fcat2/p df/FL10SpISG68MWTr3g.pdf Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological Assessment, 7 286 299. Foegen, A. (2000). Technical adequacy of general outcome measures for middle school mathematics. Assessment for Effective Intervention, 25 (3), 175. Foegen, A. (2008). Progress monitoring in middle school ma thematics: Options and issues. Remedial and Special Education, 29 (4), 195 207. Foegen, A., & Deno, S. L. (2001). Identifying growth indicators for low achieving students in middle school mathematics. The Journal of Special Education, 35 (1), 4. Foegen, A., Jiban, C., & Deno, S. (2007). Progress monitoring measures in ma thematics: A review of the literature. The Journ al of Special Education, 41 (2), 121. Foegen, A., Olson, J. R., & Impecoven Lind, L. (2008). Developing progress monitoring measures for secondary mathematics: An illustration in algebra. Assessment for Eff ective Intervention, 33 240 249. Frick, T. and Se mmel, M.I. (1978). Observer agreement and reliabilities of classroom observational measures. Review of Educational Research, 48 (1), 157 184. Fuchs, L. S. (2004). The past, present, and future of curriculu m based measurement research. School Psychology Revi ew, 33 (2), 188 192. Fuchs, L. S., & Deno, S. L. (1991). Paradigmatic distinction be tween instructionally relevant measurement models. Exceptional Children, 57, 488 500.
104 Fuchs, L. S., & Fuchs, D. (2002). Curriculum based measu rement: Describing competence, enhancing outcomes, evaluating treatment effe cts, and identifying treatment nonresponders. Peabody Journal of Education, 77, 64 84. Fuchs, L. S. & Fuchs, D. (2006). Introduction to response to interve ntion: What, why, and how valid is it? Reading Research Quarterly, 41 (1), 93 99. Fuchs, L.S., Fuchs, D., Hamlett, C. L., Thompson, A., Roberts, P.H., Kupek, P., & Stecker, P. (1994). Technical features of a mathematics concepts and applications curriculum based measurement system. Diagnostique, 19 (4), 23 49. F uchs, L. S., Fuchs, D., Hamlett, C. L., Walz, C., & Germann, S. (1993). Formative evaluation of academic progress: How much growth can we expect? School Psychology Review, 22 (1), 27 48. Fuchs, L. S., Fuchs, D., & Hollenbeck, K. (2007). Extending responsive ness to intervention to mathematics to first and third grades. Learning Disabilities Research, 22 (1), 13 24. Fuchs, L.S., Fuchs, D., Karns, K., Hamlett, C.L., Dutka, S ., & Katzaroff, M. (2000). The importance of providing background information o n the stru cture and scoring of performance assessments. Applied Measurement in Education, 13, 1 34. Fuchs, L. S., Fuchs, D., & Zumeta, R. (2008). A curricular sampling approach to progress monitoring. Assessment for Effective Instruction, 33 (4), 225 233. Fuchs, L. S., Hamlett, C. L., & Fuchs, D. (1998). Monitoring basic skills progress: Basic math computation (2 nd ed.). Austin, TX: Pro Ed. Fuchs, L. S., Hamlett, C. L., & Fuchs, D. (1999). Monitoring basic skills progress: Concepts and applications. Austin TX: Pro Ed. Fuchs, D., Mock, D., Morgan, P., & Young, C. (2003). R esponsiveness to intervention: definitions, evidence, and implications for the learning disabilities construct. Learning Disabilities Research and Practice, 18 157 171. Gersten, R., & Bec kham. (2009). Assisting students struggling with mathematics: Response to intervention (RtI) for elementary and middle schools. Retrieved from http://ies.ed.gov/ncee/wwc/publications/pr acticeguides/ Helwig, R., Anderson, L., & Tindal, G. (2002). Using a conc ept grounded, curriculum based measure in mathematics to predict statewide test scores f or middle school students with LD. Journal of Special Education, 36 (2), 102 112. Hosp, M., Ho sp, J., & Howell, K. (2007). The ABCs of CBM: A practical guide to curriculum based measurement New York: The Guilford Press. Hosp, J. & Madyun, N. (2007). Addressing disproportionality thr ough response to intervention. In S. Jimerson, M. Burns, & A. VanD erHeyden (eds.). The handbook of response to
105 intervention: The science and practice of assessment and intervention (pp. 172 181). New York: Springer. Hu ,L. & Bentler P. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Co n ventional criteria versus new alternatives, Structural Equation Modeling, 6 (1), 1 55. Jiban, C. L., & Deno, S. L. (2007). Using math and reading cu rriculum based measurements to predict state mathematics test performance: Are simple o ne minute measures te chnically adequate? Assessment for Effective Intervention, 32 (2), 78 89. Kashima, Y., Schleich, B., & Spradlin, T. (2009). The core compo nents of RtI: A closer look at evidence based core curriculum, assessment and progr ess monitoring, and data based deci sion making. Center for Evaluation and Education Policy: Bloomington, IN Kelley, B., Hosp, J., & Howell, K. (2008). Curriculum based eva luation and math: An overview. Assessment for Effective Intervention, 33 250 256. Ketterlin Geller, L., Chard, D., & Fi en, H. (2008). Making connections in mathematics: Conceptual mathematics interventions for low performing students. Remedial and Special Education, 29 (1), 33 45. Kilpatrick, J., Swafford, J., & Findell, B. (Eds.). (2001). Adding It Up: Helping Children Lea rn Mathematics Washington, DC: National Academy Press. Kishton, J. M., & Widaman, K. F. (1994). Unidimensiona l versus domain representative parceling of questionnaire items: An empirical example. Educational and Psychological Measurement, 54, 757 765. Le e, Y., Lembke, E., Moore, D., Ginsburg, H., & Pappas, S. ( 2012). Item level and construct evaluation of early numeracy curriculum based measures. Assessment for Effective Intervention 37 (2), 107 117. Lei, P.W., Wu, Q., DiPerna, J. C., & Morgan, P. L. (200 9). Developing short forms of the EARLI numeracy measures: Comparison of item selection methods. Educational and Psychological Measurement, 69, 825 842. Lembke, E., & Stecker, P. (2007). Curriculum based measureme nt in mathematics: An evidence based format ive assessment procedure. Portsmouth, NH: RMC Research Corporation, Center on Instruction. Milgram, R. (2005). The mathematics pre service teachers need to know Stanford, CA: Author. Moss, J., & Case, R. (1999). Developing children's understanding of the rational numbers: A new model and an experimental curriculum. Journal for Research in Mathematics Education, 30 122 147. National Council for Teachers of Mathematics. (2006). Curriculum focal points for 6 th grade mathematics. Retrieved from http://www.nctm.org/standards/
106 No Child Left Behind Act (2001). Title 1: Improving t he academic achievement of the disadvantaged, 20 U.S.C. 6301. Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3 rd ed.). Ne w York: McGraw Hill. Shaprio, E. S., Edwards, L., & Zigmond, N. (2005). Progress m onitoring of mathematics among students with learning disabilities. Assessment for Effective Intervention 30(2), 15 32. Shinn, M. R. (2002). Best practices in using curricul um based me asurement in a problem solving model. In A. Thomas & J. Grimes (Eds.), Best practices in school psychology (pp.671 698). Bethesda, MD. Stecker, P.M., & Fuchs, L. S. (2000). Effecting superior achi evement using curriculum based measurement: The importance of individual progress monitoring. Learning Disabilities Research and Practice, 15 (3), 128 134. Stecker, P. M., Lembke, E. S., & Foegen, A. (2008). Using progr ess monitoring data to improve instructional decision making. Preven ting School Failure, 52 (2), 48 58. Thurber, R., Shinn, M. R., & Smolkowski, K. (2002). What is m easuring in mathematics tests? Construct validity of curriculum based mathematics measures. School Psychology Review, 31 (4), 498 513. Thurlow, M., Albus, D., S picuzza, R., & Thompson, S. (1998). Participation and perform ance of (Minnesota Report 16). Minneapolis, MN: University of Minnesota, National Center on Educational Outc omes. Wallace, T., Espin, C.A., McMaster, K., Deno, S.L., & F oegen, A. (2007). CBM progress monitoring within a standards based system. Journal of Special Education, 41, 66 67 Witzel, B. (2005). Using CRA to teach algebra to students with math difficulties in inclusive settings. Learning Disabilities: A Contemporary Journal, 3 (2), 53 64. Woodward, J., & Montague, M. (2002). Meeting the chall enge of mathematics reform for students with learning disabilities. Journal of Special Education, 36 89 101.
107 BIOGRAPHICAL SKETCH An gela Denise Dobbins was born in Mound ville, Alabama. She received a Bachelor of Arts in p sychology from Vanderbilt University in 2005, Master of Education with a major in h uman d evelopment c ounseling from Vander bilt Univer sity in 2007, and Master of Education with a major in school p sychology from the University of Florida in 2012. She was awarded the Doctoral Enhancement Scholarship and Doctoral Retention Scholarship from 2008 to 2013. She served as Research Assistant in the Department of Special Education, School Psychology, and Ear ly Childhood Studies at the University of Florida and the Department of Special Education at Vanderbilt University.