UFDC Home  UF Institutional Repository  UF Theses & Dissertations  Internet Archive   Help 
Material Information
Subjects
Notes
Record Information

Table of Contents 
Title Page
Page i Page ii Acknowledgement Page iii Table of Contents Page iv Page v Page vi List of Tables Page vii Page viii List of Figures Page ix Abstract Page x Page xi Page xii Page xiii Chapter 1. Introduction Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Page 8 Page 9 Page 10 Page 11 Page 12 Page 13 Page 14 Page 15 Page 16 Page 17 Page 18 Chapter 2. Rationale for selection of variables/time series indicators Page 19 Page 20 Page 21 Page 22 Page 23 Page 24 Page 25 Page 26 Page 27 Page 28 Page 29 Page 30 Page 31 Page 32 Page 33 Page 34 Page 35 Page 36 Page 37 Chapter 3. Rationale for extrapolative methods selected for comparison Page 38 Page 39 Page 40 Page 41 Page 42 Page 43 Page 44 Page 45 Page 46 Page 47 Page 48 Page 49 Page 50 Page 51 Page 52 Page 53 Page 54 Page 55 Page 56 Chapter 4. Comparison of extrapolative methods using selected time series indicators Page 57 Page 58 Page 59 Page 60 Page 61 Page 62 Page 63 Page 64 Page 65 Page 66 Page 67 Page 68 Page 69 Page 70 Page 71 Page 72 Page 73 Page 74 Page 75 Page 76 Page 77 Page 78 Page 79 Page 80 Page 81 Page 82 Page 83 Page 84 Page 85 Page 86 Page 87 Page 88 Page 89 Page 90 Page 91 Page 92 Page 93 Page 94 Chapter 5. Discussion Page 95 Page 96 Page 97 Page 98 Page 99 Page 100 Page 101 Chapter 6. Summary, conclusions, and implications of study Page 102 Page 103 Page 104 Page 105 Page 106 Page 107 Appendix Page 108 Page 109 Page 110 Page 111 Page 112 Page 113 Page 114 Page 115 Page 116 Reference notes Page 117 References Page 118 Page 119 Page 120 Page 121 Page 122 Page 123 Biographical sketch Page 124 Page 125 Page 126 
Full Text 
TIlE PROJECTION OF SOCIAL TRENDS IISJNG TIME SERIES INDICATORS: METHIOIDO LOGY AND APPLE ICATION IN EDUCATIONAL PLANNING By JANE COUNIttAN NELSON A DISSERTATION PRESENTED) TO TIHE GRADUATE COUNCIL OF TIlE I UNIVERSITY (F FLORIDPA IN PARTIAL, FIULFII,LIENT OF TIlE RVI RIlrNTS FOR TIlE l)EGREE OF DOCTOR OF PHILOSOPY UNI VERSITY OF FLORIDlA 1977 Copyright 1977 by Jane Counihan Nelson ACKNOWLEDGMENTS I wish to express my sincere appreciation to my Supervisory Committee: to Dr. Michael Y. Nunnery, chairman, for his candor and understanding whether acting as counselor or critic; to Dr. Phillip A. Clark and Dr. Gordon D. Lawrence, for providing valuable suggestions and support. A special thank you goes to Dr. Arthur J. Lewis, director of the DOE social forecasting project, for his leadership, receptivity to new ideas, and especially for his confidence in me; and to my colleagues on the project, Dr. Robert S. Soar and Hs. Linda Troup, for their substantial contribution to the conceptual framework presented in this study. For carefully reviewing the statistical portions of this manuscript, I am indebted to my friend, Dr. Azza S. Guertin. I am especially grateful to my husband Edward for his love and encouragement and for sharing with me hi enthu siasm for scientific inquiry. iii TABLE OF CONTENTS ACKNOWLEDGIENTS .... LIST OF TABLES ....... .................. LIST OF FIGURES ....... .................. ABSTRACT ......... .................... CHAPTER I INTRODUCTION .... ............... Page . . iii . . vii . . ix * x 1 Background and Significance of the Study ... ...... 1 The Social Context of Education .... ......... 1 The Futures Perspective in Educational Planning ......... ................... 4 Forecasting Trends in Social Variables ...... 5 The Need for Research ....... .............. 7 The Problem ......... .................... 9 Delimitations and Limitations ..... ........... 9 Definition of Terms ...... ................ 11 Procedures ...................................... 13 The Selection and Operational Definition of Variables ...... ................. .. 13 Collection of Time Series Indicator Data .... 14 Comparison of Extrapolative Methods Using Time Series Indicators .... ............ 15 Development of Implications for Educational Planning ....... ................... .. 18 CHAPTER II RATIONALE FOR SELECTION OF VARIABLES/ TIME SERIES INDICATORS ... .......... The Social Indicator Movement .. ........... Historical Development .... ............ Definition and Use of Social Indicators . Data Base for Social Indicators ........... Educational Implications ... ........... Selection of Variables/Time Series Indicators . The Variables ....... ................. Bronfenbrenner's Ecology of Education Model. Operational Definition of Variables as Time Series Indicators ...... ............... CHAPTER III RATIONALE FOR EXTRAPOLATIVE METHODS SELECTED FOR COMPARISON ... ........ * 19 20 * 20 24 * 26 27 . 28 28 29 * 31 * 38 TABLE OF CONTENTS (continued) Page Overview of Extrapolative Forecasting Methods . 38 Economic and Business Forecasting .......... ..39 Technological Forecasting .... ............ 41 Educational Forecasting .... ............. ...42 Extrapolative Methods in Other Areas ....... ..43 Applicability of Reviewed Extrapolative Methods for Study ........ .................... .. 44 The Pattern of the Data ..... ............ 44 The Class of Model ....................... ..44 Description of Methods to be Compared ........ ..46 The General Linear Model .... ............ 46 The Assumptions of the Linear Model ......... ...48 Criteria for Comparison of Methods ........ ..49 Method 1: Simple Linear Regression ......... ...52 Method 2: Loglinear Regression .. ........ ..52 Method 3: Polynomial Regression .. ........ ..54 CHAPTER IV COMPARISON OF EXTRAPOLATIVE METHODS USING SELECTED TIME SERIES INDICATORS ......... ...57 Presentation of Results ..... .............. 60 Indicator 1 ....... ................... 60 Indicator 2 ......... .... ............. ...62 Indicator 3 ....... ................... .. 69 Indicator 4 ....... ................... .. 73 Indicator 5 ....... ................... .. 77 Indicator 6 ....... ................... .. 79 Indicator 7 ....... ................... .. 86 Indicator 8 ....... ................... 90 CHAPTER V DISCUSSION ....... .................. ..95 The Variables ....... ................... 95 Selection ........ .................... 95 Bronfenbrenner's Ecology of Education Model. 96 Operational Definition of Variables ......... ...97 The Extrapolative Methods .... ............ .. 98 Statistical Considerations ... ........... 98 Practical Considerations ..... ............ .101 CHAPTER VI SUMMARY, CONCLUSIONS, AND IMPLICATIONS OF STUDY ....... .................. 102 Summary ......... ...................... .102 The Variables ........ .................. ..103 The Methods ........ ................... .103 Results ......... ..................... .104 Conclusions ........ .................... .104 Suggestions for Future Research .... .......... .105 Implications for Planners and Policy Makers . . 106 TABLE OF CONTENTS (continued) Page APPENDIX .......... ........................ .109 REFERENCE NOTES ......... ..................... .117 REFERENCES ......... ....................... .118 BIOGRAPHICAL SKETCH ........ ................... ..124 LIST OF TABLES Page Table 1 Time Series Indicators of Social Variables Affecting Outcomes of Education ......... Table 2 Indicator 1: Summary Statistics for Prediction Equations by Method ........ Table 3 Indicator 1: Observed Y's and Predicted Y's by Method ...... ................ Table 4 Indicator 2: Summary Statistics for Prediction Equations by Method ........ Table 5 Indicator 2: Observed Y's and Predicted Y's by Method ...... ................ Table 6 Indicator 3: Summary Statistics for Prediction Equations by method. ......... Table 7 Indicator 3: Observed Y's and Predicted Y's by Method ...... ................ Table 8 Indicator 4: Summary Statistics for Prediction Equations by Method ........ Table 9 Indicator 4: Observed Y's and Predicted Y's by Method ...... ................ Table 10 Indicator 5: Summary Statistics for Prediction Equations by Method ........ 33 * 61 63 66 * 67 * 70 * 71 * 74 75 * 78 Table 11 Indicator 5: Observed Y's and Predicted Y's by Method ..... ............... Table 12 Indicator 6: Summary Statistics for Prediction Equations by Method ....... Table 13 Indicator 6: Observed Y's and Predicted Y's by Method ..... ............... Table 14 Indicator 7: Summary Statistics for Prediction Equations by Method ....... Table 15 Indicator 7: Observed Y's and Predicted Y's by Method ..... ............... . . 80 * . 83 S. 84 . . 87 . . 88 vii LIST OF TABLES (continued) Indicator 8: Summary Statistics for Prediction Equations by Method ........ Indicator 8: Observed Y's and Predicted Y's by Method ...... ................ Indicator by Method.. Table 19 Indicator 2: by Method. . Table 20 Indicator 3: by Method. . Table 21 Indicator 4: by Method. . Table 22 Indicator 5: by Method. . Table 23 Indicator 6: by Method. . Table 24 Indicator 7: by Method. . Table 25 Indicator 8: by Method. . Table 16 Table 17 Table 18 Page * 91 * 92 . . . 109 . . . 110 . . . 112 . . . 113 . . . 114 . . . 115 . . . 116 viii ANOVA Summary Tables ANOVA Summary Tables ANOVA Summary Tables ANOVA Summary Tables ANOVA Summary Tables ANOVA Summary Tables ANOVA Summary Tables ANOVA Summary Tables LIST OF FIGURES Page Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Bronfenbrenner's ecological structure of the educational environment. (Based upon Bronfenbrenner's [1976] description, pp. 56) ....... .................. Indicator 1: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extra polation) ....... .................. Indicator 2: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extra polation) ....... .................. Indicator 3: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extra polation) ....... .................. Indicator 4: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extra polation) ....... .................. Indicator 5: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extra polation) ....... .................. Indicator 6: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extra polation) ....... .................. Indicator 7: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extra polation) ....... .................. Indicator 8: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extra polation) ....... .................. * 64 68 * 72 * 76 81 * 85 * 89 93 Abstract of Dissertation Presented to the Graduate Council of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy THE PROJECTION OF SOCIAL TRENDS USING TIME SERIES INDICATORS: METHODOLOGY AND APPLICATION IN EDUCATIONAL PLANNING By Jane Counihan Nelson December 1977 Chairman: Michael Y. Nunnery Major Department: Educational Administration Educational planners and policy makers need adequate information about the societal context of education to make appropriate decisions about the future role and function of education. Some of this information may be provided through the use of conceptually sound social and educational vari ables operationally defined as time series indicators coupled with an empirically sound basis for forecasting future trends in such indicators. As evidence of the need for developing such a social forecasting framework for education, states including Florida have provided grants for that purpose. This study was one aspect of such a grant. The problem in this study was (a) to select, using Bronfenbrenner's ecology of education model, and operation ally define at least 10 variables that research has shown to be related to the outcomes of education; (b) to use these variables operationally defined as time series indicators in the comparison of three purely extrapolative forecasting methods; and (c) to derive implications for the use of an ecological model such as Bronfenbrenner's, time series indicators, and selected extrapolative methods for educa tional planning. The study was conducted in the following phases: 1. Using Bronfenbrenner's ecology of education model, 10 variables that research has shown to be related to the outcomes of education were selected and where possible, were operationally defined as state and/or national time series indicators. Data were collected for these indica tors; eight which met the criteria established in this study were used in the comparison of extrapolative techniques. 2. Three purely extrapolative techniques derived from the general linear model were compared according to statis tical criteria and practical considerations derived from the literature in statistics, economics, time series analysis, and forecasting methodology. The methods were (a) linear regression, (b) curvilinear regression (quad ratic and cubic forms), and (c) loglinear regression (de pendent variable undergoes logarithmic transformation). Each method was applied to each time series indicator. Time in years was used as the independent variable; the annual measure of the indicator was treated as the dependent vari able. Each data set was divided into thirds; twothirds of the data points were used to establish the prediction equation. This equation was used to predict the remaining third of the data points. Predicted values were compared with actual values. 3. Implications for the use in educational planning of an ecological model such as Bronfenbrenner's, time series indicators, and selected extrapolative techniques were dis cussed. Results of the method comparison were (a) no method was a superior predictor for all indicators; (b) each method was a superior predictor for at least one indicator; and (c) the summary statistics for the original regression were not consistently related to the accuracy of the extra polated values. The following conclusions appear to be warranted by the results of this study: 1. The Bronfenbrenner model is a useful framework for considering the numerous factors impinging upon the learner. 2. Time series indicators provide a means to compare trends in an indicator over time or to compare different groups in relation to a specific indicator. 3. The general linear model is appropriate for the analysis and extrapolation of the selected time series indicators used in this study. 4. Each method is appropriate for use with some indicators but not with others. Measures of "best fit" such as r2 and the standard error of estimate are not reliable criteria for the selection of an extrapolative method. A combination of strategies such as graphic representation of original and predicted data, analysis of residuals, and xii knowledge of the social phenomena being studied may provide guidance as to the most appropriate method for a particular indicator. xii i CHAPTER I INTRODUCTION Background and Significance of the Study The Social Context of Education Educators have become increasingly cognizant of the myriad forces in society impinging upon various facets of the educational process. The influence of a number of these forces upon educational purposes, outcomes, and resources has been analyzed from several social science perspectives (Boocock, 1976; Henry, 1961; Gordon, 1974). Keppel (in Thomas & Larson, 1976) acknowledged one of the reasons for this continuing interest in societal trends by educational planners and policymakers: the impetus for change in educational institutions, from the preschool through the university, is more likely to derive from changes in the wider society than from forces within the institutions. (Foreword) Additionally, Keppel noted that "educational policy must be formed in concert with other aspects of public policy and program development" (Foreword). Bronfenbrenner (1976) proposed an ecological structure of the educational environment which must be taken into account if "any progress in the scientific study of educa tional systems and processes" (p. 5) is to be made. Bronfen brenner stated: Whether and how people learn is a function of sets of forces, or systems, at two levels: a. The first comprises the relations between characteristics of learners and their sur roundings in which they live out their lives (e.g., home, school, peer group, work place, neighborhood, community). b. The second encompasses the relations and interconnections that exist between these environments. (p. 5) Building on Lewin's theory of topological territories and employing a terminology adapted from Brim (1975), Bronfen brenner further elaborated that the construct environment can be "conceived topologically as a nested arrangement of structures, each contained within the next" (p. 5). 1) A microsystem is an immediate setting containing the learner .. 2) The mesosystem comprises the inter relationships among the major settings containing the learner at a particular point in his or her life a system of microsystems. 3) The exosystem is an extension of the mesosystem embracing the concrete social structures, both formal and informal, that impinge upon or encompass the immediate settings containing the learner and, there by, influence and even determine or delimit what goes on there. These structures in clude the major institutions of society, both deliberately structured and spontane ously evolving, as they operate at the local community level ... 4) Macrosystems are the overarching insti tutions of the culture or subculture, such as the economic, social, educational, legal and political systems, of which local micro, meso, and exosystems are the concrete manifestations. (pp. 56) (See Figure 1 for a representation of these ideas.) Figure 1. Bronfenbrenner's ecological structure of the educational environment. (Based upon Bronfen brenner's [19761 description, pp. 56.) The Futures* Perspective in Educational Planning While Bronfenbrenner proposed his ecological structure primarily as a framework for learning research efforts, that is, for examining relationships among variables associated with learning, others (Harman, 1976; Webster, 1976) have dis cussed the societal context of education as a framework for futureoriented educational planning. Indeed, this emphasis on future awareness has evolved into a significant movement within education referred to as educational futurism (Hencley & Yates, 1974; Pulliam & Bowman, 1974), educational futures (Marien & Ziegler, 1972), or alternative futures perspective (Webster, 1976). The primary purpose of futures research or future studies is "to help policy makers choose wiselyin terms of their purposes and valuesamong alternative courses of action that are open to leadership at a given time" (Shane, 1973, p. 1). According to Webster (1976), this requires that we attend to alternativesto alternative assumptions, ends and means. It requires us to examine alternative plausible futures that might be rendered more or less possible by our planning and action; to identify un intended as well as intended consequences for others of achieving the goals that seem desirable to us; to analyze alternative stra tegies and tactics for achieving any desired future; and to anticipate the variety of potential consequences of our strategies, tactics, and shortrun planning. Perhaps, most fundamentally it asks of us that we look hard at our basic premises about the nature of man and the world and consider implications and alternatives for the future. (p. 2) *"Futures" refers to the number of different possible views of what is ahead in subsequent time periods for society and, thus, for education. S Webster also noted: the futures perspective implies that we not just attend to alternatives in and for educa tion, but also consider the societal context in more comprehensive fashion than is usual in educational planning. (p. 2) In order to assist decision makers in the selection of alternatives which have positive future consequences for society, educational planners at both national and state levels must take into account those societal forces which affect not only the outcomes of education, but also the pur poses of education, and the human and material resources available to the educational process. To do this, however, the planner must delineate the societal factors or variables to be included in the planning process and develop a sound rationale based on research and theory for such inclusion. Then trendspast, present, and futurein these variables may be examined in order to derive implications for educa tional planning and policy. Forecasting Trends in Social Variables Available to the educational planner in this undertaking are a number of predictive and heuristic devices to explore alternative futures which have been developed by government, industry, nonprofit organizations, and futures consulting groups. These forecasting techniques can be categorized into exploratory forecasting methods and normative forecasting methods: Exploratory forecasting methods start from the present situation and its preceding history, and attempt to project future developments. Normative forecasts, on the contrary, start with some desired or pos tulated future situation, and work back wards to derive feasible routes for the transition from the present to the desired future. (Martino, 1976, p. 4) Exploratory forecasting methods, all of which are based upon extrapolation of some kind, include (a) purely extra polative methods, (b) explanatory methods, and (c) auxiliary methods. Since forecasting of social phenomena is still in a highly intuitive developmental phase, there is a growing interest in examining those exploratory methods considered to be purely extrapolative, which are based upon time series data representing social and educational variables. These time series data, often called time series indicators, are defined measurements made at specified intervals over a period of time. By extrapolating identified patterns in the time series data into the future, planners may compare present, past, and future states of that indicator. Thus, a projec tion of future societal trends can provide the impetus to examine present policy and to analyze the consequences of contemplated changes. This approach need not be only "pre ventive" forecasting, in the sense used by Ziegler (1972) of preventing undesirable forecasts. It may also be extended to examine all consequences of action or intervention, in tended or not. Purely extrapolative methods, when combined with auxiliary methods such as trendimpact analysis, cross impact matrices, or scenarios, can provide a vehicle for ex ploring the relationships among identified future patterns in society. While the use of purely extrapolative methods with time series data is fairly well defined in technological and economic areas, their application to social forecasting has not been the focus of significant definitive study. Indeed Harrison (1976) emphasized the need for such research, speci fically the consideration of "each method in terms of some aspect of the social process it would likely be applied to" (p. 13). For, as Harrison explained, while some problems in regression and time series analysis which remain unresolved are currently the concern of statisticians and mathematicians, "it appears that resolution might best lie in terms of inves tigation in concrete application cases" (p. 14). In social forecasting there is a great need in almost all the known extrapolative methods for an explicit statement of the algorithmic, theoretical, and empirical weaknesses or sensitivities of such procedures. Such a discussion, as noted, would be more mean ingful if carried on in the context of an analysis of some specific aspect or aspects of social process. (Harrison, 1976, p. 17) Only through empirical study of the performance of various extrapolative methods applied to particular social phenomena will a basis for selection of appropriate and accurate tech niques be formulated. The Need for Research Since there are no widelyaccepted planning models in corporating quantitative data on social variables, the edu cational planner who wants to utilize such information is confronted with a number of questions related to (a) the identification of social variables to be included, (b) the operational definition of social variables in terms of time series indicators, (c) the selection of a purely extrapolative technique which will yield the most accurate forecast for a specific indicator, and (d) the utilization of these forecasts in the planning process. Answers require futures research which is derived from a conceptually sound framework and is pursued with methodological vigor. As evidence of the impor tance of such investigation to the educational planner, the State of Florida through the Office of Strategy Planning in the Department of Education funded in 1976 a social fore casting project (STAR Project No. R5175) at the University of Florida for the second year. The study described herein was part of that effort to forecast social trends affecting education in Florida. To summarize: Educational planners and policy makers need adequate information to make appropriate decisions about the role and function of education in creating improved quality of life for citizens of the future. The State of Florida, in funding STAR Project No. R5175 of which this study is a part, acknowledged that need. Part of this in formation may be provided through the use of conceptually sound social and educational variables operationally defined as time series indicators coupled with an empirically sound basis for forecasting future states of such indicators. The Problem The problem in this study was (a) to select, using Bronfenbrenner's ecology of education model, and operationally define at least 10 variables that research has shown to be related to the outcomes of education; (b) to use these variables operationally defined as time series indicators in the comparison of three purely extrapolative forecasting methods; and (c) to derive implications for the use of an eco logical model such as Bronfenbrenner's, time series indicators, and selected extrapolative methods for educational planning. Delimitations and Limitations The Bronfenbrenner ecology of education model was used primarily as a framework for the selection of social and educational variables and was not evaluated itself in this study. Ten variables (e.g., socioeconomic status of family, peer group characteristics) were selected to be operationally defined, where possible, in terms of national and/or state level time series indicators (e.g., median family income, juvenile crime rates). Of these identified indicators, eight which met the following criteria were used in the comparison of extrapolative techniques: (a) the indicator was readily available, (b) the data were available for a 10 year or greater time span, and (c) the indicator was a reasonably reliable and valid measure of one aspect of the social or educational variable that it represented. It should be noted that the selection of the eight indicators used in this study was in many cases influenced more by data availability than the logic or appropriateness of the indicator to represent a specific social variable. Thus, the eight indicators are examples of the type of data that might be employed to operationally define the variables; utilization in a specific planning situation would require evaluation of the appropri ateness of the indicators presented in this study and the addition and/or substitution of other indicators. In this study only the variables related to the outcomes of education were used. As previously noted, this study was part of a larger social forecasting and educational planning effort which also included the status of education 197677, social trends affecting the purposes of education, and social trends affecting the resources for education. While the literature in mathematics, statistics, and economics was reviewed and considered in preparation for the selection and use of the three extrapolative techniques (linear, loglinear, and curvilinear regression), there was no attempt to present the comparison of these techniques in the detail desired by these disciplines. Rather the compari son was made in such a way as to be most relevant to the planner in education. There was no attempt to write or adapt computer programs for various techniques. Instead, an effort was made to identify and utilize computer programs and statistical pack ages which had already been adapted for use at the North East Regional Data Center's computer facilities. Additionally, the projection of specific trends per se was not of interest in this study. Rather the focus of this study was the development of the conceptual framework and methodology for such projection. Also, there has not been any attempt to forecast educational outcomes from the operationally defined social and educational variables. The present work may be considered an initial step in determining the feasibility of developing such a mathematical forecasting model. Definition of Terms Extrapolative forecasting. The procedure consists of identifying an underlying historical trend or cycle in social processes that can be extrapo lated by means as varied as multiple regression analysis, time series analy sis, envelope curve fitting, threemode factor analysis, correlational analysis, averages, or any other method that takes current and historical data as the prin cipal basis for estimating future states in a given variable. (Harrison, 1976, p. 3) Indicator, educational. Educational indicators are statistics that enable interested publics to know the status of education at a particular moment in time with respect to some selected variables, to make comparisons in that status over time and to project future status. Indicators are timeseries statistics that permit a study of trends and change in education. (Gooler, 1976, p. 11) Indicator, social. "The operational definition or part of the operational definition of any one of the concepts central to the generation of an information system descrip tive of the social system" (Carlisle, 1972, p. 25); "time series that allow comparisons over an extended period which permit one to grasp longterm trends as well as unusually sharp fluctuations" (Sheldon & Freeman, 1970, p. 97); "a statistic of direct normative interest which facilitates concise, comprehensive and balanced judgments about the con dition of major aspects of a society" (U.S. Department of Health, Education,& Welfare, 1970, p. 97). Outcomes of education. Those measures of performance, such as achievement test scores, or utilization, such as employment rates, which appear to be the result of partici pation in the formal educational process. Regression, linear. Most common type of regression in which the objective is to locate the bestfitting straight line through a scattergram based on intervallevel variables (Nie, Hull, Jenkins, Steinbrenner, & Bent, 1975, p. 278). Regression, loglinear. As used in this study, a least squares regression method in which a geometric straight line is located through a scattergram plotted on semilogarithmic paper; also called exponential curve or trend curve. Regression, polynomial or curvilinear. Regression method for fitting a curve to a set of data using the cri terion of least squares distances (Nie et al., 1975, p. 278). Time series. "A set of observations generated sequen tially in time" (Box & Jenkins, 1970, p. 23). Procedures The study proceeded in the following phases: (a) using Bronfenbrenner's ecology of education model, 10 variables that research has shown to be related to the outcomes of education were selected and, where possible, were operational ly defined as time series indicators; (b) data were collected for these time series indicators, eight of which were used in the comparison of the selected extrapolative techniques; (c) using the selected time series indicators, three purely extrapolative techniques were compared according to statis tical criteria and practical considerations derived from the literature; and (d) implications for the use in educa tional planning of an ecological model such as Bronfen brenner's, time series social and educational indicators, and selected extrapolative techniques were derived. The Selection and Operational Definition of Variables The work by Collazo, Lewis, and Thomas (1977), completed during the first year of STAR Project No. R5175, on fore casting selected educational outcomes from social variables was utilized. Since the variables selected by these inves tigators were derived from a review of the research litera ture and were acknowledged to be appropriate for the stated social forecasting purposes by a panel of experts in various disciplines, they appeared to fulfill the requirements of this study. Additionally, each of the 10 variables selected for use was described and classified according to Bronfen brenner's ecology of education model. For each variable an attempt was made to identify one or more types of time series indicators which might logically represent the variable. For some variables several indica tors were identified, while for others, no indicator could logically be identified or no time series data were avail able for the indicator at the time of the study. This phase of the study is explained further in Chapter II. Collection of Time Series Indicator Data Sources of needed time series data at both the national and state level'were identified in several ways. The expan ding literature on social trends (e.g.,U.S. Department of Health, Education, & Welfare, 1970) and specifically the literature on these social trends operationalized as social indicators (e.g., Executive Office of the President, Office of Management & Budget, 1973) was reviewed. Furthermore, examination of initial efforts in using time series indica tors related to education by the Office of Technology Assess ment for the United States Congress (Coates, Note 1) and several state departments of education (e.g., Oregon, Penn sylvania, & Florida) yielded additional sources. Published sources of data such as U.S. Census Reports and Florida Statistical Abstracts were consulted. When data did not appear to be available in suitable form or for desired time periods, inquiries and requests were directed to appropriate sources. Any apparent limitations in the data such as known measurement error due to sampling technique were noted. After data collection was completed, eight indicators which met the criteria outlined in a previous section were selected for inclusion in the next phase of the study. Comparison of Extrapolative Methods Using Time Series Indicators The following steps were involved in this phase of the study: (a) initial identification and testing of methods using data similar in form to selected indicators, (b) recon sideration and testing of additional available methods, (c) selection of three methods to be used for comparative extra polations, (d) derivation of specific criteria and practical considerations from the literature, (e) application of three methods to each data set, (f) extrapolation of identified trend into future using equation generated in (e), and (g) comparison of actual versus predicted values of indicators. From a preliminary review of the literature in statis tics, economics, time series analysis, and forecasting methodology, the following four methods were tentatively identified for comparison: (a) linear regression (computer program by Nie et al., 1975), (b) curvilinear or polynomial regression (computer program by Nie et al., 1975), (c) Box Jenkins time series analysis (computer program by Cooper, Note 2), and (d) FIT curvefitting with weighted data (com puter program by Stover, Note 3). An initial analysis of the methods using trial sets of data combined with a visual analysis of the general form of the data to be used revealed that two of the methods under consideration were inappropriate. The BoxJenkins procedure, while an extremely powerful tool for time series analysis of data which are characterized by seasonal or cyclic variation (usually resulting in autocorrelation of observations and residuals), did not seem suitable for the social indicator data collected. (Should subsequent tests reveal autocorrela tion and hence a violation of the assumptions of the linear model, BoxJenkins could then be appropriately employed.) The FIT curvefitting procedure utilizing a weighted data principle was rejected because the computer program required extensive modification to yield necessary comparative statis tics and reliable output. Theoretical justification for the weighting formula and data transformations employed was unavailable. Thus, two of the four methods tentatively considered were rejected. Since the comparison phase was to involve three methods, the literature was again searched for other appropriate methods. The most promising of these was a curve fitting technique which utilizes an exponential function to describe a constant growth rate. This method, called log linear regression in this study, can be described in terms of the general linear model and solved by least squares pro cedures when the dependent variable undergoes logarithmic transformation. Since social phenomena sometimes exhibit what appears to be a constant growth rate, loglinear regres sion seemed to be an appropriate method to include in this study. The three methods finally selected for comparison were (a) linear regression (without data transformation), (b) curvilinear or polynomial regression, and (c) loglinear regression. The mathematical properties of each are pre sented in Chapter III. All three approaches to trend extra polation were executed by using variations of SPSS subpro grams SCATTERGRA4 and REGRESSION and that system's data transformation capabilities (Nie et al., 1975). Each of the three methods was applied to each of the eight selected time series indicators. Time in years was used as the independent variable; the annual measure or index of the indicator was treated as the dependent or response variable. Each data set was divided into thirds; twothirds of the data points were used to establish the prediction equation. This equation was then used to predict the remaining third of the data points. Predicted values were then compared with actual values. Thus, in this phase of the study three prediction equations (one for each method) were generated for each of the eight time series indicators. Statistical criteria de rived from the literature were used to evaluate the "good ness of fit" of the regression line derived from the pre diction equation to the data. The distribution of error (residuals) about the regression line was also examined to determine if the data satisfied the assumptions of the sta tistical model. Results of the method comparison phase are reported in Chapter IV. Development of Implications for Educational Planning In Chapter V methodological strategies involved in the selection and operational definition of variables are analyzed in terms of viability for future use. Results of the technique comparison phase are analyzed according to the statistical criteria and practical considerations derived from the literature in forecasting methodology and statis tics. In Chapter VI a summary of the study and conclusions warranted by the results of the study are presented. Future directions for research suggested by the results of this study are discussed. Additionally, implications for the use in educational planning of an ecological model such as Bronfenbrenner's, time series social and educational in dicators, and selected extrapolative methods are discussed. CHAPTER II RATIONALE FOR SELECTION OF VARIABLES/ TIME SERIES INDICATORS In the previous chapter, the need for educational planners and policy makers to have an awareness of the societal context of education was emphasized. To this end the Bronfenbrenner ecology of education model was proposed as a framework for the selection of social variables which affect the outcomes of the educational process. The selected social variables may then be operationalized as time series indicators; trends in these indicators can be identified and extrapolated into the future. Such information might then be incorporated into a planning model in order to assist planners and policy makers in making informed decisions about the role and function of education in the future. In order to place the use of time series indicators described in this study into perspective, in the first section of the present chapter social indicators are dis cussed in relation to their historical development, defini tion and use, and data base. Educational applications of indicators are briefly noted. In the second section the social variables selected for use in this study are presented in relation to the Bronfenbrenner model. These variables are then operationally defined as time series indicators, and the eight indicators selected for use in the comparison of the three extrapolative methods are listed. The Social Indicator Movement Historical Development Interest in societal trends by policy planners is not of recent origin in the United States. Indeed, in 1933 a presidential task force reported on social trends in a com prehensive work documenting social change in the United States (President's Research Committee on Social Trends, 1933). The development of indicators, or measures, of social change, however, did not receive the sustained governmental support that was provided for indicators of the economic process. Thus, while the development of economic statistics during the 1930's and 1940's provided "a solid basis for economic analysis and economic reporting which eventually resulted in the establishment of the Council of Economic Advisors and the Economic Report" (U.S. Department of Health, Education, & Welfare, 1970, p. v), comparable development of social in dicators was not undertaken. In the 1960's a renewed interest in statistics describing the social condition became apparent. Impetus for the de velopment of social indicators was provided by social scientists in various disciplines, government policy makers, and business leaders in the private sector (Brooks, 1972, p. 1). While this early effort was not well defined as to membership, organization, or objectives, the participants in the social indicator movement "sensed great needs and oppor tunities for change, [and] celebrated shared but necessarily ambiguous symbols" (Sheldon & Parke, 1975, p. 693). The following examples were drawn from the many mani festations of interest in the development of social indica tors during the period from 1965 through 1975 (over 1000 items were listed in a bibliography issued in late 1972 by Wilcox, Brooks, Beal, & Klongian): 1. The Russell Sage Foundation commissioned in 1965 and published in 1968 an independent study, Indicators of Social Change: Concepts and Measurements, on a number of aspects of structural change in society (Sheldon & Moore, 1968). 2. A study on ways to measure the impact of massive scientific and technological change on society (Bauer, 1966) was prepared by the American Academy of Arts and Sciences for the National Aeronautics and Space Administration. This work, Social Indicators, was an overview of the task of developing indicators as part of a feedback mechanism docu menting social change. 3. President Johnson in March of 1966 directed the Secretary of Health, Education and Welfare "to develop the necessary social statistics and indicators to supplement those prepared by the Bureau of Labor Statistics and the Council of Economic Advisors" (U.S. Department of Health, Education, & Welfare, 1970, p. iii). The result of this directive, Toward a Social Report issued in 1969, was considered "a preliminary step toward the evolution of a regular system of social reporting" (U.S. Department of Health, Education, & Welfare, 1970, p. iii). 4. The Social Science Research Council established in 1972 the Center for Coordination of Research on Social Indicators, whose objective is "to enhance the contribution of social science research to the development of a broad range of indicators of social change" (World Future Society, 1977, p. 97). 5. The appearance of the landmark government publica tion, Social Indicators 1973 (Executive Office of the Presi dent: Office of Management & Budget, 1973),was heralded as a significant attempt to provide a collection of social sta tistics describing quality of life in the United States. This work, which was scheduled to be updated every three years, was a compilation of statistics on eight major areas of social interest: health, public safety, education, em ployment, income, housing, leisure and recreation, and popu lation. (It may be noted that it is often impossible to strictly categorize what is social and what is economic, as almost all aspects of life are the result of interaction be tween social and economic forces.) The idea of systematic collection and use of social indicators has not always mct a favorable reception. Bezold (Note 4) and Shostak (Note 5) documented the unsuccessful efforts, beginning in 1967, by Walter F. Mondale to earn congressional approval of a farreaching plan for new government use of the applied social sciences. Mondale's blueprint for better collection and use of social intelli gence involved two statutorilymandated additions to the Executive Office of the President: (a) a Council of Social Advisors (CSA) comparable to the Council of Economic Advisors established in 1946, and (b) an annual Social Report of the President prepared by the CSA to parallel the annual Economic Report to the President. While aspects of Mondale's plan may be satisfied by such efforts as Social Indicators 1973 and several congressional provisions which required the develop ment and application of social science techniques to the study of present and future national problems (Shostak, Note 5), the comprehensive nature of the Mondale plan is absent. The future of such governmental efforts at social accounting was in 1977 uncertain. Interest in social indicators has not been confined to the United States (Johnson, 1975; Sheldon & Parke, 1975). In 1973, for example, the Organization for Economic Coopera tion and Development (OECD) to which the United States also belongs issued a list of social concerns shared by many member countries. The identification of concerns was a first step in the development of a set of social indicators designed explicitly to reveal, with validity, the level of wellbeing for each social concern in the list and to monitor changes in those levels over time. (OECD, 1973, p. 4) Additionally, international organizations such as the Conference of European Statisticians, the United Nations Research Institute for Social Development, and the United Nations Educational, Scientific,and Cultural Organization have been actively concerned with social indicators (Sheldon & Parke, 1975). Efforts to develop social indicators have been initiated in countries such as France, Great Britain, West Germany, Canada, Japan, Norway, Sweden, and Denmark (Brooks, 1972; Johnson, 1975). Definition and Use of Social Indicators The social indicator movement has been characterized by ambiguity of definition and purpose due, in part, to the heterogeneous nature of participants with their own back grounds, skills, and interests, and also, to the necessary stages of evolution that such a movement experiences. These problems in definition and purpose of social indicators have been discussed by a number of critics (Land, 1971; Little, 1975; Plessas & Fein, 1972; Sheldon & Freeman, 1970; Sheldon & Land, 1972). Attempts have been made to resolve a number of these problems. Land (1971), for example, proposed the following social scienceoriented definition of social indicators: social indicators refer to social sta tistics that (I) are components in a social system model (including sociopsy chological, economic, demographic, and ecological) or of some particular segment or process thereof, (2) can be collected and analyzed at various times and accumu lated into a time series, and (3) can be aggregated or disaggregated to levels appropriate to the specifications of the model. .*. The important point is that the criterion for classifying a social statistic as a social indicator is its informative value which derives from its empirically verified nexus in a con ceptualization of a social process. (p. 323) Part of the confusion over definition is the result of disagreement over purposes, or uses, of social indicators. These purposes, or uses, have been considered under a number of overlapping, sometimes synonomous, headings: (a) descrip tions reporting, (b) policy planning, (c) social accounting, (d) program evaluation, (e) social modeling, (f) social fore casting, and (g) social engineering. While the ultimate ob jective of guiding social policy is rarely disputed, the form of this guidance is still debated. Social scientists are more likely to be concerned with the analysis and pre diction of social change, while public administrators and legislators are often more concerned with uses of indicators related to public program evaluation and agency goal setting. Sheldon and Parke (1975) in acknowledging these concerns, said: It is apparent that many different types of work go on under the rubric of social indicators. What is impor tant is that the field be seen as an arena for longterm development, as an effort of social scientists to push foreward developments in concepts and in methodology that promise payoffs to both science and public policy. (p. 698) To underscore this point, Sheldon and Parke (1975) selected an observation by Duncan: The value of improved measures of social change. . is not that they necessarily resolve theoretical issues concerning social dynamics or settle pragmatic issues of social policy, but that they may permit those issues to be argued more productively. (p. 698) Data Base for Social Indicators Various efforts have been undertaken to improve the data base for social indicators. Among the efforts in the early 1970's were basic surveys on crime and education as well as replications of previous social science studies and surveys (Sheldon & Parke, 1975). Most social statistics, available Drimarily from govern ment sources, are objective in nature; that is, they measure the frequency of occurrence of an attribute or commodity in the population. Numbers of births, deaths, marriages, years of schooling, and percent of occupied housing with television sets could thus be considered objective measures. (Some would disagree, however, with the objectivity of these measures, see Andrews & Withey, 1976, p. 5.) Several researchers (Andrews & Withey, 1976; Campbell, Converse, & Rogers, 1976) have attempted to measure people's perceptions of their wellbeing, their quality of life. Such measures collected on a regular basis are expected to be valuable supplements to the usual objective quality of life indicators. (See for examples of the latter: Liu, 1976; Thompson, 1976b, 1977.) Creation of a social indicator data base is not without conceptual and methodological problems. Various aspects of the social measurement problem have been acknowledged in the literature (see, for example, de Neufville, 1975, pp. 175179; Etzioni & Lehman, 1969; Social Measurement, 1972). While de tailed discussion of measurement dysfunction (in the termi nology of Etzioni & Lehman, 1969) is beyond the scope of this study, the following observation might be kept in mind: Increased investment, intellectual as well as financial, no doubt can go a long way to increase the efficacy of social measurements and to reduce much of the likelihood of dysfunctions. But, in the final analysis, these problems can never be eliminated en tirely. Here, the client of systematic measurement and accounting should be alerted to the limitations of social indicators, both to make his use of them more sophisti cated and to prevent him from ultimately rejecting the idea of social accounting when he encounters its limitations. (Etzioni & Lehman, 1969, p. 62) Educational Implications Educational indicators, a subset of social indicators, have traditionally been measures of the educational system's inputs and outputs stated in such terms as numbers of tea chers, per pupil expenditures, and achievement test scores. There have been attempts, however, to broaden this base of educational statistics to include both objective and sub jective indicators under the categories of access, aspirations, achievement, impact, and resources (Gooler, 1976, p. 15). There have also been attempts to link indicators of social processes (e.g., divorce rates, voting rates) to educational goals and thus to establish accountability measures, albeit remote, external to the educational system (Clemmer, Fairbanks, lall, Impara, & Nelson, 1974; Collazo, Lewis, & Thomas, Note 6; Grady, 1974). The use and abuse of indicators in an edu cational setting, however, remained in 1977 a matter of debate (Impara, Note 7) and cautious optimism (Hall, Note 8). Hope fully, investigations of the problem, such as that described in this study, will provide some guidance as to the most promising applications of social indicators to education. Selection of Variables/Time Series Indicators The Variables In the first year (Sept. 1975June 1976) of Florida Department of Education STAR Project R5175 on social fore casting for educational planning, trends in five indicators of educational outcomes were forecast. In order to do this, it was necessary to identify variables that influence the outcomes of education. Through a review of the research and theoretical literature, a number of social variables were identified. This list was refined by an interdisciplinary panel of experts at the University of Florida to the following 10 variables: (a) socioeconomic status; (b) family expectations, attitudes, and aspirations; (c) student's self concept; (d) student's general ability; (e) student's sense of fate control; (f) student's attitudes and motivation; (g) peer group characteristics; (h) teacher expectations; (i) teacher behavior in the classroom; and (j) administrative leadership style. Collazo et al. (1977) said that only the variables (a) and (d) received strong support from research; a number of the other variables, while "identified as impor tant in the theoretical literature. .had inconclusive support from research" (p. 298). (See Collazo, Lewis, & Thomas, Note 9, for a review of the research literature on variables affecting educational outcomes.) The panel of experts was further utilized to forecast the future trends of these variables and their effect on specified performance and utilization measures of the out comes of education. Crossimpact analysis, a computer assist ed modification of the Delphi forecasting technique, was then used by the panel to generate the future trends in the five outcome indicators. The framework for looking at the future established during these first year project activities is utilized in the present study. Previous forecasting activities were based primarily on the subjective judgment of panel participants. In this study, however, the feasibility of using time series data, where available, as the basis for forecasting future trends in the 10 variables affecting educational outcomes is examined. In addition, the use of a model containing the selected variables is considered. Bronfenbrenner's Ecology of Education Model In the previous section, the 10 variables affecting educational outcomes which were derived from the research literature were presented. flow can these variables be put into perspective as social forces influencing what the stu dent learns? The Bronfenbrenner (1976) model which was presented in Chapter I (pp. 13) is a multidimensional ecological struc ture of the educational environment. At the center of the interacting meso, exo and macrosystems is the microsystem, "the immediate setting containing the learner" (Bronfenbrenner, 1976, p. 5). The mesosystem is actually a system of micro systems; that is, it "comprises the interrelationships among the major settings containing the learner at a particular point in his or her life" (Bronfenbrenner, 1976, p. 5). Some of the social variables that were identified previously could be considered as part of the mesosystem. The home, for ex ample, is represented by socioeconomic status and family expectations, attitudes, and aspirations; the peer group by peer group characteristics; and the school by teacher expec tations, teacher behavior in the classroom, and administra tive leadership style. The other variables: student's self concept, student's general ability, student's sense of fate control, and student's attitudes and motivation are all di rectly related to the learner. Bronfenbrenner (1976) proposed that learning is a func tion of (a) the dynamic relationship between characteristics of the 'learners and their various surroundings (mesosystem) and (b) the interaction between these various environments (e.g., home, school, peer group). The Bronfenbrenner ecology of education model thus appears to provide the necessary framework to support use of the presently identified variables and to generate directions for future forecasting research. Operational Definition of Variables as Time Series Indicators In previous sections 10 variables affecting educational outcomes were presented and then classified according to the Bronfenbrenner ecology of education model. In order to iden tify trends in these variables and to extrapolate these trends into the future, it was necessary to operationally define these variables as time series measures, or indicators. Since some of these variables were expressed in general terms, it seemed necessary to try to represent each by a number of measures and thus avoid "fractional measurement" which is often a concern when operationally defining a social concept (Etzioni F, Lehman, 1969). Several problems became apparent in operationalizing the variables: 1. A number of indicators were identified for the variables (a) socioeconomic status; (b) family expectations, attitudes, and aspirations; and (c) peer group characteristics. For some indicators, however, data were not collected annually; for others, measures were not comparable over time due to a different basis for measurement. 2. For the variables related to the school and student characteristics (except student attitudes and motivation), no time series data which met the criteria for selection were available. 3. Operational definitions were in many cases influenced by the availability of indicators rather than the logic or appropriateness of the indicator to measure the social concept it represented. The social variables, examples of indicators that might be used to operationally define these variables, and sources of the available time series data are presented in Table 1. The following eight indicators which met the criteria estab lished for this study (see p. 9) were selected for use with the three extrapolative methods described in Chapter III: 1. Median family income in the United States ex pressed in 1971 constant dollars. 2. Number of families in the United States headed by women expressed as a percentage of total families. 3. Number of wives in the labor force expressed as a percentage of total wives in the United States. 4. Number of marriages in Florida expressed as rate per 1,000 population in Florida. 5. Number of dissolutions of marriage in Florida ex pressed as rate per 1,000 population in Florida. 6. Number of resident live births in Florida ex pressed as rate per 1,000 population in Florida. 7. Number of 3 to 5 year olds enrolled in nursery school and kindergarten expressed as percentage of total children 3 to 5 years old in the United States. 8. Number of children involved in divorce or annulment expressed as rate per 1,000 children under 18 years old in the United States. While rates or percentages are used for forecasting purposes, the magnitude of the actual numbers should be kept in mind before interpretation of an identified trend is Table 1 Time Series Indicators of Social Variables Affecting Outcomes of Education Availabletime series data Social variables Indicatora Years b) State U.S. Source Socioeconomic status of family Median family income Employment ratetotal labor force Unemployment rate Husbandwife families with two workers or more Female headed families 1947 1971 1947 1976 1947 1975 1950 1975 1950, 1955 1957 1940, 47,50 55,60 65,70 75 Fla. X U.S. Dept. of Commerce, Bureau of the Census in Social Indicators 1973 X U.S. Dept. of Labor, Bureau of Labor Statistics in Employment and Earnings, Dec. 1976 X U.S. Dept. of Labor, Bureau of Labor Statistics in Employment and Earnings, Dec. 1976 Fla. Dept. of Commerce, Division of Employment Security Research and Statistics X U.S. Dept. of Labor, Bureau of Labor Statistics in Special Report 189 X U.S. Dept. of Labor, Bureau of Labor Statistics in Special Report 190 Table 1 continued Available time series data Social variables Indicator a Yearsb State U.S. Source Labor force participation rates of wives Children living in poverty Family expecta tions, attitudes, aspirations Marriage rate Divorce rate Birth rate 1950, 5575 1959, 62,65 68,71 75,76 1930, 40,50 60,63 75 1930, 40,50 60,63 75 1964 1975 Births to unwed mothers By age 1018 1957 1975 Fla. X U.S. Dept. of Labor, Bureau of Labor Statistics in Special Report 189 X U.S. Dept. of Commerce, Bureau of the Census and 1976 Survey of Income and Education State of Florida, Dept. of Health and Rehabilitative Services, Division of Health in Florida Statistical Abstract 1976 (Thompson, 1976a) Fla. Fla. Fla. X U.S. Dept. of Commerce, Bureau of the Census in Florida Statistic Abstract 1976 Public Health Statistics Section, Fla. Dept. of Health & Rehabilitative Services Table 1 continued Available time series data Social variables Indicator a Yearsb State U.S. Source Peer group characteristics Student's self concept By race By age & race Public atti tude toward education 3 to 5 year olds enrolled in nursery school and kindergarten Children under 18 involved in divorce Suspected offenders for four violent crimes, by age NAc 1930, 40,50, 5675 1956 1975 1969 1976 1964 1975 Fla. Fla. 1953 1975 1958 1972 Public Health Statistics Section, Fla. Dept. of Health & Rehabilitative Services Public Health Statistics Section, Fla. Dept. of Health & Rehabilitative Services X Gallup Poll (published annually in Phi Delta Kappan, 196977 X HEW, National Center for Educational Statistics in Advisory Committee on Child Development (1976) study X 195367: 196875: Ferriss (1970) Bane (1977) X See Social Indicators 1973 for list of sources Table 1 continued Available time series data Social variables Indicatora Yearsb State U.S. Source Student's general NA ability Student's sense NA of fate control Student's atti High school 1901 X HEW, National Center for Educational tudes toward graduation 1975 Statistics in The Condition of Edu education and rate biennial cation 1977, Vol. 3, Part 1 motivation for achievement School reten 192432 X HEW, National Center for Educational tion rate, 5th to Statistics in Digest of Education grade to high 196775 Statistics (1976 edition) school gradua tion Teacher expecta NA tions Teacher behavior NA in classroom Administrative NA leadership style aIndicators presented are examples of time series data that might be used to operationally define the accompanying variable; a number of other indicators representing data aggregated to local, state, or national levels (depending upon purpose) could be added and/or substituted. bYears presented are those found in the course of this study and do not imply that data are in any way limited to these years. cNA: Time series indicator not available for variable. attempted. Furthermore, it is necessary to remember that since the population base increased over the decades covered by the data, a stable rate or percentage still represents larger absolute numbers of the phenomenon. The indicators selected are aggregated to either the state or national level; the appropriate level of aggregation would, of course, depend upon the specific planning activity. These indicators could be disaggregated by race, age, region, or sex (where appro priate) for comparative analysis, and indeed this feature is a necessary characteristic in many of the definitions of social indicators (e.g., see definition by Land, 1971, pre sented earlier in this chapter). CHAPTER III RATIONALE FOR EXTRAPOLATIVE METHODS SELECTED FOR COMPARISON One of the purposes of this study was to compare three purely extrapolative methods which could be used with social indicator data of the type described in Chapter II to fore cast future values of those indicators. In order to select methods which were appropriate for this purpose, both the general forecasting literature and forecasting applications of extrapolative techniques in specific areas were reviewed. Detailed descriptions of each technique as well as statisti cal assumptions, sensitivity, and evaluative criteria were derived primarily from the literature in economic statistics and regression analysis. The following sections provide (a) an overview of the extrapolative methods used in forecasting, (b) evaluation of the applicability of these methods for the purpose of this study, (c) a description of the three methods selected for comparison, including equations, parameters to be estimated, assumptions, and criteria to be used in the comparison of the three methods. Overview of Extrapolative Forecasting Methods An extrapolative forecasting method is a procedure for (a) identifying an underlying historical trend or cycle in time series data, and (b) estimating future states of a variable based on current and historical observations/mea sures of that variable (Harrison, 1976). Extrapolation pro vides a "surprise free" projection of the future, but not necessarily a future which is a bigger and better (or worse) version of the present. Martino (1976) noted that some extrapolation methods allow the forecaster to identify policy variables which are subject to manipulation and which allow the decisionmaker to alter the future away from today's pattern of events. (p. 4) In the social realm extrapolation of trends may at least allow the planner or policy maker to make enlightened de cisions to prepare for the future. Economic and Business Forecasting Because of the impetus in the 1930's and 1940's to describe and forecast the economic condition, many extra polative methods were developed with economic applications in mind. Greenwald (1963, p. 187) classified methods for determining economic trends into (a) nonmathematical methods such as freehand curve fitting, firstorder differences, semi averages, selected points, and weighted and unweighted moving averages; and (b) mathematical methods such as least squares, moments, maximum likelihood, and others. In general, only the mathematical methods, which include a widely diverse array of complex curvefitting techniques, seem to be relied upon for forecasting purposes while the nonmathematical methods are used for preliminary analysis of the shape of the time series data. (For descriptions of these methods, see Greenwald, 1963; Mayes & Mayes, 1976; Mendenhall & Reinmuth, 1971; Neiswanger, 1956; Tuttle, 1957.) Approaches to governmental/national economic forecasting (e.g.,Theil, 1966) often reach a relatively high level of mathematical and theoretical sophistication. This appears to be the result of decades of development, of applying method in light of theory, and developing both in turn. It is also the result of substantial investment of financial and manpower resources by both government and industry. The value of extrapolative forecasting to individual decisionmakers in business has become apparent (Makridakis, Hodgsdon, & Wheelwright, 1974). Indeed, companies of all sizes are compelled to make forecasts for a number of varia bles which affect them. Makridakis et al. (1974) have noted, however, that as with the development of most management science techniques, the application of these [extrapolative forecasting] methods has lagged behind their theoretical formu lation and verification. (p. 153) Thus, the authors observed that while the need for forecasting methods is recognized by managers in business, few are famil iar with the numerous techniques available and their charac teristics in order that the one most appropriate for a given situation be selected. To help meet this need, Makridakis et al. have developed an interactive forecasting system (called Interactive Forecasting [SIBYL/RUNNER]) which allows a number of factors to be considered in the selection of a forecasting technique for a given set of data. Although the system has been well tested in teaching situations, it has not had extensive application in actual business settings. Quantitative techniques available in the Interactive Forecas ting (SIBYL/RUNNER) system fall under the general headings of smoothing, decomposition, control, regression, and other techniques. The techniques considered under those headings are clearly explained in a subsequent work of two of the authors (Wheelwright & Makridakis, 1977). Technological Forecasting Martino (I973h; 1976) described the extrapolative methods most commonly used in technological forecasting in relation to the shape of their fitted curves: (a) growth curve, an Sshaped curve, which requires the setting of an upper limit; (b) trend curve, an exponential function which takes the form of a straight line when logarithmic transformation of the data is undertaken. Martino (1973b) illustrated the use of the growth curve with data on lowest temperature achieved in the laboratory by artificial means and the trend curve with data on productivity in the aircraft industry. It should be noted that both the growth curve and the trend curve applied to technological change by Martino (1973b) are highly versatile approaches with applications in a number of disciplines. Both methods are derived from the least squares formula for a straight line. The growth curve is a modified exponential, that is, it represents a variable which changes at a changing rate; the trend curve is a geometric straight line which represents a variable which changes at a constant rate (Neiswanger, 1956). Educational Forecasting Uses of extrapolative methods in education have generally been limited to projections of expenditures, school enroll ments, and the number of instructional staff, high school graduates, and earned degrees. While many states and school districts have developed their own models, especially for projections of enrollments, the National Center for Educa tion Statistics (U.S. Department of Health, Education, & Welfare, 1977c) in developing projections of education statis tics to 198586 relied on regression methods wherever a trend could be established. Specifically, either arithmetic straight lines or logistic growth curves, depending upon the nature of data, were fitted by the method of least squares. The following was noted, however: For both the straight line and logistic growth curve, the fitted curve often lies considerably above or below the last ob served point, resulting in an unusual rise or drop from the last actual observa tion. To avoid this and give face validity to the projections, the fitted curve was used only to establish the last point, and a new curve was drawn through the last observed ratio and the end point on the fitted curve. (U.S. Department of Health, Education, & Welfare, 1977c, p. 92) Brown (1974) summarized the use of trend analysis methods in education and noted their potential applications in educa tional administration. The four extrapolative methods that he critiqued were (a) arithmetic straight line extrapolation, (b) time series analysis (really a simplified version of the BoxJenkins technique), (c) the Sshaped growth curve, and (d) cohort analysis (actually the trend curve described by Martino in the previous section). The examples selected by Brown do not reveal the versatility of the methods illustra ted; he did, however, provide a comprehensive review of literature describing applications in other fields. A number of methodological concerns raised by Brown were considered in this study. In a critique of selected futures prediction techniques that might be employed by educational planners, Folk (1976) observed that exponential trend line and arithmetic straight line projections appear to be the most commonly used extra polative techniques. This author provided a number of useful measures for evaluating statistically derived regression lines. The educational applications just described are basically attempts to project inputs such as money, pupils, or teachers to the educational system or outputs (graduates, degrees earned) of that system. No attempt to extrapolate the future status of variables which affect these studentrelated inputs or outputs was discovered in the literature search. Extrapolative Methods in Other Areas Several areas have developed highly specialized extrapo lative methods in making forecasts of the future. Popula tion, employment, and unemployment projections, for example, are usually based on fairly complex models which incorporate a number of factors. These particular applications are not reviewed here due to their highly specialized purposes and functions. Applicability of Reviewed Extrapolative Methods for Study In evaluating the applicability of the previously re viewed extrapolative methods for projected future states of the time series indicators selected for use in this study, several points needed to be considered. Chief among these were (a) the underlying pattern of the data that can be recognized and (b) the type or class of model desired (from Wheelwright & Makridakis, 1977). Both of these will be briefly considered in relation to this study. The Pattern of the Data From graphical representations of each indicator, the data for each appeared to be characterized by a trend which either increased or decreased with time. Some also appeared to contain cyclical patterns and random fluctuations. It seemed as if major trends might follow the form of a straight line or curve with one or two bends. The Class of Model Wheelwright and Makridakis (1977) distinguished four classes or categories of models: 1. The time series model "always assumes that some pattern of combination of patterns is recurring over time" (p. 22). 2. The causal model assumes "that the value of a cer tain variable is a function of several other variables" (p. 23). 3. The statistical model comprises a number of fore casting techniques; it uses the language and procedures of statistical analysis to identify patterns in the variables being forecast and in making statements about the reliability of these forecasts. (p. 23) 4. The nonstatistical model includes "all models that do not follow the general rules of statistical analysis and probability" (p. 24). Of course, some techniques can be classified into more than one of the four types of models. It appeared that the statistical model, with its welldefined properties, and replicable procedures, would be an appropriate starting point for predicting the longterm trends in the selected time series data. The review of the literature revealed several techniques denoted by the form of their curves which are sensitive to longterm trends in the data and which are classified under the statistical model: (a) the arithmetic straight line, (b) the Sshaped growth or logistic curve, (c) the trend or exponential curve, (d) the polynomial curve. All of these techniques are regression techniques solved by least squares procedures. Techniques (b) through (d) require data trans formations to satisfy the basic linear model used in regres sion. The growth or logistic curve was eliminated from com parison because this technique necessitates the setting of limits which might bias the results of the study due to its ex post facto nature. The remaining three techniques were considered to be appropriate for use in the comparison phase of this study. Description of Methods to be Compared Since the three techniques selected for comparison are intrinsically linear in their parameters (Draper & Smith, 1966), the general linear model denoted by the simple or bivariate regression equation is presented first. Addition ally, estimation of the parameters of the equation by least squares procedures, the assumptions of the model, and criteria for evaluation and comparison of the three methods are dis cussed. Each technique is then described in relation to the general linear model. The General Linear M'odel In the comparison of methods using selected time series indicators, time in years is considered the independent variable and the indicator is considered the dependent or response variable. Thus, if time is denoted by X, and the indicator is denoted by Y, a functional relationship in the form Y = f(X) might be stated. However, since most social relationships are stochastic (probabilistic) rather than deterministic in nature, a more appropriate form might be Y = f(X) + e, where e represents error, a measure of the unknown factors. When the relationship between the two variables, time and the indicator (Y) is assumed to be linear (that is, repre sented by a straight line), the equation becomes Y = 60 + ix; and because many social relationships are stochastic for particular values of the variables, this equation is actually Y = 30 + 81X + E. Since the population parameters So and 6i are not known unless all possible occurrences of X and Y are known, the available data are used to provide estimates b0 and b, of So and Sj as in the following regression equation, Y = b0 + bjX + e (where Y denotes predicted values of Y). The constant b0 (the intercept) and the regression coefficient b, (the slope of the regression line) can be determined by ordinary least squares procedure, "so called because it estimates. . in such a way that the sum of squared residuals, 2 Eei is as small as possible" (Mayes & Mayes, 1976, p. 112). (For detailed treatment of simple regression and least squares estimation of So and 61, see, for example: Draper & Smith, 1966; Kerlinger & Pedhazur, 1973; Mayes & Mayes, 1976; Men denhall, Ott, & Larson, 1974; Mendenhall & Reinmuth, 1971; Runyon & laber, 1967.) The Assumntions of the Linear Model Draper and Smith (1966) noted that In many aspects of statistics it is necessary to assume a mathematical model to make progress. It might be well to emphasize that what we are usually doing is to consider or tentatively entertain our model. (p. 8) Thus, when the general linear model is employed as it is in this study, it becomes necessary to examine the assumptions upon which the model is based and to judge whether the model is in fact appropriate for the data. Assumptions for the general linear model include the following: 1. The regression equation Y = b0 + b1X + e is a better predictor of Y than Y = Y (bi/ 0). 2. The regression equation accounts for a significant portion of the variation in Y, that is, the relationship between X and Y described by the equation is not the result of chance. 3. The error term c has a mean value equal to zero and variance equal to 02; it is an independent random variable which is normally distributed. If the first two assumptions are not met, then the model is not a good predictor for that data. If the third assump tion is not met, then it is not appropriate to interpret the results statistically, that is, in terms of the probability distribution of the random error e. It is possible to test Assumption 1 and Assumption 2 by the F statistic. Assumption 3 is best evaluated by plotting the residuals and examining the pattern of the deviations from the regression line (Anscombe, 1973; Anscombe & Tukey, 1963; Draper & Smith, 1966). Independence of the errors (Assumption 3[e]) may be tested by the DurbinWatson test for serial correlation (Durbin & Watson, 1950; Durbin & Watson, 1951; Mayes & Mayes, 1976; Wheelwright & Makridakis, 1977). Criteria for Comparison of Methods The following questions were derived from the literature to guide the comparison of methods: 1. Do the data satisfy the assumptions of the model? (See previous section.) 2. How well does the regression line fit the data from which it was derived (the twothirds of the data points used to generate the prediction equation)? Tufte (1974, pp. 6970) listed four measures of quality of fit: a. the N residuals: Y. Y. 1 1 b. the residual variation: s2 (Y i ^(Y ) 2 yx N k* 1 (or the square root of the residual variation, Sy.x, called the unbiased standard error of estimate). *k refers to the number of X terms in the regression equation. c. the ratio of explained to total variation: r2 = i E(y i y)2 d. the standard error of the estimate of the slope: Sbi = Syx Thus, for each set of data, the methods are compared according to these four measures. The observed and pre dicted values of Y are also reported in tabular form; both observed and predicted values are plotted for visual com parison as recommended by Anscombe (1973). 3. How well does the extrapolated line fit the data (the onethird of the data points that were not used to generate the prediction equation)? The residual variation around the extrapolated line, which is an indicator of the accuracy of the forecasting technique, may be expressed by its square root, the standard error for the extrapolated values. As in (2), the observed and extrapolated values are reported in tabular form; both observed and extrapolated values are plotted for visual comparison. Neiswanger (1956, p. 534) cautioned against accepting only mathematical tests of "goodness of fit" as proof that the mathematical expression is appropriate for the trend in the data. Other considerations such as the "reasonableness of the extrapolated values which the trend may yield" (p. 534) and "the extent to which this statistical manifestation of growth is supported by other evidence" (p. 534) should be kept in mind. Thus, the calculation of a trend is more than a mathematical analysis in curve fitting; it is essentially a problem of analysis of the phenomena represented by the data (Neiswanger, 1956). It should be also noted that while the standard error of estimate gives an overall measure of error around the regres sion line, it may not be appropriate for computing confidence intervals for a specific forecast value. The reason for this is that the further an X is from X, the larger is the error that may be expected when predicting Y from the regression line. Draper and Smith (1966) noted: We might expect to make our "best" pre dictions in the "middle" of our observed range of X and would expect our predic tions to be less good away from the "middle." (p. 22) Therefore, the confidence limits for the true value of Y for a given X are two curved lines about the regression line. The limits change as the position of X changes. Hence the following equation was provided by Wheelwright and Makridakis (1977, p. 82) for computing the standard error of forecast (SEf): SEf = (Y i 31 + + )2 j N J(Xi _ for a specific forecast value. Method 1: Simple Linear Regression The equation Y = f(X) describes a natural functional relationship between X and Y. If this functional relationship can be expressed by a straight line on arithmetic paper, the linear, firstorder regression equation Y = b0 + biX + 6 may be appropriate. The natural linear function is used when an absolute amount of change in Y per unit of X is hypothe sized. Method 2: Loglinear Regression Occasionally when time series data are plotted on an arithmetic scale the scatter of points fall more in a curve than in a straight line with the curve rising or decreasing more rapidly as X increases. These same data when plotted on a semilogarithmic scale will produce a straight line. The relationship between X and Y may then be described by log Y = f(X) or Y = abx, the exponential form of the logarithmic relationship between X and Y. The exponential function is used when there is thought to be a constant rate of change in Y per unit absolute change in X. Thus, for each year (X), Y changes by a constant per centage (rather than by an absolute amount as in Method 1). It is possible to fit the exponential function to the general linear model by transforming the values of Y to log Y. Thus Y = abx becomes log Y = a + bX log Y = log a + X log b or log Y = log b0 + X log bi + 8. As in the case of the natural number straight line, the method of least squares is used to estimate the parameters necessary for computing the logarithmic (or geometric) straight line. Tuttle (1957, p. 431) noted, therefore, that AA the log Y's are fitted to the log Y's, not the Y to the Y's, by the least squares criterion. Thus, Tuttle (1957, p. 432) recommended that the standard error of estimate be computed from the antilogs of the log Y values. If Sy.x was computed as the root mean square of the unexplained variation, "it would be in terms of the deviations of the logarithms of the YcS[Y's] from the logarithms of the Y's" (Tuttle, 1957, p. 432). The Sy x would not be comparable to those obtained from untransformed data as in Method 1. Similarly, Seidman (1976) has observed that in comparing linear and loglinear models, R2 may not be a sufficient criterion of choice. This is because the R2 represents "the proportion of variance of the logarithm of Y explained by the regression: log Y, not Y, is the dependent variable" (Seidman, 1976, p. 463). Therefore, Seidman recommended using the antilogs of the predicted values of log Y "in a regression explaining variability in Y" (p. 463). This R2 may then be used for comparison purposes. The examples given by Seidman (1976) were based on logarithmic transformations of both dependent and independent variables, but the same ob servation may be made when only the dependent variable is transformed. Seidman's reservation about R2 has been con sidered in this study. If an exponential curve appears to fit the data, it is often desirable to find the annual rate of change c. This can be derived from the regression coefficient bl, according to the following equation: log bi = (1 + c) change = antilog b1 1. The result should then be expressed as a percent (Mayes & Mayes, 1976, p. 94; Nie et al., 1975, p. 370). The common or Briggs logarithm, used in the Y trans formation in this study, is the power to which 10 must be raised to equal the number (see Neiswanger, 1956, p. 210; Tufte, 1974, p. 108). Natural logs or logs to the base 2 could also have been used to obtain the same results (Snedecor, 1956, pp. 450451). Method 3: Polynomial Regression In Method 1, the equation which expresses a straight line relationship between X and Y is Y = bo + biX + which is a linear (in the b's) firstorder (in X) regression equation. When this functional relationship between X and Y can be expressed as a solid, or unbroken curve on arithmetic paper, the linear, secondorder (or quadratic) regression equation Y = b0 + biX + b2X2 + 8 may be appropriate. When the relationship can be expressed as a curved line with two bends on arithmetic paper, the linear, thirdorder (or cubic) regression equation ^Y = b0 + biX + b2X2 + b3X3 + @ may be used. According to Kerlinger and Pedhazur (1973, p. 209), the highest order a polynomial equation may take is equal to N 1, where N is the number of distinct values in the independent variable. However, since one of the goals of scientific research is parsimony, our interest is not in the predictive power of the highest degree polynomial equation possible, but rather in the highest degree polynomial equation necessary to describe a set of data. (Kerlinger & Pedhazur, 1973, p. 209) Another reason for a parsimonious approach to polynomial curve fitting is that for each order added to the equation, a degree of freedom is lost. This is especially important when the number of observations are small as they are in this study (observations range from 8 to 20 in each of the eight sets of data). Also, higher order polynomial curves may possess statistical significance but be devoid of practical significance. Accordingly, only the quadratic and cubic forms of the polynomial regression equation are considered. In the polynomial regression the independent variable, X (time), is treated as a categorical variable and is raised to a certain power. In the quadratic equation, each value of X is squared to create a new vector of the squared X's, X2. Similarly, in the cubic equation, each value of X is cubed to create an additional vector of the cubed X's, X3. Thus, the resulting equation can be solved by a stepwise multiple regression procedure, in which at each step of the analysis, the R2 is tested to see if the higherdegree poly nomial accounts for a significant proportion of the variance. While a least squares solution is used in this study, the values of the unknowns may also be found by orthogonal poly nomials (see Draper & Smith, 1966, pp. 150155; Greenwald, 1963, pp. 204209; Kerlinger & Pedhazur, 1973, pp. 214216). Neiswanger (1956, pp. 529532) noted that the second degree and thirddegree parabolas provide greater flexi bility in fitting a line to a set of data for the parabolas allow a trend to change direction. Whether or not the flexibility of the parabolic function enhances the predic tability of extrapolated Y values, however, is not certain and is examined in this study. CHAPTER IV COMPARISON OF EXTRAPOLATIVE METHODS USING SELECTED TIME SERIES INDICATORS In Chapter II a rationale for the selection of social variables operationally defined as time series indicators was provided. The following eight time series indicators were selected for use in the method comparison phase of this study: 1. Median family income in the United States expressed as 1971 constant dollars. 2. Number of families in the United States headed by women expressed as a percentage of total families. 3. Number of wives in the labor force expressed as a percentage of total wives in the United States. 4. Number of marriages in Florida expressed as rate per 1,000 population in Florida. 5. Number of divorces in Florida expressed as rate per 1,000 population in Florida. 6. Number of resident live births in Florida expressed as rate per 1,000 population in Florida. 7. Number of 3 to 5 year olds enrolled in nursery school and kindergarten expressed as percentage of total children 3 to 5 years old in the United States. 8. Number of children involved in divorce or annulment expressed as rate per 1,000 children under 18 years old in the United States. A rationale for the three extrapolative methods selected for comparison in this study was presented in Chapter III. The three methods are simple linear regression, loglinear regression, and polynomial regression (specifically the quadratic and cubic forms). The following questions derived from the literature were proposed in Chapter III to guide the comparison of methods: 1. Do the data satisfy the assumptions of the general linear model? 2. How well does the regression line fit the data from which it was derived (the twothirds of the data points used to generate the prediction equation)? 3. [low well does the extrapolated line fit the data (the onethird of the data points that were not used to generate the prediction equation)? To answer these questions in terms of each method and to facilitate comparison among the three methods, the results obtained from applying each of the methods to each of the eight indicator data sets are presented in the following manner: 1. The fit of the regression line to the observed data is indicated by r2 and the unbiased standard error of estimate S..x* For the simple linear and loglinear regres sion methods the amount of variance accounted for by the regression line is tested by the F statistic (F value is the same as that obtained by dividing bi by SE ). For the quad ratic and cubic forms of polynomial regression, both the r2 including all orders entered to that step (r2 or r2 y.12 ory.123) and the increase in r2 attributable to the last order entered in the regression (r (2.) or ry(3.12)) are tested with the F statistic.* (Of course, dividing the partial regression coefficients b2 in the quadratic form and b3 in the cubic form by their respective standard errors will also yield the same F value for the increase in r2.) 2. The fit of the extrapolated line to the data is indicated numerically by the standard error for the extra polated values, Sext(yx)* This measure reflects the average deviation of the extrapolated values from the ob served values of Yi; thus, N (f (Y ) Sext(y.x) =extrapolated values) (Note that this equation is not the "unbiased" form used in computing S .) y.x *Actually the increase in r2 is tested according to the following ratio: F = (r2 with kthorder term) (r2 without k thorder term). (1 r2 with kthorder term) / (N k 1) Total r2 is tested according to the following ratio: F =SS regression/k F S residual/(N _1 3. All observed and predicted values of Y are reported in tabular and graphic form. 4. The residuals around the regression line were ex amined for serial correlation by the DurbinWatson d statistic, which is noted only when serial correlation is confirmed or questionable. Additionally, the standardized residuals were plotted against the sequence of cases and also against standardized Y values. Such visual inspection of the data is discussed as necessary to support the interpretation of re sults in Chapter V. Presentation of Results Indicator 1 The mean and standard deviation for the Y values used to generate the regression equations are 6674 and 987, re spectively. The following regression equations were used to derive Y: Linear Y = 3933.29 + 192.87 X Quadratic Y = 106.37 + 44.79 X + 1.35 X2 A Cubic Y = 135025.33 + (7385.17)X + 137.08 X2 + (.82)X3 Loglinear log Y = 3.12346 + .01266 X. The goodness of fit of the regression lines derived from these equations is indicated by r2, r2 change, and S in Table 2. y.x An ANOVA summary table is presented in Table 18 in the Appen dix. The overall F's for all methods are significant (p<.01); the increases in r2 due to the higher order polynomials are not significant, however. Table 2 Indicator 1: Summary Statisticsfor Prediction Equations by Method r2 r2 change F df S yx Linear .97440 571.04** 1,15 163.02 Quadratica .97531 276.47** 2,14 165.75 .00090 .51 1,14 Cubica .98137 228.25** 3,13 149.40 .00606 .42 1,13 Loglinearb log Y .97021 488.54** 1,15 log.01157 antilog Y .98726 164.92 Note. Indicator 1 is median family income expressed in 1971 constant dollars. aBoth quadratic and cubic forms of the polynomial regres sion are presented. bBoth r2 and S have been recomputed using antilogs of y x 2 2 the log Y; much of the difference between r2(log Y) and r2 (antilog Y) may be due to rounding. **P<.01 The average errors for the extrapolated Y values (Sext(y.x)) according to method employed are (a) linear, 669; (b) quadratic, 487; (c) cubic, 1981; and (d) loglinear, 283. Observed Y's and predicted values of Y for both the original regression and the extrapolated lines are presented in Table 3. These data are graphically presented in Figure 2. Thus, there is very little difference in the total r2 for the methods; the quadratic and cubic forms added little to the r2 already provided by the linear component. The S y.x for the cubic form is smaller (149.40) than for the other methods. When the lines are extrapolated beyond the original values, however, the cubic form is clearly the "worst" fit with a Sext(y.x) of 1981 and the loglinear method the "best" with a Sext(y.x) of 283. Whether the exponential curve would continue to be a superior predictor is a matter of conjecture. Indicator 2 The mean and standard deviation for the Y values used to generate the regression equations are 10.4 and .77, re spectively. The following regression equations were used to derive Y: Linear Y = 9.87860 + .02789 X Quadratic Y = 11.08794 + (.19035)X + .00626 X2 Cubic Y = 11.45737 + (.35750)X + .01954 X2 + (.0027)X3 Loglinear log Y = .99387 + .00118 X. The goodness of fit of the regression lines derived from these equations is indicated by r2, r2 change, and S in y.x Table 3 Indicator 1: Observed Y's and Predicted Y's by Method Predicted Y's by Method Year Observed Y's Linear Quadratica Cubica Loglinear Original regressionb 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 5,4S3 5,367 5,278 5,594 5,783 5,939 6,433 6,288 6,693 7,122 7,138 7,126 7,524 7,688 7,765 7,975 8,267 5,131c 5,324 5,517 5,710 5,903 6,096 6,289 6,481 6,674 6,867 7,060 7,253 7,446 7,639 7,831 8,024 8,217 5 185 5,358 5 533 5 711 5,892 6,076 6,262 6 450 6,642 6,836 7,033 7 233 7,435 7,640 7 848 8 058 8 271 5,323 5,392 5,499 5 637 5 803 5,992 6,197 6,416 6,642 6,871 7 097 7 317 7,524 7, 714 7,882 8,023 8 133 5,231 5,386 5,545 5,709 5,878 6,052 6,231 6,415 6,605 6,800 7,002 7,209 7,422 7,642 7,868 8,100 8,340 Extrapolationd 1964 1965 1966 1967 1968 1969 1970 1971 8,579 8,932 9,360 9,683 10,049 10,423 10,289 10,285 8,410 8,603 8,796 8,989 9,182 9,374 9,567 9,760 8,487 8,705 8,926 9,150 9,377 9,606 9,838 10,072 8,205 8,235 8,220 8,152 8,028 7,843 7,591 7,268 8,584 8,838 9,100 9,369 9,646 9,931 10,225 10,527 Note. Indicator I is median family 1971 constant dollars. income expressed in aBoth quadratic and cubic forms of polynomial regression are presented. bThe regression line is derived from 2/3 of the known data points. Cpredicted Y's in terms of 1971 constant dollars are rounded to number of places in original data. dValues are extrapolated beyond the data points used to generate the regression equation. 11,000 10,000 9,000 8,000 7,000 6,000 5,000 .....Observed Y's  Cubic Linear Ouadratic Loglinear ! 1947 6 60 I 70 I 75 Figure 2. Indicator 1: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extrapolation). I 0,4 U 4 4J cz V) 0 U ct5 *H ,4 4,000 I Table 4. An ANOVA summary table is presented in Table 19 in the Appendix. The overall F's for the quadratic and cubic forms of the polynomial regression are significant (p<.05); however, only the increase in r2 due to the quadratic is significant (p<.01). The average errors for the extrapolated Y values (Sext(y.x)) according to the method employed are (a) linear, 1.5; (b) quadratic, .35; (c) cubic, 1.1; and (d) loglinear, 1.5. Observed Y's and predicted values of Y for both the original regression and the extrapolated lines are presented by method in Table 5. These data are graphically presented in Figure 3. Thus, the cubic form of the polynomial accounts for the most variance in Y (89%) and has the smallest S (.33). y.x The quadratic form, accounting for 81% of the variance in Y, has a Sy. of .39; visual inspection of the seconddegree curve reveals that this curve may, in fact, more closely fit observed values for the latter portion of the regression line than the cubic form. The linear and loglinear methods provide no better estimate of Y than does Y; indeed, the standard error of estimate approximates the standard devia tion of the observed Y's. When the lines are extended beyond the original values, the quadratic provides the superior fit (Sext(y.x) = .35); the fit of the cubic, linear, and loglinear methods to the observed values is poor with residuals becoming larger for successive years. Table 4 Indicator 2: Summary Statistics for Prediction Equations by Method r2 r2 change F df S y.x Linear .16405 1.18 1,6 .76 Quadratica .81417 10.95* 2,5 .39 .65012 17.49** 1,5 Cubica .89255 11.08* 3,4 .33 .07838 2.92 1,4 Loglinearb log Y .16905 1.22 1,6 log.03175 antilog Y .16307 .78 Note. Indicator 2 is number of families in the United States headed by women expressed as a percentage of total families. aBoth quadratic and cubic forms of the polynomial regres sion are presented. bBoth r' and S have been recomputed using antilogs of the log Y; much of the difference between r2(log Y) and r2 (antilog Y) may be due to rounding. V.05 *:*E<.O1 Table 5 Indicator 2: Observed Y's and Predicted Y's by Method Predicted Y's by Method Year(X) Observed Y's Linear Quadratica Cubica Loglinear Original regressionb 1940(l) 11.2 9.9c 10.9 11.1 9.9 1947(8) 9.5 10.1 10.0 9.7 10.1 1950(11) 9.4 10.2 9.7 9.5 10.2 1955(16) 10.1 10.3 9.6 9.6 10.3 1960(21) 10.0 10.5 9.9 10.1 10.4 1965(26) 10.5 10.6 10.4 10.6 10.6 1970(31) 10.9 10.7 11.2 11.1 10.7 1971(32) 11.5 10.8 11.4 11.2 10.8 Extrapolationd 1972(33) 11.6 10.8 11.6 11.2 10.8 1973(34) 12.1 10.8 11.9 11.3 10.8 1974(35) 12.4 10.9 12.1 11.3 10.8 1975(36) 13.0 10.9 12.4 11.3 10.9 Note. Indicator 2 is number of families in the United States headed by women expressed as a percentage of total families. a Both quadratic and cubic forms of polynomial regression are presented. bThe regression line is derived from 2/3 of the known data points. C Predicted Y's are rounded to number of places in original data. dValues are extrapolated beyond the data points used to generate the regression equation. Observed Y's Cubic Linear Quadratic Loglinear 12.0 11.0 10.0 9.0 80  *1 1 1940 Figure 3. I I I I  547 '50 155 '60 Indicator 2: Observed Y's and predicted Y's separates values of original regression from I 1 '65 '70 by method (vertical extrapolation). * .. 13.0  7 _7 I 75 line VQ v Indicator 3 The mean and standard deviation for the Y values used to generate the regression equations are 32.0 and 3.8, respec tively. The following regression equations were used to de rive Y: Linear Y = 23.27325 + .74603 X Quadratic Y = 23.38774 + .71893 X + .00126 X2 Cubic Y = 22.60710 + 1.18842 X + (.05693)X2 + .00194 X3 Loglinear log Y = 1.37971 + .01047 X. The goodness of fit of the regression lines derived from these equations is indicated by r', r2 change, and S in Table 6. An ANOVA summary table is presented in Table 20 in the Appen dix. The overall F's for all methods are significant (p<.01); however, the increases in r2 due to the higher order polyno mials are not significant. The average errors for the extrapolated Y values (Sext(y.x)) according to the method employed are (a) linear, 1.4; (b) quadratic, 1.2; (c) cubic, 2.7; and (d) loglinear, .6. Observed Y's and predicted values of Y for both the original regression and the extrapolated lines are presented by method in Table 7. These data are graphically repre sented in Figure 4. Thus, there is very little difference in the total r2 for the methods; the quadratic and cubic forms added an in significant amount to the r2 already provided by the linear component. The Syx for the cubic form is only slightly better (.43) than for the other methods (.49.54). Table 6 Indicator 3: Summary Statistics for Prediction Equations by Method r2 r2 change F df S y.x Linear .98451 826.05** 1,13 .49 Quadratica .98460 383.49** 2,12 .50 .00009 .07 1,12 Cubica .98953 346.51** 3,11 .43 .00493 5.18 1,11 Loglinearb log Y .98085 665.69** 1,13 log.00760 antilog Y .99999 .54 Note. Indicator 3 is the number of wives in the labor force expressed as a percentage of total wives in the United States. a Both quadratic and cubic forms of the polynomial regres sion are presented. bBoth^ r2 and Syx have been recomputed using antilogs of the log Y; much of the difference between r2(log Y) and r2 (anitlog Y) may be due to rounding. :k*p<.O1. Table 7 Indicator 3: Observed Y's and Predicted Y's by Method Predicted Y's by Method Year(X) Observed Y's Linear Quadratica Cubica Loglinear Original regressionb 1950(1) 23.8 24.0c 24.1 23.7 24.6 1955(6) 27.7 27.7 27.8 28.1 27.7 1956(7) 29.0 28.5 28.5 28.8 28.4 1957(8) 29.6 29.2 29.2 29.5 29.1 1958(9) 30.2 30.0 30.0 30.1 29.8 1959(10) 30.9 30.7 30.7 30.7 30.5 1960(11) 30.5 31.5 31.4 31.4 31.3 1961(12) 32.7 32.2 32.2 32.0 32.0 1962(13) 32.7 33.0 33.0 32.7 32.8 1963(14) 33.7 33.7 33.7 33.4 33.6 1964(15) 34.4 34.5 34.5 34.2 34.4 1965(16) 34.7 35.2 35.2 35.0 35.3 1966(17) 35.4 36.0 36.0 35.9 36.1 1967(18) 36.8 36.7 36.7 36.9 37.0 1968(19) 38.3 37.4 37.5 38.0 37.9 Extrapolationd 1969(20) 39.6 38.2 38.3 39.1 38.8 1970(21) 40.8 38.9 39.0 40.4 39.8 1971(22) 40.8 39.7 39.8 41.9 40.7 1972(23) 41.5 40.4 40.6 43.4 41.7 1973(24) 42.2 41.2 41.4 45.2 42.8 1974(25) 43.0 41.9 42.1 47.0 43.8 1975(26) 44.4 42.7 42.9 49.1 44.9 Note. Indicator 3 is the number of wives in the labor force expressed as a percentage of total wives in the United States. a Both quadratic and cubic forms of polynomial regression are presented. bThe regression line is derived from 2/3 of the known data points. cPredicted Y's are rounded to number of places in original data. dValues are extrapolated beyond the data points used to generate the regression equation. 72 47.0  4.0.. nhserved Y's Cubic Linear 45.0 Quadratic Loulinear . 43.0  41.0 U o 390 0 ~237.0 S35.0/ 6 33.0 4 U 3 29.0 27.0 25.0 2 3. 0 1950 '5 5 '60 '65 '70 '75 Figure 4. Indicator 3: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extrapolation). When the lines are extrapolated beyond the original values, however, the cubic form is clearly the "worst" fit with a Sext(y.x) of 2.7 while the loglinear form is clearly the "best" fit with a Sext(y.x) of .6. Indicator 4 The mean and standard deviation for the Y values used to generate the regression equations are 9.8 and 2.6, re spectively. The following regression equations were used to derive Y: Linear Y 13.76707 + (.13216)X Quadratic Y = 14.16426 + (.19720)X + .00148 X2 Cubic Y = 10.90554 + 1.15549 X + (.07961)X2 + .00125 X3 Loglinear log Y = 1.12873 + (.00496)X. The goodness of fit of the regression lines derived from these equations is indicated by r2, r2 change, and S in Table 8. y.x An ANOVA summary table is presented in Table 21 in the Appen dix. The overall F statistic for the cubic form of the poly nomial regression is significant (P<.01); the increase in r2 due to the third degree polynomial is also significant (p<.01). The F statistic for both the linear and loglinear methods is significant (p<.05). The average errors for the extrapolated Y values (Sext(y.x)) according to method employed are (a) linear, 2.9; (b) quadratic, 2.5; (c) cubic, 4.1; and (d) loglinear, 2.7. Observed Y's and predicted values of Y for both the original regression and the extrapolated lines are presented by method in Table 9. These data are graphically represented in Figure 5. Table 8 Indicator 4: Summary Statistics for Prediction Equations by Method r2 r2 change F df S y.x Linear .42085 7.27* 1,10 2.07 Quadratica .42592 3.34 2,9 2.17 .00507 .08 1,9 Cubica .90338 24.93** 3,8 .94 .47746 39.53** 1,8 Loglinearb log Y .42298 7.33* 1,10 log.07715 antilog Y .36243/.42922c 2.06 Note. Indicator 4 is number of marriages in Florida expressed as rate per 1,000 population in Florida. a Both quadratic and cubic forms of the polynomial regres sion are presented. bBoth r2 and Syx have been recomputed using antilogs of 2 (l g A the log Y; much of the difference between r (log Y) and r2 (antilog Y) may be due to rounding. c2 Two methods of computing r' using antilogs of Y yielded different results. 0< **p<.050 Table 9 Indicator 4: Observed Y's and Predicted Y's by Method Predicted Y's by Method Year(X) Observed Y's Linear Quadratica Cubica Loglinear Original regressionb 1930(1) 11.6 13.6c 14.0 12.0 13.3 1940(11) 17.1 12.3 12.2 15.7 11.9 1950(21) 9.8 11.0 10.7 11.7 10.6 1960(31) 7.9 9.7 9.5 7.6 9.4 1963(34) 7.7 9.3 9.2 7.5 9.1 1964(35) 7.7 9.1 9.1 7.6 9.0 1965(36) 8.3 9.0 9.0 7.9 8.9 1966(37) 8.5 8.9 8.9 8.2 8.8 1967(38) 9.0 8.7 8.8 8.7 8.7 1968(39) 9.6 8.6 8.7 9.3 8.6 1969(40) 9.8 8.5 8.6 10.0 8.5 1970(41) 10.1 8.3 8.6 10.9 8.4 Extrapolationd 1971(42) 10.5 8.2 8.5 11.6 8.3 1972(43) 11.0 8.1 8.4 12.8 8.2 1973(44) 11.4 8.0 8.4 14.1 8.1 1974(45) 11.0 7.8 8.3 15.6 8.0 1975(46) 10.1 7.7 8.2 17.3 8.0 Note. Indicator pressed as rate per 4 is number of marriages in Florida ex 1,000 population. a Both quadratic and cubic forms of polynomial regression are presented. poi bThe regression line is derived from 2/3 of the known data nts. c Predicted Y's are rounded to number of places in original data. dValues are extrapolated beyond the data points used to generate the regression equation. Observed Y's Cubic .  Linear Quadratic Loglinear 0 2 14.0 o I.P 0 . CD 11.0 S12.0 10.0 ' N 100 "" / N 9.0 8.0 * 7.0 f I i '1____I ! 1930 40 50 '60 '63 '65 '70 '75 Figure S. Indicator 4: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extrapolation). 18.0 17.0 16.0 15.0 Thus, it would appear that for the original regression the cubic form of the polynomial best rits the observed Y's. This method accounts for 90% of the variance in Y with less than half of the average error of the other methods. When the lines are extrapolated beyond the original values, however, the cubic form has the largest average error (S ext(y.x) = 4.1). The quadratic form of the polyno mial is, in fact, the best predictor (Sext(y.x) = 2.5) of the methods compared. Actually, for this set of data the mean (9.8) of the observed values of Y used in the original re gression would have been the best predictor of the future values of Y. Indicator 5 The mean and standard deviation for the Y values used to generate the regression equations are 4.6 and 1.0, respec tively. The following regression equations were used to derive Y: Linear Y = 4.10332 + .01692 X Quadratic Y = 3.16091 + .17126 X + (.00350)X2 Cubic Y = 1.59416 + .82162 X + (.04249)X2 + .00060 X3 Loglinear log Y = .56596 + .00288 X. The goodness of fit of the regression lines derived from these equations is indicated by r2, r2 change, and S in y.x Table 10. An ANOVA summary table is presented in Table 22 in the Appendix. Only the overall F for the cubic form of the polynomial regression is significant (p<.05); the F Table 10 Indicator 5: Summary Statistics for Prediction Equations by Method r2 r2 change F df S y.x Linear .04360 .46 1,10 1.06 Quadratica .22401 1.30 2,9 1.00 .18041 2.09 1,9 Cubica .92134 31.23** 3,8 .34 .69733 70.92** 1,8 Loglinearb log Y .12007 A 1.36 1,10 log.10388 antilog Y .12779 1.07 Note. Indicator 5 is number of dissolutions of marriage in Florida expressed as rate per 1,000 population. a Both quadratic and cubic forms of the polynomial regres sion are presented. bBoth r2 and S have been recomputed using antilogs of the log Y; much of the difference between r2(log Y) and r2 (antilog Y) may be due to rounding. **0<.O1 value for the increase in r2 for the cubic component is also significant (p<.01). The average errors for the extrapolated Y values (Sext(yx)) according to the method employed are (a) linear, 2.2; (b) quadratic, 3.1; (c) cubic, .5; and (d) loglinear, 2.1. Observed Y's and predicted values of Y for both the original regression and the extrapolated lines are presented by method in Table 11. These data are graphically repre sented in Figure 6. For this set of data, the cubic form of the polynomial regression is a superior predictor of the observed Y's. This method accounts for 92% of the variance in Y with a S of y. x .34; the Sext(y.x) is .5, considerably less than the other three methods with average error ranging from 2.1 to 3.1. The quadratic form is definitely the least appropriate method for this set of data since the curve bends in an opposite direction to the observed Y values (see Figure 6). Indicator 6 The mean for the Y values used to generate the regres sion equations is 16.9; the standard deviation, 1.3. The following regression equations were used to derive Y: Linear Y = 18.56785 + (.36786)X Quadratic Y = 21.30892 + (2.01250)X + (.18274)X2 Cubic Y = 22.82142 + (3.58611)X + .59524 X2 + (.03056)X3 Loglinear log Y = 1.26761 + (.00900)X. The goodness of fit of the regression lines derived from Table 11 Indicator 5: Observed Y's and Predicted Y's by Method Predicted Y's by Method Year(X) Observed Y's Linear Quadratica Cubica Loglinear Original regression b 1930(1) 2.5 4.1c 3.3 2.4 3.7 1940(11) 5.8 4.3 4.6 6.3 4.0 1950(21) 6.4 4.5 5.2 5.7 4.2 1960(31) 3.9 4.6 5.1 4.2 4.5 1963(34) 4.1 4.7 4.9 4.1 4.6 1964(35) 4.1 4.7 4.9 4.2 4.6 1965(36) 4.2 4.7 4.8 4.3 4.7 1966(37) 4.2 4.7 4.7 4.4 4.7 1967(38) 4.6 4.7 4.6 4.6 4.7 1968(39) 4.9 4.8 4.5 4.8 4.8 1969(40) 5.2 4.8 4.4 5.1 4.8 1970(41) 5.5 4.8 4.2 5.4 4.8 Extrapolationd 1971(42) 6.1 4.8 4.2 5.6 4.9 1972(43) 6.9 4.8 4.1 6.1 4.9 1973(44) 7.1 4.8 3.9 6.6 4.9 1974(45) 7.2 4.9 3.8 7.2 5.0 1975(46) 7.5 4.9 3.6 7.9 5.0 Note. Indicator 5 is number of dissolutions of marriage in Florida expressed as rate per 1,000 population. aBoth quadratic and cubic forms of polynomial regression are presented. bThe regression line is derived from 2/3 of the known data points. CPredicted Y's are rounded to number of places in original data. d Values are extrapolated beyond the data points used to generate the regression equation. 8.0 Observed Y's 1C ubic 0 Linear 4 Quadratic M Loglinear 5.0 0 S5.01 4I 2.0 I 1930 '40 50 '60 '63 '65 70 75 Figure 6. Indicator 5: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extrapolation). these equations is indicated by r2, r2 change, and S in y.x Table 12. An ANOVA summary table is presented in Table 23 in the Appendix. The overall F's for the quadratic and cubic forms of polynomial regression are significant (p<.01); only the increase in r2 due to the quadratic component is signifi cant (p<.01), however. The average errors for the extrapolated Y values (Sext(yx)) according to the method employed are (a) linear, 1.2; (b) quadratic, 7.8; (c) cubic, 1.5; and (d) loglinear, 1.4. Observed Y's and predicted values of Y for both the original regression and the extrapolated lines are presented by method in Table 13. These data are graphically repre sented in Figure 7. Thus, while the quadratic and cubic forms of polynomial regression best fit the observed Y's for the original regres sion, they do not continue to be superior predictors. In fact, the quadratic form has a Sext(y.x) of 7.8 while the Sext(y.x) for the other three methods ranges from 1.2 to 1.5. No method is clearly the best predictor of Y when values are extrapolated beyond the original regression. It should be noted that the DurbinWatson d for the linear and loglinear methods approaches the lower limits of d and the possibility of serial correlation of the resid uals cannot be overlooked. Because of the small number of observations involved in this data set (N = 8), interpreta tion of the DurbinWatson d is more suggestive than con clusive. Table 12 Indicator 6: Summary Statistics for Prediction Equations by Method r2 r change F df S y.x Linear .46628 5.24 1,6 1.04 Quadratica .92655 31.54** 2,5 .42 .46026 31.33"* 1,5 Cubic a .97205 46.37** 3,4 .29 .04550 6.51 1,4 Loglinearb log Y .45957 ^ 5.10 1,6 log.02582 antilog Y .41427 1.03 Note. Indicator 6 is number of resident live births in Florida expressed as rate per 1,000 population. a Both quadratic and cubic forms of the polynomial regres sion are presented. bBoth r2 and S have been recomputed using antilogs of ^ y.x^ the log Y; much of the difference between r2(log Y) and r2 (antilog Y) may be due to rounding. **<.01 Table 13 Indicator 6: Observed Y's and Predicted Y's by Method Predicted Y's by Method Year(X) Observed Y's Linear Quadratica Cubica Loglinear Original regression 1964(l) 19.7 18.2c 19.5 19.8 18.1 1965(2) 17.9 17.8 18.0 17.8 17.8 1966(3) 16.8 17.5 16.9 16.6 17.4 1967(4) 15.9 17.1 16.2 16.0 17.0 1968(5) 15.7 16.7 15.8 16.0 16.7 1969(6) 16.1 16.4 15.8 16.1 16.4 1970(7) 16.8 16.0 16.1 16.4 16.0 1971(8) 16.4 15.6 16.9 16.6 15.7 Extrapolationd 1972(9) 14.8 15.3 18.0 16.5 15.4 1973(10) 13.7 14.9 19.5 15.9 15.1 1974(11) 13.4 14.5 22.1 14.7 14.7 1975(12) 12.5 14.1 23.5 12.7 14.4 Note_. Indicator 6 is number of resident live births in Florida expressed as rate per 1,000 population. aBoth quadratic and cubic forms of polynomial regression are presented. bThe regression line is derived from 2/3 of the known data points. CPredicted Y's are rounded to number of places in original data. dValues are extrapolated beyond the data points used to generate the regression equation. 25 24 0 *.so Observed Y Cubic Linear Is 23 'uaciratic Locglinear 22 4' ~21 S20 C 4 l19 J > 16 15 14  13 12 10u 1964 '65 '66 '67 '68 '69 '70 '71 '72 '73 '74 175 Figure 7.. Indicator 6: Observed Y's and predicted Y's by method (vertical line separates values of original regression from extrapolation). Indicator 7 The mean and standard deviation for the Y values used to generate the regression equations are 32.2 and 4.8, re spectively. The following regression equations were used to derive Y: Linear Y = 23.42857 + 1.95476 X Quadratic Y = 23.58928 + 1.85833 X + .01071 X2 Cubic Y = 23.26430 + 2.19645 X + (.07792)X2 + .00657 X3 Loglinear log Y = 1.38414 + .02662 X. The goodness of fit of the regression lines derived from these equations is indicated by r2, r2 change, and S in Table 14. yx An ANOVA summary table is presented in Table 24 in the Appen dix. The overall F's for all methods are significant (p<.01); however, the increases in r2 due to the higher order polyno mials are not significant. The average errors for the extrapolated Y values (Sext(y.x)) according to the method employed are (a) linear, 1.4; (b) quadratic, 1.3; (c) cubic, 1.8; and (d) loglinear, 2.4. Observed and predicted values of Y for both the original regression and the extrapolated lines are presented by method in Table 15. These data are graphically represented in Figure 8. There is very little difference in the predictive value of the methods for the original regression. Each method accounts for 99% of the variance in Y; the range of the S Y.x for all methods is from .34 to .43. When the lines are extrapolated beyond the original Table 14 Indicator 7: Summary Statistics for Prediction Equations by Method r2 r2 change F df S y.x Linear .99560 1358.03** 1,6 .34 Quadratica .99572 581.74** 2,5 .37 .00012 .14 1,5 Cubica .99588 322.27** 3,4 .41 .00016 .15 1,4 Loglinearb log Y .99333 ^ 893.91** 1,6 log.00577 antilog Y .99999 .43 Note. Indicator 7 is number of 3 to 5 year olds enrolled in nursery school and kindergarten expressed as percentage of total children 3 to 5 years old in the United States. a Both quadratic and cubic forms of the polynomial regres sion are presented. bBoth r2 and Syx have been recomputed using antilogs of the log Y; much of the difference between r2(log Y) and r2 (antilog Y) may be due to rounding. **D<. 01 