The projection of social trends using time series indicators : methodology and application in educational planning

MISSING IMAGE

Material Information

Title:
The projection of social trends using time series indicators : methodology and application in educational planning
Physical Description:
xiii, 124 leaves : ill. ; 28 cm.
Language:
English
Creator:
Nelson, Jane Counihan, 1941-
Publication Date:

Subjects

Subjects / Keywords:
Educational planning -- Mathematical models   ( lcsh )
Time-series analysis   ( lcsh )
Social indicators   ( lcsh )
Educational Administration and Supervision thesis Ph. D   ( lcsh )
Dissertations, Academic -- Educational Administration and Supervision -- UF   ( lcsh )
Genre:
bibliography   ( marcgt )
non-fiction   ( marcgt )

Notes

Thesis:
Thesis--University of Florida.
Bibliography:
Bibliography: leaves 117-123.
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Jane Counihan Nelson.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 022249428
oclc - 04290057
System ID:
AA00022214:00001

Table of Contents
    Title Page
        Page i
        Page ii
    Acknowledgement
        Page iii
    Table of Contents
        Page iv
        Page v
        Page vi
    List of Tables
        Page vii
        Page viii
    List of Figures
        Page ix
    Abstract
        Page x
        Page xi
        Page xii
        Page xiii
    Chapter 1. Introduction
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
        Page 7
        Page 8
        Page 9
        Page 10
        Page 11
        Page 12
        Page 13
        Page 14
        Page 15
        Page 16
        Page 17
        Page 18
    Chapter 2. Rationale for selection of variables/time series indicators
        Page 19
        Page 20
        Page 21
        Page 22
        Page 23
        Page 24
        Page 25
        Page 26
        Page 27
        Page 28
        Page 29
        Page 30
        Page 31
        Page 32
        Page 33
        Page 34
        Page 35
        Page 36
        Page 37
    Chapter 3. Rationale for extrapolative methods selected for comparison
        Page 38
        Page 39
        Page 40
        Page 41
        Page 42
        Page 43
        Page 44
        Page 45
        Page 46
        Page 47
        Page 48
        Page 49
        Page 50
        Page 51
        Page 52
        Page 53
        Page 54
        Page 55
        Page 56
    Chapter 4. Comparison of extrapolative methods using selected time series indicators
        Page 57
        Page 58
        Page 59
        Page 60
        Page 61
        Page 62
        Page 63
        Page 64
        Page 65
        Page 66
        Page 67
        Page 68
        Page 69
        Page 70
        Page 71
        Page 72
        Page 73
        Page 74
        Page 75
        Page 76
        Page 77
        Page 78
        Page 79
        Page 80
        Page 81
        Page 82
        Page 83
        Page 84
        Page 85
        Page 86
        Page 87
        Page 88
        Page 89
        Page 90
        Page 91
        Page 92
        Page 93
        Page 94
    Chapter 5. Discussion
        Page 95
        Page 96
        Page 97
        Page 98
        Page 99
        Page 100
        Page 101
    Chapter 6. Summary, conclusions, and implications of study
        Page 102
        Page 103
        Page 104
        Page 105
        Page 106
        Page 107
    Appendix
        Page 108
        Page 109
        Page 110
        Page 111
        Page 112
        Page 113
        Page 114
        Page 115
        Page 116
    Reference notes
        Page 117
    References
        Page 118
        Page 119
        Page 120
        Page 121
        Page 122
        Page 123
    Biographical sketch
        Page 124
        Page 125
        Page 126
Full Text











TIlE PROJECTION OF SOCIAL TRENDS IISJNG TIME
SERIES INDICATORS: METHIOIDO LOGY AND
APPLE ICATION IN EDUCATIONAL PLANNING








By

JANE COUNIttAN NELSON


A DISSERTATION PRESENTED) TO TIHE GRADUATE COUNCIL OF
TIlE I UNIVERSITY (F FLORIDPA
IN PARTIAL, FIULFII,LIENT OF TIlE RVI RIlrNTS FOR TIlE
l)EGREE OF DOCTOR OF PHILOSOPY









UNI VERSITY OF FLORIDlA


1977






























Copyright 1977

by

Jane Counihan Nelson













ACKNOWLEDGMENTS


I wish to express my sincere appreciation to my

Supervisory Committee: to Dr. Michael Y. Nunnery, chairman,

for his candor and understanding whether acting as counselor

or critic; to Dr. Phillip A. Clark and Dr. Gordon D. Lawrence,

for providing valuable suggestions and support.

A special thank you goes to Dr. Arthur J. Lewis,

director of the DOE social forecasting project, for his

leadership, receptivity to new ideas, and especially for

his confidence in me; and to my colleagues on the project,

Dr. Robert S. Soar and Hs. Linda Troup, for their substantial

contribution to the conceptual framework presented in this

study.

For carefully reviewing the statistical portions of

this manuscript, I am indebted to my friend, Dr. Azza S.

Guertin.

I am especially grateful to my husband Edward for his

love and encouragement and for sharing with me hi enthu-

siasm for scientific inquiry.


iii














TABLE OF CONTENTS


ACKNOWLEDGIENTS ....


LIST OF TABLES ....... ..................

LIST OF FIGURES ....... ..................

ABSTRACT ......... ....................

CHAPTER I INTRODUCTION .... ...............


Page
. . iii

. . vii

. . ix

* x

1


Background and Significance of the Study ... ...... 1
The Social Context of Education .... ......... 1
The Futures Perspective in Educational
Planning ......... ................... 4
Forecasting Trends in Social Variables ...... 5
The Need for Research ....... .............. 7
The Problem ......... .................... 9
Delimitations and Limitations ..... ........... 9
Definition of Terms ...... ................ 11
Procedures ...................................... 13
The Selection and Operational Definition
of Variables ...... ................. .. 13
Collection of Time Series Indicator Data .... 14
Comparison of Extrapolative Methods Using
Time Series Indicators .... ............ 15
Development of Implications for Educational
Planning ....... ................... .. 18


CHAPTER II RATIONALE FOR SELECTION OF VARIABLES/
TIME SERIES INDICATORS ... ..........

The Social Indicator Movement .. ...........
Historical Development .... ............
Definition and Use of Social Indicators .
Data Base for Social Indicators ...........
Educational Implications ... ...........
Selection of Variables/Time Series Indicators .
The Variables ....... .................
Bronfenbrenner's Ecology of Education Model.
Operational Definition of Variables as Time
Series Indicators ...... ...............

CHAPTER III RATIONALE FOR EXTRAPOLATIVE METHODS
SELECTED FOR COMPARISON ... ........


* 19

20
* 20
24
* 26
27
. 28
28
29

* 31


* 38








TABLE OF CONTENTS (continued)

Page

Overview of Extrapolative Forecasting Methods . 38
Economic and Business Forecasting .......... ..39
Technological Forecasting .... ............ 41
Educational Forecasting .... ............. ...42
Extrapolative Methods in Other Areas ....... ..43
Applicability of Reviewed Extrapolative Methods
for Study ........ .................... .. 44
The Pattern of the Data ..... ............ 44
The Class of Model ....................... ..44
Description of Methods to be Compared ........ ..46
The General Linear Model .... ............ 46
The Assumptions of the Linear Model ......... ...48
Criteria for Comparison of Methods ........ ..49
Method 1: Simple Linear Regression ......... ...52
Method 2: Log-linear Regression .. ........ ..52
Method 3: Polynomial Regression .. ........ ..54

CHAPTER IV COMPARISON OF EXTRAPOLATIVE METHODS USING
SELECTED TIME SERIES INDICATORS ......... ...57

Presentation of Results ..... .............. 60
Indicator 1 ....... ................... 60
Indicator 2 ......... .... ............. ...62
Indicator 3 ....... ................... .. 69
Indicator 4 ....... ................... .. 73
Indicator 5 ....... ................... .. 77
Indicator 6 ....... ................... .. 79
Indicator 7 ....... ................... .. 86
Indicator 8 ....... ................... 90

CHAPTER V DISCUSSION ....... .................. ..95

The Variables ....... ................... 95
Selection ........ .................... 95
Bronfenbrenner's Ecology of Education Model. 96
Operational Definition of Variables ......... ...97
The Extrapolative Methods .... ............ .. 98
Statistical Considerations ... ........... 98
Practical Considerations ..... ............ .101

CHAPTER VI SUMMARY, CONCLUSIONS, AND IMPLICATIONS
OF STUDY ....... .................. 102

Summary ......... ...................... .102
The Variables ........ .................. ..103
The Methods ........ ................... .103
Results ......... ..................... .104
Conclusions ........ .................... .104
Suggestions for Future Research .... .......... .105
Implications for Planners and Policy Makers . . 106








TABLE OF CONTENTS (continued)

Page

APPENDIX .......... ........................ .109

REFERENCE NOTES ......... ..................... .117

REFERENCES ......... ....................... .118

BIOGRAPHICAL SKETCH ........ ................... ..124














LIST OF TABLES


Page


Table 1 Time Series Indicators of Social Variables
Affecting Outcomes of Education .........

Table 2 Indicator 1: Summary Statistics for
Prediction Equations by Method ........

Table 3 Indicator 1: Observed Y's and Predicted
Y's by Method ...... ................

Table 4 Indicator 2: Summary Statistics for
Prediction Equations by Method ........

Table 5 Indicator 2: Observed Y's and Predicted
Y's by Method ...... ................

Table 6 Indicator 3: Summary Statistics for
Prediction Equations by method. .........

Table 7 Indicator 3: Observed Y's and Predicted
Y's by Method ...... ................

Table 8 Indicator 4: Summary Statistics for
Prediction Equations by Method ........

Table 9 Indicator 4: Observed Y's and Predicted
Y's by Method ...... ................

Table 10 Indicator 5: Summary Statistics for
Prediction Equations by Method ........


33


* 61


63


66


* 67


* 70


* 71


* 74


75


* 78


Table 11 Indicator 5: Observed Y's and Predicted
Y's by Method ..... ...............

Table 12 Indicator 6: Summary Statistics for
Prediction Equations by Method .......

Table 13 Indicator 6: Observed Y's and Predicted
Y's by Method ..... ...............

Table 14 Indicator 7: Summary Statistics for
Prediction Equations by Method .......

Table 15 Indicator 7: Observed Y's and Predicted
Y's by Method ..... ...............


. . 80


* . 83


S. 84


. . 87


. . 88


vii








LIST OF TABLES (continued)


Indicator 8: Summary Statistics for
Prediction Equations by Method ........

Indicator 8: Observed Y's and Predicted
Y's by Method ...... ................


Indicator


by Method..

Table 19 Indicator 2:
by Method. .

Table 20 Indicator 3:
by Method. .

Table 21 Indicator 4:
by Method. .

Table 22 Indicator 5:
by Method. .

Table 23 Indicator 6:
by Method. .

Table 24 Indicator 7:
by Method. .

Table 25 Indicator 8:
by Method. .


Table 16


Table 17


Table 18


Page


* 91


* 92


. . . 109


. . . 110





. . . 112


. . . 113


. . . 114


. . . 115


. . . 116


viii


ANOVA Summary Tables

ANOVA Summary Tables


ANOVA Summary Tables


ANOVA Summary Tables


ANOVA Summary Tables


ANOVA Summary Tables


ANOVA Summary Tables


ANOVA Summary Tables














LIST OF FIGURES


Page


Figure 1




Figure 2




Figure 3




Figure 4




Figure 5




Figure 6




Figure 7




Figure 8




Figure 9


Bronfenbrenner's ecological structure of
the educational environment. (Based upon
Bronfenbrenner's [1976] description,
pp. 5-6) ....... ..................

Indicator 1: Observed Y's and predicted
Y's by method (vertical line separates
values of original regression from extra-
polation) ....... ..................

Indicator 2: Observed Y's and predicted
Y's by method (vertical line separates
values of original regression from extra-
polation) ....... ..................

Indicator 3: Observed Y's and predicted
Y's by method (vertical line separates
values of original regression from extra-
polation) ....... ..................

Indicator 4: Observed Y's and predicted
Y's by method (vertical line separates
values of original regression from extra-
polation) ....... ..................

Indicator 5: Observed Y's and predicted
Y's by method (vertical line separates
values of original regression from extra-
polation) ....... ..................

Indicator 6: Observed Y's and predicted
Y's by method (vertical line separates
values of original regression from extra-
polation) ....... ..................

Indicator 7: Observed Y's and predicted
Y's by method (vertical line separates
values of original regression from extra-
polation) ....... ..................

Indicator 8: Observed Y's and predicted
Y's by method (vertical line separates
values of original regression from extra-
polation) ....... ..................


* 64




68




* 72




* 76




81




* 85




* 89




93








Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



THE PROJECTION OF SOCIAL TRENDS USING TIME
SERIES INDICATORS: METHODOLOGY AND
APPLICATION IN EDUCATIONAL PLANNING

By

Jane Counihan Nelson

December 1977

Chairman: Michael Y. Nunnery
Major Department: Educational Administration

Educational planners and policy makers need adequate

information about the societal context of education to make

appropriate decisions about the future role and function of

education. Some of this information may be provided through

the use of conceptually sound social and educational vari-

ables operationally defined as time series indicators

coupled with an empirically sound basis for forecasting

future trends in such indicators. As evidence of the need

for developing such a social forecasting framework for

education, states including Florida have provided grants

for that purpose. This study was one aspect of such a

grant.

The problem in this study was (a) to select, using

Bronfenbrenner's ecology of education model, and operation-

ally define at least 10 variables that research has shown

to be related to the outcomes of education; (b) to use these

variables operationally defined as time series indicators in

the comparison of three purely extrapolative forecasting








methods; and (c) to derive implications for the use of an

ecological model such as Bronfenbrenner's, time series

indicators, and selected extrapolative methods for educa-

tional planning.

The study was conducted in the following phases:

1. Using Bronfenbrenner's ecology of education model,

10 variables that research has shown to be related to the

outcomes of education were selected and where possible,

were operationally defined as state and/or national time

series indicators. Data were collected for these indica-

tors; eight which met the criteria established in this study

were used in the comparison of extrapolative techniques.

2. Three purely extrapolative techniques derived from

the general linear model were compared according to statis-

tical criteria and practical considerations derived from

the literature in statistics, economics, time series

analysis, and forecasting methodology. The methods were

(a) linear regression, (b) curvilinear regression (quad-

ratic and cubic forms), and (c) log-linear regression (de-

pendent variable undergoes logarithmic transformation).

Each method was applied to each time series indicator. Time

in years was used as the independent variable; the annual

measure of the indicator was treated as the dependent vari-

able. Each data set was divided into thirds; two-thirds

of the data points were used to establish the prediction

equation. This equation was used to predict the remaining

third of the data points. Predicted values were compared

with actual values.








3. Implications for the use in educational planning

of an ecological model such as Bronfenbrenner's, time series

indicators, and selected extrapolative techniques were dis-

cussed.

Results of the method comparison were (a) no method

was a superior predictor for all indicators; (b) each

method was a superior predictor for at least one indicator;

and (c) the summary statistics for the original regression

were not consistently related to the accuracy of the extra-

polated values.

The following conclusions appear to be warranted by

the results of this study:

1. The Bronfenbrenner model is a useful framework

for considering the numerous factors impinging upon the

learner.

2. Time series indicators provide a means to compare

trends in an indicator over time or to compare different

groups in relation to a specific indicator.

3. The general linear model is appropriate for the

analysis and extrapolation of the selected time series

indicators used in this study.

4. Each method is appropriate for use with some

indicators but not with others. Measures of "best fit" such

as r2 and the standard error of estimate are not reliable

criteria for the selection of an extrapolative method. A

combination of strategies such as graphic representation of

original and predicted data, analysis of residuals, and


xii








knowledge of the social phenomena being studied may provide

guidance as to the most appropriate method for a particular

indicator.


xii i














CHAPTER I

INTRODUCTION


Background and Significance of the Study


The Social Context of Education

Educators have become increasingly cognizant of the

myriad forces in society impinging upon various facets of the

educational process. The influence of a number of these

forces upon educational purposes, outcomes, and resources

has been analyzed from several social science perspectives

(Boocock, 1976; Henry, 1961; Gordon, 1974). Keppel (in

Thomas & Larson, 1976) acknowledged one of the reasons for

this continuing interest in societal trends by educational

planners and policy-makers:

the impetus for change in educational
institutions, from the preschool through
the university, is more likely to derive
from changes in the wider society than from
forces within the institutions. (Foreword)

Additionally, Keppel noted that "educational policy must be

formed in concert with other aspects of public policy and

program development" (Foreword).

Bronfenbrenner (1976) proposed an ecological structure

of the educational environment which must be taken into

account if "any progress in the scientific study of educa-

tional systems and processes" (p. 5) is to be made. Bronfen-

brenner stated:








Whether and how people learn is a function
of sets of forces, or systems, at two levels:
a. The first comprises the relations between
characteristics of learners and their sur-
roundings in which they live out their
lives (e.g., home, school, peer group,
work place, neighborhood, community).
b. The second encompasses the relations
and interconnections that exist between
these environments. (p. 5)

Building on Lewin's theory of topological territories

and employing a terminology adapted from Brim (1975), Bronfen-

brenner further elaborated that the construct environment can

be "conceived topologically as a nested arrangement of

structures, each contained within the next" (p. 5).

1) A micro-system is an immediate setting
containing the learner ..
2) The meso-system comprises the inter-
relationships among the major settings
containing the learner at a particular
point in his or her life a system
of micro-systems.
3) The exo-system is an extension of the
meso-system embracing the concrete social
structures, both formal and informal, that
impinge upon or encompass the immediate
settings containing the learner and, there-
by, influence and even determine or delimit
what goes on there. These structures in-
clude the major institutions of society,
both deliberately structured and spontane-
ously evolving, as they operate at the
local community level ...
4) Macro-systems are the overarching insti-
tutions of the culture or subculture,
such as the economic, social, educational,
legal and political systems, of which
local micro-, meso-, and exo-systems are
the concrete manifestations. (pp. 5-6)

(See Figure 1 for a representation of
these ideas.)























































Figure 1. Bronfenbrenner's ecological structure of the
educational environment. (Based upon Bronfen-
brenner's [19761 description, pp. 5-6.)








The Futures* Perspective in Educational Planning

While Bronfenbrenner proposed his ecological structure

primarily as a framework for learning research efforts, that

is, for examining relationships among variables associated

with learning, others (Harman, 1976; Webster, 1976) have dis-

cussed the societal context of education as a framework for

future-oriented educational planning. Indeed, this emphasis

on future awareness has evolved into a significant movement

within education referred to as educational futurism (Hencley

& Yates, 1974; Pulliam & Bowman, 1974), educational futures

(Marien & Ziegler, 1972), or alternative futures perspective

(Webster, 1976). The primary purpose of futures research or

future studies is "to help policy makers choose wisely--in

terms of their purposes and values--among alternative courses

of action that are open to leadership at a given time" (Shane,

1973, p. 1). According to Webster (1976), this requires that

we attend to alternatives--to alternative
assumptions, ends and means. It requires
us to examine alternative plausible futures
that might be rendered more or less possible
by our planning and action; to identify un-
intended as well as intended consequences
for others of achieving the goals that seem
desirable to us; to analyze alternative stra-
tegies and tactics for achieving any desired
future; and to anticipate the variety of
potential consequences of our strategies,
tactics, and short-run planning. Perhaps,
most fundamentally it asks of us that we
look hard at our basic premises about the
nature of man and the world and consider
implications and alternatives for the
future. (p. 2)
*"Futures" refers to the number of different possible
views of what is ahead in subsequent time periods for society
and, thus, for education.




S



Webster also noted:

the futures perspective implies that we not
just attend to alternatives in and for educa-
tion, but also consider the societal context
in more comprehensive fashion than is usual
in educational planning. (p. 2)

In order to assist decision makers in the selection of

alternatives which have positive future consequences for

society, educational planners at both national and state

levels must take into account those societal forces which

affect not only the outcomes of education, but also the pur-

poses of education, and the human and material resources

available to the educational process. To do this, however,

the planner must delineate the societal factors or variables

to be included in the planning process and develop a sound

rationale based on research and theory for such inclusion.

Then trends--past, present, and future--in these variables

may be examined in order to derive implications for educa-

tional planning and policy.


Forecasting Trends in Social Variables

Available to the educational planner in this undertaking

are a number of predictive and heuristic devices to explore

alternative futures which have been developed by government,

industry, non-profit organizations, and futures consulting

groups. These forecasting techniques can be categorized

into exploratory forecasting methods and normative forecasting

methods:

Exploratory forecasting methods start from
the present situation and its preceding
history, and attempt to project future








developments. Normative forecasts, on the
contrary, start with some desired or pos-
tulated future situation, and work back-
wards to derive feasible routes for the
transition from the present to the desired
future. (Martino, 1976, p. 4)

Exploratory forecasting methods, all of which are based

upon extrapolation of some kind, include (a) purely extra-

polative methods, (b) explanatory methods, and (c) auxiliary

methods. Since forecasting of social phenomena is still in

a highly intuitive developmental phase, there is a growing

interest in examining those exploratory methods considered

to be purely extrapolative, which are based upon time series

data representing social and educational variables. These

time series data, often called time series indicators, are

defined measurements made at specified intervals over a

period of time. By extrapolating identified patterns in the

time series data into the future, planners may compare present,

past, and future states of that indicator. Thus, a projec-

tion of future societal trends can provide the impetus to

examine present policy and to analyze the consequences of

contemplated changes. This approach need not be only "pre-

ventive" forecasting, in the sense used by Ziegler (1972) of

preventing undesirable forecasts. It may also be extended

to examine all consequences of action or intervention, in-

tended or not. Purely extrapolative methods, when combined

with auxiliary methods such as trend-impact analysis, cross-

impact matrices, or scenarios, can provide a vehicle for ex-

ploring the relationships among identified future patterns in

society.








While the use of purely extrapolative methods with time

series data is fairly well defined in technological and

economic areas, their application to social forecasting has

not been the focus of significant definitive study. Indeed

Harrison (1976) emphasized the need for such research, speci-

fically the consideration of "each method in terms of some

aspect of the social process it would likely be applied to"

(p. 13). For, as Harrison explained, while some problems

in regression and time series analysis which remain unresolved

are currently the concern of statisticians and mathematicians,

"it appears that resolution might best lie in terms of inves-

tigation in concrete application cases" (p. 14).

In social forecasting there is a great need
in almost all the known extrapolative methods
for an explicit statement of the algorithmic,
theoretical, and empirical weaknesses or
sensitivities of such procedures. Such a
discussion, as noted, would be more mean-
ingful if carried on in the context of an
analysis of some specific aspect or aspects
of social process. (Harrison, 1976, p. 17)

Only through empirical study of the performance of various

extrapolative methods applied to particular social phenomena

will a basis for selection of appropriate and accurate tech-

niques be formulated.


The Need for Research

Since there are no widely-accepted planning models in-

corporating quantitative data on social variables, the edu-

cational planner who wants to utilize such information is

confronted with a number of questions related to (a) the








identification of social variables to be included, (b) the

operational definition of social variables in terms of time

series indicators, (c) the selection of a purely extrapolative

technique which will yield the most accurate forecast for a

specific indicator, and (d) the utilization of these forecasts

in the planning process. Answers require futures research

which is derived from a conceptually sound framework and is

pursued with methodological vigor. As evidence of the impor-

tance of such investigation to the educational planner, the

State of Florida through the Office of Strategy Planning in

the Department of Education funded in 1976 a social fore-

casting project (STAR Project No. R5-175) at the University

of Florida for the second year. The study described herein

was part of that effort to forecast social trends affecting

education in Florida.

To summarize: Educational planners and policy makers

need adequate information to make appropriate decisions about

the role and function of education in creating improved

quality of life for citizens of the future. The State of

Florida, in funding STAR Project No. R5-175 of which this

study is a part, acknowledged that need. Part of this in-

formation may be provided through the use of conceptually

sound social and educational variables operationally defined

as time series indicators coupled with an empirically sound

basis for forecasting future states of such indicators.








The Problem

The problem in this study was (a) to select, using

Bronfenbrenner's ecology of education model, and operationally

define at least 10 variables that research has shown to be

related to the outcomes of education; (b) to use these

variables operationally defined as time series indicators in

the comparison of three purely extrapolative forecasting

methods; and (c) to derive implications for the use of an eco-

logical model such as Bronfenbrenner's, time series indicators,

and selected extrapolative methods for educational planning.


Delimitations and Limitations


The Bronfenbrenner ecology of education model was used

primarily as a framework for the selection of social and

educational variables and was not evaluated itself in this

study. Ten variables (e.g., socio-economic status of family,

peer group characteristics) were selected to be operationally

defined, where possible, in terms of national and/or state

level time series indicators (e.g., median family income,

juvenile crime rates). Of these identified indicators, eight

which met the following criteria were used in the comparison

of extrapolative techniques: (a) the indicator was readily

available, (b) the data were available for a 10 year or greater

time span, and (c) the indicator was a reasonably reliable and

valid measure of one aspect of the social or educational

variable that it represented. It should be noted that

the selection of the eight indicators used in this study








was in many cases influenced more by data availability than

the logic or appropriateness of the indicator to represent a

specific social variable. Thus, the eight indicators are

examples of the type of data that might be employed to

operationally define the variables; utilization in a specific

planning situation would require evaluation of the appropri-

ateness of the indicators presented in this study and the

addition and/or substitution of other indicators.

In this study only the variables related to the outcomes

of education were used. As previously noted, this study was

part of a larger social forecasting and educational planning

effort which also included the status of education 1976-77,

social trends affecting the purposes of education, and social

trends affecting the resources for education.

While the literature in mathematics, statistics, and

economics was reviewed and considered in preparation for

the selection and use of the three extrapolative techniques

(linear, log-linear, and curvilinear regression), there was

no attempt to present the comparison of these techniques in

the detail desired by these disciplines. Rather the compari-

son was made in such a way as to be most relevant to the

planner in education.

There was no attempt to write or adapt computer programs

for various techniques. Instead, an effort was made to

identify and utilize computer programs and statistical pack-

ages which had already been adapted for use at the North East

Regional Data Center's computer facilities.








Additionally, the projection of specific trends per se

was not of interest in this study. Rather the focus of this

study was the development of the conceptual framework and

methodology for such projection. Also, there has not been

any attempt to forecast educational outcomes from the

operationally defined social and educational variables. The

present work may be considered an initial step in determining

the feasibility of developing such a mathematical forecasting

model.


Definition of Terms


Extrapolative forecasting.

The procedure consists of identifying an
underlying historical trend or cycle in
social processes that can be extrapo-
lated by means as varied as multiple
regression analysis, time series analy-
sis, envelope curve fitting, three-mode
factor analysis, correlational analysis,
averages, or any other method that takes
current and historical data as the prin-
cipal basis for estimating future states
in a given variable. (Harrison, 1976,
p. 3)

Indicator, educational.

Educational indicators are statistics
that enable interested publics to know
the status of education at a particular
moment in time with respect to some
selected variables, to make comparisons
in that status over time and to project
future status. Indicators are time-series
statistics that permit a study of trends
and change in education. (Gooler, 1976,
p. 11)

Indicator, social. "The operational definition or part

of the operational definition of any one of the concepts








central to the generation of an information system descrip-

tive of the social system" (Carlisle, 1972, p. 25); "time-

series that allow comparisons over an extended period which

permit one to grasp long-term trends as well as unusually

sharp fluctuations" (Sheldon & Freeman, 1970, p. 97); "a

statistic of direct normative interest which facilitates

concise, comprehensive and balanced judgments about the con-

dition of major aspects of a society" (U.S. Department of

Health, Education,& Welfare, 1970, p. 97).

Outcomes of education. Those measures of performance,

such as achievement test scores, or utilization, such as

employment rates, which appear to be the result of partici-

pation in the formal educational process.

Regression, linear. Most common type of regression in

which the objective is to locate the best-fitting straight

line through a scattergram based on interval-level variables

(Nie, Hull, Jenkins, Steinbrenner, & Bent, 1975, p. 278).

Regression, log-linear. As used in this study, a least

squares regression method in which a geometric straight line

is located through a scattergram plotted on semi-logarithmic

paper; also called exponential curve or trend curve.

Regression, polynomial or curvilinear. Regression

method for fitting a curve to a set of data using the cri-

terion of least squares distances (Nie et al., 1975, p. 278).

Time series. "A set of observations generated sequen-

tially in time" (Box & Jenkins, 1970, p. 23).








Procedures

The study proceeded in the following phases: (a) using

Bronfenbrenner's ecology of education model, 10 variables

that research has shown to be related to the outcomes of

education were selected and, where possible, were operational-

ly defined as time series indicators; (b) data were collected

for these time series indicators, eight of which were used

in the comparison of the selected extrapolative techniques;

(c) using the selected time series indicators, three purely

extrapolative techniques were compared according to statis-

tical criteria and practical considerations derived from

the literature; and (d) implications for the use in educa-

tional planning of an ecological model such as Bronfen-

brenner's, time series social and educational indicators,

and selected extrapolative techniques were derived.


The Selection and Operational Definition of Variables

The work by Collazo, Lewis, and Thomas (1977), completed

during the first year of STAR Project No. R5-175, on fore-

casting selected educational outcomes from social variables

was utilized. Since the variables selected by these inves-

tigators were derived from a review of the research litera-

ture and were acknowledged to be appropriate for the stated

social forecasting purposes by a panel of experts in various

disciplines, they appeared to fulfill the requirements of

this study. Additionally, each of the 10 variables selected








for use was described and classified according to Bronfen-

brenner's ecology of education model.

For each variable an attempt was made to identify one

or more types of time series indicators which might logically

represent the variable. For some variables several indica-

tors were identified, while for others, no indicator could

logically be identified or no time series data were avail-

able for the indicator at the time of the study. This phase

of the study is explained further in Chapter II.


Collection of Time Series Indicator Data

Sources of needed time series data at both the national

and state level'were identified in several ways. The expan-

ding literature on social trends (e.g.,U.S. Department of

Health, Education, & Welfare, 1970) and specifically the

literature on these social trends operationalized as social

indicators (e.g., Executive Office of the President, Office

of Management & Budget, 1973) was reviewed. Furthermore,

examination of initial efforts in using time series indica-

tors related to education by the Office of Technology Assess-

ment for the United States Congress (Coates, Note 1) and

several state departments of education (e.g., Oregon, Penn-

sylvania, & Florida) yielded additional sources. Published

sources of data such as U.S. Census Reports and Florida

Statistical Abstracts were consulted. When data did not

appear to be available in suitable form or for desired time

periods, inquiries and requests were directed to appropriate

sources. Any apparent limitations in the data such as known








measurement error due to sampling technique were noted.

After data collection was completed, eight indicators which

met the criteria outlined in a previous section were selected

for inclusion in the next phase of the study.

Comparison of Extrapolative Methods Using Time Series

Indicators

The following steps were involved in this phase of the

study: (a) initial identification and testing of methods

using data similar in form to selected indicators, (b) recon-

sideration and testing of additional available methods, (c)

selection of three methods to be used for comparative extra-

polations, (d) derivation of specific criteria and practical

considerations from the literature, (e) application of three

methods to each data set, (f) extrapolation of identified

trend into future using equation generated in (e), and (g)

comparison of actual versus predicted values of indicators.

From a preliminary review of the literature in statis-

tics, economics, time series analysis, and forecasting

methodology, the following four methods were tentatively

identified for comparison: (a) linear regression (computer

program by Nie et al., 1975), (b) curvilinear or polynomial

regression (computer program by Nie et al., 1975), (c) Box-

Jenkins time series analysis (computer program by Cooper,

Note 2), and (d) FIT curve-fitting with weighted data (com-

puter program by Stover, Note 3).

An initial analysis of the methods using trial sets of

data combined with a visual analysis of the general form of








the data to be used revealed that two of the methods under

consideration were inappropriate. The Box-Jenkins procedure,

while an extremely powerful tool for time series analysis of

data which are characterized by seasonal or cyclic variation

(usually resulting in autocorrelation of observations and

residuals), did not seem suitable for the social indicator

data collected. (Should subsequent tests reveal autocorrela-

tion and hence a violation of the assumptions of the linear

model, Box-Jenkins could then be appropriately employed.)

The FIT curve-fitting procedure utilizing a weighted data

principle was rejected because the computer program required

extensive modification to yield necessary comparative statis-

tics and reliable output. Theoretical justification for the

weighting formula and data transformations employed was

unavailable.

Thus, two of the four methods tentatively considered

were rejected. Since the comparison phase was to involve

three methods, the literature was again searched for other

appropriate methods. The most promising of these was a curve

fitting technique which utilizes an exponential function to

describe a constant growth rate. This method, called log-

linear regression in this study, can be described in terms

of the general linear model and solved by least squares pro-

cedures when the dependent variable undergoes logarithmic

transformation. Since social phenomena sometimes exhibit

what appears to be a constant growth rate, log-linear regres-

sion seemed to be an appropriate method to include in this

study.








The three methods finally selected for comparison were

(a) linear regression (without data transformation), (b)

curvilinear or polynomial regression, and (c) log-linear

regression. The mathematical properties of each are pre-

sented in Chapter III. All three approaches to trend extra-

polation were executed by using variations of SPSS subpro-

grams SCATTERGRA4 and REGRESSION and that system's data

transformation capabilities (Nie et al., 1975).

Each of the three methods was applied to each of the

eight selected time series indicators. Time in years was

used as the independent variable; the annual measure or

index of the indicator was treated as the dependent or

response variable. Each data set was divided into thirds;

two-thirds of the data points were used to establish the

prediction equation. This equation was then used to predict

the remaining third of the data points. Predicted values

were then compared with actual values.

Thus, in this phase of the study three prediction

equations (one for each method) were generated for each of

the eight time series indicators. Statistical criteria de-

rived from the literature were used to evaluate the "good-

ness of fit" of the regression line derived from the pre-

diction equation to the data. The distribution of error

(residuals) about the regression line was also examined to

determine if the data satisfied the assumptions of the sta-

tistical model. Results of the method comparison phase are

reported in Chapter IV.









Development of Implications for Educational Planning

In Chapter V methodological strategies involved in the

selection and operational definition of variables are analyzed

in terms of viability for future use. Results of the

technique comparison phase are analyzed according to the

statistical criteria and practical considerations derived

from the literature in forecasting methodology and statis-

tics. In Chapter VI a summary of the study and conclusions

warranted by the results of the study are presented. Future

directions for research suggested by the results of this

study are discussed. Additionally, implications for the

use in educational planning of an ecological model such as

Bronfenbrenner's, time series social and educational in-

dicators, and selected extrapolative methods are discussed.













CHAPTER II

RATIONALE FOR SELECTION OF VARIABLES/
TIME SERIES INDICATORS


In the previous chapter, the need for educational

planners and policy makers to have an awareness of the

societal context of education was emphasized. To this end

the Bronfenbrenner ecology of education model was proposed

as a framework for the selection of social variables which

affect the outcomes of the educational process. The selected

social variables may then be operationalized as time series

indicators; trends in these indicators can be identified

and extrapolated into the future. Such information might

then be incorporated into a planning model in order to assist

planners and policy makers in making informed decisions about

the role and function of education in the future.

In order to place the use of time series indicators

described in this study into perspective, in the first

section of the present chapter social indicators are dis-

cussed in relation to their historical development, defini-

tion and use, and data base. Educational applications of

indicators are briefly noted. In the second section the

social variables selected for use in this study are presented

in relation to the Bronfenbrenner model. These variables

are then operationally defined as time series indicators,








and the eight indicators selected for use in the comparison

of the three extrapolative methods are listed.


The Social Indicator Movement


Historical Development

Interest in societal trends by policy planners is not

of recent origin in the United States. Indeed, in 1933 a

presidential task force reported on social trends in a com-

prehensive work documenting social change in the United States

(President's Research Committee on Social Trends, 1933). The

development of indicators, or measures, of social change,

however, did not receive the sustained governmental support

that was provided for indicators of the economic process.

Thus, while the development of economic statistics during the

1930's and 1940's provided "a solid basis for economic

analysis and economic reporting which eventually resulted in

the establishment of the Council of Economic Advisors and

the Economic Report" (U.S. Department of Health, Education,

& Welfare, 1970, p. v), comparable development of social in-

dicators was not undertaken.

In the 1960's a renewed interest in statistics describing

the social condition became apparent. Impetus for the de-

velopment of social indicators was provided by social

scientists in various disciplines, government policy makers,

and business leaders in the private sector (Brooks, 1972, p.

1). While this early effort was not well defined as to

membership, organization, or objectives, the participants in








the social indicator movement "sensed great needs and oppor-

tunities for change, [and] celebrated shared but necessarily

ambiguous symbols" (Sheldon & Parke, 1975, p. 693).

The following examples were drawn from the many mani-

festations of interest in the development of social indica-

tors during the period from 1965 through 1975 (over 1000 items

were listed in a bibliography issued in late 1972 by Wilcox,

Brooks, Beal, & Klongian):

1. The Russell Sage Foundation commissioned in 1965

and published in 1968 an independent study, Indicators of

Social Change: Concepts and Measurements, on a number of

aspects of structural change in society (Sheldon & Moore,

1968).

2. A study on ways to measure the impact of massive

scientific and technological change on society (Bauer, 1966)

was prepared by the American Academy of Arts and Sciences

for the National Aeronautics and Space Administration. This

work, Social Indicators, was an overview of the task of

developing indicators as part of a feedback mechanism docu-

menting social change.

3. President Johnson in March of 1966 directed the

Secretary of Health, Education and Welfare "to develop the

necessary social statistics and indicators to supplement

those prepared by the Bureau of Labor Statistics and the

Council of Economic Advisors" (U.S. Department of Health,

Education, & Welfare, 1970, p. iii). The result of this

directive, Toward a Social Report issued in 1969, was








considered "a preliminary step toward the evolution of a

regular system of social reporting" (U.S. Department of

Health, Education, & Welfare, 1970, p. iii).

4. The Social Science Research Council established

in 1972 the Center for Coordination of Research on Social

Indicators, whose objective is "to enhance the contribution

of social science research to the development of a broad

range of indicators of social change" (World Future Society,

1977, p. 97).

5. The appearance of the landmark government publica-

tion, Social Indicators 1973 (Executive Office of the Presi-

dent: Office of Management & Budget, 1973),was heralded as a

significant attempt to provide a collection of social sta-

tistics describing quality of life in the United States.

This work, which was scheduled to be up-dated every three

years, was a compilation of statistics on eight major areas

of social interest: health, public safety, education, em-

ployment, income, housing, leisure and recreation, and popu-

lation. (It may be noted that it is often impossible to

strictly categorize what is social and what is economic, as

almost all aspects of life are the result of interaction be-

tween social and economic forces.)

The idea of systematic collection and use of social

indicators has not always mct a favorable reception. Bezold

(Note 4) and Shostak (Note 5) documented the unsuccessful

efforts, beginning in 1967, by Walter F. Mondale to earn

congressional approval of a far-reaching plan for new








government use of the applied social sciences. Mondale's

blueprint for better collection and use of social intelli-

gence involved two statutorily-mandated additions to the

Executive Office of the President: (a) a Council of Social

Advisors (CSA) comparable to the Council of Economic Advisors

established in 1946, and (b) an annual Social Report of the

President prepared by the CSA to parallel the annual Economic

Report to the President. While aspects of Mondale's plan may

be satisfied by such efforts as Social Indicators 1973 and

several congressional provisions which required the develop-

ment and application of social science techniques to the

study of present and future national problems (Shostak, Note

5), the comprehensive nature of the Mondale plan is absent.

The future of such governmental efforts at social accounting

was in 1977 uncertain.

Interest in social indicators has not been confined to

the United States (Johnson, 1975; Sheldon & Parke, 1975).

In 1973, for example, the Organization for Economic Coopera-

tion and Development (OECD) to which the United States also

belongs issued a list of social concerns shared by many

member countries. The identification of concerns was a first

step in the development of

a set of social indicators designed
explicitly to reveal, with validity,
the level of well-being for each social
concern in the list and to monitor
changes in those levels over time.
(OECD, 1973, p. 4)

Additionally, international organizations such as the

Conference of European Statisticians, the United Nations








Research Institute for Social Development, and the United

Nations Educational, Scientific,and Cultural Organization

have been actively concerned with social indicators (Sheldon

& Parke, 1975). Efforts to develop social indicators have

been initiated in countries such as France, Great Britain,

West Germany, Canada, Japan, Norway, Sweden, and Denmark

(Brooks, 1972; Johnson, 1975).


Definition and Use of Social Indicators

The social indicator movement has been characterized

by ambiguity of definition and purpose due, in part, to the

heterogeneous nature of participants with their own back-

grounds, skills, and interests, and also, to the necessary

stages of evolution that such a movement experiences. These

problems in definition and purpose of social indicators have

been discussed by a number of critics (Land, 1971; Little,

1975; Plessas & Fein, 1972; Sheldon & Freeman, 1970; Sheldon

& Land, 1972).

Attempts have been made to resolve a number of these

problems. Land (1971), for example, proposed the following

social science-oriented definition of social indicators:

social indicators refer to social sta-
tistics that (I) are components in a
social system model (including sociopsy-
chological, economic, demographic, and
ecological) or of some particular segment
or process thereof, (2) can be collected
and analyzed at various times and accumu-
lated into a time series, and (3) can be
aggregated or disaggregated to levels
appropriate to the specifications of the
model. .*. The important point is that
the criterion for classifying a social








statistic as a social indicator is its
informative value which derives from its
empirically verified nexus in a con-
ceptualization of a social process.
(p. 323)

Part of the confusion over definition is the result

of disagreement over purposes, or uses, of social indicators.

These purposes, or uses, have been considered under a number

of overlapping, sometimes synonomous, headings: (a) descrip-

tions reporting, (b) policy planning, (c) social accounting,

(d) program evaluation, (e) social modeling, (f) social fore-

casting, and (g) social engineering. While the ultimate ob-

jective of guiding social policy is rarely disputed, the

form of this guidance is still debated. Social scientists

are more likely to be concerned with the analysis and pre-

diction of social change, while public administrators and

legislators are often more concerned with uses of indicators

related to public program evaluation and agency goal setting.

Sheldon and Parke (1975) in acknowledging these concerns,

said:

It is apparent that many different
types of work go on under the rubric
of social indicators. What is impor-
tant is that the field be seen as an
arena for long-term development, as
an effort of social scientists to push
foreward developments in concepts and
in methodology that promise payoffs
to both science and public policy.
(p. 698)

To underscore this point, Sheldon and Parke (1975) selected

an observation by Duncan:

The value of improved measures of
social change. . is not that they
necessarily resolve theoretical issues








concerning social dynamics or settle
pragmatic issues of social policy,
but that they may permit those issues
to be argued more productively.
(p. 698)


Data Base for Social Indicators

Various efforts have been undertaken to improve the

data base for social indicators. Among the efforts in the

early 1970's were basic surveys on crime and education as

well as replications of previous social science studies and

surveys (Sheldon & Parke, 1975).

Most social statistics, available Drimarily from govern-

ment sources, are objective in nature; that is, they measure

the frequency of occurrence of an attribute or commodity in

the population. Numbers of births, deaths, marriages, years

of schooling, and percent of occupied housing with television

sets could thus be considered objective measures. (Some would

disagree, however, with the objectivity of these measures,

see Andrews & Withey, 1976, p. 5.)

Several researchers (Andrews & Withey, 1976; Campbell,

Converse, & Rogers, 1976) have attempted to measure people's

perceptions of their well-being, their quality of life. Such

measures collected on a regular basis are expected to be

valuable supplements to the usual objective quality of life

indicators. (See for examples of the latter: Liu, 1976;

Thompson, 1976b, 1977.)

Creation of a social indicator data base is not without

conceptual and methodological problems. Various aspects of








the social measurement problem have been acknowledged in the

literature (see, for example, de Neufville, 1975, pp. 175-179;

Etzioni & Lehman, 1969; Social Measurement, 1972). While de-

tailed discussion of measurement dysfunction (in the termi-

nology of Etzioni & Lehman, 1969) is beyond the scope of this

study, the following observation might be kept in mind:

Increased investment, intellectual as well
as financial, no doubt can go a long way to
increase the efficacy of social measurements
and to reduce much of the likelihood of
dysfunctions. But, in the final analysis,
these problems can never be eliminated en-
tirely. Here, the client of systematic
measurement and accounting should be alerted
to the limitations of social indicators,
both to make his use of them more sophisti-
cated and to prevent him from ultimately
rejecting the idea of social accounting when
he encounters its limitations. (Etzioni &
Lehman, 1969, p. 62)


Educational Implications

Educational indicators, a subset of social indicators,

have traditionally been measures of the educational system's

inputs and outputs stated in such terms as numbers of tea-

chers, per pupil expenditures, and achievement test scores.

There have been attempts, however, to broaden this base of

educational statistics to include both objective and sub-

jective indicators under the categories of access, aspirations,

achievement, impact, and resources (Gooler, 1976, p. 15).

There have also been attempts to link indicators of social

processes (e.g., divorce rates, voting rates) to educational

goals and thus to establish accountability measures, albeit

remote, external to the educational system (Clemmer, Fairbanks,








lall, Impara, & Nelson, 1974; Collazo, Lewis, & Thomas, Note

6; Grady, 1974). The use and abuse of indicators in an edu-

cational setting, however, remained in 1977 a matter of debate

(Impara, Note 7) and cautious optimism (Hall, Note 8). Hope-

fully, investigations of the problem, such as that described

in this study, will provide some guidance as to the most

promising applications of social indicators to education.


Selection of Variables/Time Series Indicators


The Variables

In the first year (Sept. 1975-June 1976) of Florida

Department of Education STAR Project R5-175 on social fore-

casting for educational planning, trends in five indicators

of educational outcomes were forecast. In order to do this,

it was necessary to identify variables that influence the

outcomes of education. Through a review of the research and

theoretical literature, a number of social variables were

identified. This list was refined by an interdisciplinary

panel of experts at the University of Florida to the

following 10 variables: (a) socio-economic status; (b) family

expectations, attitudes, and aspirations; (c) student's self-

concept; (d) student's general ability; (e) student's sense

of fate control; (f) student's attitudes and motivation; (g)

peer group characteristics; (h) teacher expectations; (i)

teacher behavior in the classroom; and (j) administrative

leadership style. Collazo et al. (1977) said that only the

variables (a) and (d) received strong support from research;








a number of the other variables, while "identified as impor-

tant in the theoretical literature. .had inconclusive

support from research" (p. 298). (See Collazo, Lewis, &

Thomas, Note 9, for a review of the research literature on

variables affecting educational outcomes.)

The panel of experts was further utilized to forecast

the future trends of these variables and their effect on

specified performance and utilization measures of the out-

comes of education. Cross-impact analysis, a computer assist-

ed modification of the Delphi forecasting technique, was then

used by the panel to generate the future trends in the five

outcome indicators.

The framework for looking at the future established

during these first year project activities is utilized in

the present study. Previous forecasting activities were based

primarily on the subjective judgment of panel participants.

In this study, however, the feasibility of using time series

data, where available, as the basis for forecasting future

trends in the 10 variables affecting educational outcomes is

examined. In addition, the use of a model containing the

selected variables is considered.


Bronfenbrenner's Ecology of Education Model

In the previous section, the 10 variables affecting

educational outcomes which were derived from the research

literature were presented. flow can these variables be put

into perspective as social forces influencing what the stu-

dent learns?








The Bronfenbrenner (1976) model which was presented in

Chapter I (pp. 1-3) is a multi-dimensional ecological struc-

ture of the educational environment. At the center of the

interacting meso-, exo- and macro-systems is the micro-system,

"the immediate setting containing the learner" (Bronfenbrenner,

1976, p. 5). The meso-system is actually a system of micro-

systems; that is, it "comprises the inter-relationships among

the major settings containing the learner at a particular

point in his or her life" (Bronfenbrenner, 1976, p. 5). Some

of the social variables that were identified previously could

be considered as part of the meso-system. The home, for ex-

ample, is represented by socioeconomic status and family

expectations, attitudes, and aspirations; the peer group by

peer group characteristics; and the school by teacher expec-

tations, teacher behavior in the classroom, and administra-

tive leadership style. The other variables: student's self-

concept, student's general ability, student's sense of fate

control, and student's attitudes and motivation are all di-

rectly related to the learner.

Bronfenbrenner (1976) proposed that learning is a func-

tion of (a) the dynamic relationship between characteristics

of the 'learners and their various surroundings (meso-system)

and (b) the interaction between these various environments

(e.g., home, school, peer group). The Bronfenbrenner ecology

of education model thus appears to provide the necessary

framework to support use of the presently identified variables

and to generate directions for future forecasting research.







Operational Definition of Variables as Time Series Indicators

In previous sections 10 variables affecting educational

outcomes were presented and then classified according to the

Bronfenbrenner ecology of education model. In order to iden-

tify trends in these variables and to extrapolate these trends

into the future, it was necessary to operationally define

these variables as time series measures, or indicators. Since

some of these variables were expressed in general terms, it

seemed necessary to try to represent each by a number of

measures and thus avoid "fractional measurement" which is

often a concern when operationally defining a social concept

(Etzioni F, Lehman, 1969).

Several problems became apparent in operationalizing

the variables:

1. A number of indicators were identified for the

variables (a) socioeconomic status; (b) family expectations,

attitudes, and aspirations; and (c) peer group characteristics.

For some indicators, however, data were not collected annually;

for others, measures were not comparable over time due to a

different basis for measurement.

2. For the variables related to the school and student

characteristics (except student attitudes and motivation),

no time series data which met the criteria for selection were

available.

3. Operational definitions were in many cases influenced

by the availability of indicators rather than the logic or

appropriateness of the indicator to measure the social

concept it represented.








The social variables, examples of indicators that might

be used to operationally define these variables, and sources

of the available time series data are presented in Table 1.

The following eight indicators which met the criteria estab-

lished for this study (see p. 9) were selected for use with

the three extrapolative methods described in Chapter III:

1. Median family income in the United States ex-

pressed in 1971 constant dollars.

2. Number of families in the United States headed by

women expressed as a percentage of total families.

3. Number of wives in the labor force expressed as

a percentage of total wives in the United States.

4. Number of marriages in Florida expressed as rate

per 1,000 population in Florida.

5. Number of dissolutions of marriage in Florida ex-

pressed as rate per 1,000 population in Florida.

6. Number of resident live births in Florida ex-

pressed as rate per 1,000 population in Florida.

7. Number of 3 to 5 year olds enrolled in nursery

school and kindergarten expressed as percentage of total

children 3 to 5 years old in the United States.

8. Number of children involved in divorce or annulment

expressed as rate per 1,000 children under 18 years old in

the United States.

While rates or percentages are used for forecasting

purposes, the magnitude of the actual numbers should be kept

in mind before interpretation of an identified trend is









Table 1


Time Series Indicators of Social Variables Affecting Outcomes of Education

Available-time series data
Social variables Indicatora Years b) State U.S. Source


Socioeconomic
status of
family


Median family
income

Employment
rate-total
labor force


Unemployment
rate




Husband-wife
families with
two workers
or more

Female headed
families


1947-
1971

1947-
1976


1947-
1975


1950-
1975


1950,
1955-
1957


1940,
47,50
55,60
65,70-
75


Fla.


X U.S. Dept. of Commerce, Bureau of the
Census in Social Indicators 1973

X U.S. Dept. of Labor, Bureau of Labor
Statistics in Employment and Earnings,
Dec. 1976

X U.S. Dept. of Labor, Bureau of Labor
Statistics in Employment and Earnings,
Dec. 1976

Fla. Dept. of Commerce, Division of
Employment Security Research and
Statistics

X U.S. Dept. of Labor, Bureau of Labor
Statistics in Special Report 189



X U.S. Dept. of Labor, Bureau of Labor
Statistics in Special Report 190









Table 1 continued

Available time series data
Social variables Indicator a Yearsb State U.S. Source


Labor force
participation
rates of wives

Children
living in
poverty


Family expecta-
tions,
attitudes,
aspirations


Marriage
rate


Divorce
rate


Birth rate


1950,
55-75


1959,
62,65
68,71
75,76

1930,
40,50
60,63-
75

1930,
40,50
60,63-
75

1964-
1975


Births to
unwed mothers
By age 10-18 1957-
1975


Fla.


X U.S. Dept. of Labor, Bureau of Labor
Statistics in Special Report 189


X U.S. Dept. of Commerce, Bureau of the
Census and 1976 Survey of Income and
Education


State of Florida, Dept. of Health and
Rehabilitative Services, Division of
Health in Florida Statistical Abstract
1976 (Thompson, 1976a)


Fla.


Fla.


Fla.


X U.S. Dept. of Commerce, Bureau of the
Census in Florida Statistic Abstract
1976



Public Health Statistics Section, Fla.
Dept. of Health & Rehabilitative
Services








Table 1 continued

Available time series data
Social variables Indicator a Yearsb State U.S. Source


Peer group
characteristics


Student's self-
concept


By race



By age &
race


Public atti-
tude toward
education

3 to 5 year
olds enrolled
in nursery
school and
kindergarten

Children under
18 involved in
divorce

Suspected
offenders for
four violent
crimes, by
age

NAc


1930,
40,50,
56-75

1956-
1975


1969-
1976


1964-
1975


Fla.



Fla.


1953-
1975


1958-
1972


Public Health Statistics Section, Fla.
Dept. of Health & Rehabilitative
Services

Public Health Statistics Section, Fla.
Dept. of Health & Rehabilitative
Services

X Gallup Poll (published annually in
Phi Delta Kappan, 1969-77


X HEW, National Center for Educational
Statistics in Advisory Committee on
Child Development (1976) study


X 1953-67:
1968-75:


Ferriss (1970)
Bane (1977)


X See Social Indicators 1973 for list
of sources









Table 1 continued

Available time series data
Social variables Indicatora Yearsb State U.S. Source

Student's general NA
ability

Student's sense NA
of fate
control

Student's atti- High school 1901- X HEW, National Center for Educational
tudes toward graduation 1975 Statistics in The Condition of Edu-
education and rate biennial cation 1977, Vol. 3, Part 1
motivation for
achievement School reten- 1924-32 X HEW, National Center for Educational
tion rate, 5th to Statistics in Digest of Education
grade to high 1967-75 Statistics (1976 edition)
school gradua-
tion

Teacher expecta- NA
tions

Teacher behavior NA
in classroom
Administrative NA
leadership style
aIndicators presented are examples of time series data that might be used to operationally
define the accompanying variable; a number of other indicators representing data aggregated
to local, state, or national levels (depending upon purpose) could be added and/or substituted.
bYears presented are those found in the course of this study and do not imply that data
are in any way limited to these years.
cNA: Time series indicator not available for variable.









attempted. Furthermore, it is necessary to remember that

since the population base increased over the decades covered

by the data, a stable rate or percentage still represents

larger absolute numbers of the phenomenon. The indicators

selected are aggregated to either the state or national level;

the appropriate level of aggregation would, of course, depend

upon the specific planning activity. These indicators could

be disaggregated by race, age, region, or sex (where appro-

priate) for comparative analysis, and indeed this feature is

a necessary characteristic in many of the definitions of

social indicators (e.g., see definition by Land, 1971, pre-

sented earlier in this chapter).













CHAPTER III

RATIONALE FOR EXTRAPOLATIVE METHODS
SELECTED FOR COMPARISON


One of the purposes of this study was to compare three

purely extrapolative methods which could be used with social

indicator data of the type described in Chapter II to fore-

cast future values of those indicators. In order to select

methods which were appropriate for this purpose, both the

general forecasting literature and forecasting applications

of extrapolative techniques in specific areas were reviewed.

Detailed descriptions of each technique as well as statisti-

cal assumptions, sensitivity, and evaluative criteria were

derived primarily from the literature in economic statistics

and regression analysis. The following sections provide (a)

an overview of the extrapolative methods used in forecasting,

(b) evaluation of the applicability of these methods for the

purpose of this study, (c) a description of the three methods

selected for comparison, including equations, parameters to

be estimated, assumptions, and criteria to be used in the

comparison of the three methods.


Overview of Extrapolative Forecasting Methods


An extrapolative forecasting method is a procedure for

(a) identifying an underlying historical trend or cycle in









time series data, and (b) estimating future states of a

variable based on current and historical observations/mea-

sures of that variable (Harrison, 1976). Extrapolation pro-

vides a "surprise free" projection of the future, but not

necessarily a future which is a bigger and better (or worse)

version of the present. Martino (1976) noted that

some extrapolation methods allow the
forecaster to identify policy variables
which are subject to manipulation and
which allow the decision-maker to alter
the future away from today's pattern
of events. (p. 4)

In the social realm extrapolation of trends may at least

allow the planner or policy maker to make enlightened de-

cisions to prepare for the future.


Economic and Business Forecasting

Because of the impetus in the 1930's and 1940's to

describe and forecast the economic condition, many extra-

polative methods were developed with economic applications

in mind. Greenwald (1963, p. 187) classified methods for

determining economic trends into (a) non-mathematical methods

such as freehand curve fitting, first-order differences, semi-

averages, selected points, and weighted and unweighted moving

averages; and (b) mathematical methods such as least squares,

moments, maximum likelihood, and others. In general, only

the mathematical methods, which include a widely diverse

array of complex curve-fitting techniques, seem to be relied

upon for forecasting purposes while the non-mathematical

methods are used for preliminary analysis of the shape of the








time series data. (For descriptions of these methods, see

Greenwald, 1963; Mayes & Mayes, 1976; Mendenhall & Reinmuth,

1971; Neiswanger, 1956; Tuttle, 1957.)

Approaches to governmental/national economic forecasting

(e.g.,Theil, 1966) often reach a relatively high level of

mathematical and theoretical sophistication. This appears

to be the result of decades of development, of applying

method in light of theory, and developing both in turn. It

is also the result of substantial investment of financial

and manpower resources by both government and industry.

The value of extrapolative forecasting to individual

decision-makers in business has become apparent (Makridakis,

Hodgsdon, & Wheelwright, 1974). Indeed, companies of all

sizes are compelled to make forecasts for a number of varia-

bles which affect them. Makridakis et al. (1974) have noted,

however, that

as with the development of most management
science techniques, the application of
these [extrapolative forecasting] methods
has lagged behind their theoretical formu-
lation and verification. (p. 153)

Thus, the authors observed that while the need for forecasting

methods is recognized by managers in business, few are famil-

iar with the numerous techniques available and their charac-

teristics in order that the one most appropriate for a given

situation be selected. To help meet this need, Makridakis

et al. have developed an interactive forecasting system

(called Interactive Forecasting [SIBYL/RUNNER]) which allows

a number of factors to be considered in the selection of a








forecasting technique for a given set of data. Although

the system has been well tested in teaching situations, it

has not had extensive application in actual business settings.

Quantitative techniques available in the Interactive Forecas-

ting (SIBYL/RUNNER) system fall under the general headings of

smoothing, decomposition, control, regression, and other

techniques. The techniques considered under those headings

are clearly explained in a subsequent work of two of the

authors (Wheelwright & Makridakis, 1977).


Technological Forecasting

Martino (I973h; 1976) described the extrapolative methods

most commonly used in technological forecasting in relation

to the shape of their fitted curves: (a) growth curve, an

S-shaped curve, which requires the setting of an upper limit;

(b) trend curve, an exponential function which takes the form

of a straight line when logarithmic transformation of the

data is undertaken. Martino (1973b) illustrated the use of

the growth curve with data on lowest temperature achieved in

the laboratory by artificial means and the trend curve with

data on productivity in the aircraft industry.

It should be noted that both the growth curve and the

trend curve applied to technological change by Martino (1973b)

are highly versatile approaches with applications in a number

of disciplines. Both methods are derived from the least

squares formula for a straight line. The growth curve is

a modified exponential, that is, it represents a variable

which changes at a changing rate; the trend curve is a








geometric straight line which represents a variable which

changes at a constant rate (Neiswanger, 1956).


Educational Forecasting

Uses of extrapolative methods in education have generally

been limited to projections of expenditures, school enroll-

ments, and the number of instructional staff, high school

graduates, and earned degrees. While many states and school

districts have developed their own models, especially for

projections of enrollments, the National Center for Educa-

tion Statistics (U.S. Department of Health, Education, &

Welfare, 1977c) in developing projections of education statis-

tics to 1985-86 relied on regression methods wherever a trend

could be established. Specifically, either arithmetic

straight lines or logistic growth curves, depending upon the

nature of data, were fitted by the method of least squares.

The following was noted, however:

For both the straight line and logistic
growth curve, the fitted curve often lies
considerably above or below the last ob-
served point, resulting in an unusual
rise or drop from the last actual observa-
tion. To avoid this and give face validity
to the projections, the fitted curve was
used only to establish the last point,
and a new curve was drawn through the last
observed ratio and the end point on the
fitted curve. (U.S. Department of Health,
Education, & Welfare, 1977c, p. 92)

Brown (1974) summarized the use of trend analysis methods

in education and noted their potential applications in educa-

tional administration. The four extrapolative methods that

he critiqued were (a) arithmetic straight line extrapolation,







(b) time series analysis (really a simplified version of the

Box-Jenkins technique), (c) the S-shaped growth curve, and

(d) cohort analysis (actually the trend curve described by

Martino in the previous section). The examples selected by

Brown do not reveal the versatility of the methods illustra-

ted; he did, however, provide a comprehensive review of

literature describing applications in other fields. A number

of methodological concerns raised by Brown were considered

in this study.

In a critique of selected futures prediction techniques

that might be employed by educational planners, Folk (1976)

observed that exponential trend line and arithmetic straight

line projections appear to be the most commonly used extra-

polative techniques. This author provided a number of useful

measures for evaluating statistically derived regression

lines.

The educational applications just described are basically

attempts to project inputs such as money, pupils, or teachers

to the educational system or outputs (graduates, degrees

earned) of that system. No attempt to extrapolate the future

status of variables which affect these student-related inputs

or outputs was discovered in the literature search.


Extrapolative Methods in Other Areas

Several areas have developed highly specialized extrapo-

lative methods in making forecasts of the future. Popula-

tion, employment, and unemployment projections, for example,








are usually based on fairly complex models which incorporate

a number of factors. These particular applications are not

reviewed here due to their highly specialized purposes and

functions.


Applicability of Reviewed Extrapolative Methods
for Study

In evaluating the applicability of the previously re-

viewed extrapolative methods for projected future states of

the time series indicators selected for use in this study,

several points needed to be considered. Chief among these

were (a) the underlying pattern of the data that can be

recognized and (b) the type or class of model desired (from

Wheelwright & Makridakis, 1977). Both of these will be

briefly considered in relation to this study.


The Pattern of the Data

From graphical representations of each indicator, the

data for each appeared to be characterized by a trend which

either increased or decreased with time. Some also appeared

to contain cyclical patterns and random fluctuations. It

seemed as if major trends might follow the form of a straight

line or curve with one or two bends.


The Class of Model

Wheelwright and Makridakis (1977) distinguished four

classes or categories of models:

1. The time series model "always assumes that some

pattern of combination of patterns is recurring over time"

(p. 22).








2. The causal model assumes "that the value of a cer-

tain variable is a function of several other variables" (p. 23).

3. The statistical model comprises a number of fore-

casting techniques; it uses the language and

procedures of statistical analysis to
identify patterns in the variables being
forecast and in making statements about
the reliability of these forecasts.
(p. 23)

4. The nonstatistical model includes "all models that

do not follow the general rules of statistical analysis and

probability" (p. 24).


Of course, some techniques can be classified into more than

one of the four types of models. It appeared that the

statistical model, with its well-defined properties, and

replicable procedures, would be an appropriate starting

point for predicting the long-term trends in the selected

time series data.

The review of the literature revealed several techniques

denoted by the form of their curves which are sensitive to

long-term trends in the data and which are classified under

the statistical model: (a) the arithmetic straight line,

(b) the S-shaped growth or logistic curve, (c) the trend or

exponential curve, (d) the polynomial curve. All of these

techniques are regression techniques solved by least squares

procedures. Techniques (b) through (d) require data trans-

formations to satisfy the basic linear model used in regres-

sion. The growth or logistic curve was eliminated from com-

parison because this technique necessitates the setting of









limits which might bias the results of the study due to its

ex post facto nature. The remaining three techniques were

considered to be appropriate for use in the comparison phase

of this study.


Description of Methods to be Compared


Since the three techniques selected for comparison are

intrinsically linear in their parameters (Draper & Smith,

1966), the general linear model denoted by the simple or

bivariate regression equation is presented first. Addition-

ally, estimation of the parameters of the equation by least

squares procedures, the assumptions of the model, and criteria

for evaluation and comparison of the three methods are dis-

cussed. Each technique is then described in relation to the

general linear model.


The General Linear M'odel

In the comparison of methods using selected time series

indicators, time in years is considered the independent

variable and the indicator is considered the dependent or

response variable. Thus, if time is denoted by X, and the

indicator is denoted by Y, a functional relationship in the

form

Y = f(X)

might be stated. However, since most social relationships

are stochastic (probabilistic) rather than deterministic in

nature, a more appropriate form might be








Y = f(X) + e,

where e represents error, a measure of the unknown factors.

When the relationship between the two variables, time and

the indicator (Y) is assumed to be linear (that is, repre-

sented by a straight line), the equation becomes

Y = 60 + ix;

and because many social relationships are stochastic for

particular values of the variables, this equation is actually

Y = 30 + 81X + E.

Since the population parameters So and 6i are not known unless

all possible occurrences of X and Y are known, the available

data are used to provide estimates b0 and b, of So and Sj as

in the following regression equation,

Y = b0 + bjX + e

(where Y denotes predicted values of Y).

The constant b0 (the intercept) and the regression coefficient

b, (the slope of the regression line) can be determined by

ordinary least squares procedure, "so called because it

estimates. . in such a way that the sum of squared residuals,
2
Eei is as small as possible" (Mayes & Mayes, 1976, p. 112).

(For detailed treatment of simple regression and least squares

estimation of So and 61, see, for example: Draper & Smith,

1966; Kerlinger & Pedhazur, 1973; Mayes & Mayes, 1976; Men-

denhall, Ott, & Larson, 1974; Mendenhall & Reinmuth, 1971;

Runyon & laber, 1967.)








The Assumntions of the Linear Model

Draper and Smith (1966) noted that

In many aspects of statistics it is
necessary to assume a mathematical
model to make progress. It might be
well to emphasize that what we are
usually doing is to consider or
tentatively entertain our model.
(p. 8)

Thus, when the general linear model is employed as it is in

this study, it becomes necessary to examine the assumptions

upon which the model is based and to judge whether the model

is in fact appropriate for the data.

Assumptions for the general linear model include the

following:

1. The regression equation

Y = b0 + b1X + e

is a better predictor of Y than

Y = Y (bi/ 0).

2. The regression equation accounts for a significant

portion of the variation in Y, that is, the relationship

between X and Y described by the equation is not the result

of chance.

3. The error term c has a mean value equal to zero and

variance equal to 02; it is an independent random variable

which is normally distributed.

If the first two assumptions are not met, then the model

is not a good predictor for that data. If the third assump-

tion is not met, then it is not appropriate to interpret the

results statistically, that is, in terms of the probability








distribution of the random error e. It is possible to test

Assumption 1 and Assumption 2 by the F statistic. Assumption

3 is best evaluated by plotting the residuals and examining

the pattern of the deviations from the regression line

(Anscombe, 1973; Anscombe & Tukey, 1963; Draper & Smith,

1966). Independence of the errors (Assumption 3[e]) may be

tested by the Durbin-Watson test for serial correlation

(Durbin & Watson, 1950; Durbin & Watson, 1951; Mayes & Mayes,

1976; Wheelwright & Makridakis, 1977).


Criteria for Comparison of Methods

The following questions were derived from the literature

to guide the comparison of methods:

1. Do the data satisfy the assumptions of the model?

(See previous section.)

2. How well does the regression line fit the data from

which it was derived (the two-thirds of the data points used

to generate the prediction equation)? Tufte (1974, pp. 69-70)

listed four measures of quality of fit:

a. the N residuals: Y. Y.
1 1
b. the residual variation:
s2 (Y i ^(Y ) 2
y-x
N k* -1

(or the square root of the residual variation, Sy.x, called

the unbiased standard error of estimate).


*k refers to the number of X terms in the regression
equation.








c. the ratio of explained to total variation:

r2 = i
E(y i y)2


d. the standard error of the estimate of the slope:

Sbi = Syx



Thus, for each set of data, the methods are compared

according to these four measures. The observed and pre-

dicted values of Y are also reported in tabular form; both

observed and predicted values are plotted for visual com-

parison as recommended by Anscombe (1973).

3. How well does the extrapolated line fit the data

(the one-third of the data points that were not used to

generate the prediction equation)? The residual variation

around the extrapolated line, which is an indicator of the

accuracy of the forecasting technique, may be expressed by

its square root, the standard error for the extrapolated

values. As in (2), the observed and extrapolated values

are reported in tabular form; both observed and extrapolated

values are plotted for visual comparison.

Neiswanger (1956, p. 534) cautioned against accepting

only mathematical tests of "goodness of fit" as proof that

the mathematical expression is appropriate for the trend in

the data. Other considerations such as the "reasonableness

of the extrapolated values which the trend may yield" (p. 534)








and "the extent to which this statistical manifestation of

growth is supported by other evidence" (p. 534) should be kept

in mind. Thus, the calculation of a trend is more than a

mathematical analysis in curve fitting; it is essentially a

problem of analysis of the phenomena represented by the data

(Neiswanger, 1956).

It should be also noted that while the standard error of

estimate gives an overall measure of error around the regres-

sion line, it may not be appropriate for computing confidence

intervals for a specific forecast value. The reason for this

is that the further an X is from X, the larger is the error

that may be expected when predicting Y from the regression

line. Draper and Smith (1966) noted:

We might expect to make our "best" pre-
dictions in the "middle" of our observed
range of X and would expect our predic-
tions to be less good away from the
"middle." (p. 22)

Therefore, the confidence limits for the true value of Y for

a given X are two curved lines about the regression line.

The limits change as the position of X changes. Hence the

following equation was provided by Wheelwright and Makridakis

(1977, p. 82) for computing the standard error of forecast

(SEf):

SEf = (Y i 31 + + )2
j N J(Xi _

for a specific forecast value.









Method 1: Simple Linear Regression

The equation

Y = f(X)

describes a natural functional relationship between X and

Y. If this functional relationship can be expressed by a

straight line on arithmetic paper, the linear, first-order

regression equation

Y = b0 + biX + 6

may be appropriate. The natural linear function is used when

an absolute amount of change in Y per unit of X is hypothe-

sized.


Method 2: Log-linear Regression

Occasionally when time series data are plotted on an

arithmetic scale the scatter of points fall more in a curve

than in a straight line with the curve rising or decreasing

more rapidly as X increases. These same data when plotted

on a semilogarithmic scale will produce a straight line.

The relationship between X and Y may then be described by

log Y = f(X)
or
Y = abx,

the exponential form of the logarithmic relationship between

X and Y.

The exponential function is used when there is thought

to be a constant rate of change in Y per unit absolute change

in X. Thus, for each year (X), Y changes by a constant per-

centage (rather than by an absolute amount as in Method 1).








It is possible to fit the exponential function to the

general linear model by transforming the values of Y to log

Y. Thus

Y = abx becomes

log Y = a + bX

log Y = log a + X log b

or

log Y = log b0 + X log bi + 8.

As in the case of the natural number straight line, the

method of least squares is used to estimate the parameters

necessary for computing the logarithmic (or geometric)

straight line. Tuttle (1957, p. 431) noted, therefore, that
AA
the log Y's are fitted to the log Y's, not the Y to the Y's,

by the least squares criterion. Thus, Tuttle (1957, p. 432)

recommended that the standard error of estimate be computed

from the antilogs of the log Y values. If Sy.x was computed

as the root mean square of the unexplained variation, "it

would be in terms of the deviations of the logarithms of the

YcS[Y's] from the logarithms of the Y's" (Tuttle, 1957, p.

432). The Sy x would not be comparable to those obtained

from untransformed data as in Method 1.

Similarly, Seidman (1976) has observed that in comparing

linear and log-linear models, R2 may not be a sufficient

criterion of choice. This is because the R2 represents "the

proportion of variance of the logarithm of Y explained by

the regression: log Y, not Y, is the dependent variable"

(Seidman, 1976, p. 463). Therefore, Seidman recommended








using the antilogs of the predicted values of log Y "in a

regression explaining variability in Y" (p. 463). This R2

may then be used for comparison purposes. The examples given

by Seidman (1976) were based on logarithmic transformations of

both dependent and independent variables, but the same ob-

servation may be made when only the dependent variable is

transformed. Seidman's reservation about R2 has been con-

sidered in this study.

If an exponential curve appears to fit the data, it is

often desirable to find the annual rate of change c. This

can be derived from the regression coefficient bl, according

to the following equation:

log bi = (1 + c)

change = antilog b1 1.

The result should then be expressed as a percent (Mayes &

Mayes, 1976, p. 94; Nie et al., 1975, p. 370).

The common or Briggs logarithm, used in the Y trans-

formation in this study, is the power to which 10 must be

raised to equal the number (see Neiswanger, 1956, p. 210;

Tufte, 1974, p. 108). Natural logs or logs to the base 2

could also have been used to obtain the same results

(Snedecor, 1956, pp. 450-451).


Method 3: Polynomial Regression

In Method 1, the equation which expresses a straight

line relationship between X and Y is

Y = bo + biX +








which is a linear (in the b's) first-order (in X) regression

equation. When this functional relationship between X and

Y can be expressed as a solid, or unbroken curve on arithmetic

paper, the linear, second-order (or quadratic) regression

equation

Y = b0 + biX + b2X2 + 8

may be appropriate. When the relationship can be expressed

as a curved line with two bends on arithmetic paper, the

linear, third-order (or cubic) regression equation

^Y = b0 + biX + b2X2 + b3X3 + @

may be used.

According to Kerlinger and Pedhazur (1973, p. 209), the

highest order a polynomial equation may take is equal to N 1,

where N is the number of distinct values in the independent

variable. However, since one of the goals of scientific

research is parsimony,

our interest is not in the predictive
power of the highest degree polynomial
equation possible, but rather in the
highest degree polynomial equation
necessary to describe a set of data.
(Kerlinger & Pedhazur, 1973, p. 209)

Another reason for a parsimonious approach to polynomial

curve fitting is that for each order added to the equation,

a degree of freedom is lost. This is especially important

when the number of observations are small as they are in this

study (observations range from 8 to 20 in each of the eight

sets of data). Also, higher order polynomial curves may

possess statistical significance but be devoid of practical









significance. Accordingly, only the quadratic and cubic

forms of the polynomial regression equation are considered.

In the polynomial regression the independent variable,

X (time), is treated as a categorical variable and is raised

to a certain power. In the quadratic equation, each value

of X is squared to create a new vector of the squared X's,

X2. Similarly, in the cubic equation, each value of X is

cubed to create an additional vector of the cubed X's, X3.

Thus, the resulting equation can be solved by a stepwise

multiple regression procedure, in which at each step of the

analysis, the R2 is tested to see if the higher-degree poly-

nomial accounts for a significant proportion of the variance.

While a least squares solution is used in this study, the

values of the unknowns may also be found by orthogonal poly-

nomials (see Draper & Smith, 1966, pp. 150-155; Greenwald,

1963, pp. 204-209; Kerlinger & Pedhazur, 1973, pp. 214-216).

Neiswanger (1956, pp. 529-532) noted that the second-

degree and third-degree parabolas provide greater flexi-

bility in fitting a line to a set of data for the parabolas

allow a trend to change direction. Whether or not the

flexibility of the parabolic function enhances the predic-

tability of extrapolated Y values, however, is not certain

and is examined in this study.














CHAPTER IV

COMPARISON OF EXTRAPOLATIVE METHODS USING
SELECTED TIME SERIES INDICATORS


In Chapter II a rationale for the selection of social

variables operationally defined as time series indicators

was provided. The following eight time series indicators

were selected for use in the method comparison phase of this

study:

1. Median family income in the United States expressed

as 1971 constant dollars.

2. Number of families in the United States headed by

women expressed as a percentage of total families.

3. Number of wives in the labor force expressed as

a percentage of total wives in the United States.

4. Number of marriages in Florida expressed as rate

per 1,000 population in Florida.

5. Number of divorces in Florida expressed as rate

per 1,000 population in Florida.

6. Number of resident live births in Florida expressed

as rate per 1,000 population in Florida.

7. Number of 3 to 5 year olds enrolled in nursery

school and kindergarten expressed as percentage of total

children 3 to 5 years old in the United States.

8. Number of children involved in divorce or annulment








expressed as rate per 1,000 children under 18 years old in

the United States.

A rationale for the three extrapolative methods selected

for comparison in this study was presented in Chapter III.

The three methods are simple linear regression, log-linear

regression, and polynomial regression (specifically the

quadratic and cubic forms).

The following questions derived from the literature were

proposed in Chapter III to guide the comparison of methods:

1. Do the data satisfy the assumptions of the general

linear model?

2. How well does the regression line fit the data from

which it was derived (the two-thirds of the data points used

to generate the prediction equation)?

3. [low well does the extrapolated line fit the data

(the one-third of the data points that were not used to

generate the prediction equation)?

To answer these questions in terms of each method and

to facilitate comparison among the three methods, the results

obtained from applying each of the methods to each of the

eight indicator data sets are presented in the following

manner:

1. The fit of the regression line to the observed data

is indicated by r2 and the unbiased standard error of

estimate S..x* For the simple linear and log-linear regres-

sion methods the amount of variance accounted for by the

regression line is tested by the F statistic (F value is the








same as that obtained by dividing bi by SE ). For the quad-

ratic and cubic forms of polynomial regression, both the r2

including all orders entered to that step (r2 or r2
y.12 ory.123)

and the increase in r2 attributable to the last order entered

in the regression (r (2.) or ry(3.12)) are tested with the

F statistic.* (Of course, dividing the partial regression

coefficients b2 in the quadratic form and b3 in the cubic

form by their respective standard errors will also yield the

same F value for the increase in r2.)

2. The fit of the extrapolated line to the data is

indicated numerically by the standard error for the extra-

polated values, Sext(y-x)* This measure reflects the

average deviation of the extrapolated values from the ob-

served values of Yi; thus,

-N (f (Y -)

Sext(y.x) =extrapolated values)

(Note that this equation is not the "unbiased" form used in
computing S .)
y.x


*Actually the increase in r2 is tested according to the
following ratio:

F = (r2 with kthorder term) (r2 without k th-order term).
(1 r2 with kth-order term) / (N k 1)

Total r2 is tested according to the following ratio:

F =SS regression/k
F S residual/(N -_1








3. All observed and predicted values of Y are reported

in tabular and graphic form.

4. The residuals around the regression line were ex-

amined for serial correlation by the Durbin-Watson d statistic,

which is noted only when serial correlation is confirmed or

questionable. Additionally, the standardized residuals

were plotted against the sequence of cases and also against

standardized Y values. Such visual inspection of the data is

discussed as necessary to support the interpretation of re-

sults in Chapter V.


Presentation of Results


Indicator 1

The mean and standard deviation for the Y values used

to generate the regression equations are 6674 and 987, re-

spectively. The following regression equations were used to

derive Y:

Linear Y = -3933.29 + 192.87 X

Quadratic Y = 106.37 + 44.79 X + 1.35 X2
A
Cubic Y = 135025.33 + (-7385.17)X + 137.08 X2 + (-.82)X3

Log-linear log Y = 3.12346 + .01266 X.

The goodness of fit of the regression lines derived from these

equations is indicated by r2, r2 change, and S in Table 2.
y.x
An ANOVA summary table is presented in Table 18 in the Appen-

dix. The overall F's for all methods are significant (p<.01);

the increases in r2 due to the higher order polynomials are

not significant, however.















Table 2

Indicator 1: Summary Statisticsfor

Prediction Equations by Method


r2 r2 change F df S
y-x


Linear

.97440 571.04** 1,15 163.02


Quadratica

.97531 276.47** 2,14 165.75
.00090 .51 1,14


Cubica

.98137 228.25** 3,13 149.40
.00606 .42 1,13


Log-linearb

log Y
.97021 488.54** 1,15 log.01157
antilog Y
.98726 164.92

Note. Indicator 1 is median family income expressed in
1971 constant dollars.
aBoth quadratic and cubic forms of the polynomial regres-
sion are presented.
bBoth r2 and S have been recomputed using antilogs of
y x 2 2
the log Y; much of the difference between r2(log Y) and r2
(antilog Y) may be due to rounding.
**P<.01








The average errors for the extrapolated Y values

(Sext(y.x)) according to method employed are (a) linear, 669;

(b) quadratic, 487; (c) cubic, 1981; and (d) log-linear, 283.

Observed Y's and predicted values of Y for both the original

regression and the extrapolated lines are presented in Table

3. These data are graphically presented in Figure 2.

Thus, there is very little difference in the total r2

for the methods; the quadratic and cubic forms added little

to the r2 already provided by the linear component. The S
y.x
for the cubic form is smaller (149.40) than for the other

methods.

When the lines are extrapolated beyond the original

values, however, the cubic form is clearly the "worst" fit

with a Sext(y.x) of 1981 and the log-linear method the "best"

with a Sext(y.x) of 283. Whether the exponential curve would

continue to be a superior predictor is a matter of conjecture.


Indicator 2

The mean and standard deviation for the Y values used

to generate the regression equations are 10.4 and .77, re-

spectively. The following regression equations were used to

derive Y:

Linear Y = 9.87860 + .02789 X

Quadratic Y = 11.08794 + (-.19035)X + .00626 X2

Cubic Y = 11.45737 + (-.35750)X + .01954 X2 + (-.0027)X3

Log-linear log Y = .99387 + .00118 X.

The goodness of fit of the regression lines derived from

these equations is indicated by r2, r2 change, and S in
y.x









Table 3

Indicator 1: Observed Y's and Predicted Y's by Method

Predicted Y's by Method
Year Observed Y's Linear Quadratica Cubica Log-linear


Original regressionb


1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963


5,4S3
5,367
5,278
5,594
5,783
5,939
6,433
6,288
6,693
7,122
7,138
7,126
7,524
7,688
7,765
7,975
8,267


5,131c
5,324
5,517
5,710
5,903
6,096
6,289
6,481
6,674
6,867
7,060
7,253
7,446
7,639
7,831
8,024
8,217


5 185
5,358
5 533
5 711
5,892
6,076
6,262
6 450
6,642
6,836
7,033
7 233
7,435
7,640
7 848
8 058
8 271


5,323
5,392
5,499
5 637
5 803
5,992
6,197
6,416
6,642
6,871
7 097
7 317
7,524
7, 714
7,882
8,023
8 133


5,231
5,386
5,545
5,709
5,878
6,052
6,231
6,415
6,605
6,800
7,002
7,209
7,422
7,642
7,868
8,100
8,340


Extrapolationd


1964
1965
1966
1967
1968
1969
1970
1971


8,579
8,932
9,360
9,683
10,049
10,423
10,289
10,285


8,410
8,603
8,796
8,989
9,182
9,374
9,567
9,760


8,487
8,705
8,926
9,150
9,377
9,606
9,838
10,072


8,205
8,235
8,220
8,152
8,028
7,843
7,591
7,268


8,584
8,838
9,100
9,369
9,646
9,931
10,225
10,527


Note. Indicator I is median family
1971 constant dollars.


income expressed in


aBoth quadratic and cubic forms of polynomial regression
are presented.
bThe regression line is derived from 2/3 of the known data
points.
Cpredicted Y's in terms of 1971 constant dollars are rounded
to number of places in original data.
dValues are extrapolated beyond the data points used to
generate the regression equation.

















11,000-



10,000-



9,000-



8,000-



7,000-



6,000-



5,000-


.....Observed Y's
-- Cubic
Linear
Ouadratic
Log-linear


!
1947


6
60


I
70


I
75


Figure 2. Indicator 1: Observed Y's and predicted Y's by
method (vertical line separates values of original
regression from extrapolation).


I-

0,-4
U -4

4-J

cz V)
0
U
ct5
*H ,-4


4,000


I








Table 4. An ANOVA summary table is presented in Table 19 in

the Appendix. The overall F's for the quadratic and cubic

forms of the polynomial regression are significant (p<.05);

however, only the increase in r2 due to the quadratic is

significant (p<.01).

The average errors for the extrapolated Y values

(Sext(y.x)) according to the method employed are (a) linear,

1.5; (b) quadratic, .35; (c) cubic, 1.1; and (d) log-linear,

1.5. Observed Y's and predicted values of Y for both the

original regression and the extrapolated lines are presented

by method in Table 5. These data are graphically presented

in Figure 3.

Thus, the cubic form of the polynomial accounts for the

most variance in Y (89%) and has the smallest S (.33).
y.x
The quadratic form, accounting for 81% of the variance in Y,

has a Sy. of .39; visual inspection of the second-degree

curve reveals that this curve may, in fact, more closely

fit observed values for the latter portion of the regression

line than the cubic form. The linear and log-linear methods

provide no better estimate of Y than does Y; indeed, the

standard error of estimate approximates the standard devia-

tion of the observed Y's.

When the lines are extended beyond the original values,

the quadratic provides the superior fit (Sext(y.x) = .35);

the fit of the cubic, linear, and log-linear methods to the

observed values is poor with residuals becoming larger for

successive years.










Table 4

Indicator 2: Summary Statistics for

Prediction Equations by Method


r2 r2 change F df S
y.x


Linear

.16405 1.18 1,6 .76


Quadratica

.81417 10.95* 2,5 .39
.65012 17.49** 1,5


Cubica

.89255 11.08* 3,4 .33
.07838 2.92 1,4


Log-linearb

log Y
.16905 1.22 1,6 log.03175
antilog Y
.16307 .78

Note. Indicator 2 is number of families in the United
States headed by women expressed as a percentage of total
families.
aBoth quadratic and cubic forms of the polynomial regres-
sion are presented.
bBoth r' and S have been recomputed using antilogs of

the log Y; much of the difference between r2(log Y) and r2
(antilog Y) may be due to rounding.
V.05
*:*E<.O1








Table 5

Indicator 2: Observed Y's and Predicted Y's by Method


Predicted Y's by Method
Year(X) Observed Y's Linear Quadratica Cubica Log-linear

Original regressionb

1940(l) 11.2 9.9c 10.9 11.1 9.9
1947(8) 9.5 10.1 10.0 9.7 10.1
1950(11) 9.4 10.2 9.7 9.5 10.2
1955(16) 10.1 10.3 9.6 9.6 10.3
1960(21) 10.0 10.5 9.9 10.1 10.4
1965(26) 10.5 10.6 10.4 10.6 10.6
1970(31) 10.9 10.7 11.2 11.1 10.7
1971(32) 11.5 10.8 11.4 11.2 10.8

Extrapolationd

1972(33) 11.6 10.8 11.6 11.2 10.8
1973(34) 12.1 10.8 11.9 11.3 10.8
1974(35) 12.4 10.9 12.1 11.3 10.8
1975(36) 13.0 10.9 12.4 11.3 10.9

Note. Indicator 2 is number of families in the United
States headed by women expressed as a percentage of total
families.
a Both quadratic and cubic forms of polynomial regression
are presented.
bThe regression line is derived from 2/3 of the known data
points.
C Predicted Y's are rounded to number of places in original
data.
dValues are extrapolated beyond the data points used to
generate the regression equation.









Observed Y's
Cubic
Linear
Quadratic
Log-linear


12.0-





11.0-





10.0-





9.0-


80


- *1-


1
1940


Figure 3.


I I I I -
547 '50 155 '60
Indicator 2: Observed Y's and predicted Y's
separates values of original regression from


I 1
'65 '70
by method (vertical
extrapolation).


* ..


13.0-


- -7


-_--7


I
75
line


VQ v








Indicator 3

The mean and standard deviation for the Y values used to

generate the regression equations are 32.0 and 3.8, respec-

tively. The following regression equations were used to de-

rive Y:

Linear Y = 23.27325 + .74603 X

Quadratic Y = 23.38774 + .71893 X + .00126 X2

Cubic Y = 22.60710 + 1.18842 X + (-.05693)X2 + .00194 X3

Log-linear log Y = 1.37971 + .01047 X.

The goodness of fit of the regression lines derived from these

equations is indicated by r', r2 change, and S in Table 6.

An ANOVA summary table is presented in Table 20 in the Appen-

dix. The overall F's for all methods are significant (p<.01);

however, the increases in r2 due to the higher order polyno-

mials are not significant.

The average errors for the extrapolated Y values

(Sext(y.x)) according to the method employed are (a) linear,

1.4; (b) quadratic, 1.2; (c) cubic, 2.7; and (d) log-linear,

.6. Observed Y's and predicted values of Y for both the

original regression and the extrapolated lines are presented

by method in Table 7. These data are graphically repre-

sented in Figure 4.

Thus, there is very little difference in the total r2

for the methods; the quadratic and cubic forms added an in-

significant amount to the r2 already provided by the linear

component. The Sy-x for the cubic form is only slightly

better (.43) than for the other methods (.49-.54).












Table 6

Indicator 3: Summary Statistics for

Prediction Equations by Method


r2 r2 change F df S
y.x

Linear
.98451 826.05** 1,13 .49


Quadratica

.98460 383.49** 2,12 .50
.00009 .07 1,12


Cubica

.98953 346.51** 3,11 .43
.00493 5.18 1,11


Log-linearb

log Y
.98085 665.69** 1,13 log.00760
antilog Y
.99999 .54

Note. Indicator 3 is the number of wives in the labor
force expressed as a percentage of total wives in the United
States.
a Both quadratic and cubic forms of the polynomial regres-
sion are presented.
bBoth^ r2 and Syx have been recomputed using antilogs of

the log Y; much of the difference between r2(log Y) and r2
(anitlog Y) may be due to rounding.
:k*p<.O1.








Table 7

Indicator 3: Observed Y's and Predicted Y's by Method


Predicted Y's by Method
Year(X) Observed Y's Linear Quadratica Cubica Log-linear

Original regressionb

1950(1) 23.8 24.0c 24.1 23.7 24.6
1955(6) 27.7 27.7 27.8 28.1 27.7
1956(7) 29.0 28.5 28.5 28.8 28.4
1957(8) 29.6 29.2 29.2 29.5 29.1
1958(9) 30.2 30.0 30.0 30.1 29.8
1959(10) 30.9 30.7 30.7 30.7 30.5
1960(11) 30.5 31.5 31.4 31.4 31.3
1961(12) 32.7 32.2 32.2 32.0 32.0
1962(13) 32.7 33.0 33.0 32.7 32.8
1963(14) 33.7 33.7 33.7 33.4 33.6
1964(15) 34.4 34.5 34.5 34.2 34.4
1965(16) 34.7 35.2 35.2 35.0 35.3
1966(17) 35.4 36.0 36.0 35.9 36.1
1967(18) 36.8 36.7 36.7 36.9 37.0
1968(19) 38.3 37.4 37.5 38.0 37.9

Extrapolationd
1969(20) 39.6 38.2 38.3 39.1 38.8
1970(21) 40.8 38.9 39.0 40.4 39.8
1971(22) 40.8 39.7 39.8 41.9 40.7
1972(23) 41.5 40.4 40.6 43.4 41.7
1973(24) 42.2 41.2 41.4 45.2 42.8
1974(25) 43.0 41.9 42.1 47.0 43.8
1975(26) 44.4 42.7 42.9 49.1 44.9

Note. Indicator 3 is the number of wives in the labor
force expressed as a percentage of total wives in the United
States.
a Both quadratic and cubic forms of polynomial regression
are presented.
bThe regression line is derived from 2/3 of the known data
points.
cPredicted Y's are rounded to number of places in original
data.
dValues are extrapolated beyond the data points used to
generate the regression equation.




72




47.0 -
4.0.. nhserved Y's
Cubic
Linear
45.0- Quadratic
Lou-linear .

43.0 -



41.0

U
o 390-

0
~237.0-



S35.0-/

6 33.0-
4--




U
3


29.0--


27.0-


25.0


2 3. 0
1950 '5 5 '60 '65 '70 '75
Figure 4. Indicator 3: Observed Y's and predicted Y's by
method (vertical line separates values of
original regression from extrapolation).








When the lines are extrapolated beyond the original

values, however, the cubic form is clearly the "worst" fit

with a Sext(y.x) of 2.7 while the log-linear form is clearly

the "best" fit with a Sext(y.x) of .6.


Indicator 4

The mean and standard deviation for the Y values used

to generate the regression equations are 9.8 and 2.6, re-

spectively. The following regression equations were used to

derive Y:

Linear Y 13.76707 + (-.13216)X

Quadratic Y = 14.16426 + (-.19720)X + .00148 X2

Cubic Y = 10.90554 + 1.15549 X + (-.07961)X2 + .00125 X3

Log-linear log Y = 1.12873 + (-.00496)X.

The goodness of fit of the regression lines derived from these

equations is indicated by r2, r2 change, and S in Table 8.
y.x
An ANOVA summary table is presented in Table 21 in the Appen-

dix. The overall F statistic for the cubic form of the poly-

nomial regression is significant (P<.01); the increase in r2

due to the third degree polynomial is also significant (p<.01).

The F statistic for both the linear and log-linear methods

is significant (p<.05).

The average errors for the extrapolated Y values

(Sext(y.x)) according to method employed are (a) linear, 2.9;

(b) quadratic, 2.5; (c) cubic, 4.1; and (d) log-linear, 2.7.

Observed Y's and predicted values of Y for both the original

regression and the extrapolated lines are presented by method

in Table 9. These data are graphically represented in Figure

5.











Table 8

Indicator 4: Summary Statistics for

Prediction Equations by Method


r2 r2 change F df S
y.x


Linear

.42085 7.27* 1,10 2.07


Quadratica

.42592 3.34 2,9 2.17
.00507 .08 1,9


Cubica

.90338 24.93** 3,8 .94
.47746 39.53** 1,8


Log-linearb
log Y
.42298 7.33* 1,10 log.07715
antilog Y
.36243/.42922c 2.06

Note. Indicator 4 is number of marriages in Florida
expressed as rate per 1,000 population in Florida.
a Both quadratic and cubic forms of the polynomial regres-
sion are presented.
bBoth r2 and Syx have been recomputed using antilogs of
2 (l g A
the log Y; much of the difference between r (log Y) and r2
(antilog Y) may be due to rounding.
c2
Two methods of computing r' using antilogs of Y yielded
different results.
0<
**p<.050








Table 9

Indicator 4: Observed Y's and Predicted Y's by Method


Predicted Y's by Method
Year(X) Observed Y's Linear Quadratica Cubica Log-linear


Original regressionb

1930(1) 11.6 13.6c 14.0 12.0 13.3
1940(11) 17.1 12.3 12.2 15.7 11.9
1950(21) 9.8 11.0 10.7 11.7 10.6
1960(31) 7.9 9.7 9.5 7.6 9.4
1963(34) 7.7 9.3 9.2 7.5 9.1
1964(35) 7.7 9.1 9.1 7.6 9.0
1965(36) 8.3 9.0 9.0 7.9 8.9
1966(37) 8.5 8.9 8.9 8.2 8.8
1967(38) 9.0 8.7 8.8 8.7 8.7
1968(39) 9.6 8.6 8.7 9.3 8.6
1969(40) 9.8 8.5 8.6 10.0 8.5
1970(41) 10.1 8.3 8.6 10.9 8.4

Extrapolationd

1971(42) 10.5 8.2 8.5 11.6 8.3
1972(43) 11.0 8.1 8.4 12.8 8.2
1973(44) 11.4 8.0 8.4 14.1 8.1
1974(45) 11.0 7.8 8.3 15.6 8.0
1975(46) 10.1 7.7 8.2 17.3 8.0


Note. Indicator
pressed as rate per


4 is number of marriages in Florida ex-
1,000 population.


a Both quadratic and cubic forms of polynomial regression
are presented.


poi


bThe regression line is derived from 2/3 of the known data
nts.
c Predicted Y's are rounded to number of places in original


data.
dValues are extrapolated beyond the data points used to
generate the regression equation.












Observed Y's
Cubic
. -- Linear
Quadratic
Log-linear


0

2 14.0-
o I.P


0 .


CD 11.0
S12.0-



10.0 '


N
100- "" /
N

9.0-



8.0 -*

7.0 f I i '1____I !

1930 40 50 '60 '63 '65 '70 '75

Figure S. Indicator 4: Observed Y's and predicted Y's by
method (vertical line separates values of
original regression from extrapolation).


18.0


17.0


16.0


15.0









Thus, it would appear that for the original regression

the cubic form of the polynomial best rits the observed Y's.

This method accounts for 90% of the variance in Y with less

than half of the average error of the other methods.

When the lines are extrapolated beyond the original

values, however, the cubic form has the largest average

error (S ext(y.x) = 4.1). The quadratic form of the polyno-

mial is, in fact, the best predictor (Sext(y.x) = 2.5) of the

methods compared. Actually, for this set of data the mean

(9.8) of the observed values of Y used in the original re-

gression would have been the best predictor of the future

values of Y.


Indicator 5

The mean and standard deviation for the Y values used to

generate the regression equations are 4.6 and 1.0, respec-

tively. The following regression equations were used to

derive Y:

Linear Y = 4.10332 + .01692 X

Quadratic Y = 3.16091 + .17126 X + (-.00350)X2

Cubic Y = 1.59416 + .82162 X + (-.04249)X2 + .00060 X3

Log-linear log Y = .56596 + .00288 X.

The goodness of fit of the regression lines derived from

these equations is indicated by r2, r2 change, and S in
y.x
Table 10. An ANOVA summary table is presented in Table 22

in the Appendix. Only the overall F for the cubic form of

the polynomial regression is significant (p<.05); the F











Table 10

Indicator 5: Summary Statistics for

Prediction Equations by Method


r2 r2 change F df S
y.x


Linear

.04360 .46 1,10 1.06


Quadratica

.22401 1.30 2,9 1.00
.18041 2.09 1,9


Cubica

.92134 31.23** 3,8 .34
.69733 70.92** 1,8


Log-linearb

log Y
.12007 A 1.36 1,10 log.10388
antilog Y
.12779 1.07

Note. Indicator 5 is number of dissolutions of marriage
in Florida expressed as rate per 1,000 population.
a Both quadratic and cubic forms of the polynomial regres-
sion are presented.
bBoth r2 and S have been recomputed using antilogs of

the log Y; much of the difference between r2(log Y) and r2
(antilog Y) may be due to rounding.
**0<.O1








value for the increase in r2 for the cubic component is also

significant (p<.01).

The average errors for the extrapolated Y values

(Sext(yx)) according to the method employed are (a) linear,

2.2; (b) quadratic, 3.1; (c) cubic, .5; and (d) log-linear,

2.1. Observed Y's and predicted values of Y for both the

original regression and the extrapolated lines are presented

by method in Table 11. These data are graphically repre-

sented in Figure 6.

For this set of data, the cubic form of the polynomial

regression is a superior predictor of the observed Y's. This

method accounts for 92% of the variance in Y with a S of
y. x
.34; the Sext(y.x) is .5, considerably less than the other

three methods with average error ranging from 2.1 to 3.1.

The quadratic form is definitely the least appropriate method

for this set of data since the curve bends in an opposite

direction to the observed Y values (see Figure 6).


Indicator 6

The mean for the Y values used to generate the regres-

sion equations is 16.9; the standard deviation, 1.3. The

following regression equations were used to derive Y:

Linear Y = 18.56785 + (-.36786)X

Quadratic Y = 21.30892 + (-2.01250)X + (.18274)X2

Cubic Y = 22.82142 + (-3.58611)X + .59524 X2 + (-.03056)X3

Log-linear log Y = 1.26761 + (-.00900)X.

The goodness of fit of the regression lines derived from









Table 11

Indicator 5: Observed Y's and Predicted Y's by Method


Predicted Y's by Method
Year(X) Observed Y's Linear Quadratica Cubica Log-linear

Original regression b

1930(1) 2.5 4.1c 3.3 2.4 3.7
1940(11) 5.8 4.3 4.6 6.3 4.0
1950(21) 6.4 4.5 5.2 5.7 4.2
1960(31) 3.9 4.6 5.1 4.2 4.5
1963(34) 4.1 4.7 4.9 4.1 4.6
1964(35) 4.1 4.7 4.9 4.2 4.6
1965(36) 4.2 4.7 4.8 4.3 4.7
1966(37) 4.2 4.7 4.7 4.4 4.7
1967(38) 4.6 4.7 4.6 4.6 4.7
1968(39) 4.9 4.8 4.5 4.8 4.8
1969(40) 5.2 4.8 4.4 5.1 4.8
1970(41) 5.5 4.8 4.2 5.4 4.8

Extrapolationd

1971(42) 6.1 4.8 4.2 5.6 4.9
1972(43) 6.9 4.8 4.1 6.1 4.9
1973(44) 7.1 4.8 3.9 6.6 4.9
1974(45) 7.2 4.9 3.8 7.2 5.0
1975(46) 7.5 4.9 3.6 7.9 5.0

Note. Indicator 5 is number of dissolutions of marriage
in Florida expressed as rate per 1,000 population.
aBoth quadratic and cubic forms of polynomial regression
are presented.
bThe regression line is derived from 2/3 of the known data
points.
CPredicted Y's are rounded to number of places in original
data.
d Values are extrapolated beyond the data points used to
generate the regression equation.












8.0--
Observed Y's
1-C ubic
0 Linear
4-- Quadratic
M -Log-linear






5.0
0




S5.01





4-I

2.0 I
1930 '40 50 '60 '63 '65 70 75

Figure 6. Indicator 5: Observed Y's and predicted Y's by method (vertical line
separates values of original regression from extrapolation).








these equations is indicated by r2, r2 change, and S in
y.x
Table 12. An ANOVA summary table is presented in Table 23

in the Appendix. The overall F's for the quadratic and cubic

forms of polynomial regression are significant (p<.01); only

the increase in r2 due to the quadratic component is signifi-

cant (p<.01), however.

The average errors for the extrapolated Y values

(Sext(yx)) according to the method employed are (a) linear,

1.2; (b) quadratic, 7.8; (c) cubic, 1.5; and (d) log-linear,

1.4. Observed Y's and predicted values of Y for both the

original regression and the extrapolated lines are presented

by method in Table 13. These data are graphically repre-

sented in Figure 7.

Thus, while the quadratic and cubic forms of polynomial

regression best fit the observed Y's for the original regres-

sion, they do not continue to be superior predictors. In

fact, the quadratic form has a Sext(y.x) of 7.8 while the

Sext(y.x) for the other three methods ranges from 1.2 to 1.5.

No method is clearly the best predictor of Y when values are

extrapolated beyond the original regression.

It should be noted that the Durbin-Watson d for the

linear and log-linear methods approaches the lower limits

of d and the possibility of serial correlation of the resid-

uals cannot be overlooked. Because of the small number of

observations involved in this data set (N = 8), interpreta-

tion of the Durbin-Watson d is more suggestive than con-

clusive.











Table 12

Indicator 6: Summary Statistics for

Prediction Equations by Method


r2 r change F df S
y.x


Linear

.46628 5.24 1,6 1.04


Quadratica

.92655 31.54** 2,5 .42
.46026 31.33"* 1,5


Cubic a

.97205 46.37** 3,4 .29
.04550 6.51 1,4


Log-linearb

log Y
.45957 ^ 5.10 1,6 log.02582
antilog Y
.41427 1.03

Note. Indicator 6 is number of resident live births in
Florida expressed as rate per 1,000 population.
a Both quadratic and cubic forms of the polynomial regres-
sion are presented.
bBoth r2 and S have been recomputed using antilogs of
^ y.x^
the log Y; much of the difference between r2(log Y) and r2
(antilog Y) may be due to rounding.
**<.01








Table 13
Indicator 6: Observed Y's and Predicted Y's by Method


Predicted Y's by Method
Year(X) Observed Y's Linear Quadratica Cubica Log-linear

Original regression

1964(l) 19.7 18.2c 19.5 19.8 18.1
1965(2) 17.9 17.8 18.0 17.8 17.8
1966(3) 16.8 17.5 16.9 16.6 17.4
1967(4) 15.9 17.1 16.2 16.0 17.0
1968(5) 15.7 16.7 15.8 16.0 16.7
1969(6) 16.1 16.4 15.8 16.1 16.4
1970(7) 16.8 16.0 16.1 16.4 16.0
1971(8) 16.4 15.6 16.9 16.6 15.7

Extrapolationd
1972(9) 14.8 15.3 18.0 16.5 15.4
1973(10) 13.7 14.9 19.5 15.9 15.1
1974(11) 13.4 14.5 22.1 14.7 14.7
1975(12) 12.5 14.1 23.5 12.7 14.4

Note_. Indicator 6 is number of resident live births in
Florida expressed as rate per 1,000 population.
aBoth quadratic and cubic forms of polynomial regression
are presented.
bThe regression line is derived from 2/3 of the known data
points.
CPredicted Y's are rounded to number of places in original
data.
dValues are extrapolated beyond the data points used to
generate the regression equation.














25

24


0 *.so


Observed Y
Cubic
Linear


Is


23-- 'uaciratic
---Locg-linear

22-
4-'
~21

S20

-C
4-
l19


-J

> 16-

15-

14 -

13-

12-


10u-
1964 '65 '66 '67 '68 '69 '70 '71 '72 '73 '74 175
Figure 7.. Indicator 6: Observed Y's and predicted Y's by
method (vertical line separates values of
original regression from extrapolation).









Indicator 7

The mean and standard deviation for the Y values used

to generate the regression equations are 32.2 and 4.8, re-

spectively. The following regression equations were used to

derive Y:

Linear Y = 23.42857 + 1.95476 X

Quadratic Y = 23.58928 + 1.85833 X + .01071 X2

Cubic Y = 23.26430 + 2.19645 X + (-.07792)X2 + .00657 X3

Log-linear log Y = 1.38414 + .02662 X.

The goodness of fit of the regression lines derived from these

equations is indicated by r2, r2 change, and S in Table 14.
y-x
An ANOVA summary table is presented in Table 24 in the Appen-

dix. The overall F's for all methods are significant (p<.01);

however, the increases in r2 due to the higher order polyno-

mials are not significant.

The average errors for the extrapolated Y values

(Sext(y.x)) according to the method employed are (a) linear,

1.4; (b) quadratic, 1.3; (c) cubic, 1.8; and (d) log-linear,

2.4. Observed and predicted values of Y for both the original

regression and the extrapolated lines are presented by method

in Table 15. These data are graphically represented in

Figure 8.

There is very little difference in the predictive value

of the methods for the original regression. Each method

accounts for 99% of the variance in Y; the range of the S
Y.x
for all methods is from .34 to .43.

When the lines are extrapolated beyond the original












Table 14

Indicator 7: Summary Statistics for

Prediction Equations by Method


r2 r2 change F df S
y.x


Linear

.99560 1358.03** 1,6 .34


Quadratica

.99572 581.74** 2,5 .37
.00012 .14 1,5


Cubica

.99588 322.27** 3,4 .41
.00016 .15 1,4


Log-linearb

log Y
.99333 ^ 893.91** 1,6 log.00577
antilog Y
.99999 .43

Note. Indicator 7 is number of 3 to 5 year olds enrolled
in nursery school and kindergarten expressed as percentage
of total children 3 to 5 years old in the United States.
a Both quadratic and cubic forms of the polynomial regres-
sion are presented.
bBoth r2 and Syx have been recomputed using antilogs of

the log Y; much of the difference between r2(log Y) and r2
(antilog Y) may be due to rounding.
**D<. 01