
Citation 
 Permanent Link:
 http://ufdc.ufl.edu/AA00048420/00001
Material Information
 Title:
 A mathematical model for allocation of school resources to optimize a selected output, by Jackson K. McAfee
 Creator:
 McAfee, Jackson Kirby, 1939
 Publication Date:
 1972
 Language:
 English
 Physical Description:
 x, 168 leaves. : ; 28 cm.
Subjects
 Subjects / Keywords:
 Educational research ( jstor )
Mathematical independent variables ( jstor ) Mathematical models ( jstor ) Mathematical variables ( jstor ) Multiple regression ( jstor ) Production estimates ( jstor ) Production functions ( jstor ) Regression coefficients ( jstor ) School districts ( jstor ) Schools ( jstor ) Dissertations, Academic  Educational Administration and Supervision  UF ( lcsh ) Educational Administration and Supervision thesis Ed. D ( lcsh ) School management and organization  Mathematical models ( lcsh ) City of Gainesville ( local )
 Genre:
 bibliography ( marcgt )
nonfiction ( marcgt )
Notes
 Thesis:
 Thesis (Ed. D.)  University of Florida.
 Bibliography:
 Bibliography: leaves 161167.
 General Note:
 Typescript.
 General Note:
 Vita.
Record Information
 Source Institution:
 University of Florida
 Holding Location:
 University of Florida
 Rights Management:
 The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. Â§107) for nonprofit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
 Resource Identifier:
 022776941 ( ALEPH )
14081861 ( OCLC )

Downloads 
This item has the following downloads:

Full Text 
A MATHEMATICAL MODEL FOR ALLOCATION OF
SCHOOL RESOURCES TO OPTIMIZE
A SELECTED OUTPUT
By
JACKSON K. McAFEE
A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA IN PARTIAL
FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF EDUCATION
UNIVERSITY OF FLORIDA 1972
ACKNOWLEDGMENTS
The writer wishes to acknowledge the efforts of
certain persons who provided encouragement and assistance in the preparation of this study.
The writer is particularly indebted to Dr. Michael Y. Nunnery, chairman of the supervisory committee, for his interest and many valuable suggestions, and to other members of the committee, Dr. Ralph B. Kimbrough and Dr. Irving J. Goffman.
Thanks are given to Dr. Charles M. Bridges, Jr., Dr. Donald W. Hearn, and Dr. Gene A. Barlow for time generously spent discussing with the writer various aspects of the study.
Finally, deep appreciation is extended to the
writer's wife, Esther, who spent long and arduous hours in the typing of the original manuscript.
TABLE OF CONTENTS
ACKNOWLEDGMENTS ..... ............
LIST OF TABLES ..... .............
LIST OF FIGURES ..... ............
ABSTRACT ....... ................
CHAPTER
I INTRODUCTION ....... ................
The Problem ..... ..............
Assumptions ..... ..............
Definition of Terms .... ..........
Review of the Literature ... .........
Procedures ..... ................
Organization of the Remainder of the
Study ...... .................
II DEVELOPMENT AND DESCRIPTION OF THE MATHEMATICAL MODEL ....... ..................
Conceptual Considerations ... ......
Mathematical Considerations .. ......
A Mathematical Model .... ..........
II EMPIRICAL APPLICATION OF THE MODEL .......
Design of the Empirical Application .
Results of the Empirical Application. .
IV SUMMARY, DISCUSSION AND SUGGESTIONS FOR
FUTURE STUDIES ...... ...............
Summary ....... ................
Discussion ..... ................
Suggestions for Future Studies.....
iii
Page
. . . . . . . ii
. . . . . . . vi
. . . . . . . vii
1
4
10 10
12 29
32
33 34 48 88
90 91 110
140
140 146 151
I
TABLE OF CONTENTS (CONTINUED)
Page
APPENDIX .......... ....................... 157
SELECTED BIBLIOGRAPHY ...... ................ 161
BIOGRAPHICAL SKETCH ...... ................. 168
LIST OF TABLES
TABLE Page
1 VARIABLES POSTULATED AS PREDICTORS OF LOCAL
SCHOOL DISTRICT PRODUCTIVITY FOR A GIVEN
STATE ......... ................... 102
2 BASIC DATA DESCRIPTION ... .......... 112
3 INTERACTION TERMS TESTED FOR THE PRODUCTION
FUNCTION ....... .................. . 115
4 REGRESSION COEFFICIENTS FOR THE PRODUCTION
FUNCTION (DEPENDENT VARIABLE: READING
ACHIEVEMENTX21) ..... .............. 117
5 AN EXAMINATION OF THE INTERACTION BETWEEN THE
PERCENTAGE OF TEACHERS WITH LESS THAN FOUR
YEARS' TRAINING (X14) AND MEDIAN TEACHING
SALARY (X17) ...... ................ 120
6 REGRESSION COEFFICIENTS FOR THE BUDGET
EQUATION (DEPENDENT VARIABLE: INSTRUCTIONAL
EXPENDITURES PER PUPILX24) .. ........ 122
7 VARIABLES MANIPULATED IN THE MATHEMATICAL
PROGRAMMING ANALYSIS .... ........... 130
8 EFFECT OF DISTRICT SIZE ON ACHIEVEMENT AND
THE DISTRIBUTION OF SCHOOL RESOURCES
(X14 = 5.87).................135
9 THE EFFECT OF INCREMENTING THE BUDGET ON
ACHIEVEMENT (X22 = 9,180) ............ 137
LIST OF FIGURES
FIGURE Page
1 A MODEL OF THE EDUCATIONAL PROCESS . ... .. 38
2 METHOD FOR INVESTIGATING TECHNICAL
EFFICIENCY ...... ................. 55
3 AN EXAMPLE OF A CONCAVE FUNCTION f ..... 83 4 AN EXAMPLE OF A NONCONCAVE FUNCTION f. . . . 83
Abstract of Dissertation Presented to the Graduate Council
of the University of Florida in Partial Fulfillment
of the Requirements for the Degree of Doctor of Education
A MATHEMATICAL MODEL FOR ALLOCATION OF
SCHOOL RESOURCES TO OPTIMIZE A SELECTED OUTPUT
By
Jackson K. McAfee
August, 1972
Chairman: Dr. Michael Y. Nunnery Major Department: Educational Administration
Educational decisionmakers are faced with the task of allocating school resources in such a way as to maximize student outcomes. Decisionmakers are constrained in their choice of plans by a limited budget and by social, political, and legal forces. The rationale for this study was a response to the need for studies that demonstrate the utility and limitations of mathematical models as aids in improving educational planning and decisionmaking.
The problem of the study was to develop a mathematical model to facilitate the decisionmaking process in selected areas of educational activity by optimizing the allocation of scarce resources and to empirically illustrate its application. In development of the model attention was given to
vii
1. Determining the educational production function that describes the inputoutput relationship between selected variables;
2. Determining the optimal combination of inputs to maximize output subject to certain costs and other constraints;
3. Determining the optimal combination of inputs to maximize output, subject to certain constraints, given increments in the budget constraint.
In the empirical application, data were utilized to illustrate the model. The data consisted of aggregate measures on input and output factors in 181 local school districts in a given state. Twentyfour variables were measured providing information on (1) student and community inputs, (2) median reading achievement at the sixthgrade level, and (3) selected school resources. In particular, certain teacher characteristics were considered: experience, training, and level of salary. Also, classroom size, pupilsupport personnel ratio, and the size of the district were used.
Given these inputs (i.e., student, community, and school) into the educational process, stepwise multiple regression was used to estimate the parameters of the production function (reading achievementdependent variable) viii
and the budget equation (perpupil instructional expendituresdependent variable). The proportion of variance explained in the dependent variables was 0.79 and 0.94, respectively.
The optimization strategy employed the techniques of
mathematical programming with the production function as an objective function and the budget equation as one of the constraints. Specifically, the SUMT (Sequential Unconstrained Minimization Technique) algorithm was employed to solve the mathematical programming problems. Looking at the population of school districts as a whole, the analysis revealed that by reallocating existing school resources and maintaining a minimum standard for the pupilsupport personnel ratio, the predicted increase in reading achievement that would result was 0.38 standard deviations. Given the existing allocation of school resources, if the perpupil budget was incremented by $100 so as to maximize reading achievement, the predicted increase would be 0.73 standard deviations.
Throughout the analysis it was apparent that the percentage of teachers with advanced training was the dominant effect followed by median teacher salary. The importance of salary was a function of the percentage of teachers with less than four years' training; the greater the percentage, ix
the greater the effect of salary. When the percentage was zero, the effect of salary was minor compared to the effect of advanced training.
CHAPTER I
INTRODUCTION
For too long the United States has tolerated a
system of compulsory universal education without
pertinent public accountability. As a result
critical deficiencies are now becoming evidentsuch as the failure to serve innercity students effectively. . . . They (public) have a right to
know how well students are learning as well as a 1
responsibility to pay for the educational program.
Across the land, state legislatures have sounded the clarion call of "accountability" for educational institutions. Politicians have asked for specific and quantitative information on the schools' output. Educators' requests for increased educational expenditures have been matched by legislators' demands for increased productivity.
Concurrently, educational institutions have begun efforts to respond to the demands of new technologies. Experimentation with the concepts of systems analysis and operations research have occurred in an attempt to place
1L. J. Stiles, "Assessment of Educational Productivity," Journal of Educational Research, LX (April, 1967),l.
decisionmaking activities on a more rational plane. The PlanningProgrammingBudgetingSystem (PPBS) found its place in the public sector, first at the federal level and then in state and local governmental agencies.2 "Attention is focused on choosing from many possible objectives those specific objectives to be achieved, and then choosing from alternative courses of action that plan which will accomplish the chosen objectives at the lowest possible cost, or accomplish some more optimum set of objectives at a specified cost." 3 For school purposes, the core of PPBS is the program budget that reports programs to be accomplished and allocates expenditures in terms of objectives relating to student outputs rather than in terms of objects to be purchased. Two outcomes of a PPB system are costeffectiveness and costbenefits analyses. Costeffectiveness is concerned with the extent to which given dollar inputs if utilized in alternative ways would increase certain educational outputs. The main requirement of costbenefits
2Marvin C. Alkin, "Evaluating the CostEffectiveness of Instructional Programs," UCLA Symposium on Problems in the Evaluation of Instruction, Sponsored bythe Center for the Study of Evaluation (Los Angeles: December, 1967).
3H. Thomas James, The New Cult of Efficiency and
Education (Pittsburgh: University of Pittsburgh Press, 1969), p. 40.
3
analysis is that both input and output measures be specified in the same unit, namely dollars; then the ratio of benefits to costs for a given program is examined.4
With new developments in technology, economists and
educational researchers have increased their efforts in the study of the internal efficiency of education. These studies require measurements of both the inputs and outputs of educational systemsa difficult task. Even more difficult is the recognition that there are too many variables affecting school performance for researchers to experimentally control and manipulate them all at once. "Thus researchers have had to rely on surveytype correlation analyses and infer from them the causes of variation in educational success.'5 Hence, inputoutput models from economics have been adapted for the study of productivity in education. This has not been done without experiencing difficulties. Usually in industry, inputoutput relationships are such that empirical relationships can be determined within a theoretical framework. Such is not the case in education where concern with behavioral characteristics is wanting of both a theoretical and quantitative basis.
4Alkin.
5Michigan Department of Education, Research into Correlates of School Performance, Assessment Report No. 3
(1970), p. 14.
Numerous inputoutput studies have taken place in
education in recent years; however, most of these have not considered questions of costeffectiveness. Of the costeffectiveness studies, few have related their techniques to inputoutput analysis. Consequently, the focus of this study was the development of a model that integrates both concepts and leads to decisionmaking on a plane conceived by the advocates of PPBS. The demands of the age are such that if the educational enterprise is to survive, progress must be made towards the development of a theory of decisionmaking that effectively relate the inputs of a school system to its outputs.
The Problem
Statement of the Problem
The problem of the study was to develop a mathematical model to facilitate the decisionmaking process in selected areas of educational activity by optimizing the allocation of resources and to empirically illustrate its application. In development of the model attention was given to
1. Determining the production function that describes the inputoutput relationship between selected variables;
2. Determining the optimal combination of inputs to maximize output subject to certain costs and other constraints;
3. Determining the optimal combination of inputs to maximize output, subject to certain constraints, given increments in the budget constraint.
In the empirical application, data obtained from the National Educational Finance Project were utilized to illustrate the model. On the basis of these data an educational production function was estimated and optimal allocations of resources were determined subject to certain constraints.
Delimitations
The study was confined as follows:
1. To educational outcomes of a cognitive nature, that is, behaviors dealing with the recall or recognition of knowledge and development of intellectual abilities and skills;
2. To the investigation of allocative efficiency; using the dollar budget in such a way that the resources that are purchased yield the best outcome that can be attained for the given budget (i.e., to identify feasible ways of improving output within the context of the present system by using the components presently employed in the system);
3. To an investigation of school districts within a given state in the empirical application;
4. To an investigation of aggregate educational
policy (what is true for the population as a whole may not be true for individual school districts). Limitations
The study was subject to the following limitations:
1. The use of unidimensional outcomes;
2. The explanatory power of the particular input
variables selected and their usefulness from a decisionmaking context;
3. The problems associated with imprecise measurement of the variables selected;
4. The use of a developmental design and the limitations inherent in determining a mathematical model:
a. the lack of a theoretical basis for postulating a specific model;
b. the necessity for a statistical dimension and
the resulting restrictions on the complexity
of the model design;
c. because social behavior is less predictable
than physical phenomena, mathematical models are a less accurate approximation of reality
in the social sciences;
5. The use of survey data and the inability to attach causality to estimated parameters. Justification for the Study
Since the most commonly used tool for measuring economic relationships is multiple regression analysis, it is no surprise that as researchers have examined the internal efficiency of education, regression analysis has been the primary tool.6 Also, the economists' stress on regression coefficients rather than the proportion of variance ex7
plained is a new development in educational research. As the review of the literature describes, educational production functions have had too many flaws to be effectively used in the past.8 Nevertheless, an assessment of the optimality of resource allocation in education requires a
9
knowledge of the production process. Benson, in considering
6Arthur S. Goldberger, Topics in Regression Analysis (New York: Macmillan, 1968), p. 1.
7Mary Jane Bowman, "Economics of Education," Review of Educational Research,xxXIX (December, 1969), 642.
8Elchanan Cohn, "Towards Rational DecisionMaking in Secondary Education," Institute for Research on Human Resources (University Park: Pennsylvania State University, 1970), p. 8.
9Ibid., p. 2.
resource allocation in the education process to attain objectives, has stated that the following conditions must be met: (1) that objectives can be precisely stated; (2) units in which the objectives are achieved can be defined;
(3) variables which influence the achievement of an objective can be identified and that units of these variables can be defined and priced; and (4) the relationships between input variables and objectives can be specified.10 Bowman has stated, "Proper identification of production functions is indeed necessary for sound costeffectiveness or costbenefits analysis in educational decision making.'11
Bowles has stated, "Without knowledge of the educational production process, we are unable to estimate the relative effectiveness of each factor input per unit of cost, nor are we able to compare the relative productivity of resources devoted to different types of education or noneducational purposes." 12 Entwisle and Conviser noted that
10Charles S. Benson, The School and the Economical System (Chicago: Science Research Associates, 1966), p. 102.
llBowman, p. 652.
12Samuel Bowles and Henry M. Levin, "More on Multicollinearity and the Effectiveness of Schools," Journal of Human Resources, III (Summer, 1968), 28.
inputoutput analysis deals only with the possibility of certain states of affairs. It is not suitable for making determinations as to the most desirable state of affairs, such as producing certain outputs at minimum costs. "There has developed a more general approach known as linear programming, which is appropriate for optimization problems, and inputoutput analysis turns out to be a special case of linear programming." 13 Bowles stated that "the task of statistical and educational researchers should be directed towards estimating the structural parameters of an equation representing a learning process."14
Thus, if efficient resource allocation is to be more than a figure of speech, the previous statements must be considered. To the extent that a more effective marshaling of school resources maximizes student outputs, the attainment of the goals of education are enhanced and support for proper financing is generated. Hence, the study was justifiable for the following reasons:
13Doris R. Entwisle and Richard Conviser, "InputOutput Analysis in Education," High School Journal, LII (January, 1969), 194.
14Samuel Bowles, Planning Educational Systems for Economic Growth (Cambridge, Mass.: Harvard University Press, 1969), p. 400.
1. Refinements in the techniques that develop
production functions are needed if a meaningful analysis is to occur;
2. Appropriate optimization techniques must be
developed if the goals of resource allocation are to be met;
3. Effective interfacing between the production function and the budget equation must result if costeffectiveness analysis is to be realized.
Assumptions
1. It was assumed that instruments used to measure inputs and outputs in the model were valid.
2. It was assumed that cost per unit of input could be determined for certain inputs used in the model.
3. It was assumed that a relationship existed between the inputs and outputs used in the model.
Definition of Terms
Budget equation.The relationship between the
dollar budget accorded educational decisionmakers and the transformation of that budget into school inputs.
Costeffectiveness analysis.A procedure by which the costs of alternative means of achieving a stated effectiveness or, conversely, the effectiveness of alternative means
11
for a given cost, are compared to determine that alternative or combination of alternatives that either gives the greatest expected effectiveness for a given expected cost or a given expected effectiveness for the least expected cost.
Constraint.Any procedure, practice or condition that limits the manner in which an activity or decisionmaking is carried out.
Decisionmaking.The series of actions and interactions through which questions or issues are resolved or disposed.
Interfacing.The process of using certain variables measured in the same units in both the production function and the budget equation for the purpose of optimization.
Mathematical programming.A systematic method for finding maximum and minimum values of mathematical functions subject to certain constraints.
Marginal effect.The increase in output which can be obtained by a unit increase in an input.
Multiple regression analvsis.The process of estimating a variable y from the knowledge of several predictor variables x , x , . . . , x .
2 n
Optimization.The best or most favorable amounts of
inputs for the purposes of maximizing outputs or minimizing costs.
Parameter.A measure computed from all observations in a population.
Production function.The relationship between an outcome of education and the human and material resources which education consumes.
Resource allocation.The process of assigning or allocating resources among the various components of an educational system.
Review of the Literature
The review summarizes past studies which were pertinent to the purposes of the present study. The main areas of concern were
1. Inputoutput analysis,
2. Costeffectiveness analysis,
3. Methodological concerns.
Since methodological considerations were of prime importance, emphasis was focused on this aspect.
InputOutput Analysis
In an attempt to determine factors that are related to school performance, educators have employed various models or research paradigms. Inputoutput models have their origins in the work of Leontief in economics. Leontief conceptualized a transactions table for displaying relationships between economic inputs and outputs for 15
purposes of budget analysis and planning. Hoffenberg applied this model to a school district budget. The assumption was that the activities which a school district carries on are interrelated, hence Hoffenberg determined 16
the structural framework of these relationships. The goal of such inputoutput studies is to obtain estimates of the marginal effects of an increase in a given input variable.
Researchers have adapted the inputoutput model of
Leontief in an attempt to account for the causes of variation in school performance. "Researchers employing this
15Marvin C. Alkin, "The Use of Quantitative Methods
as an Aid to DecisionMaking in Educational Administration," A paper read at a meeting of the American Educational Research Association (Los Angeles: 1969).
16Marvin Hoffenberg, "Application of Leontief InputOutput Analysis to School District Budgeting," Center for the Study of Evaluation: Working Paper No. 12 (Los Angeles: UCLA Graduate School of Education, 1970).
paradigm have: (1) identified a criterion of school performance as a dependent variable and measures thought to influence performance as independent variables; (2) operationally measured these variables in a sample of educational systems; (3) computed relationships between independent and dependent variables; and (4) drawn inferences from the relationships as to what factors, either singly or in combination, account for variation in school performance." 17 The mathematical relationship between inputs and an output is termed a production function. Outputs,which are behavioral changes in students, and their relationships to inputs are termed psychological production functions.18 This function seeks to relate performance increments to inputs.
Three types of studies exist which seek to determine the relationships between school inputs and educational outcomes.
The first type of study attempts to link total
educational expenditures to output measures. The second group includes those that have attempted to
estimate the effects of various functional
17Michigan Department of Education, p. 1.
18J. Alan Thomas, The Productive School: A Systems
Analysis Approach to Educational Administration (New York: Wiley & Sons, Inc., 1971), p. 12.
components of expenditure on school outcomes, and the third set of studies represents estimates of
the relationships between resources as measured in
physical terms and school outputs (educational
production functions).19
For studies of the first type, emphasis was placed on the financial side of the ledger (i.e., more money does more things) without concurrence as to what is quality.20 Actually, quality of education was seen to be a measure of school practices. For studies of the second type, emphasis was placed on determining which group of resources, that the school purchases, gives the highest educational returns per dollar input. A problem was that these studies did not address themselves to how the resources were being used. Given that one has purchased a particular combination of physical resources, there are alternative ways of utilizing these resources to affect output.21
With the advent of systems analysis in industry and
the concern with the nature of output, studies of the third
19Henry M. Levin, "The Effect of Different Levels of Expenditure on Educational Output," Economic Factors Affecting the Financing of Education (Gainesville, Florida: National Educational Finance Project, 1970), II, 185.
20Michigan Department of Education, pp. 1, 2.
21Levin, pp. 182, 188.
type have emerged. Kershaw and McKean pioneered the application of systems analysis to education. "This work outlined persuasively and clearly the values of treating education as a system amenable to quantitative analysis, capable of evaluation in terms of efficiency, and responsive to policy decisions regarding inputs and administrative arrangements.''22 After this several inputoutput studies followed that sought to relate various measures of student performance to school and community characteristics. Thomas, Benson, Coleman, Burkhead, and Kiesling sought to determine the factors most related to achievement. In each of these studies, the most highly related characteristics were found to be socioeconomic.23 Most of these studies used regression methods to determine the production functions. Criticisms of this methodology are discussed in a later section.
Costeffectiveness Studies
It is assumed that resources should be allocated
so as to maximize the achievement of a given
22Jerry Miner, "Financial Support of Education,"
Implications for Education of Prospective Changes in Society,
ed. by Edgar L. Morphet and Charles 0. Ryan (Denver: Designing Education for the Future, an EightState Study, 1967), p. 313.
23Michigan Department of Education, p. 15.
objective or set of objectives. Cost analysis provides a means for examining the manner in which
resources are used.24
Temkin stated that costeffectiveness methods are appropriate when the decision structure is suitable for planned and incremental improvement of an existing system with a time dimension weighted heavily towards the present and immediate future.25 Thomas described five elements of costeffectiveness analysis: (1) the objective, (2) the alternatives, (3) costs, (4) operational model, and (5) a decision rule.26 If the operational model is the production function described earlierthen one should be able to estimate the allocative pattern among inputs that will yield the largest increase in output within a limited budget. Few such studies have been carried out for educational productivity, in part, because the 'prices' of 27
such inputs are difficult to derive. As a result, most 28
inputoutput models used to date have not included costs.
24Thomas, p. 41.
25Sanford Temkin, "A Comprehensive Theory of CostEffectiveness: Administration for Change Program" (Philadelphia: Research for Better Schools, Inc., April, 1970), p. 41.
26Thomas, p. 82.
27Levin, p. 191.
28Thomas, p. 17.
18
One exception is Hoffenberg, who applied an interdependency approach in an inputoutput framework to a school district budget. Using legal program descriptions he was able to determine
1. interprogram flowsthe input distributions
contributing to output,
2. direct purchases per dollar of output (what is
required from each activity to give $1 of output
to a designated activity),
3. direct and indirect requirements per $1 of final
output from each of the activities.29
It must be noted, however, that the outputs of
Hoffenberg's study were school programs rather than student outputs. Nevertheless, where psychological production functions were involved, Thomas stated that inputoutput relationships could be measured in terms of costs per student per unit of time for a given input. As an example, to determine the direct cost per student hour of a given service would require cost figures for (1) teachers' salaries, (2) other salaries of personnel who support the service, (3) space, and (4) equipment and materials.30
In designing a costeffectiveness model for elementary and secondary education, Abt made the following recommendations:
29Hoffenberg, p. 6.
30Michigan Department of Education, p. 50.
1. The relative contributions of home, peer influence, and school instruction to student achievement should be determined,
2. The relative contributions of manipulable
variables in the school environment which contribute most heavily to student attitude and
achievement change should be determined,
3. The coefficients and parameter settings for
environmental influences on student attitude
and achievement change should be determined
with enough accuracy to allow useful predictions.31
Cohn discussed the use of linear programming as a
means of optimizing outputs. He regarded existing production functions as being incomplete. Hence, he used suboptimization techniques where he varied at most three variables simultaneously. Consequently, the overall results were less than optimal.32 The New York State Education Department studied the feasibility of using state department data to develop multiple regression equations required for simulating the effects of alternative expenditure configurations on pupil achievement. Various
31Clark C. Abt, Design for an Elementary and Secondary Education CostEffectiveness Model (Cambridge, Mass.: Abt Associates, Inc., 1967), II, 142.
32Cohn.
configurations of input were considered to maximize output, but optimization techniques were not discussed.33
Methodological Considerations
For most production processes, an increase in the
quantity of an input is expected to yield an increase in
output. However, Ribich stated,
Observation of the effect may be difficult because the additional resource input is small relative to
other changes that are occurring simultaneously.
The "upshot" can be difficulty in statistically
sorting out the independent influence of the
additional inputs from the other influences at work, many of which are unmeasured and unmeasurable. The
problem may be so severe as to submerge almost
completely any evidence of output response to an
input change.34
Wynne stated that most inputoutput studies in education
have used the techniques of multiple regression analysis.35 Consequently, as a result of Ribich's previous statement,
the use of multiple regression analysis must be examined with particular attention to interpretation of results.
A brief description of the method follows.
33New York State Education Department, "Technical
Report of a Project to Develop Education CostEffectiveness Models for New York State" (Albany: Bureau of School Progress and Evaluation, March, 1970).
34Thomas I. Ribich, "The Effect of Educational Spending on Poverty Reduction," Economic Factors Affecting the Financing of Education, II, 214.
35Edward Wynne, "School Output Measures as Tools for
Change," Education and Urban Society, XII (November, 1969).
The process whereby performance on a singlecriterion variable is predicted from a knowledge of several predictor variables is called multiple regression. Predictors, undoubtedly, will differ in their predictive efficiency. Therefore, computational processes are directed towards determining a separate regression weight (partial regression coefficient) for each predictor, in order to achieve the best possible prediction. The result may be a regression equation of the form:
= b + blx1 + b2x2 + . . . +bx
o 1 22nn
where y = criterion (dependent) variable, y = predicted value, and xi = predictor (independent) variable. A measure of overall efficiency in prediction is given by R (multiple correlation coefficient), which is the correlation of predicted values ^ to actual values y. More importantly, R2 gives us the proportion of variance of y accounted for by 36
the regression relationship.
The partial regression coefficient bi, estimates the marginal effect upon y with all other x's held constant.37 However, if the sample variation of x. is small, the
1
36Goldberger.
37Ibid., p. 26.
imprecision which attaches to its coefficient will tend to
38
be large. If the chief concern is with the marginal effect, then a high correlation is needed between y and xi, or the confidence limits on Bi (parameter estimated by bi) will be too wide.39 Next, the reliability of the instruments used to measure the x. must be considered. Regression
1
coefficients b. will tend to overestimate Bi with less than perfect reliability of the instruments used.40
But of all of the problems discussed thus far, the
most serious is the problem of multicollinearity. Multicollinearity arises when some or all of the explanatory variables are so highly correlated one with another that it becomes very difficult, if not impossible, to disentangle their influences and obtain a reasonably precise estimate of their separate effects.41 Gordon stated that small variations among the correlations of a highly related set
38Ibid., p. 60.
39Phillip Lyle, Regression Analysis of Production Costs and Factory Operations (New York: Hafner Publishing Co., Inc., 1957), p. 65.
40George W. Bohrnstedt, "Observations on the Measurement of Change," Sociological Methodology, ed. by Edgar F. Borgatta (San Francisco: JosseyBass, Inc., 1969), p. 128.
41Goldberger, p. 80.
of independent variables can create large variations among their regression coefficients.42 A standard approach to this problem has been to utilize additional information as an aid in estimation. Knowledge of the ratios of regression coefficients from theory or from previous samples can be used. Even incorrect information can reduce the imprecision 43
of estimation by using Bayesian techniques. Finally, one may ask if i (beta weightstandardized regression coefficient) is not a more effective statistic. Bohrnstedt stated that the regression coefficient bi is more appropriate for the study of change than i, since bi is relatively stable across subsamples of a population where S. may
1
vary significantly as a function of the standard deviations of the variables in the sample.44
The application of regression theory in estimating
educational production functions has been discussed in the literature. In attempting to interface the production function to the budget equation, particular care must be placed on the interpretation of regression coefficients.
42Robert A. Gordon, "Issues in Multiple Regression," American Journal of Sociology, LXXIII (March, 1968), 162.
43Goldberger, p. 82.
44Bohrnstedt, p. 120.
The other aspect of regression analysis is the proportion of accounted variance. The Coleman report has been soundly criticized for the manner in which it sought to identify important variables on the basis of their contributions to the porportion of variability explained. As Bowles and Levin stated, using stepwise procedures, the relative contributions to variability are dependent upon the order in which the independent variables are entered into the equation. They suggested that Coleman should have used the 45
regression coefficients. Even though this is a more effective procedure, if multicollinearity is severe, the regression coefficients may be woefully inaccurate.
Additional problems in interpretation are noted. Levin suggested that output measures should be valueadded, that is, the change that has occurred while in school.46 Borhnstedt raised serious questions involving the use of 47
gain scores in measuring change. Nephew, in constructing an inputoutput model, "partialed out" the variance due to nonschool variables and then worked with residual measures
45Bowles and Levin.
46Levin, p. 176.
47Bohrnstedt.
of school output using multiple regression equations.48 Goldberger showed that regressing on the residuals will underestimate the regression coefficients of the remaining variables.49
Bowles in a comprehensive study of the educational
production function found that the function generally explains only a small percentage of the variability of school achievement using the full range of variables. A crucial deficiency is the absence of a theory of the learning process. A conclusion was that it would be useful to think of distinct educational production technologies at least for black and white and rich and poor students separately. Also, due to "ceiling effects" it would be more appropriate to work with ratios of regression coefficients rather than 50
absolute magnitudes. This is related to Levin's assertion that the law of diminishing marginal returns is operative in the production function, that is, as more and more
48Charles T. Nephew, Guides for the Allocation of School District Financial Resources, Ed.D. Dissertation (Buffalo: State University of New York, 1969).
49Arthur S. Goldberger, "Note on Stepwise Least
Squares," Journal of the American Statistical Association, LVI (March, 1961).
50Samuel Bowles, "Educational Production Function: Final Report," Research Project Supported by the United States Office of Education (Harvard University, February, 1969), pp. 5, 13, 18.
of a given input is consumed its marginal effect decreases. 51
In an attempt to refine techniques that have been
used, factor analysis has been tried to meaningfully group the variables to avoid the problem of multicollinearity. The results were disappointing since variables grouped under statistical considerations were not necessarily 52
meaningful from a decisionmaking context. Wynne suggested the use of high and low productive schools to identify factors that discriminate between them on the basis of productivity.53 Kiesling stated that it is difficult to determine an accurate production function since one cannot easily gather data in units proper to each variable.54
While considerable research has taken place in the
study of inputoutput relationships, less has been done in the interfacing of the production function and the budget equation, as conceived by Levin, for purposes of optimizing
5 Levin, p. 177.
52New York State Education Department.
53Wynne.
54Herbert J. Kiesling, "The Study of Cost and Quality of New York School Districts: Final Report," Research supported by the United States Office of Education, Project No. 80264 (Bloomington: Indiana University, 1970), p. 66.
27
outputs subject to budget constraints. As Levin stated, the educational establishment needs a production function and budget relationship to tie educational expenditures to 55
educational outcomes. Essentially, the concern is that of optimization, but limitations on optimization in an educational context have been expressed. Tracz spoke of the lack of suitable objective functions or performance criteria as strong limitations. Also, education is a nonstationary process which delimits the usefulness of fixed 56
coefficients in objective functions. Alkin identified impediments to the use of linear programming. First, there is the failure to designate precisely educational programs within educational systems and the attendant financial costs. This is one of the challenges to be met by PlanningProgrammingBudgeting Systems (PPBS). Secondly, there is a lack of specificity in the designation of educational outcomes .
55Levin, p. 178.
56George S. Tracz, "An Overview of Optimal Control
Theory Applied to Educational Planning," A paper read at a meeting of the American Educational Research Association (Los Angeles, 1969), p. 5.
57Alkin, "The Use of Quantitative Methods as an Aid to DecisionMaking in Educational Administration," p. 6.
28
A more serious limitation to optimization using linear programming is the nature of fixed marginal effects when using a linear production function. If the production function and budget equation were expressed respectively as:
y =bo + bX1 +. . . + bnxn
R Plxl + P2x2 +. .+ Pnxn
where R = total budget, and P. = price per unit of input xi, then according to Levin, the objective would be to satisfy the condition
Yx 2Yx
1Yl= 2 = . . . = n
Px1 Px2 Pxn
(i.e., the additional output from each input relative to its 58
price should be equal for all inputs). However, with constant marginal effects, questions of optimal economies of scale cannot be considered. Generalizations from the Literature
The review of literature appears to justify the following generalizations.
1. There is a need for decisionmaking based upon inputoutput relationships.
58Levin, p. 178.
2. The challenge for inputoutput studies is to determine the proper mix of inputs that will maximize output.
3. In general, inputoutput studies have not interfaced with costeffectiveness studies.
4. Serious reservations regarding methodology in inputoutput studies have been expressed.
5. A mathematical model is appropriate for systematically determining optimal allocation strategies.
Procedures
In the process of developing and applying the mathematical model, attention was given to (1) conceptual considerations, (2) mathematical considerations, and (3) an empirical illustration.
Conceptual Considerations
The process of developing a mathematical model required that one proceed in the presence of a conceptual framework. Consideration was given to a conceptualization of the process of education. The process was viewed in an inputoutput context. The inputs consisted of community and student factors, and school resources. The outputs were educational outcomes that are desired such as cognitive achievement, affective growth, and physical development.
The relationships between inputs and outputs were termed educational production functions.
As decisionmakers seek to effectively allocate resources, their choices are constrained by various social, economic, and legal forces. Thus attention was given to an explicit description of constraints on the process of education.
Efficient resource allocation requires two types of efficiency, technical and allocative. Consideration was given to distinctions between these two types of efficiency and their implications for estimating educational production functions.
Given an understanding of the process of education as revealed by the production function and the various constraints operating on the process, the need for optimization strategies whereby decisionmakers can seek to maximize certain outcomes subject to various constraints was deliberated.
Mathematical Considerations
Given a conceptual framework of the process of education, the translation to a mathematical model required consideration of several issues.
Attention was given to estimating single output equations versus multiple output equations. In using multiple regression to estimate the production function, consideration was given to (1) interpretation of regression coefficients, (2) determining goodness of fit, (3) linear versus nonlinear models, (4) reliability of measurements,
(5) the use of gain scores, (6) population data versus sample data, (7) causality and multiple regression, and
(8) experimental design versus survey design.
Given the need for optimization, the mathematical formulation of an optimization strategy was elaborated. In particular, an algorithm for solving nonlinear programming problems was discussed.
An Empirical Illustration
Given the final form of the mathematical model, steps were taken to empirically illustrate the model. Using real data, an educational production function was estimated. Subject to certain constraints, optimal combinations of inputs were determined to maximize a selected output. Given increments in the budget constraint, the problem was resolved. The exact procedures for the empirical analysis are detailed in Chapter III.
32
Organization of the Remainder of the Study
In Chapter II the mathematical model is developed and described. In Chapter III an empirical application of the model is demonstrated. In Chapter IV a summary, discussion, and suggestions for future studies are given.
CHAPTER II
DEVELOPMENT AND DESCRIPTION
OF THE MATHEMATICAL MODEL
While interest centers on the development of a mathematical model which describes in a systematic fashion a strategy of optimization, it must be remembered that mathematical models per se do not evolve in the absence of a conceptual framework. Antecedent to a mathematical model is a conceptual model which is a verbalization of concepts and ideas that purport to describe the phenomenon under investigation. The attempt to translate the abstractions of a conceptual model into equivalent mathematical statements only accentuates the fact that both are manmade. Thus, the conceptual model serves only as an approximation to reality, and the mathematical model is an approximation to the conceptual model. In this sense, the conceptual model becomes an upper limit for the adequacy of the mathematical model. In short, the mathematical model cannot describe the phenomena under consideration in any better fashion than the conceptual model which precedes it.
The first part of this chapter is devoted to a discussion of conceptual considerations for a mathematical model. Secondly, mathematical considerations implicit in model development are discussed. This provides the basis for stating the methodology which must be employed to make the mathematical model operational. Finally, the complete mathematical model is presented.
Conceptual Considerations
The basic notion of a model pertains to a simplification of reality that allows the user to analyze particular aspects of a processin this case the production of education. The conceptual framework represents the theoretical backdrop for the mathematical model. Armor has presented
1
a model of aggregate educational processes. The basic structure of this model is discussed with some modifications made to allow a consideration of educational processes at the levels of student, school, district, and state. To formulate a conceptual framework whereby the mathematical model can be developed, it is necessary to consider (1) an educational production function, (2) constraints and the
iDavid J. Armor, "School and Family Effects on Black and White Achievement: A Reexamination of the USOE Data," in On Equality of Educational Opportunity, ed. by Frederick Mosteller and Daniel P. Moynihan (New York: Random House, 1971), pp. 171177.
educational process, (3) technical and allocative efficiency, and (4) the need for optimization.
An Educational Production Function
Any model must begin by deciding what are the appropriate goals of schooling. First and foremost is the learning of basic cognitive skills such as reading, writing, and arithmetic. The mastery of such skills is deemed necessary to acquire a level of competence sufficient to function in a technological society. Secondly, schools offer instruction in a variety of subject fields that enable students to discover interests and aptitudes relevant to future occupational choices. Third, school programs seek to establish positive affective outcomes regarding selfconcept, identification with others, openness to experience, and a commitment to social institutions and the culture in which they are enshrined. Finally, schools provide opportunities for aiding the physical and psychological welfare of children with clinics, physical education, schoollunch programs, and guidance services.2
Schools attempt to achieve goals by the use of programs, staff, and facilities. No doubt, the program
2Ibid., p. 171.
36
determines how the school will be organized, what facilities will be required, and the number and type of staff that will be employed. Each of these factors is potentially capable of manipulation by school authorities. The manner in which these factors are manipulated creates variations across schools and school districts. Concurrently, it may be assumed that these variations create different levels of effectiveness in attaining the goals described above. It is also apparent that the success schools have in working towards these goals is dependent in part upon the background and characteristics which students bring with them to school. This fact has been amply demonstrated in such research as the Equality of Educational Opportunity Survey (EEOS).
Armor has distinguished several components of the
characteristics of students which precede their entry into school and interact with the school environment throughout their school careers. First, there is the environment of the student's family. In the basic unit of society, family lifestyles, goals, attitudes, and morals have significant impact on how well the child succeeds in school. Second, the community or neighborhood in which the student lives
3james S. Coleman et al., Equality of Educational
Opportunity (Washington, D.C.: U.S. Government Printing Office, 1966).
also helps to define the student. The interpersonal relationships which a student develops with other members of the community, and the manner in which the community manifests itself in facilities, housing, and programs can serve as a debilitating or helping force for the student's growth in school. Finally, there is the peer or studentbody factor. The extent of similarity between aggregated student characteristics and the goals of school programs has consequences for the success of students in schools.4
Thus, the school can be viewed as a collectivity,
with certain inputs from the community, with various school input characteristics and with outputs in the form of the goals discussed above, namely, cognitive and affective outcomes and physical development. Figure 1 represents a schematic of the model outlined above. This schematic represents a revision of the schematic described by Armor. The solid arrows indicate assumed causal directions. The dotted arrows indicate weaker, but plausible causal directions. In this instance, it is felt that causal directions may be twoway. As an example, the quality of a school may be the factor determining what types of families are
4Armor, p. 173.
moving into or out of a community and thus influencing the nature of the community.
Community Input Factors School Input Factors
Family lifestyle Programs
Economic wellbeing Facilities
Community facilities, Teacher Quality
programs, culture. Expenditures
Student Input Factors
Intellectual Capacity
Attitudes
PeerGroup Norms
Student Outputs
Academic Achievement
Attitudes
Physical Development
FIGURE 1. A MODEL OF THE EDUCATIONAL PROCESS
Hence, the model states that the outputs of an individual (i th) student at time t are a function (g) of community inputs cumulative to time t; of the quality and quantity of school inputs consumed by him cumulative to time t; of his characteristics (initial endowment and attitudes); and of the characteristics of his peers.
Thus Ait = g(Ci t), Si(t), Pt) where Ait = vector of educational outcomes of the ith
student at time t.
(t)
Ci (= vector of community inputs relevant to the
ith student cumulative to time t.
si(t)= vector of school inputs relevant to the i
student cumulative to t.
Pi(t)= vector of peer influences cumulative to t.
Ii = vector of initial endowments of the ith individual. The above equation employs mathematical symbolism and can be called a mathematical model. The use of such symbols allows the expression of the conceptual model in a precise fashion and lessens the possibility of ambiguous interpretations.
If the vector A consists of two or more components such as A = (Al, A2, A3) where A1 = reading achievement, A2 = citizenship attitudes, and A3 = social consciousness, then the model generates a system of equations that are to be solved simultaneously. Alternatively, one can conceive of A as a composite of educational outcomes; that is, A is a single number representing an index of educational output. It is also possible for A to represent a single output such as reading achievement. The last two instances describe what has been termed an educational production function. The production function attempts to reveal the relationships
between outcomes of education and the human and material resources which education consumes. Community inputs, peer influences, and initial individual endowments serve as controls in the production function since these factors are necessarily confounded with school inputs. These factors are explicitly brought into the production function, since failure to do so would obscure the true relationships of school inputs to output factors.
The production function allows one to analyze the
effects of changes in inputs to the educational process. To properly allocate school resources requires one to discover which resources affect outputs and to estimate the effects that changes in their allocation might have. Policymakers want information about the possible effect of hiring better trained or smarter teachers, about the influence of small pupil/teacher ratios, and so forth. In short, they want to know what will happen if they change the status quo. Thus, it would seem that in setting school policy and in longrange educational planning, knowledge of the educational production function is essential to efficient resource allocation.
Production function estimation requires a knowledge of the different factors important in producing the product and the interactions between the inputs. A typology of
variables associated with educational attainment involves the following categories: (1) student inputs, (2) community inputs, and (3) school inputs. The first two categories represent nonmanipulable inputs into the school system. School inputs can be further subdivided into: (1) teacher variables, (2) administration variables, (3) facilities,
(4) expenditures and (5) types of programs. To the extent that variables representing important determinants of achievement or their proxies are not included in the model, the adequacy of the model is diminished. The fact that a theory of learning is lacking on the educational scene increases the probability of important aspects being left out of the model. But, the key lies in the ability of the model to predict performance even when proxy variables are used. There is ample evidence that the discovery of predictive factors may be useful on practical grounds even if these factors have no causal significance. The appendix lists variables deemed desirable for production function estimation.
Dimensionality of output is a major concern. Previous inputoutput analyses of the educational process have generally employed some measure of academic achievement as the dependent variable. As Mosteller and Moynihan stated, "Lest it seem that academic achievement must be the
only job of the schools, let us remember that studies do not find adult social achievement well predicted by academic achievement."5 Thus it would seem that other educational outputs as suggested in the conceptual model must be brought into the analysis if results are to be meaningful. This can be done in one of two ways. First, the various outputs can be entered separately into the analysis. This leads to a system of simultaneous equations as previously explained. Secondly, the estimation of the educational production process can be based upon some measure of total educational output. That is, in some way the outputs should be weighed by a common factor (utility, priority, social value) in order to obtain a total index of output.
Another issue centers on whether the measures of output employed should represent current levels of performance or represent gains made over a specified period of time. Ideally, it is desired to know what students have gained during the time they have been under instruction, how much of the gain may be reasonably attributed to the instruction, and how much to factors beyond the reach of the school. Attempts at using this approach have fallen short of success due to measurement problems.
5Frederick Mosteller and Daniel P. Moynihan, "A Pathbreaking Report," On Equality of Educational Opportunity, p. 7.
Constraints and the Educational Process
Any attempt to describe the process of education by production function estimation must recognize various forces (i.e., social, political, economic, legal) that act as constraints upon the process. This has been one of the shortcomings of previous inputoutput analyses of educational systems and may explain in some degree their lack of impact upon policy decisions.
The most obvious constraint is the budgetary one.
Resources for education, as for all other purposes, public or private, are limited. Decisionmakers must operate within the framework of a limited budget. Thus, the key policy questions they face are these: when a school changes its inputs in various ways, how much improvement in achievement scores (or other desired good) will be obtained and for how much money? Evaluation of alternative policies necessarily involves consideration of both the effects on output of different changes in the inputs to the production process and the costs of these changes.
It is also true that existing state laws, accreditation requirements, and market demands may serve to inhibit alternative choices among the use of school resources. Examples follow:
44
1. Pupilteacher ratioanalysis may show this to have negligible effect, but accreditation requirements prevail.
2. Teachers' salariesagain the effect may be small relative to other variables, but disproportionate increases may be needed to attract and retain qualified teachers due to market demands.
3. Length of school yearanalysis may suggest an
optimal length different from that prescribed by state law. No doubt, the process of estimating an educational production function might suggest changes in state laws and regulations with respect to education. But more importantly, by considering present constraints, educational decisionmakers can affect educational outcomes now, as best they can, until laws and regulations or conditions are changed. Technical and Allocative Efficiency
As attempts are made to allocate resources to best
effect educational outcomes, it is apparent that two types of efficiency are involved. Levin has termed them technical and allocative efficiency. Technical efficiency refers to using the physical resources of a school in such a way that more educational output is achieved than might be done under some alternative utilization. Allocative efficiency refers to using the dollar budget of the school in such a way that more output is achieved than could be attained if a different
combination of physical resources were purchased even when technical efficiency is satisfied.6
It is at this point that a major distinction occurs between the concept of an educational production function and an industrial production function. The methods that have previously been used to estimate educational production functions employ an average concept; that is, the final function represents what the average effects are for various inputs. On the other hand, an industrial production function is conceptually designed to express the maximum product obtainable from the input combination at the existing state of technical knowledge. Thus the industrial production function represents a frontier of potential attainment for given input combinations. It establishes the most technically efficient combinations of current practices. On the other hand, if it is desired only to estimate how much output on the average could be obtained for a firm in the industry with a certain set of inputs, then the average concept would obviously be the correct one to employ. But, this function cannot answer questions regarding technical efficiency. It can resolve the question of allocative efficiency at the existing state of technical knowledge.
6Henry M. Levin, "The Effect of Different Levels of Expenditure on Educational Output," p. 182.
The Need for Optimization
A number of inputoutput analyses have been performed in education and these studies have provided a necessary description of the educational system. However, the policy implications of such studies have been hard to realize since they generally have been devoid of cost considerations. A simple example follows to indicate the necessity for cost analysis in conjunction with the production function. Suppose two variables xI, x2 are found to have the following relationship to y:
y = 4x, + x2.
It appears that xI has a greater effect on y than x2. Hence, researchers viewing this equation would stress the importance of x, in their conclusions. But, suppose further it is known that the prices of x, and x2 are respectively P1 = 5and P2 = 1. Then it becomes clear that within a limited budget, more can be accomplished by using X2 rather than xI. Suppose that the budget equation happens to be:
5xI + x2 = 20.
This would allow us to use at most 4 units of x when x2 =
0 or 20 units of x2 when x, = 0. The maximum value of y would be y = 20 when xI = 0 and x2 = 20. It is realized that this is a rather naive example, for which the solution is intuitively obvious. However, in a reallife situation,
if one had a production function consisting of 20 or more variables, each having different prices, then it becomes intuitively impossible to know what the best combination of inputs would be. The point is that inputoutput analyses of schools and educational policy miss the mark if they are done irrespective of price considerations for the various school inputs into the educational process. Conclusions drawn from an examination of a production function may be severely revised as a result of the joint consideration of budget constraints. What is needed, then, is a systematic procedure for determining the best combination of inputs when dealing with large numbers of variables.
Conceptual Considerations in Retrospect
In review, the process of education has been viewed in an inputoutput framework. It was found that in addition to school inputs, community and student inputs also affect the magnitude of educational outputs. Also, recognition must be made of various constraints that are operating at each level of the process. The most pervasive of these is the budgetary constraint. Finally, given the conceptual framework of the educational process, the task of decisionmakers is to optimize educational outcomes subject to various constraints which explicitly and implicitly operate on the system.
Mathematical Considerations
In the recent past there has occurred an emerging
interest in applying to education the methods of economic analysis. These methods have been variously called operations analysis, systems analysis, inputoutput analysis, costeffectiveness analysis, and costbenefits analysis. They are viewed as a potential means of increasing the efficiency of educational planning and decisionmaking. These types of analyses can be utilized in education to deal with such questions as how much of an increase in reading achievement of sixthgrade children is likely to result from any given reduction in the size of sixthgrade classes, all other things being equal. Essentially, these types of analyses allow the planner to view an educational system as a set of inputoutput or production relationships which can be controlled in a way that will optimize the use of scarce educational resources. The operational form of such analyses usually involves a mathematical model which describes in a quantitative sense the structural characteristics of the phenomena under investigation. In developing a mathematical model, in the present case, it is necessary to consider (1) ordinary least squares versus twostage least squares, (2) single versus multiple outputs,
(3) technical and allocative efficiency, (4) prices of the
school inputs, (5) statistical difficulties in using multiple regression, and (6) strategies for optimization.
Ordinary Least Squares Versus TwoStage Least Squares
Previously, the educational process was viewed as
composed of multiple outputs. This was reflected in the equation
Ait 2 g(Ci(t), si(t), pi(t), ii)
where A it is a vector of educational outcomes. Two ways have been identified for working with multiple outputs. One leads to a system of equations (production functions) to be solved simultaneously. The other consists of constructing an index of total educational output so that the educational process may be represented in a single equation. A major difficulty with the latter approach is that some outputs are likely to be causally related to other outputs. Thus, an increase in positive attitudes is likely to lead to an increase in academic achievement which in turn will cause better attitudes and so on. The use of ordinary least squares to estimate the parameters of a single equation with this type of feedback existing among the outputs will probably lead to biased estimates. When the above situation exists a better approach is to employ
simultaneous equations. The parameters on these equations can be estimated using twostage least squares.7
Single Versus Multiple Outputs
If one adopts the simultaneous equations approach,
then each equation estimates a single output. Difficulties arise when one attempts to allocate resources so as to maximize all outputs. Most likely, not all outputs can be weighted equally; that is, one may not be willing to assign the same value or utility to each of the outputs. Thus, one is required to subjectively determine the weights that describe the utility of each output. Hence one seeks to maximize a utility function of the various outputs. It is at this point that trouble is likely to occur. Gilbert and Mosteller stated, '. . . setting many desirable maxima makes it hard to know what programs might possibly lead to achievement of these goals. A wellknown mathematical theorem states that one cannot ordinarily maximize two or more variables simultaneously." 8 Consequently, it would
7
Henry M. Levin, "A New Model of School Effectiveness," in Do Teachers Make a Difference? (Washington, D.C.: U.S. Government Printing Office, 1970).
8John P. Gilbert and Frederick Mosteller, "The Urgent Need for Experimentation," On Equality of Educational Opportunity, p. 382.
51
be vain to hope to maximize the outcomes of each student in several different areas. In the utility function, only a few of the outputs would be maximized, at the expense of other outputs.
The above seems to suggest that analysis might be more fruitful if it were restricted to a single output, which is not a composite of other outputs. That this should be so, seems to follow for additional reasons. McNamara stated,
Mathematical applications to management should stay
clear of large general models and concentrate on
specific problem areas. The mistake that educators
have made for years involves trying to solve huge
problems that do not have immediate or feasible
solutions. If the general area of operations
analysis is to persevere in educational management, then the appropriate approach seems to be the consistent application of these techniques to small
problems.
Such an approach recognizes that resources should be allocated to programs. As an example, given a reading program, interest centers on how resources might be allocated to maximize reading achievement.
No doubt, if the focus is on reading achievement only, analysis is likely to indicate a distribution of resources that maximizes reading achievement at the expense of other
9James F. McNamara, "Mathematical Programming Models in Educational Planning," Review of Educational Research, XLI (December, 1971), 440.
outputs. This need not be, however. As the various constraints operating on the reading program are considered, constraints can also be introduced that establish minimum standards for other outputs. For example, in the reading program, analysis may suggest that the ratio of students to counselors be increased so that more resources are available for increasing the training of classroom teachers. Such a decision would likely have a negative influence on certain affective outcomes. Thus, in the analysis, a constraint can be introduced that maintains a minimum standard for the studentcounselor ratio. This standard would be determined either as a subjective judgment or as a result of previous analyses in which the appropriate affective outcomes were the outputs under consideration. Technical and Allocative Efficiency
In deciding upon the methodology to employ in estimating the production function distinctions must be made between technical efficiency and allocative efficiency. It will be recalled that the industrial production function expresses the maximum product obtainable from the input combination at the existing state of technical knowledge. The process of using ordinary least squares to estimate the production function, on the other hand, expresses the
average product obtainable from a firm in the industry. Educational production functions have generally been estimated using ordinary least squares analysis. The reason for this is that given the lack of a comprehensive theory of learning, it is not known if all the relevant inputs have been identified. Likewise, it has not been possible to provide adequate data for some of the inputs presently identified.
Given this state of affairs, contentment must rest with the way things are rather than the way things should be. In this context, the average concept is the correct one to apply since it can answer questions of allocative efficiency at the existing state of technical knowledge. For purposes of optimization, then, ordinary least squares should be used for estimating the production function. If the output being considered is perceived to have feedback effect on certain independent variables in the equation, then a system of simultaneous equations must be solved in which affected independent variables become the dependent variables in other equations. This system is solved using twostage least squares. The process of using ordinary least squares or twostage least squares to estimate the parameters of one or more equations is more generally termed multiple regression analysis.
It should be mentioned, at this point, that multiple
regression analysis can provide clues regarding the technical efficiency of the units observed. Each unit in the analysis is composed of a particular combination of input values and an output value. Multiple regression analysis estimates what the average output is with this combination of input factors. Figure 2 illustrates this process. Estimated output is plotted against the actual output observed. If the units of data represent the mean scores of schools on some educational outcome, then it can be argued that in a group of schools working with equivalent inputs (community, student, school), these being indicated by the estimated output, every school in the group should be able to do what the most productive schools in that group have done.
This allows questions of technical efficiency to be answered. By observing the high productive and low productive schools in a given group, a whole series of questions can be asked about the characteristics of high productive schools that make them different from the low productive schoolsquestions about school organization, courses offered, methods of instruction, teacher attitudes and behavior, special services, community involvement, quality of facilities, level of financial support, and the
55
like. Results of such analyses would help to identify new factors that are related to output. Incorporation of these new factors into the production function advances the level of technical knowledge.
SSchool C 0 School A,
%. L UC
Grade 6 Output
# Group Y
1 11 School D
:,.vSchool B
2 3
Estimated Grade 6 Output (Based on inputs and control variables)
FIGURE 2. METHOD FOR INVESTIGATING TECHNICAL EFFICIENCY
Determining Prices of the
School Inputs
Thus far, it has been suggested that multiple regression analysis be employed for estimating the production function and that consideration be given to maximizing single outputs only. Also, as was seen earlier, one of the constraints to be considered is the budgetary constraint. Having identified the relevant school inputs in the production function, their prices must be estimated. In linear form, the budget constraint can be expressed as
B = ZPixi (i = 1, 2, . . . , n)
where B represents the total budget
xi = ith school input in the production function
Pi = price of the ith input per unit.
The costs of some variables are obvious and relatively easy to obtain, such as teachers' salaries and instructional expenditures per unit. For other variables, the costs appear to be intractable using traditional accounting methods. However, Cohn, Katzman, Levin and Riew,I0 by using least squares techniques, were able to obtain some idea of how changing a unit of a nonpriced variable can affect total cost. Cohn determined the relationship to total per school costs on such variables as
Z1  average number of college semester hours per
teaching assignment,
Z2  average number of different subject matter
assignments per high school teacher,
Z3  average class size,
Z4  number of units of subjects offered.
10Elchanan Cohn, "Economies of Scale in Iowa High
School Operation," Journal of Human Resources, III (Fall, 1968); Martin T. Katzman, "Distribution and Production in a Big City Elementary School System," Yale Economic Essays, VIII (Spring, 1968); Henry M. Levin, "A CostEffectiveness Analysis of Teacher Selection," Journal of Human Resources, V (Winter, 1970); and John Riew, "Economies of Scale in High School Operation," Review of Economics and Statistics, XLVIII (August, 1966).
Thus it appears that costs of the various school variables can be determined with the aid of least squares analysis. Statistical Considerations in using Multiple Regression
Both the production function and the budget equation are to be estimated using multiple regression techniques. It is necessary, then, that a critical discussion of all the statistical issues relevant to multiple regression analysis be considered. Thus far, the terms multiple regression and least squares analysis have been used. The problem of predicting or estimating the value of a dependent variable given various independent variables is correctly termed multiple regression. Ordinary least squares is one technique for determining the regression equation. Since ordinary least squares will generally be employed in the estimation process, the two terms will be used interchangeably, unless otherwise noted.
Ideally, the process of estimating a regression equation should be based on a mathematical model including all of the factors believed to have an influence on the dependent variable. In the linear case, a typical regression equation has the form:
y = bo + bl1x + b2x2 + . . + bnXn + E
where  = estimated y (dependent variable)
xi = independent variable
bi = regression coefficient (i = 1, 2, . . . , n)
b = constant term (for scaling purposes)
0
= error term or residual ( e = y y).
It is important that sufficient variation be observed in all of the independent variables that valid estimates of the regression coefficients may be made. Otherwise, existing differences among schools may not be great enough to reveal the importance of school resources to a given output. Technically, one is not allowed to extrapolate the results of a regression analysis beyond the range of observed data. At the same time, the effects of certain resources may not be apparent in the range of existing variation. Reducing the pupil/teacher ratio, for example, may make no difference until instruction can be really individualized, which might require a pupil/teacher ratio of less than 10:1. Also, suppose that output is functionally related to both community and school inputs. Further, suppose that school inputs vary only slightly in a particular sample and that home inputs vary widely. Statistically, most of the variance in output would be attributable to community factors. Thus, it becomes imperative that in any sample used to estimate the production function, the population extremes be well represented.
Interpretation of regression
coefficients
The regression coefficient has the following interpretation: b. represents the amount of change in y that can be associated with a unit change in x., with the remaining independent variables held constant. Since in a multiple regression analysis the remaining variables are controlled, b. is termed a partial regression coefficient. An interesting property of the partial regression coefficient is that regardless of whether x. is influenced by or influences other independent variables in the equation, the same regression coefficient for x. will be obtained. The size of the regression coefficient depends on the unique contribution of the independent variable to the prediction of the dependent variable plus an apportioned amount of the influence it shares with other variables in and out of the equation.
Multiple regression analysis enables one to find the relationship between a given input factor and an output, with all other factors controlled. More precisely, this means that the partial regression coefficient is a weighted average of regression coefficients obtained by fixing
llRobert L. Linn and Charles E. Werts, "Assumptions in Making Causal Inferences from Part Correlation, Partial Correlations, and Partial Regression Coefficients," Psychological Bulletin, LXXII (1970), 309.
control variables at all possible values in the range observed and computing separate regression coefficients. Any indirect effects that the control variables may be having on y acting through xi are isolated. Thus, it is important that variables entering the analysis are distinct, that is, they are not measuring the same factor and hence are not identical. Otherwise, controlling for them becomes tantamount to partialling the relationship out of itself. This is known as the partialling fallacy.12 The use of stepwise regression avoids the inclusion of variables that are not distinct. Controls on distinct independent variables which are unnecessary will not produce misleading results as long as we do not attempt to control on a dependent variable. The size of a regression coefficient may still be influenced by variables that have not explicitly been brought into the analysis. This is discussed in more detail in the section on causality and multiple regression. Regression models and
goodness of fit
By correlating the estimated y's with the actual y's, a measure of the goodness of fit of the multiple regression
12Robert A. Gordon, "Issues in Multiple Regression," American Journal of Sociology, LXXIII (March, 1968), 592.
equation to the data is obtained. Such a correlation is
2
termed multiple correlation (R Ry.x . . . x " X l X 2 "x n y . 1 . 0 O n gives the proportion of variability in the dependent variable explained by the regression relationship. If A is a set of original variables and B is a set of added variR2
ables, any increment to an R due to the addition of y.A
X can be tested for significance by the F ratio
B
(R  R / b
F y.A,B y .A
(1R ) / (n a b  1)
y.A,B
df = b and (n  a  b  1).
R2 is the incremental R2 based on a + b independent
y.A,B
variables where
a = number of original variables
b = number of added variables.
When R is based on sample data, there is a tendency for it to be systematically biased upwards. This happens because the least squares process capitalizes on any sampling error that overestimates zeroorder correlation coefficients. This is known as the problem of shrinkagethe tendency for R to decrease as the sample grows larger. A correction formula is given by
, ~ N  1
R2 = 1  (1  R2) N 1
   ' N k
R = unbiased estimator of population R
R = multiple correlation found in a sample of size N
k = number of independent variables.
Of course, if one is dealing with population data, shrinkage is not a problem.
Multicollinearity
Past studies that have estimated educational production functions have generally exhibited evidence of high degrees of interrelationships among the independent variables. This is known as the problem of multicollinearity. This affects the regression analysis by increasing the standard errors of the regression coefficients for the collinear variables; that is, the estimated regression coefficients exhibit more variance and in a sense are unstable. Small sampling variations among the correlations of a highly related set of independent variables can create large variations among their regression coefficients. Still, the coefficient values themselves are not biased. Attempts to avoid the effects of multicollinearity have involved one or more of the following: (1) introducing extraneous information, (2) stratifying, and (3) providing for simultaneous relationships. Another alternative would be to study an
entire population rather than a sample. As an example, if one were concerned with educational policy for a given state, it would be almost as easy to conduct an analysis of data for all the districts in the state, assuming that appropriate data collection techniques were carried out by the state department of education. Since in a sample regression equation the regression coefficients are unbiased estimators, it would follow that in the population regression equation the regression coefficients would represent the true effects for the variables present.
The need for nonlinear models
Most applications of multiple regression employ the
linear additive model. However, for purposes of estimating production functions, the following limitations of such a model are apparent.
1. The model does not allow for economies of scale.
2. It does not allow for interactions between inputs.
3. It implies that the marginal effect of a given
input is the same regardless of the level of usage.
What the above suggests is that in estimating educational production functions, most likely nonlinearities in the observed data must be considered. The provision of any
u
given power u of xi, that is x., allows for ui bends in the
regression curve of y on xi. In most research in the behavioral sciences, provision for more than one or two bends will rarely be necessary. A simple parabola may often give a reasonably good fit to the data, especially when it is realized that the curve may be quite flat and the data need not extend far enough to complete the bend (i.e., the parabola may be a reasonable fit within the limits of variation given in the problem). To obtain a measure of the goodness of fit to the parabola, the multiple R between
2 2
y and xl, xI is used. The difference between R and
21
r2 (assuming linearity) will give a measure of the
YX1
degree to which ability to predict has been improved.
Another type of nonlinearity is often encountered in research studies, namely, the concept of interaction. The importance of this concept is revealed by Mosteller and Moynihan in their reanalysis of the Coleman study (EEOR). It is recalled that Coleman concluded that the relative effects of school resources are small compared to community and student inputs. Mosteller and Moynihan stated:
To the simple of mind or heart, such findings might be interpreted to mean that 'schools don't make any difference'. This is absurd. Schools make a very great difference to children. Children don't think
up algebra on their own. . . . But given that
schools have reached their present levels of quality,
the observed variation in schools was reported by
EEOR to have little effect upon school achievement.
This actually means [that of] a large joint effect
owing to both schools and home background (including
region, degree of urbanization, SES, ethnic group)
little is unique to school, or homes. They vary
together.13
Cohen stated that this joint effect is carried by a variable z of the form z = XiXj.14 This is known as the first order interaction effect. Interaction terms that measure the joint contribution to explained variance of two or more variables can be very important. If two variables are highly intercorrelated, the first entered into the regression analysis will be assigned both its unique contribution to the explained variance and its jointly explained variance with all other variables (the interaction terms). This is precisely what happened in the Coleman report.
If one were to employ an interaction term in a simple threevariable model the equation would take the following form:
x3 a 13x1 + a23 x2+ a123X1X2How does one interpret the coefficient a 123? The partial derivative of x3 with respect to x. expresses the slope of the regression surface with other independent variables held constant (i.e., amount of change in x3 associated
13Mosteller and Moynihan, p. 21.
14Jacob Cohen, "Multiple Regression as a General DataAnalytic System," Psychological Bulletin, LXX (1968), 436.
with a unit change in x.). Thus, in the above equation:
ax3
axI  13 + a123x2
and the amount of change expected in x3 is dependent upon the level of x2. In an educational production function this could mean that the effect on performance of increasing the percentage of students from whitecollar homes is not independent of the percentage of teachers in a school who are experienced and who have a master's degree. Consider
a Ix3 a this expresses the rate of change
\ Xll = e23'
of Dx3 when x2 changes. In other words, this is the
ax1
effect of x2 on the effect of x1 on x3
Thus the concept of interaction consists essentially
of a residual category of all types of effects that are nonadditive. It always refers to the joint effects of two or more independent variables on some particular dependent variable. Since nonadditive relationships constitute a residual category, it would be rather surprising if interaction effects could always be interpreted in a simple manner. Nevertheless, an interaction model would enable us to determine whether variables are complements (that is, act together to improve performance) or substitutes (namely,
one variable has a greater effect in the absence of some other variable than it has in its presence). The implications for educational decisionmaking is that it would allow the use of inputbackground interaction in conjunction with the demographic composition of a student body to achieve desired performance levels.
Reliability of measurement
In conducting multiple regression analyses one must always be concerned with the reliability of measuring instruments. If an output is unreliably measured and if the errors are random, the increase in the standard deviation of output offsets the decrease in the correlation between output and input, leaving the regression coefficient unbiased. Given
y = (ryx) (Sy) (x) + ,
(Sx)
the above is apparent. Though the regression coefficient is unbiased, an increase in the standard error of estimate and a corresponding decrease in the multiple R can be expected. There are formulas that allow one to correct the attenuation in correlations. If, on the other hand, random errors are made in measuring x, ryx will be underestimated and Sx will be overestimated, leading to biased results. This type
of problem can also be corrected but the work involved is
tedious.15
Quite frequently the level of data treatment will
call for aggregated data. If the unit of analysis is at
the district level, then aggregated or mean measures on
the variables would be observed. Shaycroft has given
a formula for estimating the reliability of group means.
2
11 S
raa
aa n 2
S
where r = reliability of group means
aa
r = reliability of individual responses
aa
2
Sa = variance of individual responses
S 2 = variance of group means
16
n = number of individuals in a group.
Suppose r = .50 (reliability of some psychological tests),
aa
2 2
S a S = 10 : 1, and n = 100; then r = .95.
15Albert Mandansky, "The Fitting of Straight Lines when Both Variables are Subject to Error," Journal of the American Statistical Association, LIV (1959), 173205.
16Marion Shaycroft, "The Statistical Characteristics of School Means," Studies of the American High School, ed. by John C. Flanagan et al., Cooperative Research Project 176, Project Talent, University of Pittsburgh, 1962.
Actually, for most purposes, it can be seen that when n is sufficiently large (100 or more), r is close to 1. Consequently, when dealing with aggregated data, reliability is not seen to be a problem.
Use of gain scores
Many authorities have expressed a preference for measures of output that are "value added." If Y is a measure of output, it would be preferable to consider Y as a measure of the net gain achieved within a specified period of time. However, the reliabilities associated with gain scores are always less than that of either of the scores used to obtain a gain score. McAfee studied the reliabili17
ties of gain scores using Monte Carlo simulation. It was found that the average reliability of gain scores with a pretest reliability of 0.90 is only 0.16. Thus, at the level of individual data treatment, the use of gain scores cannot be justified. Once again, however, if aggregated data are used, the problem of reliability is nearly resolved. Using the above example of Shaycroft's formula and replacing r with r = 0.16 gives r = 0.92. aa aa
17Jackson K. McAfee, "Problems in Measuring Change," Unpublished paper, University of Florida, 1972.
There remains one other difficulty with the use of
gain scores. In using gain models, the school effects are probably understated, since prior performance levels which are due both to school and environment are summarily subtracted out. Psychologists note that changes in a child's self concept are usually the result of similar experiences repeated over a long period of time. Thus, increases in a child's achievement in a given year may have been the result of prior school influences operating over a period of years. It may be best to deal with levels of outcome only as an expression of the cumulative influences of schooling to that point in time.
Population data vs. sample data
The possibility of using either population or sample data has been considered. The use of sample data is often done with the goal of generalizing results to the population sampled. The use of inferential statistics in multiple regression requires a set of restrictive assumptions.
1. The variance must be homogeneous, that is, the
variance of y values about the regression surface must be the same at all combinations of x values.
2. The errors ( = 9  y) must be independent of each other.
3. The x values must be measured with essentially no error.
4. The distribution of y values must be normal. The use of population data requires only that assumption (3) be satisfied.
Causality and multiple regression
In assessing the results of a multiple regression analysis, distinctions must be made between associated effects and causal effects of the independent variables on the dependent variable. It is one thing to say that an increase in y of b units is associated with an increase of one unit in xi, and another to say that an increase of b units in y is caused by an increase of one unit in xi. To make a statement of causality requires further assumptions regarding the specification of variables to be included in the model.
In particular, if the regression model is used for simulation purposes, the following must hold true if regression coefficients are to be interpreted causally.
1. All variables which might affect the dependent variables are either included in the regression equation or are uncorrelated with the variables which are included. The absence of relevant influences will, in general, lead
to 'specification error' (i.e., bias in the calculated regression coefficients due to incorrect specification of the structural model). This also implies that the regression residuals () are uncorrelated with any of the xi. The error term represents all those unmeasured implicit factors which influence output, but are assumed or known to be uncorrelated with any of the xi in question.
2. All x. are measured without error. Random measurement error in output should not bias the regression coefficients.
3. Terms are included in the regression to handle any curvilinear or interactive effects.
4. The dependent variable has no effect on any of the independent variables. When this happens, one must solve a system of simultaneous equations to obtain unbiased estimates of the regression coefficients. One must also use twostage least squares instead of ordinary least squaresin the estimation process.
Given that most estimates of educational production functions have been based on crosssectional data in a surveytype analysis, it would be improper to assert causality to the regression coefficients since their sizes are so sensitive to the proper specification of the equatiors. If relevant influences have not been brought explicitly into
the equation, this can lead to overestimation of the relationships between output and input variables in the equation. Also, the independent variables used may be proxies for the causative variables. Thus, an increase in teacher salary may not be accompanied by a corresponding increase in all the other attributes for which teacher salary is serving as a proxy. It can be said that the regression coefficients are unique regardless of whatever causal interpretation is given to the correlations between independent variables that are in the equation. Suppose two independent variables are correlated. Causally, this correlation might correspond to a direct effect of x1 on x2 or of x2 on xI, or to an indirect effect of x1 on x2 through x3, or to a simultaneous effect of x3 on both x, and x2. But the regression coefficient b1, the effect of x1 on y, will be the same whether x1 causes x2 or x2 causes xI.
Where causal inferences are desired, one should select variables that are distinct and avoid the use of indices that are thought to be measures of the "same" underlying variable. Generally, factor analysis has been employed to resolve redundancies among the indicators of a given factor. Blalock stated that factor analysis techniques and the interpretations given to the factors extracted ordinarily presuppose certain limited kinds of causal
models.18 In particular, it is assumed that there are no direct causal links among the indicator variables and that these measured variables are caused by the underlying variables, making the intercorrelations among the indicators completely spurious. Factor analysis may therefore not be appropriate for census data or other types of analyses in which some of the measured variables are caused by other measured variables. But given the intransigence of the regression coefficient to causal directions among measured variables, stepwise regression can prevent the inclusion of redundant information and thus selects the indicator that is optimally related to the dependent variable. Experimental design vs. survey design
It is doubtful that in estimating a production function all of the relevant influences on output can be identified or, if identified, can be properly measured. In other words, it is doubtful that all the relevant variables can be controlled. Another alternative over rigid controls lies in the experimental tradition of randomization. Randomization does not control for implicit variables. Rather, it transforms the causal scheme into the form
18Hubert M. Blalock, Causal Inferences in NonExperimental Research (Chapel Hill: University of North Carolina Press, 1961), p. 168.
where v and z are the implicit influences. They continue to operate on y but do not systematically affect the estimation of the relationship between x and y as measured by the regression coefficient. They do increase sampling error.
According to Blalock, of the implicit factors that are confounded with variables xi in their influence on y, there exist two types:
1. Forcingsthose variables that are impinging from
the outside environment;
2. Propertiesthose variables that are conceived
to be properties of the system at the time of
observation.19
Randomization reduces the effects of property variables; however, it does not handle forcing variables. This parallels the CampbellStanley tradition regarding sources of external invalidity. In school effects studies the interactions of student selection and school variables, testing and school variables, and various reactive arrangements cannot be handled with randomization. Thus
19Ibid., p. 22.
76
randomization allows simplifying assumptions to be made about a large class of 'property variables' that can be made to operate independently of the causal variables under study. But other 'forcing variables' cannot so readily be ruled out. Even in experimental designs, simplifying assumptions must always be made if causal models are to be evaluated. In short, causal laws can never actually be demonstrated empirically. This is true even where experimentation is possible.
Thus, even though the process of experimentation is
viewed with limitation, it is certainly a far more effective device for eliminating alternative plausible causal schemes than is the survey study. However, in the carrying out of largescale studies for the estimation of educational production functions, recourse to experimentation is not likely to be feasible. This is the current state of affairs in educational research. If there is any consolation, it lies in a statement by Gilbert and Mosteller:
Our own experiences with observational studies . . .
has been that the controlled study usually does
not contradict the observational study, but it does
clarify matters and make them firmer.20
20Gilbert and Mosteller, p. 373.
Strategies for Optimization
Given that an educational production function has been estimated using multiple regression analysis, the regression coefficient assesses the effect of an independent variable on output with the proviso that other variables in the equation remain unchanged. Unfortunately, the condition that 'other things remain unchanged' cannot be met in the real world. A policy of raising teachers' salaries would probably change the monies spent on other educational resourcesresources that might be necessary to the success of the student in the school. What is needed is a process that assesses the effects of variables acting in all their simultaneity, a process that specifically notes the consequences on other resources when a given resource is changed.
Essentially, interest centers on a situation in which the use and level of resources are constrained by various factors. In particular, the budget equation serves as a primary constraint in the educational process. If for a given budget, all dollars are already charted for consumption, then an increase in any resource must necessarily mean decreases in others.
At the same time, subject to various constraints,
decisionmakers are attempting to optimize the allocation of resources so as to maximize educational outcomes.
78
Optimization is the process of finding a best solution among several feasible alternatives. What is needed, then, is a systematic procedure for determining the best combination of inputs subject to certain constraints. Such a procedure is mathematical programming. If a model of the educational process describes a linear system such as
y = Eb.x.
subject to the following constraints
E~a x < c
ij i i
then the techniques of linear programming can determine optimal solutions. The x. are the variables that can be
1
manipulated to achieve the desired objective. y = b.x. is called the objective function, since the objective is to optimize y. Most frequently, nonnegativity restrictions will be placed on the xi (i.e., xi > 0 for all i). The constants (aij, bi, ci) are termed parameters; they are factors that effect the objective function but cannot be manipulated as are the variables. Constraints not only provide regions of acceptable values of the variables but also provide a mechanism for relating the mathematical expression to real world conditions. The important contribution of this approach is that it realistically views the process of education as a system constrained by political, social and economic considerations.
As stated previously, the linear approach simply ignores such phenomena as economies of scale and nonadditive interactions among input variables. Actually, the process of education might better be approached using nonlinear programming, problems in which one or more terms of either the objective function or any one of the constraints is represented by a nonlinear function. Generally, the system can be expressed as:
y = g(xI, x2, . . . , xn)
subject to
Sfi(xl, x2, ï¿½ . Xn) Ci
Thus, programming is the mathematical method for the analysis and computation of optimum decisions which do not violate the limitations imposed by inequality sideconditions.
Associated with each programming problem is a dual problem. The process of finding the optimal values for the original problem also gives the optimal solution for the dual problem. Without discussing all of the ramifications of this concept, there is one important implication for production function analyses. In the process of maximizing the production function (objective function), a level of output at a minimum cost (budget equation) is attained. This is to say that there is no other way of allocating resources to the inputs such that this level of output
can be attained for a lower cost. Given then, that cost has been minimized at an optimal level of output, the budget can be incremented by x units and the effect observed on output.
Up to this point, consideration has been given only to maximizing output given a specific budget. But the process of incrementing the budget has broader implications for policy analysis. Specifically, educational decisionmakers should follow this approach to financing educational programs:
1. What are the outcomes that are desired and at
what levels?
2. What programs are necessary to achieve the outcomes that are desired?
3. What resources are required to carry out the
programs?
Given this approach towards the financing of educational programs and the optimization techniques described above, decisionmakers are in a position to estimate the additional resources required to bring about incremental improvements in educational outcomes. As an example, suppose the statewide average on reading achievement for all twelfthgrade students is 11.7 years. Suppose further that a production function has been established between
81
reading achievement and various inputs into the educational process. Then educational leaders are in a position to determine the minimal cost to bring about any desired level of improvement. If the decision is made, for instance, to increase the reading level to 12.4 years, then an optimal combination of inputs can be determined that will minimize costs. The advantage of this approach is that decisionmakers can have some idea of what additional resources allocated to education will buy.
In the process of developing a mathematical model,
the idea of integrating the concepts of multivariate statistics and mathematical programming may seem to some to be a novel approach. It is one, however, that was anticipated as early as 1963. At that time Wegner stated:
Multivariate statistics is concerned with the
estimation of parameters and the determination
of the structure of relations between variables.
Mathematical programming assumes a mathematical
model of known structure and is concerned with the determination of optimal policies subject to known structural restrictions. Each topic can supplement
the other, and the union of the two disciplines would provide a more powerful tool than either
discipline could provide in itself . . . problems
requiring a combination of the two disciplines
are becoming increasingly common.21
21P. Wegner, "Relationship Between Multivariate Statistics and Mathematical Programming," Applied Statistics, XII (November, 1963), 146150.
An Algorithm for Solving Nonlinear Programming Problems
In estimating the production function and the budget equation, multiple regression analysis was an appropriate methodology. It is necessary to determine an appropriate methodology to solve a nonlinear programming problem in which the production function is the objective function and the budget equation is one constraint. Other constraints would include the ranges of the variables since predictions cannot be extrapolated beyond the ranges observed. Nonlinear programming problems generally fall into two classes, those in which the constraints are linear and those which are not. Since the budget equation considers questions of economies of scale, there is every likelihood that the equation will include nonlinear terms. The most crucial question regarding nonlinear problems deals with the nature of the objective function. If interest centers on maximizing the objective function, then questions of concavity must be answered. A concave function is defined as a function such that a straight line segment joining any two points on the surface must lie below the surface. Figure 3 illustrates a concave function. Figure 4 illustrates a function which is not concave. The figures also illustrate the basic difference between the two functions. In Figure 4 the
function attains two local maximum values Mi, M2. This means that within small neighborhoods of M1 and M2, these points are the maximum values. In this case M2 is the global maximum since it is the largest value f assumes in the region. In Figure 3, on the other hand, only one local maximum is observed. For a concave function, a local maximum is always a global maximum over a given range.
y
M2
M Mi
Q
P f f
a b x a x1 x2 b x
FIGURE 3. AN EXAMPLE OF A FIGURE 4. AN EXAMPLE OF A
CONCAVE FUNCTION f NONCONCAVE FUNCTION f
This illustrates the difficulty of using a nonlinear programming algorithm when the objective function is not concave. The algorithm always converges to a local optimum whether or not it is the global optimum. Thus if the algorithm is started at x1 the solution converges on M .
1
Starting at x2, it will converge on N2. When dealing with
84
a nonconcave function, theoretically one would need to start the algorithm at each point in the range to be assured of attaining the global optimum. Practically, by starting the algorithm at various points in the range the global optimum can be approximately determined correct to a certain number of decimal places. The number of points used determines the accuracy of the estimates. Of course, all of this is avoided if the function is concave, for then any local optimum is a global optimum.
An allpurpose algorithm for solving nonlinear programming problems with nonlinear constraints is the SUMT algorithm. SUMT stands for Sequential Unconstrained Minimization Technique. The technique was perfected by Fiacco
22
and McCormick. The general programming problem is to determine a vector X that solves:
minimize f(x), subject to gi (x) >0; i = 1, 2, . . , m.
x = (xI, x2, . . . , Xn).
(Note, a maximization problem can always be transformed
to a minimization problem by multiplying through the
objective function by a (1).)
22Anthony V. Fiacco and Garth P. McCormick, "Computational Algorithm for the Sequential Unconstrained Minimization Technique for Nonlinear Programming," Management Science, X (July, 1964), 601617.
This problem is solved by transforming it into a sequence of unconstrained minimization problems.
Define the function
P(x,r) =f(x) + r1 1
gi (x)
where rI is a positive constant. Proceed from x0 (starting point) to a point x(r ) that minimizes P(x,rl) in the feasible domain (set of points satisfying the constraints of the problem).
Form the new function,
P(x,r2) = f(x) + r2Z 1 ; 0
gi(x)
Starting from x(rl), determine the minimum of P(x,r . Continuing in this manner, a sequence of points {x(rk)}, k = 1, 2, 3 . . . are generated that respectively minimize {P(x,rk)} and rk) Q as k> co. This sequence of P  minima converges to an optimum of the original problem; that is, x(rk)+ 3, where R is the optimal vector and f(x(rk) ) f , as rk >.
The principal technique used to minimize P at step i is to evaluate the gradient of P at x(ri 1 ) and then to descend the gradient a fixed distance. The gradient gives the direction of steepest descent of the surface P at a given point and thus suggests the optimal direction to
proceed in order to attain the minimum value of the P function. When the original problem is not concave, it is necessary to start at several points to determine an approximation to the global optimum. Mathematical Considerations in Retrospect
In review, a detailed discussion of mathematical
considerations relative to the development of a mathematical model has taken place. As a result of this discussion, several generalizations can be made.
1. Analysis should be confined to a single output.
By employing appropriate constraints, minimum standards to be achieved by other outputs can be set.
2. The use of a nonlinear model that incorporates
higher degree and interaction terms should be considered. This is necessary to consider questions of economies of scale and nonadditive effects.
3. The use of gain scores at the level of the individual is not justified. Such scores are simply not reliable. Aggregated gain scores are reliable for sufficiently large n, but their use is dismissed on the grounds that it leads to underestimation of the effects of school variables.
4. The measurement of independent variables must
essentially be errorless. Random error in measuring the dependent variables does not bias the regression coefficients, but does increase the standard error of estimate.
5. Where possible, one should use population data
rather than sample data. The use of sample data requires additional assumptions regarding the shape of the distribution (i.e., normal) and the homogeneity of variance. Furthermore, if independent variables are interrelated, multicollinearity may affect the stability of the regression coefficients. If population data are used, the stability of regression coefficients is affected only by specification error (assuming no measurement error in the independent variables).
6. For analysis to be meaningful it is necessary
that the variables be constrained by real world conditions. This implies the use of mathematical programming.
7. The production function should be estimated using crosssectional or longitudinal data in a surveytype design. While the experimental approach is more effective in eliminating plausible alternative causal schemes, the process of randomization and its concomitant, sampling, may lead to unstable regression coefficients.
8. The experimental approach should be followed in
investigating the residuals of production function analysis (i.e., the question of technical efficiency). This leads to the discovery of previously unknown implicit factors that influence output and are correlated with certain inputs. Their inclusion in the regression equation allows us to control for their effects on other inputs, which lessens specification error and improves the ability of the model to predict.
A Mathematical Model
Given a conceptual framework for the process of education, a mathematical model is presented which translates these conceptualizations into statements capable of quantitative analysis. Given the educational production function:
Ait = g(Ci(t), si(t), Pi
for a given set of community, peergroup, and individual inputs, the above can be transformed to
(t)
Ait = g(k,Si )
where k is a constant. Thus for a given set of school inputs, the problem is to maximize
Ait = g(k,S. (t)
it 1
89
subject to
fi(k, Si(t)) < Bi.
The production function is estimated by using multiple regression analysis, and the entire system is'solved by methods of mathematical programming.
CHAPTER III
EMPIRICAL APPLICATION
OF THE MODEL
In light of the prevailing emphasis on modeling the educational process, there is an emerging literature that advocates the use of mathematical models as a means to increasing the efficiency of educational planning and decisionmaking. Too often, however, the intent is simply to focus on the advantages of applying models rather than to provide empirical research that illustrates the unique contributions of such models in generating solutions for real and immediate educational problems. As a result, some rather sophisticated models have appeared in the literature, the mathematics of which have awed even the most interested observers with less than a college degree in mathematics. Also, the developers of such models have failed to provide means by which parameters evident in the model could be estimated. No matter how close the model may mirror the conceptual framework, any estimation of parameters on other than an objective basis defeats

Full Text 
PAGE 1
A MATHEMATICAL MODEL FOR ALLOCATION OF SCHOOL RESOURCES TO OPTIMIZE A SELECTED OUTPUT By JACKSON K. McAFEE A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF EDUCATION UNIVERSITY OF FLORIDA 1972
PAGE 2
ACKNOWLEDGMENTS The writer wishes to acknowledge the efforts of certain persons who provided encouragement and assistance in the preparation of this study. The writer is particularly indebted to Dr. Michael Y. Nunnery, chairman of the supervisory committee, for his interest and many valuable suggestions, and to other members of the committee. Dr. Ralph B. Kimbrough and Dr. Irving J Â„ Goff man. Thanks are given to Dr. Charles M. Bridges, Jr., Dr. Donald W. Hearn, and Dr. Gene A. Barlow for time generously spent discussing with the writer various aspects of the study. Finally, deep appreciation is extended to the writer's wife, Esther, who spent long and arduous hours in the typing of the original manuscript. n
PAGE 3
TABLE OF CONTENTS Page ACKNOWLEDGMENTS ii LIST OF TABLES v LIST OF FIGURES vi ABSTRACT vii CHAPTER I INTRODUCTION 1 The Problem 4 Assumptions 10 Definition of Terms 10 Review of the Literature 12 Procedures 29 Organization of the Remainder of the Study 32 II DEVELOPMENT AND DESCRIPTION OF THE MATHEMATICAL MODEL 33 Conceptual Considerations 34 Mathematical Considerations 48 A Mathematical Model 88 III EMPIRICAL APPLICATION OF THE MODEL 90 Design of the Empirical Application . . 91 Results of the Empirical Application. . 110 IV SUMMARY, DISCUSSION AND SUGGESTIONS FOR FUTURE STUDIES 140 Summary 140 Discussion 146 Suggestions for Future Studies 151 iii
PAGE 4
TABLE OF CONTENTS (CONTINUED) Page APPENDIX 157 SELECTED BIBLIOGRAPHY 161 BIOGRAPHICAL SKETCH 168 IV
PAGE 5
LIST OF TABLES TABLE Page 1 VARIABLES POSTULATED AS PREDICTORS OF LOCAL SCHOOL DISTRICT PRODUCTIVITY FOR A GIVEN STATE 102 2 BASIC DATA DESCRIPTION 112 3 INTERACTION TERMS TESTED FOR THE PRODUCTION FUNCTION 115 4 REGRESSION COEFFICIENTS FOR THE PRODUCTION FUNCTION (DEPENDENT VARIABLE: READING ACHIEVEMENTX 21 ) 117 5 AN EXAMINATION OF THE INTERACTION BETWEEN THE PERCENTAGE OF TEACHERS WITH LESS THAN FOUR YEARS' TRAINING (X 14 ) AND MEDIAN TEACHING SALARY (X 1? ) 120 6 REGRESSION COEFFICIENTS FOR THE BUDGET EQUATION (DEPENDENT VARIABLE: INSTRUCTIONAL EXPENDITURES PER PUPIL Â— X 24 ) 122 7 VARIABLES MANIPULATED IN THE MATHEMATICAL PROGRAMMING ANALYSIS 130 8 EFFECT OF DISTRICT SIZE ON ACHIEVEMENT AND THE DISTRIBUTION OF SCHOOL RESOURCES (X 14 = 5.87) 135 9 THE EFFECT OF INCREMENTING THE BUDGET ON ACHIEVEMENT (X 22 = 9,180) 137 v
PAGE 6
LIST OF FIGURES FIGURE Page 1 A MODEL OF THE EDUCATIONAL PROCESS 38 2 METHOD FOR INVESTIGATING TECHNICAL EFFICIENCY 55 3 AN EXAMPLE OF A CONCAVE FUNCTION f 83 4 AN EXAMPLE OF A NONCONCAVE FUNCTION f. . . . 83 vx
PAGE 7
Abstract of Dissertation Presented to the Graduate Council of the University of Flqrida in Partial Fulfillment of the Requirements for the Degree of Doctor of Education A MATHEMATICAL MODEL FOR ALLOCATION OF SCHOOL RESOURCES TO OPTIMIZE A SELECTED OUTPUT By Jackson K. McAfee August, 1972 Chairman: Dr. Michael Y. Nunnery Major Department: Educational Administration Educational decisionmakers are faced with the task of allocating school resources in such a way as to maximize student outcomes. Decisionmakers are constrained in their choice of plans by a limited budget and by social, political, and legal forces. The rationale for this study was a response to the need for studies that demonstrate the utility and limitations of mathematical models as aids in improving educational planning and decisionmaking. The problem of the study was to develop a mathematical model to facilitate the decisionmaking process in selected areas of educational activity by optimizing the allocation of scarce resources and to empirically illustrate its application. In development of the model attention was given to Vll
PAGE 8
1. Determining the educational production function that describes the inputoutput relationship between selected variables; 2. Determining the optimal combination of inputs to maximize output subject to certain costs and other constraints ; 3. Determining the optimal combination of inputs to maximize output, subject to certain constraints, given increments in the budget constraint. In the empirical application, data were utilized to illustrate the model. The data consisted of aggregate measures on input and output factors in 181 local school districts in a given state. Twentyfour variables were measured providing information on (1) student and community inputs, (2) median reading achievement at the sixthgrade level, and (3) selected school resources. In particular, certain teacher characteristics were considered; experience, training, and level of salary. Also, classroom size, pupilsupport personnel ratio, and the size of the district were used. Given these inputs (i.e., student, community, and school) into the educational process, stepwise multiple regression was used to estimate the parameters of the production function (reading achievement Â— dependent variable) viii
PAGE 9
and the budget equation (perpupil instructional expenditures Â— dependent variable) . The proportion of variance explained in the dependent variables was 0.79 and 0.94, respectively . The optimization strategy employed the techniques of mathematical programming with the production function as an objective function and the budget equation as one of the constraints. Specifically, the SUMT (Sequential Unconstrained Minimization Technique) algorithm was employed to solve the mathematical programming problems. Looking at the population of school districts as a whole, the analysis revealed that by reallocating existing school resources and maintaining a minimum standard for the pupilsupport personnel ratio, the predicted increase in reading achievement that would result was 0.38 standard deviations. Given the existing allocation of school resources, if the perpupil budget was incremented by $100 so as to maximize reading achievement, the predicted increase would be 0.73 standard deviations. Throughout the analysis it was apparent that the percentage of teachers with advanced training was the dominant effect followed by median teacher salary. The importance of salary was a function of the percentage of teachers with less than four years' training; the greater the percentage. xx
PAGE 10
the greater the effect of salary. When the percentage was zero, the effect of salary was minor compared to the effect of advanced training. X
PAGE 11
CHAPTER I INTRODUCTION For too long the United States has tolerated a system of compulsory universal education without pertinent public accountability. As a result critical deficiencies are now becoming evident Â— such as the failure to serve innercity students effectively. . . . They (public) have a right to know how well students are learning as well as a responsibility to pay for the educational program.'*' Across the land, state legislatures have sounded the clarion call of "accountability" for educational institutions. Politicians have asked for specific and quantitative information on the schools' output. Educators' requests for increased educational expenditures have been matched by legislators' demands for increased productivity. Concurrently, educational institutions have begun efforts to respond to the demands of new technologies. Experimentation with the concepts of systems analysis and operations research have occurred in an attempt to place *"L . J. Stiles, "Assessment of Educational Productivity," Journal of Educational Research , LX (April, 1967), 1. 1
PAGE 12
2 decisionmaking activities on a more rational plane. The PlanningProgrammingBudgetingSystem (PPBS) found its place in the public sector, first at the federal level and then m state and local governmental agencies . "Attention is focused on choosing from many possible objectives those specific objectives to be achieved, and then choosing from alternative courses of action that plan which will accomplish the chosen objectives at the lowest possible cost, or accomplish some more optimum set of objectives at a . . 3 specified cost." For school purposes, the core of PPBS is the program budget that reports programs to be accomplished and allocates expenditures in terms of objectives relating to student outputs rather than in terms of objects to be purchased. Two outcomes of a PPB system are costeffectiveness and costbenefits analyses. Costeffectiveness is concerned with the extent to which given dollar inputs if utilized in alternative ways would increase certain educational outputs. The main requirement of costbenefits 2 Marvin C. Alkm, "Evaluating the CostEffectiveness of Instructional Programs," UCLA Symposium on Problems in the Evaluation of Instruction, Sponsored by the Center for the Study of Evaluation (Los Angeles: December, 1967). ^H. Thomas James, The New Cult of Efficiency and Education (Pittsburgh: University of Pittsburgh Press, 1969), p. 40.
PAGE 13
3 analysis is that both input and output measures be specified in the same unit, namely dollars; then the ratio of benefits to costs for a given program is examined. 4 With new developments in technology, economists and educational researchers have increased their efforts in the study of the internal efficiency of education. These studies require measurements of both the inputs and outputs of educational systemsa difficult task. Even more difficult is the recognition that there are too many variables affecting school performance for researchers to experimentally control and manipulate them all at once. "Thus researchers have had to rely on surveytype correlation analyses and infer from them the causes of variation in educational success. Hence, inputoutput models from economics have been adapted for the study of productivity in education. This has not been done without experiencing difficulties. Usually in industry, inputoutput relationships are such that empirical relationships can be determined within a theoretical framework. Such is not the case in education where concern with behavioral characteristics is wanting of both a theoretical and quantitative basis. 4 Alkin. ^Michigan Department of Education, Research into Cor relates of School Performance, Assessment Report No. 3 (1970), p. 14.
PAGE 14
4 Numerous inputoutput studies have taken place in education in recent years; however, most of these have not considered questions of costeffectiveness. Of the costeffectiveness studies, few have related their techniques to inputoutput analysis. Consequently, the focus of this study was the development of a model that integrates both concepts and leads to decisionmaking on a plane conceived by the advocates of PPBS. The demands of the age are such that if the educational enterprise is to survive, progress must be made towards the development of a theory of decisionmaking that effectively relates the inputs of a school system to its outputs. The Problem Statement of the Problem The problem of the study was to develop a mathematical model to facilitate the decisionmaking process in selected areas of educational activity by optimizing the allocation of resources and to empirically illustrate its application. In development of the model attention was given to 1. Determining the production function that describes the inputoutput relationship between selected variables; 2. Determining the optimal combination of inputs to maximize output subject to certain costs and other constra ints;
PAGE 15
5 3. Determining the optimal combination of inputs to maximize output, subject to certain constraints, given increments in the budget constraint. In the empirical application, data obtained from the National Educational Finance Project were utilized to illustrate the model. On the basis of these data an educational production function was estimated and optimal allocations of resources were determined subject to certain constraints . Delimitations The study was confined as follows: 1. To educational outcomes of a cognitive nature, that is, behaviors dealing with the recall or recognition of knowledge and development of intellectual abilities and skills ; 2. To the investigation of allocative efficiency; using the dollar budget in such a way that the resources that are purchased yield the best outcome that can be attained for the given budget (i.e., to identify feasible ways of improving output within the context of the present system by using the components presently employed in the system) ;
PAGE 16
6 3. To an investigation of school districts within a given state in the empirical application; 4. To an investigation of aggregate educational policy (what is true for the population as a whole may not be true for individual school districts). Limitations The study was subject to the following limitations: 1. The use of unidimensional outcomes; 2. The explanatory power of the particular input variables selected and their usefulness from a decisionmaking context; 3. The problems associated with imprecise measurement of the variables selected; 4. The use of a developmental design and the limitations inherent in determining a mathematical model: a. the lack of a theoretical basis for postulating a specific model; b. the necessity for a statistical dimension and the resulting restrictions on the complexity of the model design; c. because social behavior is less predictable than physical phenomena, mathematical models are a less accurate approximation of reality in the social sciences;
PAGE 17
7 5. The use of survey data and the inability to attach causality to estimated parameters. Justification for the Study Since the most commonly used tool for measuring economic relationships is multiple regression analysis, it is no surprise that as researchers have examined the internal efficiency of education, regression analysis has been the primary tool. 8 Also, the economists' stress on regression coefficients rather than the proportion of variance ex7 plained is a new development in educational research. As the review of the literature describes, educational production functions have had too many flaws to be effectively used in the past. 8 Nevertheless, an assessment of the optimality of resource allocation in education requires a knowledge of the production process. Benson, in considering ^Arthur S. Goldberger, Topics in Regression Analysis (New York: Macmillan, 1968), p. 1. 7 Mary Jane Bowman, "Economics of Education, " Review of Educational Research , XXXIX (December, 1969), 642. 8 Elchanan Cohn, "Towards Rational DecisionMaking in Secondary Education, " Institute for Research on Human Resources (University Park: Pennsylvania State University, 1970), p. 8. 8 Ibid . , p. 2.
PAGE 18
8 resource allocation in the education process to attain objectives, has stated that the following conditions must be met: (1) that objectives can be precisely stated; (2) units in which the objectives are achieved can be defined ; (3) variables which influence the achievement of an objective can be identified and that units of these variables can be defined and priced; and (4) the relationships between input variables and objectives can be specified."^ Bowman has stated, "Proper identification of production functions is indeed necessary for sound costeffectiveness or costbenefits analysis in educational decision making. Bowles has stated, "Without knowledge of the educational production process, we are unable to estimate the relative effectiveness of each factor input per unit of cost, nor are we able to compare the relative productivity of resources devoted to different types of education or non12 educational purposes." Entwisle and Conviser noted that "^Charles S. Benson, The School and the Economical System (Chicago: Science Research Associates, 1966), p. 102 . I T Â‘Bowman, p. 652. ^Samuel Bowles and Henry M. Levin, "More on Multicollinearity and the Effectiveness of Schools," Journal of Human Resources , III (Summer, 1968), 28.
PAGE 19
inputoutput analysis deals only with the possibility of certain states of affairs. It is not suitable for making determinations as to the most desirable state of affairs, such as producing certain outputs at minimum costs. "There has developed a more general approach known as linear programming, which is appropriate for optimization problems, and inputoutput analysis turns out to be a special case of 13 linear programming." Bowles stated that "the task of statistical and educational researchers should be directed towards estimating the structural parameters of an equation 14 representing a learning process." Thus, if efficient resource allocation is to be more than a figure of speech, the previous statements must be considered. To the extent that a more effective marshaling of school resources maximizes student outputs, the attainment of the goals of education are enhanced and support for proper financing is generated. Hence, the study was justifiable for the following reasons : 13 Doris R. Entwisle and Richard Conviser, "InputOutput Analysis in Education, " High School Journal , LII (January, 1969), 194. 14 Samuel Bowles, Planning Educational Systems for Economic Growth (Cambridge, Mass.: Harvard University Press, 1969), p. 400.
PAGE 20
10 1. Refinements in the techniques that develop production functions are needed if a meaningful analysis is to occur; 2 . Appropriate optimization techniques must be developed if the goals of resource allocation are to be met ; 3. Effective interfacing between the production function and the budget equation must result if costeffectiveness analysis is to be realized. Assumptions 1. It was assumed that instruments used to measure inputs and outputs in the model were valid. 2. It was assumed that cost per unit of input could be determined for certain inputs used in the model. 3. It was assumed that a relationship existed between the inputs and outputs used in the model. Definition of Terms Budget equation . Â— The relationship between the dollar budget accorded educational decisionmakers and the transformation of that budget into school inputs. Costeffectiveness analysis. A procedure by which the costs of alternative means of achieving a stated effectiveness or, conversely, the effectiveness of alternative means
PAGE 21
11 for a given cost, are compared to determine that alternative or combination of alternatives that either gives the greatest expected effectiveness for a given expected cost or a given expected effectiveness for the least expected cost. Constraint . Â— Any procedure, practice or condition that limits the manner in which an activity or decisionmaking is carried out. Decisionmaking . Â— The series of actions and interactions through which questions or issues are resolved or disposed . Interfacing . Â— The process of using certain variables measured in the same units in both the production function and the budget equation for the purpose of optimization. Mathematical programming . Â— A systematic method for finding maximum and minimum values of mathematical functions subject to certain constraints. Marginal effect . Â— The increase in output which can be obtained by a unit increase in an input. Multiple regression analysis . Â— The process of estimating a variable y from the knowledge of several predictor . . , x . n variables x , x , . 1 2
PAGE 22
12 Optimization . Â— The best or most favorable amounts of inputs for the purposes of maximizing outputs or minimizing Â• costs . Parameter . Â— A measure computed from all observations in a population. Production function . Â— The relationship between an outcome of education and the human and material resources which education consumes. Resource allocation . Â— The process of assigning or allocating resources among the various components of an educational system. Review of the Literature The review summarizes past studies which were pertinent to the purposes of the present study. The main areas of concern were 1. Inputoutput analysis, 2. Costeffectiveness analysis, 3. Methodological concerns. Since methodological considerations were of prime importance, emphasis was focused on this aspect.
PAGE 23
13 InputOutput Analysis In an attempt to determine factors that are related to school performance, educators have employed various models or research paradigms. Inputoutput models have their origins in the work of Leontief in economics. Leontief conceptualized a transactions table for displaying relationships between economic inputs and outputs for 15 purposes of budget analysis and planning. Hoffenberg applied this model to a school district budget. The assumption was that the activities which a school district carries on are interrelated, hence Hoffenberg determined 16 the structural framework of these relationships. The goal of such inputoutput studies is to obtain estimates of the marginal effects of an increase in a given input variable . Researchers have adapted the inputoutput model of Leontief in an attempt to account for the causes of variation in school performance. "Researchers employing this 15 Marvin C. Alkin, "The Use of Quantitative Methods as an Aid to DecisionMaking in Educational Administration, " A paper read at a meeting of the American Educational Research Association (Los Angeles: 1969). 16 Marvin Hoffenberg, "Application of Leontief InputOutput Analysis to School District Budgeting, " Center for the Study of Evaluation: Working Paper No. 12 (Los Angeles: UCLA Graduate School of Education, 1970).
PAGE 24
14 paradigm have: (1) identified a criterion of school performance as a dependent variable and measures thought to influence performance as independent variables; (2) operationally measured these variables in a sample of educational systems; (3) computed relationships between independent and dependent variables; and (4) drawn inferences from the relationships as to what factors, either singly or in combi17 nation, account for variation m school performance." The mathematical relationship between inputs and an output is termed a production function. Outputs, which are behavioral changes in students, and their relationships to 18 inputs are termed psychological production functions. This function seeks to relate performance increments to inputs . Three types of studies exist which seek to determine the relationships between school inputs and educational outcomes . The first type of study attempts to link total educational expenditures to output measures. The second group includes those that have attempted to estimate the effects of various functional 17 . Michigan Department of Education, p. 1. 18 J. Alan Thomas, The Productive School: A Systems Analysis Approach to Educational Administration (New York : Wiley & Sons, Inc., 1971), p. 12.
PAGE 25
15 components of expenditure on school outcomes, and the third set of studies represents estimates of the relationships between resources as measured in physical terms and school outputs (educational production functions) . 19 For studies of the first type, emphasis was placed on the financial side of the ledger (i.e., more money does more o n things) without concurrence as to what is quality. Actually, quality of education was seen to be a measure of school practices. For studies of the second type, emphasis was placed on determining which group of resources, that the school purchases, gives the highest educational returns per dollar input. A problem was that these studies did not address themselves to how the resources were being used. Given that one has purchased a particular combination of physical resources, there are alternative ways of utilizing 2 1 these resources to affect output. With the advent of systems analysis in industry and the concern with the nature of output, studies of the third ^Henry M. Levin, "The Effect of Different Levels of Expenditure on Educational Output, " Economic Factors Affecting the Financing of Education (Gainesville, Florida: National Educational Finance Project, 1970), II, 185. 20 Michigan Department of Education, pp. 1, 2. 21 Levin, pp. 182, 188.
PAGE 26
16 type have emerged. Kershaw and McKean pioneered 'the application of systems analysis to education. "This work outlined persuasively and clearly the values of treating education as a system amenable to quantitative analysis, capable of evaluation in terms of efficiency, and responsive to policy decisions regarding inputs and administrative arrangements." After this several inputoutput studies followed that sought to relate various measures of student performance to school and community characteristics. Thomas, Benson, Coleman, Burkhead, and Kiesling sought to determine the factors most related to achievement. In each of these studies, the most highly related characteristics were found to be socioeconomic.^ Most of these studies used regression methods to determine the production functions. Criticisms of this methodology are discussed in a later section. Costeffectiveness Studies It is assumed that resources should be allocated so as to maximize the achievement of a given 22 Jerry Miner, "Financial Support of Education, Implications for Education of Prospective Changes in Society , ed. by Edgar L. Morphet and Charles 0. Ryan (Denver: Designing Education for the Future, an EightState Study, 1967) , p. 313. 23 . Michigan Department of Education, p. 15.
PAGE 27
17 objective or set of objectives. Cost analysis provides a means for examining the manner in which resources are used. 2 ^ Temkin stated that costeffectiveness methods are appropriate when the decision structure is suitable for planned and incremental improvement of an existing system with a time dimension weighted heavily towards the present and immediate 2 5 , future. Thomas described five elements of costeffectiveness analysis: (1) the objective, (2) the alternatives, (3) costs, (4) operational model, and (5) a decision rule. Â° If the operational model is the production function described earlier, then one should be able to estimate the allocative pattern among inputs that will yield the largest increase in output within a limited budget. Few such studies have been carried out for educational productivity, in part, because the 'prices' of 27 such inputs are difficult to derive. As a result, most 2 8 inputoutput models used to date have not included costs. 24 Thomas, p. 41. 2 5 Sanford Temkin, "A Comprehensive Theory of CostEffectiveness: Administration for Change Program" (Philadelphia: Research for Better Schools, Inc., April, 1970), p. 41 . ^ ^Thomas, p. 82. 2 ^Levin, p. 191. 28 Thomas, p. 17.
PAGE 28
18 One exception is Hoffenberg, who applied an interdependency approach in an inputoutput framework to a school district budget. Using legal program descriptions he was able to determine 1. interprogram flows Â— the input distributions contributing to output, 2. direct purchases per dollar of output (what is required from each activity to give $1 of output to a designated activity) , 3. direct and indirect requirements per $1 of final output from each of the activities. 29 It must be noted, however, that the outputs of Hoffenberg' s study were school programs rather than student outputs. Nevertheless, where psychological production functions were involved, Thomas stated that inputoutput relationships could be measured in terms of costs per student per unit of time for a given input. As an example, to determine the direct cost per student hour of a given service would require cost figures for (1) teachers' salaries, (2) other salaries of personnel who support the service, (3) space, and (4) equipment and materials. ^ In designing a costeffectiveness model for elementary and secondary education, Abt made the following recommendations : 2 ^Hof fenberg, p. 6. 30 Michigan Department of Education, p. 50.
PAGE 29
19 1. The relative contributions of home, peer influence, and school instruction to student achievement should be determined, 2. The relative contributions of manipulable variables in the school environment which contribute most heavily to student attitude and achievement change should be determined, 3. The coefficients and parameter settings for environmental influences on student attitude and achievement change should be determined with enough accuracy to allow useful predictxons Cohn discussed the use of linear programming as a means of optimizing outputs. He regarded existing production functions as being incomplete. Hence, he used suboptimization techniques where he varied at most three variables simultaneously. Consequently, the overall results were less than optimal. The New York State Education Department studied the feasibility of using state department data to develop multiple regression equations required for simulating the effects of alternative expenditure configurations on pupil achievement. Various 31 Clark C. Abt, Design for an Elementary and Secondary Education CostEffectiveness Model (Cambridge, Mass.: Abt Associates, Inc., 1967), II, 142. 3 2 Cohn.
PAGE 30
20 configurations of input were considered to maximize output, 3 3 but optimization techniques were not discussed. Methodological Considerations For most production processes, an increase in the quantity of an input is expected to yield an increase in output. However, Ribich stated. Observation of the effect may be difficult because the additional resource input is small relative to other changes that are occurring simultaneously. The "upshot" can be difficulty in statistically sorting out the independent influence of the additional inputs from the other influences at work, many of which are unmeasured and unmeasurable. The problem may be so severe as to submerge almost completely any evidence of output response to an input change. 34 Wynne stated that most inputoutput studies in education 35 have used the techniques of multiple regression analysis. Consequently, as a result of Ribich 's previous statement, the use of multiple regression analysis must be examined with particular attention to interpretation of results. A brief description of the method follows. 3%ew York State Education Department, "Technical Report of a Project to Develop Education CostEffectiveness Models for New York State" (Albany: Bureau of School Progress and Evaluation, March, 1970) . 34 Thomas I. Ribich, "The Effect of Educational Spending on Poverty Reduction, " Economic Factors Affecting the Fi nancing of Education , II, 214. 3 5 Edward Wynne, "School Output Measures as Tools for Change," Education and Urban Society , XII (November, 1969).
PAGE 31
21 The process whereby performance on a singlecriterion variable is predicted from a knowledge of several predictor variables is called multiple regression . Predictors, undoubtedly, will differ in their predictive efficiency. Therefore, computational processes are directed towards determining a separate regression weight (partial regression coefficient) for each predictor, in order to achieve the best possible prediction. The result may be a regression equation of the form: y = b + b.x + bÂ„xÂ„ + . . . + b x o 1 1 2 2 n n where y = criterion (dependent) variable, y = predicted value, and x^ = predictor (independent) variable. A measure of overall efficiency in prediction is given by R (multiple correlation coefficient) , which is the correlation of predicted values y to actual values y. More importantly, R 2 gives us the proportion of variance of y accounted for by 3 6 the regression relationship. The partial regression coefficient b^, estimates the marginal effect upon y with all other x's held constant. However, if the sample variation of x^ is small, the 36 37 Goldberger . Ibid . , p. 26.
PAGE 32
22 imprecision which attaches to its coefficient will tend to be large 38 If the chief concern is with the marginal effect, then a high correlation is needed between y and x^, or the confidence limits on (parameter estimated by b^) . 39 will be too wide. Next, the reliability of the instruments used to measure the x^ must be considered. Regression coefficients b. will tend to overestimate Bwith less l i than perfect reliability of the instruments used.^ But of all of the problems discussed thus far, the most serious is the problem of multicollinearity . Multicollinearity arises when some or all of the explanatory variables are so highly correlated one with another that it becomes very difficult, if not impossible, to disentangle their influences and obtain a reasonably precise estimate of their separate effects.^ Gordon stated that small variations among the correlations of a highly related set ~^ Ibid . , p . 60 . 39 Phillip Lyle, Regression Analysis of Production Costs and Factory Operations (New York: Hafner Publishing Co., Inc. , 1957) , p. 65. 40 George W. Bohrnstedt, "Observations on the Measurement of Change," Sociological Methodology , ed. by Edgar F. Borgatta (San Francisco: JosseyBass, Inc., 1969), p. 128. 41 Goldberger, p. 80.
PAGE 33
23 of independent variables can create large variations among 42 their regression coefficients. A standard approach to this problem has been to utilize additional information as an aid in estimation. Knowledge of the ratios of regression coefficients from theory or from previous samples can be used. Even incorrect information can reduce the imprecision 43 of estimation by using Bayesian techniques. Finally, one may ask if 3j_ (beta weightstandardized regression coefficient) is not a more effective statistic. Bohrnstedt stated that the regression coefficient b^ is more appropriate for the study of change than 3^, since b^ is relatively stable across subsamples of a population where 3. may vary significantly as a function of the standard deviations of the variables in the sample. 44 The application of regression theory in estimating educational production functions has been discussed in the literature. In attempting to interface the production function to the budget equation, particular care must be placed on the interpretation of regression coefficients. 42 Robert A. Gordon, "Issues in Multiple Regression," American Journal of Sociology . LXXIII (March, 1968), 162. 43 Goldberger, p. 82. 44 Bohrnstedt, p. 120.
PAGE 34
24 The other aspect of regression analysis is the proportion of accounted variance. The Coleman report has been soundly criticized for the manner in which it sought to identify important variables on the basis of their contributions to the porportion of variability explained. As Bowles and Levin stated, using stepwise procedures, the relative contributions to variability are dependent upon the order in which the independent variables are entered into the equation. They suggested that Coleman should have used the 45 regression coefficients. Even though this is a more effective procedure, if multicollinearity is severe, the regression coefficients may be woefully inaccurate. Additional problems in interpretation are noted. Levin suggested that output measures should be valueadded, that 46 is, the change that has occurred while in school. Borhnstedt raised serious questions involving the use of 47 gain scores m measuring change. Nephew, m constructing an inputoutput model, "partialed out" the variance due to nonschool variables and then worked with residual measures ^Bowles an d Levin. 46 Levin, p. 176. 47 Bohrnstedt .
PAGE 35
25 of school output using multiple regression equations. 48 Goldberger showed that regressing on the residuals will underestimate the regression coefficients of the remaining 49 variables . Bowles in a comprehensive study of the educational production function found that the function generally explains only a small percentage of the variability of school achievement using the full range of variables. A crucial deficiency is the absence of a theory of the learning process. A conclusion was that it would be useful to think of distinct educational production technologies at least for black and white and rich and poor students separately. Also, due to "ceiling effects" it would be more appropriate to work with ratios of regression coefficients rather than 50 absolute magnitudes. This is related to Levin's assertion that the law of diminishing marginal returns is operative in the production function, that is, as more and more 40 Charles T. Nephew, Guides for the Allocation of School District Financial Resources , Ed.D. Dissertation (Buffalo: State University of New York, 1969). 49 Arthur S. Goldberger, "Note on Stepwise Least Squares," Journal of the American Statistical Association , LVI (March, 1961) . 50 Samuel Bowles, "Educational Production Function: Final Report," Research Project Supported by the United States Office of Education (Harvard University, February, 1969), pp. 5, 13, 18.
PAGE 36
26 of a given input is consumed its marginal effect decreases . 51 In an attempt to refine techniques that have been used, factor analysis has been tried to meaningfully group the variables to avoid the problem of multicollinearity . The results were disappointing since variables grouped under statistical considerations were not necessarily 52 meaningful from a decisionmaking context. Wynne suggested the use of high and low productive schools to identify factors that discriminate between them on the basis of produc53 txvxty. Kxeslxng stated that xt is difficult to determine an accurate production function since one cannot easily . . 54 gather data xn unxts proper to each varxable. While considerable research has taken place in the study of inputoutput relationships, less has been done in the interfacing of the production function and the budget equation, as conceived by Levin, for purposes of optimizing ^ "''Levin, p. 177. 52 New York State Education Department. 53 tt Wynne . Â“^Herbert J. Kiesling, "The Study of Cost and Quality of New York School Districts: Final Report," Research supported by the United States Office of Education, Project No. 80264 (Bloomington: Indiana University, 1970), p. 66.
PAGE 37
27 outputs subject to budget constraints. As Levin stated, the educational establishment needs a production function and budget relationship to tie educational expenditures to 55 educational outcomes. Essentially, the concern is that of optimization, but limitations on optimization in an educational context have been expressed. Tracz spoke of the lack of suitable objective functions or performance criteria as strong limitations. Also, education is a nonstationary process which delimits the usefulness of fixed Â• . 56 coefficients in objective functions. Alkin identified impediments to the use of linear programming. First, there is the failure to designate precisely educational programs within educational systems and the attendant financial costs. This is one of the challenges to be met by PlanningProgrammingBudgeting Systems (PPBS) . Secondly, there is a lack of specificity in the designation of educational 57 outcomes . 55 Levin, p. 178. 56 George S. Tracz, "An Overview of Optimal Control Theory Applied to Educational Planning, " A paper read at a meeting of the American Educational Research Association (Los Angeles, 1969), p. 5. 57 Alkm, "The Use of Quantitative Methods as an Aid to DecisionMaking in Educational Administration," p. 6.
PAGE 38
28 A more serious limitation to optimization using linear programming is the nature of fixed marginal effects when using a linear production function. If the production function and budget equation were expressed respectively as : y 0 b 1! + b ^x *t . . Â• + b n x n R = P.X, + PÂ„xÂ„ + . . . + P X 1 1 2 2 n n where R = total budget, and P^ = price per unit of then according to Levin, the objective would be to the condition 3Yx 0 S Yx 1 = 2 = . . . = n input x^, satisfy P x 1 (i.e., the additional output from each input relative to its 58 price should be equal for all inputs) . However, with constant marginal effects, questions of optimal economies of scale cannot be considered. Generalizations from the Literature The review of literature appears to justify the following generalizations. 1. There is a need for decisionmaking based upon inputoutput relationships. 58 Levin, p. 178.
PAGE 39
29 2. The challenge for inputoutput studies is to determine the proper mix of inputs that will maximize output . 3. In general, inputoutput studies have not interfaced with costeffectiveness studies. 4. Serious reservations regarding methodology in inputoutput studies have been expressed. 5. A mathematical model is appropriate for systematically determining optimal allocation strategies. Procedures In the process of developing and applying the mathematical model, attention was given to (1) conceptual considerations, (2) mathematical considerations, and (3) an empirical illustration. Conceptual Considerations The process of developing a mathematical model required that one proceed in the presence of a conceptual framework. Consideration was given to a conceptualization of the process of education. The process was viewed in an inputoutput context. The inputs consisted of community and student factors, and school resources. The outputs were educational outcomes that are desired such as cognitive achievement, affective growth, and physical development.
PAGE 40
30 The relationships between inputs and outputs were termed educational production functions. As decisionmakers seek to effectively allocate resources, their choices are constrained by various social, economic, and legal forces. Thus attention was given to an explicit description of constraints on the process of education . Efficient resource allocation requires two types of efficiency, technical and allocative. Consideration was given to distinctions between these two types of efficiency and their implications for estimating educational production functions. Given an understanding of the process of education as revealed by the production function and the various constraints operating on the process, the need for optimization strategies whereby decisionmakers can seek to maximize certain outcomes subject to various constraints was deliberated . Mathematical Considerations Given a conceptual framework of the process of education, the translation to a mathematical model required consideration of several issues.
PAGE 41
31 Attention was given to estimating single output equations versus multiple output equations. In using multiple regression to estimate the production function, consideration was given to (1) interpretation of regression coefficients, (2) determining goodness of fit, (3) linear versus nonlinear models, (4) reliability of measurements, (5) the use of gain scores, (6) population data versus sample data, (7) causality and multiple regression, and (8) experimental design versus survey design. Given the need for optimization, the mathematical formulation of an optimization strategy was elaborated. In particular, an algorithm for solving nonlinear programming problems was discussed. An Empirical Illustration Given the final form of the mathematical model, steps were taken to empirically illustrate the model. Using real data, an educational production function was estimated. Subject to certain constraints, optimal combinations of inputs were determined to maximize a selected output. Given increments in the budget constraint, the problem was resolved. The exact procedures for the empirical analysis are detailed in Chapter III.
PAGE 42
32 Organization of the Remainder of the Study In Chapter II the mathematical model is developed and described. In Chapter III an empirical application of the model is demonstrated. In Chapter IV a summary, discussion, and suggestions for future studies are given.
PAGE 43
CHAPTER II DEVELOPMENT AND DESCRIPTION OF THE MATHEMATICAL MODEL While interest centers on the development of a mathematical model which describes in a systematic fashion a strategy of optimization, it must be remembered that mathematical models per se do not evolve in the absence of a conceptual framework. Antecedent to a mathematical model is a conceptual model which is a verbalization of concepts and ideas that purport to describe the phenomenon under investigation. The attempt to translate the abstractions of a conceptual model into equivalent mathematical statements only accentuates the fact that both are manmade. Thus, the conceptual model serves only as an approximation to reality, and the mathematical model is an approximation to the conceptual model. In this sense, the conceptual model becomes an upper limit for the adequacy of the mathematical model. In short, the mathematical model cannot describe the phenomena under consideration in any better fashion than the conceptual model which precedes it. 33
PAGE 44
34 The first part of this chapter is devoted to a discussion of conceptual considerations for a mathematical model. Secondly, mathematical considerations implicit in model development are discussed. This provides the basis for stating the methodology which must be employed to make the mathematical model operational. Finally, the complete mathematical model is presented. Conceptual Considerations The basic notion of a model pertains to a simplification of reality that allows the user to analyze particular aspects of a process Â— in this case the production of education. The conceptual framework represents the theoretical backdrop for the mathematical model. Armor has presented a model of aggregate educational processes.^ The basic structure of this model is discussed with some modifications made to allow a consideration of educational processes at the levels of student, school, district, and state. To formulate a conceptual framework whereby the mathematical model can be developed, it is necessary to consider (1) an educational production function, (2) constraints and the ^David J. Armor, "School and Family Effects on Black and White Achievement: A Reexamination of the USOE Data, " in On Equality of Educational Opportunity , ed. by Frederick Mosteller and Daniel P. Moynihan (New York: Random House, 1971), pp. 171177.
PAGE 45
35 educational process, (3) technical and allocative efficiency, and (4) the need for optimization. An Educational Production Function Any model must begin by deciding what are the appropriate goals of schooling. First and foremost is the learning of basic cognitive skills such as reading, writing, and arithmetic. The mastery of such skills is deemed necessary to acquire a level of competence sufficient to function in a technological society. Secondly, schools offer instruction in a variety of subject fields that enable students to discover interests and aptitudes relevant to future occupational choices. Third, school programs seek to establish positive affective outcomes regarding selfconcept, identification with others, openness to experience, and a commitment to social institutions and the culture in which they are enshrined. Finally, schools provide opportunities for aiding the physical and psychological welfare of children with clinics, physical education, schoollunch programs, and guidance services. 2 Schools attempt to achieve goals by the use of programs, staff, and facilities. No doubt, the program 2 Ibid . , p. 171.
PAGE 46
36 determines how the school will be organized, what facilities will be required, and the number and type of staff that will be employed. Each of these factors is potentially capable of manipulation by school authorities. The manner in which these factors are manipulated creates variations across schools and school districts. Concurrently, it may be assumed that these variations create different levels of effectiveness in attaining the goals described above. It is also apparent that the success schools have in working towards these goals is dependent in part upon the background and characteristics which students bring with them to school. This fact has been amply demonstrated in such research as the Equality of Educational Opportunity Survey (EEOS) . Armor has distinguished several components of the characteristics of students which precede their entry into school and interact with the school environment throughout their school careers. First, there is the environment of the student's family. In the basic unit of society, family lifestyles, goals, attitudes, and morals have significant impact on how well the child succeeds in school. Second, the community or neighborhood in which the student lives 3 James S. Coleman et al . , Equality of Educational Opportunity (Washington, D.C.: U.S. Government Printing Office, 1966) .
PAGE 47
37 also helps to define the student. The interpersonal relationships which a student develops with other members of the community, and the manner in which the community manifests itself in facilities, housing, and programs can serve as a debilitating or helping force for the student's growth in school. Finally, there is the peer or studentbody factor. The extent of similarity between aggregated student characteristics and the goals of school programs has consequences for the success of students in schools. 4 Thus, the school can be viewed as a collectivity, with certain inputs from the community, with various school input characteristics and with outputs in the form of the goals discussed above, namely, cognitive and affective outcomes and physical development. Figure 1 represents a schematic of the model outlined above. This schematic represents a revision of the schematic described by Armor. The solid arrows indicate assumed causal directions. The dotted arrows indicate weaker, but plausible causal directions. In this instance, it is felt that causal directions may be twoway. As an example, the quality of a school may be the factor determining what types of families are 4 Armor, p. 173.
PAGE 48
moving into or out of a community and thus influencing the nature of the community. 38 FIGURE 1. A MODEL OF THE EDUCATIONAL PROCESS Hence, the model states that the outputs of an inditil vidual (i ) student at time t are a function (g) of community inputs cumulative to time t; of the quality and quantity of school inputs consumed by him cumulative to time t; of his characteristics (initial endowment and attitudes); and of the characteristics of his peers.
PAGE 49
39 Thus A it = g(C i (t) , S i (t) , P Â± (t) , I Â± ) where A^ t = vector of educational outcomes of the i*"* 1 student at time t, (t) = vector of community inputs relevant to the r_T_ i 1 " student cumulative to time t. vector of school inputs relevant to the i^ student cumulative to t. (t)_ P^ ' Â— vector of peer influences cumulative to t. 1^ = vector of initial endowments of the i*"* 1 individual. The above equation employs mathematical symbolism and can be called a mathematical model. The use of such symbols allows the expression of the conceptual model in a precise fashion and lessens the possibility of ambiguous interpretations. If the vector A consists of two or more components such as A = (A^, A 2 , A3) where Aj_ = reading achievement, A 2 = citizenship attitudes, and A^ = social consciousness, then the model generates a system of equations that are to be solved simultaneously. Alternatively, one can conceive of A as a composite of educational outcomes; that is, A is a single number representing an index of educational output. It is also possible for A to represent a single output such as reading achievement. The last two instances describe what has been termed an educational production function. The production function attempts to reveal the relationships
PAGE 50
40 between outcomes of education and the human and material resources which education consumes. Community inputs, peer influences, and initial individual endowments serve as controls in the production function since these factors are necessarily confounded with school inputs. These factors are explicitly brought into the production function, since failure to do so would obscure the true relationships of school inputs to output factors. The production function allows one to analyze the effects of changes in inputs to the educational process. To properly allocate school resources requires one to discover which resources affect outputs and to estimate the effects that changes in their allocation might have. Policymakers want information about the possible effect of hiring better trained or smarter teachers, about the influence of small pupil/teacher ratios, and so forth. In short, they want to know what will happen if they change the status quo. Thus, it would seem that in setting school policy and in longrange educational planning, knowledge of the educational production function is essential to efficient resource allocation. Production function estimation requires a knowledge of the different factors important in producing the product and the interactions between the inputs. A typology of
PAGE 51
41 variables associated with educational attainment involves the following categories: (1) student inputs, (2) community inputs, and (3) school inputs. The first two categories represent nonmanipulable inputs into the school system. School inputs can be further subdivided into: (1) teacher variables, (2) administration variables, (3) facilities, (4) expenditures and (5) types of programs. To the extent that variables representing important determinants of achievement or their proxies are not included in the model, the adequacy of the model is diminished. The fact that a theory of learning is lacking on the educational scene increases the probability of important aspects being left out of the model. But, the key lies in the ability of the model to predict performance even when proxy variables are used. There is ample evidence that the discovery of predictive factors may be useful on practical grounds even if these factors have no causal significance. The appendix lists variables deemed desirable for production function estimation . Dimensionality of output is a major concern. Previous inputoutput analyses of the educational process have generally employed some measure of academic achievement as the dependent variable. As Mosteller and Moynihan stated, "Lest it seem that academic achievement must be the
PAGE 52
42 only job of the schools, let us remember that studies do not find adult social achievement well predicted by academic achievement." Thus it would seem that other educational outputs as suggested in the conceptual model must be brought into the analysis if results are to be meaningful. This can be done in one of two ways. First, the various outputs can be entered separately into the analysis. This leads to a system of simultaneous equations as previously explained. Secondly, the estimation of the educational production process can be based upon some measure of total educational output. That is, in some way the outputs should be weighed by a common factor (utility, priority, social value) in order to obtain a total index of output. Another issue centers on whether the measures of output employed should represent current levels of performance or represent gains made over a specified period of time. Ideally, it is desired to know what students have gained during the time they have been under instruction, how much of the gain may be reasonably attributed to the instruction, and how much to factors beyond the reach of the school. Attempts at using this approach have fallen short of success due to measurement problems . Â“Frederick Mosteller and Daniel P. Moynihan, "A Pathbreaking Report," On Equality of Educational Opportunity , p. 7.
PAGE 53
43 Constraints and the Educational Process Any attempt to describe the process of education by production function estimation must recognize various forces (i.e., social, political, economic, legal) that act as constraints upon the process. This has been one of the shortcomings of previous inputoutput analyses of educational systems and may explain in some degree their lack of impact upon policy decisions. The most obvious constraint is the budgetary one. Resources for education, as for all other purposes, public or private, are limited. Decisionmakers must operate within the framework of a limited budget. Thus, the key policy questions they face are these: when a school changes its inputs in various ways, how much improvement in achievement scores (or other desired good) will be obtained and for how much money? Evaluation of alternative policies necessarily involves consideration of both the effects on output of different changes in the inputs to the production process and the costs of these changes. It is also true that existing state laws, accreditation requirements, and market demands may serve to inhibit alternative choices among the use of school resources. Examples follow:
PAGE 54
44 1. Pupilteacher ratio Â— analysis may show this to have negligible effect, but accreditation requirements prevail. 2. Teachers' salaries Â— again the effect may be small relative to other variables, but disproportionate increases may be needed to attract and retain qualified teachers due to market demands. 3. Length of school year Â— analysis may suggest an optimal length different from that prescribed by state law. No doubt, the process of estimating an educational production function might suggest changes in state laws and regulations with respect to education. But more importantly, by considering present constraints, educational decisionmakers can affect educational outcomes now, as best they can, until laws and regulations or conditions are changed. Technical and Allocative Efficiency As attempts are made to allocate resources to best effect educational outcomes, it is apparent that two types of efficiency are involved. Levin has termed them technical and allocative efficiency. Technical efficiency refers to using the physical resources of a school in such a way that more educational output is achieved than might be done under some alternative utilization. Allocative efficiency refers to using the dollar budget of the school in such a way that more output is achieved than could be attained if a different
PAGE 55
45 combination of physical resources were purchased even when technical efficiency is satisfied. ^ It is at this point that a major distinction occurs between the concept of an educational production function and an industrial production function. The methods that have previously been used to estimate educational production functions employ an average concept; that is, the final function represents what the average effects are for various inputs. On the other hand, an industrial production function is conceptually designed to express the maximum product obtainable from the input combination at the existing state of technical knowledge. Thus the industrial production function represents a frontier of potential attainment for given input combinations. It establishes the most technically efficient combinations of current practices. On the other hand, if it is desired only to estimate how much output on the average could be obtained for a firm in the industry with a certain set of inputs, then the average concept would obviously be the correct one to employ. But, this function cannot answer questions regarding technical efficiency. It can resolve the question of allocative efficiency at the existing state of technical knowledge. 6 Henry M. Levin, "The Effect of Different Levels of Expenditure on Educational Output," p. 182.
PAGE 56
46 The Need for Optimization A number of inputoutput analyses have been performed in education and these studies have provided a necessary description of the educational system. However, the policy implications of such studies have been hard to realize since they generally have been devoid of cost considerations. A simple example follows to indicate the necessity for cost analysis in conjunction with the production function. Suppose two variables x^, x 2 are found to have the following relationship to y: y = 4x[_ + X 2 . It appears that x^ has a greater effect on y than x 2 . Hence, researchers viewing this equation would stress the importance of x^ in their conclusions. But, suppose further it is known that the prices of x^ and x^ are respectively P. = 5 and PÂ„ = 1 . Then it becomes clear that within a J. 2 limited budget, more can be accomplished by using X 2 rather than x . Suppose that the budget equation happens to be: 5x^ + x 2 = 20. This would allow us to use at most 4 units of x^ when x^ = 0 or 20 units of x 2 when x 1 = 0. The maximum value of y would be y = 20 when x^ = 0 and x^ = 20. It is realized that this is a rather naive example, for which the solution is intuitively obvious. However, in a reallife situation.
PAGE 57
47 if one had a production function consisting of 20 or more variables, each having different prices, then it becomes intuitively impossible to know what the best combination of inputs would be. The point is that inputoutput analyses of schools and educational policy miss the mark if they are done irrespective of price considerations for the various school inputs into the educational process . Conclusions drawn from an examination of a production function may be severely revised as a result of the joint consideration of budget constraints. What is needed, then, is a systematic procedure for determining the best combination of inputs when dealing with large numbers of variables. Conceptual Considerations in Retrospect In review, the process of education has been viewed in an inputoutput framework. It was found that in addition to school inputs, community and student inputs also affect the magnitude of educational outputs. Also, recognition must be made of various constraints that are operating at each level of the process. The most pervasive of these is the budgetary constraint. Finally, given the conceptual framework of the educational process, the task of decisionmakers is to optimize educational outcomes subject to various constraints which explicitly and implicitly operate on the system.
PAGE 58
48 Mathematical Considerations In the recent past there has occurred an emerging interest in applying to education the methods of economic analysis. These methods have been variously called operations analysis, systems analysis, inputÂ— output analysis, costeffectiveness analysis, and costbenefits analysis. They are viewed as a potential means of increasing the e ffi c i enc Y of educational planning and decisionmaking. These types of analyses can be utilized in education to deal with such questions as how much of an increase in reading achievement of sixthgrade children is likely to result from any given reduction in the size of sixthgrade classes, all other things being equal. Essentially, these types of analyses allow the planner to view an educational system as a set of inputoutput or production relationships which can be controlled in a way that will optimize the use of scarce educational resources. The operational form of such analyses usually involves a mathematical model which describes in a quantitative sense the structural characteristics of the phenomena under investigation. In developing a mathematical model, in the present case, it is necessary to consider (1) ordinary least squares versus two stage least squares, (2) single versus multiple outputs, (3) technical and allocative efficiency, (4) prices of the
PAGE 59
49 school inputs, (5) statistical difficulties in using multiple regression, and (6) strategies for optimization. Ordinary Least Squares Versus Two Stage Least Squares Previously, the educational process was viewed as composed of multiple outputs. This was reflected in the equation A it = g( c i (t) * s i (t) , Pi (t) , I Â± ) where is a vector of educational outcomes. Two ways have been identified for working with multiple outputs. One leads to a system of equations (production functions) to be solved simultaneously. The other consists of constructing an index of total educational output so that the educational process may be represented in a single equation, A major difficulty with the latter approach is that some outputs are likely to be causally related to other outputs. Thus, an increase in positive attitudes is likely to lead to an increase in academic achievement which in turn will cause better attitudes and so on. The use of ordinary least squares to estimate the parameters of a single equation with this type' of feedback existing among the outputs will probably lead to biased estimates. When the above situation exists a better approach is to employ
PAGE 60
50 simultaneous equations. The parameters on these equations can be estimated using twostage least squares. ^ Single Versus Multiple Outputs If one adopts the simultaneous equations approach, then each equation estimates a single output. Difficulties arise when one attempts to allocate resources so as to maximize all outputs. Most likely, not all outputs can be weighted equally; that is, one may not be willing to assign the same value or utility to each of the outputs. Thus, one is required to subjectively determine the weights that describe the utility of each output. Hence one seeks to maximize a utility function of the various outputs. It is at this point that trouble is likely to occur. Gilbert and Mosteller stated, "... setting many desirable maxima makes it hard to know what programs might possibly lead to achievement of these goals. A wellknown mathematical theorem states that one cannot ordinarily maximize two or 0 more variables simultaneously." Consequently, it would 7 Henry M. Levin, "A New Model of School Effectiveness, in Do Teachers Make a Difference ? (Washington, D.C.; U.S. Government Printing Office, 1970) . Q John P. Gilbert and Frederick Mosteller, "The Urgent Need for Experimentation, " On Equality of Educational Opportunity , p. 382.
PAGE 61
51 be vain to hope to maximize the outcomes of each student in several different areas. In the utility function, only a few of the outputs would be maximized, at the expense of other outputs . The above seems to suggest that analysis might be more fruitful if it were restricted to a single output, which is not a composite of other outputs. That this should be so, seems to follow for additional reasons. McNamara stated. Mathematical applications to management should stay clear of large general models and concentrate on specific problem areas. The mistake that educators have made for years involves trying to solve huge problems that do not have immediate or feasible solutions. If the general area of operations analysis is to persevere in educational management, then the appropriate approach seems to be the consistent application of these techniques to small problems . Such an approach recognizes that resources should be allocated to programs. As an example, given a reading program, interest centers on how resources might be allocated to maximize reading achievement. No doubt, if the focus is on reading achievement only, analysis is likely to indicate a distribution of resources that maximizes reading achievement at the expense of other g James F. McNamara, "Mathematical Programming Models in Educational Planning, " Review of Educational Research , XLI (December, 1971), 440.
PAGE 62
52 outputs. This need not be, however. As the various constraints operating on the reading program are considered, constraints can also be introduced that establish minimum standards for other outputs. For example, in the reading program, analysis may suggest that the ratio of students to counselors be increased so that more resources are available for increasing the training of classroom teachers. Such a decision would likely have a negative influence on certain affective outcomes. Thus, in the analysis, a constraint can be introduced that maintains a minimum standard for the studentcounselor ratio. This standard would be determined either as a subjective judgment or as a result of previous analyses in which the appropriate affective outcomes were the outputs under consideration. Technical and Allocative Efficiency In deciding upon the methodology to employ in estimating the production function distinctions must be made between technical efficiency and allocative efficiency. It will be recalled that the industrial production function expresses the maximum product obtainable from the input combination at the existing state of technical knowledge. The process of using ordinary least squares to estimate the production function, on the other hand, expresses the
PAGE 63
53 average product obtainable from a firm in the industry. Educational production functions have generally been estimated using ordinary least squares analysis. The reason for this is that given the lack of a comprehensive theory of learning, it is not known if all the relevant inputs have been identified. Likewise, it has not been possible to provide adequate data for some of the inputs presently identified . Given this state of affairs, contentment must rest with the way things are rather than the way things should be. In this context, the average concept is the correct one to a PP!y since it can answer questions of allocative efficiency at the existing state of technical knowledge. For purposes of optimization, then, ordinary least squares should be used for estimating the production function. If the output being considered is perceived to have feedback effect on certain independent variables in the equation, then a system of simultaneous equations must be solved in which affected independent variables become the dependent variables in other equations. This system is solved using twostage least squares. The process of using ordinary least squares or twostage least squares to estimate the parameters of one or more equations is more generally termed multiple regression analysis.
PAGE 64
It should be mentioned, at this point, that multiple regression analysis can provide clues regarding the techni cal efficiency of the units observed. Each unit in the analysis is composed of a particular combination of input values and an output value. Multiple regression analysis estimates what the average output is with this combination of input factors. Figure 2 illustrates this process. Estimated output is plotted against the actual output observed. If the units of data represent the mean scores of schools on some educational outcome, then it can be argued that in a group of schools working with equivalent inputs (community, student, school), these being indicated by the estimated output, every school in the group should be able to do what the most productive schools in that group have done . This allows questions of technical efficiency to be answered. By observing the high productive and low productive schools in a given group, a whole series of questions can be asked about the characteristics of high productive schools that make them different from the low productive schools Â— questions about school organization, courses offered, methods of instruction, teacher attitudes and behavior, special services, community involvement, quality of facilities, level of financial support, and the
PAGE 65
55 like. Results of such analyses would help to identify new factors that are related to output. Incorporation of these new factors into the production function advances the level of technical knowledge. (Based on inputs and control variables) FIGURE 2 . METHOD FOR INVESTIGATING TECHNICAL EFFICIENCY Determining Prices of the School Inputs Thus far, it has been suggested that multiple regression analysis be employed for estimating the production function and that consideration be given to maximizing single outputs only. Also, as was seen earlier, one of the constraints to be considered is the budgetary constraint. Having identified the relevant school inputs in the production function, their prices must be estimated. In linear form, the budget constraint can be expressed as B = (i = 1, 2, . . . , n)
PAGE 66
56 where B represents the total budget school input in the production function = price of the i th input per unit. The costs of some variables are obvious and relatively easy to obtain, such as teachers' salaries and instructional expenditures per unit. For other variables, the costs appear to be intractable using traditional accounting methods. However, Cohn, Katzman, Levin and Riew, 10 by using least squares techniques, were able to obtain some idea of how changing a unit of a nonpriced variable can affect total cost. Cohn determined the relationship to total per school costs on such variables as Z^ average number of college semester hours per teaching assignment, Z 2 average number of different subject matter assignments per high school teacher, Z^ average class size, Z 4 number of units of subjects offered. Â’^Elchanan Cohn, "Economies of Scale in Iowa High School Operation," Journal of Human Resources , III (Fall, 1968) ; Martin T. Katzman, "Distribution and Production in a Big City Elementary School System, " Yale Economic Essays , VIII (Spring, 1968); Henry M. Levin, "A CostEffectiveness Analysis of Teacher Selection, " Journal of Human Resources , V (Winter, 1970); and John Riew, "Economies of Scale in High School Operation, " Review of Economics and Statistics , XLVIII (August, 1966) .
PAGE 67
57 Thus it appears that costs of the various school variables can be determined with the aid of least squares analysis. Statistical Considerations in using Multiple Regression Both the production function and the budget equation are to be estimated using multiple regression techniques. It is necessary, then, that a critical discussion of all the statistical issues relevant to multiple regression analysis be considered. Thus far, the terms multiple regression and least squares analysis have been used. The problem of predicting or estimating the value of a dependent variable given various independent variables is correctly termed multiple regression. Ordinary least squares is one technique for determining the regression equation. Since ordinary least squares will generally be employed in the estimation process, the two terms will be used interchangeably, unless otherwise noted. Ideally, the process of estimating a regression equation should be based on a mathematical model including all of the factors believed to have an influence on the dependent variable. In the linear case, a typical regression equation has the form: y = b 0 + b 1 x 1 +b 2 x 2 +. Â• . + b n x n + e
PAGE 68
58 where y estimated y (dependent variable) x i = independent variable b i = regression coefficient (i = 1, 2, . . . , n) = constant term (for scaling purposes) e = error term or residual ( e = y y) . It is important that sufficient variation be observed in all of the independent variables that valid estimates of the regression coefficients may be made. Otherwise, existing differences among schools may not be great enough to reveal the importance of school resources to a given output. Technically, one is not allowed to extrapolate the results of a regression analysis beyond the range of observed data. At the same time, the effects of certain resources may not be apparent in the range of existing variation. Reducing the pupil/teacher ratio, for example, may make no difference until instruction can be really individualized, which might require a pupil/teacher ratio of less than 10:1. Also, suppose that output is functionally related to both community and school inputs. Further, suppose that school inputs vary only slightly in a particular sample and that home inputs vary widely. Statistically, most of the variance in output would be attributable to community factors. Thus, it becomes imperative that in any sample used to estimate the production function, the population extremes be well represented.
PAGE 69
59 Interpretation of regression coefficients The regression coefficient has the following interpretation: Ik represents the amount of change in y that can be associated with a unit change in x^, with the remaining independent variables held constant. Since in a multiple regression analysis the remaining variables are controlled, b^ is termed a partial regression coefficient. An interesting property of the partial regression coefficient is that regardless of whether x^ is influenced by or influences other independent variables in the equation, the same regression coefficient for x^ will be obtained. 11 The size of the regression coefficient depends on the unique contribution of the independent variable to the prediction of the dependent variable plus an apportioned amount of the influence it shares with other variables in and out of the equation . Multiple regression analysis enables one to find the relationship between a given input factor and an output, with all other factors controlled. More precisely, this means that the partial regression coefficient is a weighted average of regression coefficients obtained by fixing 11 Robert L. Linn and Charles E. Werts, "Assumptions in Making Causal Inferences from Part Correlation, Partial Correlations, and Partial Regression Coefficients," Psychological Bulletin, LXXII (1970), 309.
PAGE 70
60 control variables at all possible values in the range observed and computing separate regression coefficients. Any indirect effects that the control variables may be having on y acting through are isolated. Thus, it is important that variables entering the analysis are distinct, that is, they are not measuring the same factor and hence are not identical. Otherwise, controlling for them becomes tantamount to partialling the relationship out of itself. This is known as the partialling fallacy. 12 The use of stepwise regression avoids the inclusion of variables that are not distinct. Controls on distinct independent variables which are unnecessary will not produce misleading results as long as we do not attempt to control on a dependent variable. The size of a regression coefficient may still be influenced by variables that have not explicitly been brought into the analysis. This is discussed in more detail in the section on causality and multiple regression. Regression models and goodness of fit By correlating the estimated y's with the actual y's, a measure of the goodness of fit of the multiple regression 12 Robert A. Gordon, "Issues in Multiple Regression," American Journal of Sociology , LXXIII (March, 1968) , 592.
PAGE 71
61 equation to the data is obtained. Such a correlation is 2 termed multiple correlation (R ) . R v Â„ Y* X lV * * X n Y Â‘ X 1 gives the proportion of variability in the dependent variable explained by the regression relationship. If A is a set of original variables and B is a set of added vari2 ables, any increment to an R A due to the addition of y Â•**, can be tested for significance by the F ratio (r; F = y .A,B R _) / b y .A (1 R ) / (n a b 1) y ,A,B df = b and (n a b 1) . 2 2 R is the incremental R based on a + b independent y.A,B ^ variables where a = number of original variables b = number of added variables. When R is based on sample data, there is a tendency for it to be systematically biased upwards. This happens because the least squares process capitalizes on any sampling error that overestimates zeroorder correlation coefficients. This is known as the problem of shrinkage Â— the tendency for R to decrease as the sample grows larger. A correction formula is given by
PAGE 72
62 R 2 = 1 (1 R 2 ) R = unbiased estimator of population R R = multiple correlation found in a sample of size N k = number of independent variables. Of course, if one is dealing with population data, shrinkage is not a problem. Mult icol linearity Past studies that have estimated educational production functions have generally exhibited evidence of high degrees of interrelationships among the independent variables. This is known as the problem of multicollinearity . This affects the regression analysis by increasing the standard errors of the regression coefficients for the collinear variables; that is, the estimated regression coefficients exhibit more variance and in a sense are unstable. Small sampling variations among the correlations of a highly related set of independent variables can create large variations among their regression coefficients. Still, the coefficient values themselves are not biased. Attempts to avoid the effects of multicollinearity have involved one or more of the following: (1) introducing extraneous information, (2) stratifying, and (3) providing for simultaneous relationships. Another alternative would be to study an
PAGE 73
63 entire population rather than a sample. As an example, if one were concerned with educational policy for a given state, it would be almost as easy to conduct an analysis of data for all the districts in the state, assuming that appropriate data collection techniques were carried out by the state department of education. Since in a sample regression equation the regression coefficients are unbiased estimators, it would follow that in the population regression equation the regression coefficients would represent the true effects for the variables present. The need for nonlinear models Most applications of multiple regression employ the linear additive model. However, for purposes of estimating production functions, the following limitations of such a model are apparent. 1. The model does not allow for economies of scale. 2. It does not allow for interactions between inputs. 3. It implies that the marginal effect of a given input is the same regardless of the level of usage. What the above suggests is that in estimating educational production functions, most likely nonlinearities in the observed data must be considered. The provision of any given power u of that is x^, allows for u1 bends in the
PAGE 74
64 regression curve of y on x^ . In most research in the behavioral sciences, provision for more than one or two bends will rarely be necessary. A simple parabola may often give a reasonably good fit to the data, especially when it is realized that the curve may be quite flat and the data need not extend far enough to complete the bend (i.e., the parabola may be a reasonable fit within the limits of variation given in the problem) . To obtain a measure of the goodness of fit to the parabola, the multiple R between 2 2 y and x^, x^ is used. The difference between R and 2 r y Xl ( assumin g linearity) will give a measure of the degree to which ability to predict has been improved. Another type of nonlinearity is often encountered in research studies, namely, the concept of interaction. The importance of this concept is revealed by Mosteller and Moynihan in their reanalysis of the Coleman study (EEOR) . It is recalled that Coleman concluded that the relative effects of school resources are small compared to community and student inputs. Mosteller and Moynihan stated: To the simple of mind or heart, such findings might be interpreted to mean that 'schools don't make any difference' . This is absurd. Schools make a very great difference to children. Children don't think up algebra on their own. . . . But given that schools have reached their present levels of quality, the observed variation in schools was reported by
PAGE 75
65 EEOR to have little effect upon school achievement. This actually means [that of] a large joint effect owing to both schools and home background (including region, degree of urbanization, SES, ethnic group) little is unique to school, or homes. They vary together . Cohen stated that this joint effect is carried by a variable z of the form z = XjX j . 14 This is known as the first order interaction effect. Interaction terms that measure the joint contribution to explained variance of two or more variables can be very important. If two variables are highly intercorrelated, the first entered into the regression analysis will be assigned both its unique contribution to the explained variance and its jointly explained variance with all other variables (the interaction terms) . This is precisely what happened in the Coleman report. If one were to employ an interaction term in a simple threevariable model the equation would take the following form : X 3 a 13 X l + a 23 X 2 + a l23 X l X 2* How does one interpret the coefficient a ? The P ar tial derivative of x^ with respect to x^ expresses the slope of the regression surface with other independent variables held constant (i.e., amount of change in x^ associated 1 ^ J Mosteller and Moynihan, p. 21. Â•^Jacob Cohen, "Multiple Regression as a General DataAnalytic System," Psychological Bulletin , LXX (1968), 436.
PAGE 76
66 with a unit change in xj . Thus, in the above equation: and the amount of change expected in x 3 is dependent upon the level of x 2 . In an educational production function this could mean that the effect on performance of increasing the percentage of students from whitecollar homes is not independent of the percentage of teachers in a school who are experienced and who have a master's degree. Consider effect of x 0 on the effect of x, on x . 2 1 3 Thus the concept of interaction consists essentially of a residual category of all types of effects that are nonadditive. It always refers to the joint effects of two or more independent variables on some particular dependent variable. Since nonadditive relationships constitute a residual category, it would be rather surprising if interaction effects could always be interpreted in a simple manner. Nevertheless, an interaction model would enable a i 23 # this expresses the rate of change of 8x 3 when x^ changes. In other words, this is the us to determine whether variables are complements (that is, act together to improve performance) or substitutes (namely.
PAGE 77
67 one variable has a greater effect in the absence of some other variable than it has in its presence) . The implications for educational decisionmaking is that it would allow the use of inputbackground interaction in conjunction with the demographic composition of a student body to achieve desired performance levels. Reliability of measurement In conducting multiple regression analyses one must always be concerned with the reliability of measuring instruments. If an output is unreliably measured and if the errors are random, the increase in the standard deviation of output offsets the decrease in the correlation between output and input, leaving the regression coefficient unbiased. Given (S ) y = (r yx ) y (x) + e , the above is apparent. Though the regression coefficient is unbiased, an increase in the standard error of estimate and a corresponding decrease in the multiple R can be expected. There are formulas that allow one to correct the attenuation in correlations. If, on the other hand, random errors are made in measuring x, r v will be underestimated and S x will be overestimated, leading to biased results. This type
PAGE 78
68 of problem can also be corrected but the work involved is tedious . ^ Quite frequently the level of data treatment will call for aggregated data. If the unit of analysis is at the district level, then aggregated or mean measures on the variables would be observed. Shaycroft has given a formula for estimating the reliability of group means. 2 aa n S_ a where r = reliability of group means aa r aa = reliability of individual responses 2 S a = variance of individual responses 2 S = variance of group means cT 16 n = number of individuals in a group. Suppose r = .50 (reliability of some psychological tests), cl cl 2 2 S : S_ =10 : 1, and n = 100; then r = .95. a a aa 15 Albert Mandansky, "The Fitting of Straight Lines when Both Variables are Subject to Error," Journal of the Ameri can Statistical Association . LIV (1959), 173205. 16 Marion Shaycroft, "The Statistical Characteristics of School Means," Studies of the American High School , ed. by John C. Flanagan et al . , Cooperative Research Project 176, Project Talent, University of Pittsburgh, 1962.
PAGE 79
69 Actually, for most purposes, it can be seen that when n is sufficiently large (100 or more), r is close to 1. ConHH" sequently, when dealing with aggregated data, reliability is not seen to be a problem. Use of gain scores Many authorities have expressed a preference for measures of output that are "value added." If Y is a measure of output, it would be preferable to consider Y as a measure of the net gain achieved within a specified period of time. However, the reliabilities associated with gain scores are always less than that of either of the scores used to obtain a gain score. McAfee studied the reliabili17 ties of gain scores using Monte Carlo simulation. It was found that the average reliability of gain scores with a pretest reliability of 0.90 is only 0.16. Thus, at the level of individual data treatment, the use of gain scores cannot be justified. Once again, however, if aggregated data are used, the problem of reliability is nearly resolved. Using the above example of Shaycroft's formula and replacing r with r =0.16 gives r =0.92. aa aa aa 17 Jackson K. McAfee, "Problems in Measuring Change," Unpublished paper. University of Florida, 1972.
PAGE 80
70 There remains one other difficulty with the use of gain scores. In using gain models, the school effects are probably understated, since prior performance levels which are due both to school and environment are summarily subtracted out. Psychologists note that changes in a child's self concept are usually the result of similar experiences repeated over a long period of time. Thus, increases in a child's achievement in a given year may have been the result of prior school influences operating over a period of years. It may be best to deal with levels of outcome only as an expression of the cumulative influences of schooling to that point in time. Population data vs. sample data The possibility of using either population or sample data has been considered. The use of sample data is often done with the goal of generalizing results to the population sampled. The use of inferential statistics in multiple regression requires a set of restrictive assumptions. 1. The variance must be homogeneous, that is, the variance of y values about the regression surface must be the same at all combinations of x values. 2. The errors ( e = y y) must be independent of each other.
PAGE 81
71 3. The x values roust be measured with essentially no error . 4. The distribution of y values must be normal. The use of population data requires only that assumption (3) be satisfied. Causality and multiple regression In assessing the results of a multiple regression distinctions must be made between associated effects and causal effects of the independent variables on the dependent variable. It is one thing to say that an increase in y of b units is associated with an increase of one unit in x^, and another to say that an increase of b units in y is caused by an increase of one unit in x^. To make a statement of causality requires further assumptions regarding the specification of variables to be included in the model. In particular, if the regression model is used for simulation purposes, the following must hold true if regression coefficients are to be interpreted causally. All variables which might affect the dependent variables are either included in the regression equation or are uncorrelated with the variables which are included. The absence of relevant influences will, in general, lead
PAGE 82
72 to 'specification error' (i.e., bias in the calculated regression coefficients due to incorrect specification of the structural model) . This also implies that the regression residuals (e ) are uncorrelated with any of the x i . The error term represents all those unmeasured implicit factors which influence output, but are assumed or known to be uncorrelated with any of the x^ in question. 2. All x^ are measured without error. Random measurement error in output should not bias the regression coefficients . 3. Terms are included in the regression to handle any curvilinear or interactive effects. 4. The dependent variable has no effect on any of the independent variables. When this happens, one must solve a system of simultaneous equations to obtain unbiased estimates of the regression coefficients. One must also use twostage least squares instead of ordinary least squares in the estimation process. Given that most estimates of educational production functions have been based on crosssectional data in a surveytype analysis, it would be improper to assert causality to the regression coefficients since their sizes are so sensitive to the proper specification of the equations. If relevant influences have not been brought explicitly into
PAGE 83
73 the equation, this can lead to overestimation of the relationships between output and input variables in the equation. Also, the independent variables used may be proxies for the causative variables. Thus, an increase in teacher salary may not be accompanied by a corresponding increase in all the other attributes for which teacher salary is serving as a proxy. It can be said that the regression coefficients are unique regardless of whatever causal interpretation is given to the correlations between independent variables that are in the equation. Suppose two independent variables are correlated. Causally, this correlation might correspond to a direct effect of x^ on x 2 or of on x^, or to an indirect effect of on x 2 through x 3 , or to a simultaneous effect of x^ on both x^ and X2 . But the regression coefficient b , the effect of x^ on y, will be the same whether x^ causes x^ or x 2 causes x^. Where causal inferences are desired, one should select variables that are distinct and avoid the use of indices that are thought to be measures of the "same" underlying variable. Generally, factor analysis has been employed to resolve redundancies among the indicators of a given factor. Blalock stated that factor analysis techniques and the interpretations given to the factors extracted ordinarily presuppose certain limited kinds of causal
PAGE 84
74 18 models. In particular, it is assumed that there are no direct causal links among the indicator variables and that these measured variables are caused by the underlying variables, making the intercorrelations among the indicators completely spurious. Factor analysis may therefore not be appropriate for census data or other types of analyses in which some of the measured variables are caused by other measured variables. But given the intransigence of the regression coefficient to causal directions among measured variables, stepwise regression can prevent the inclusion of redundant information and thus selects the indicator that is optimally related to the dependent variable. Experimental design vs. survey design It is doubtful that in estimating a production function all of the relevant influences on output can be identified or, if identified, can be properly measured. In other words, it is doubtful that all the relevant variables can be controlled. Another alternative over rigid controls lies in the experimental tradition of randomization. Randomization does not control for implicit variables. Rather, it transforms the causal scheme into the form Â•^Hubert m. Blalock, Causal Inferences in Non Experimental Research (Chapel Hill: University of North Carolina Press, 1961), p. 168.
PAGE 85
75 X *3 V where v and z are the implicit influences. They continue to operate on y but do not systematically affect the estimation of the relationship between x and y as measured by the regression coefficient. They do increase sampling error . According to Blalock, of the implicit factors that are confounded with variables x. in their influence on y, there exist two types: 1. Forcings Â— those variables that are impinging from the outside environment; 2. Propertiesthose variables that are conceived to be properties of the system at the time of observation . Randomization reduces the effects of property variables; however, it does not handle forcing variables. This parallels the CampbellStanley tradition regarding sources of external invalidity. In school effects studies the interactions of student selection and school variables, testing and school variables, and various reactive arrangements cannot be handled with randomization. Thus 19 Ibid . , p. 22
PAGE 86
76 randomization allows simplifying assumptions to be made about a large class of 'property variables' that can be made to operate independently of the causal variables under study. But other 'forcing variables' cannot so readily be ruled out. Even in experimental designs, simplifying assumptions must always be made if causal models are to be evaluated. In short, causal laws can never actually be demonstrated empirically. This is true even where experimentation is possible . Thus, even though the process of experimentation is viewed with limitation, it is certainly a far more effective device for eliminating alternative plausible causal schemes than is the survey study. However, in the carrying out of largescale studies for the estimation of educational production functions, recourse to experimentation is not likely to be feasible. This is the current state of affairs in educational research. If there is any consolation, it lies in a statement by Gilbert and Mosteller: Our own experiences with observational studies . . . has been that the controlled study usually does not contradict the observational study, but it does clarify matters and make them firmer. 20 Gilbert and Mosteller, p. 373.
PAGE 87
77 Strategies for Optimization Given that an educational production function has been estimated using multiple regression analysis, the regression coefficient assesses the effect of an independent variable on output with the proviso that other variables in the equation remain unchanged. Unfortunately, the condition that other things remain unchanged 1 cannot be met in the real world. A policy of raising teachers' salaries would probably change the monies spent on other educational resources Â— resources that might be necessary to the success of th e student in the school. What is needed is a process that assesses the effects of variables acting in all their simultaneity, a process that specifically notes the consequences on other resources when a given resource is changed. Essentially, interest centers on a situation in which the use and level of resources are constrained by various factors. In particular, the budget equation serves as a primary constraint in the educational process. If for a given budget, all dollars are already charted for consumpÂ— tion, then an increase in any resource must necessarily mean decreases in others. At the same time, subject to various constraints, decisionmakers are attempting to optimize, the allocation of resources so as to maximize educational outcomes .
PAGE 88
78 Optimization is the process of finding a best solution among several feasible alternatives. What is needed, then, is a systematic procedure for determining the best combination of inputs subject to certain constraints. Such a procedure is mathematical programming. If a model of the educational process describes a linear system such as y = Eb.x 1 1 subject to the following constraints EZa x <_ c ij i i then the techniques of linear programming can determine optimal solutions. The x^ are the variables that can be manipulated to achieve the desired objective. y = Â£b x i i is called the obiective function , since the objective is to optimize y. Most frequently, nonnegativity restrictions will be placed on the x Â± (i.e., x Â± >_ 0 for all i) . The constants (a^j, b^, c^) are termed parameters ; they are factors that effect the objective function but cannot be manipulated as are the variables . Constraints not only provide regions of acceptable values of the variables but also provide a mechanism for relating the mathematical expression to real world conditions. The important contribution of this approach is that it realistically views the process of education as a system constrained by political, social and economic considerations.
PAGE 89
79 As stated previously, the linear approach simply ignores such phenomena as economies of scale and nonadditive interactions among input variables. Actually, the process of education might better be approached using nonlinear programming, problems in which one or more terms of either the objective function or any one of the constraints is represented by a nonlinear function. Generally, the system can be expressed as: y = g(x x , x 2 , . . . , x n ) subject to Zf i( x i, x 2 x n } C i* Thus, programming is the mathematical method for the analysis and computation of optimum decisions which do not violate the limitations imposed by inequality sideconditions. Associated with each programming problem is a dual problem. The process of finding the optimal values for the original problem also gives the optimal solution for the dual problem. Without discussing all of the ramifications of this concept, there is one important implication for production function analyses. In the process of maximizing the production function (objective function), a level of output at a minimum cost (budget equation) is attained. This is to say that there is no other way of allocating resources to the inputs such that this level of output
PAGE 90
80 can be attained for a lower cost. Given then, that cost has been minimized at an optimal level of output, the budget can be incremented by x units and the effect observed on output. Up to this point, consideration has been given only to maximizing output given a specific budget. But the process of incrementing the budget has broader implications for policy analysis. Specifically, educational decisionmakers should follow this approach to financing educational programs : 1. What are the outcomes that are desired and at what levels? 2. What programs are necessary to achieve the outcomes that are desired? 3. What resources are required to carry out the programs ? Given this approach towards the financing of educational programs and the optimization techniques described above, decisionmakers are in a position to estimate the additional resources required to bring about incremental improvements in educational outcomes. As an example, suppose the statewide average on reading achievement for all twelfthgrade students is 11.7 years. Suppose further that a production function has been established between
PAGE 91
81 reading achievement and various inputs into the educational process. Then educational leaders are in a position to determine the minimal cost to bring about any desired level of improvement. If the decision is made, for instance, to increase the reading level to 12.4 years, then an optimal combination of inputs can be determined that will minimize costs. The advantage of this approach is that decisionmakers can have some idea of what additional resources allocated to education will buy. In the process of developing a mathematical model, the idea of integrating the concepts of multivariate statistics and mathematical programming may seem to some to be a novel approach. It is one, however, that was anticipated as early as 1963. At that time Wegner stated: Multivariate statistics is concerned with the estimation of parameters and the determination of the structure of relations between variables. Mathematical programming assumes a mathematical model of known structure and is concerned with the determination of optimal policies subject to known structural restrictions. Each topic can supplement the other, and the union of the two disciplines would provide a more powerful tool than either discipline could provide in itself . . . problems requiring a combination of the two disciplines are becoming increasingly common. 21 21 P. Wegner, "Relationship Between Multivariate Statistics and Mathematical Programming, " Applied Statistics , XII (November, 1963), 146150.
PAGE 92
82 An Algorithm for Solving Nonlinear Programming Problems In estimating the production function and the budget equation, multiple regression analysis was an appropriate methodology . It is necessary to determine an appropriate methodology to solve a nonlinear programming problem in which the production function is the objective function and the budget equation is one constraint. Other constraints would include the ranges of the variables since predictions cannot be extrapolated beyond the ranges observed. Nonlinear programming problems generally fall into two classes, those in which the constraints are linear and those which are not. Since the budget equation considers questions of economies of scale, there is every likelihood that the equation will include nonlinear terms. The most crucial question regarding nonlinear problems deals with the nature of the objective function. If interest centers on maximizing the objective function, then questions of concavity must be answered. A concave function is defined as a function such that a straight line segment joining any two points on the surface must lie below the surface. Figure 3 illustrates a concave function. Figure 4 illustrates a function which is not concave. The figures also illustrate the basic difference between the two functions. In Figure 4 the
PAGE 93
83 function attains two local maximum values M^, M 2 . This means that within small neighborhoods of M^ and M 2 , these points are the maximum values. In this case M 2 is the global maximum since it is the largest value f assumes in the region. In Figure 3, on the other hand, only one local maximum is observed. For a concave function, a local maximum is always a global maximum over a given range. FIGURE 3. AN EXAMPLE OF A CONCAVE FUNCTION f y FIGURE 4. AN EXAMPLE OF A NONCONCAVE FUNCTION f This illustrates the difficulty of using a nonlinear programming algorithm when the objective function is not concave. The algorithm always converges to a local optimum whether or not it is the global optimum. Thus if the algorithm is started at x^ the solution converges on M_^ . Starting at , it will converge on M 2 . When dealing with
PAGE 94
84 a nonconcave function, theoretically one would need to start the algorithm at each point in the range to be assured of attaining the global optimum. Practically, by starting the algorithm at various points in the range the global optimum can be approximately determined correct to a certain number of decimal places. The number of points used determines the accuracy of the estimates. Of course, all of this is avoided if the function is concave, for then any local optimum is a global optimum. An allpurpose algorithm for solving nonlinear programming problems with nonlinear constraints is the SUMT algorithm. SUMT stands for Sequential Unconstrained Minimization Technique. The technique was perfected by Fiacco 22 and McCormick. The general programming problem is to determine a vector x that solves: minimize f (x) , subject to gj. (x) >.0; i = 1, 2, . . . , m. x = (x x , x 2 , . . . , x n ) . (Note, a maximization problem can always be transformed to a minimization problem by multiplying through the objective function by a (1).) 22 Anthony V. Fiacco and Garth P. McCormick, "Computational Algorithm for the Sequential Unconstrained Minimization Technique for Nonlinear Programming, " Management Science . X (July, 1964), 601617.
PAGE 95
This problem is solved by transforming it into a sequence of unconstrained minimization problems. 85 Define the function P (x, r ) = f (x) + r.E Â— Â— Â— g Â± (x) where rj_ is a positive constant. Proceed from xÂ° (starting point) to a point x(r^) that minimizes P(x,r^) in the feasible domain (set of points satisfying the constraints of the problem) . Form the new function, P(x,r ? ) = f (x ) + r~E Â— Â— ; 0 , 2, 3 . . . are generated that respectively minimize (P(x,rj < .)} and r^ > 0 as k> <Â». This sequence of P minima converges to an optimum of the original problem; that is, x(r^)*x, where x is the optimal vector and f(x(r^.) ) f (x) , as r^ >0 . The principal technique used to minimize P at step i is to evaluate the gradient of P at x(r i _ ) and then to descend the gradient a fixed distance. The gradient gives the direction of steepest descent of the surface P at a given point and thus suggests the optimal direction to
PAGE 96
86 proceed in order to attain the minimum value of the P function. When the original problem is not concave, it is necessary to start at several points to determine an approximation to the global optimum. Mathematical Considerations in Retrospect In review, a detailed discussion of mathematical considerations relative to the development of a mathematical model has taken place. As a result of this discussion, several generalizations can be made. 1. Analysis should be confined to a single output. By employing appropriate constraints, minimum standards to be achieved by other outputs can be set. 2. The use of a nonlinear model that incorporates higher degree and interaction terms should be considered. This is necessary to consider questions of economies of scale and nonadditive effects. 3. The use of gain scores at the level of the individual is not justified. Such scores are simply not reliable. Aggregated gain scores are reliable for sufficiently large n, but their use is dismissed on the grounds that it leads to underestimation of the effects of school variables .
PAGE 97
87 4. The measurement of independent variables must essentially be errorless. Random error in measuring the dependent variables does not bias the regression coefficients, but does increase the standard error of estimate. 5. Where possible, one should use population data rather than sample data. The use of sample data requires additional assumptions regarding the shape of the distribution (i.e., normal) and the homogeneity of variance. Furthermore, if independent variables are interrelated, multicollinearity may affect the stability of the regression coefficients. If population data are used, the stability of regression coefficients is affected only by specification error (assuming no measurement error in the independent variables) . 6. For analysis to be meaningful it is necessary that the variables be constrained by real world conditions. This implies the use of mathematical programming. 7. The production function should be estimated using crosssectional or longitudinal data in a surveytype design. While the experimental approach is more effective in eliminating plausible alternative causal schemes, the process of randomization and its concomitant, sampling, may lead to unstable regression coefficients.
PAGE 98
88 8. The experimental approach should be followed in investigating the residuals of production function analysis (i.e., the question of technical efficiency). This leads to the discovery of previously unknown implicit factors that influence output and are correlated with certain inputs. Their inclusion in the regression equation allows us to control for their effects on other inputs, which lessens specification error and improves the ability of the model to predict. A Mathematical Model Given a conceptual framework for the process of education, a mathematical model is presented which translates these conceptualizations into statements capable of quantitative analysis. Given the educational production function: L it = g(c Â± (t) (t) (t) for a given set of community, peergroup, and individual inputs, the above can be transformed to A it = g(k,S i (t) ) where k is a constant. Thus for a given set of school inputs, the problem is to maximize A it = g(k,s. (t) )
PAGE 99
89 subject to Zf Â± (k,S i (t) ) <_ B i . The production function is estimated by using multiple regression analysis, and the entire system is 'solved by methods of mathematical programming.
PAGE 100
CHAPTER III EMPIRICAL APPLICATION OF THE MODEL In light of the prevailing emphasis on modeling the educational process, there is an emerging literature that advocates the use of mathematical models as a means to increasing the efficiency of educational planning and decisionmaking. Too often, however, the intent is simply to focus on the advantages of applying models rather than to provide empirical research that illustrates the unique contributions of such models in generating solutions for real and immediate educational problems. As a result, some rather sophisticated models have appeared in the literature, the mathematics of which have awed even the most interested observers with less than a college degree in mathematics. Also, the developers of such models have failed to provide means by which parameters evident in the model could be estimated. No matter how close the model may mirror the conceptual framework, any estimation of parameters on other than an objective basis defeats 90
PAGE 101
91 the primary purpose of model development. Any subjective determination of the values of the parameters only obscures the intended objectivity. Thus the process of estimating the educational production function necessarily requires a statistical dimension for parameter estimation. At the same time, a statistical dimension necessarily restricts the complexity of model design. Thus the choice which a designer must face is developing highly sophisticated models, for which parameters cannot be objectively determined using current means, or developing less sophisticated models with a statistical dimension. Considering the urgent need to demonstrate the utility of such models, the choice made is the model presented in the previous chapter. In this chapter, using real data, an empirical illustration of the utility of the model is provided. In the first section the design of the empirical application is described. In the second section the results of the empirical application are discussed. Design of the Empirical Application Research Design The application was a survey study consisting of three In the first phase, multiple regression analysis phases .
PAGE 102
92 was used to estimate the production function and the budget equation . In the second phase, mathematical programming was used to maximize the production function subject to various constraints. The constraints included the budget equation and the bounds of observed variation in the independent variables as well as other constraints deemed appropriate by the investigator. In the third phase, the budget equation was incremented several times and the entire system was resolved using mathematical programming. It must be stressed that the absolute effect of schools on outputs is not being assessed, but rather the effect of schools on variation in output levels. No doubt schools have an overall baseline effect (i.e., school factors are important for a certain basic level of output). Thus, the analysis did not deal with the effects of "no program" 1 as compared with having a program. Also, the analysis did not consider the question of technical efficiency. This did not invalidate the results of the analysis for the policymaker. Knowing which type of inputs seem associated with success is important for suggesting reallocation to best affect outcomes within the present technology. As an example of the effects of "no program, " the City of New York reported a loss of two months in reading level for the year 196869 owing to the twomonth teachers ' strike. See Mosteller and Moynihan, p. 27.
PAGE 103
93 All research is plagued by the presence of confounding variables. In surveytype research, the inability to manipulate independent variables keeps one from making causal assertions. As an example, an assessment of the potential effects of school resources would have to take into account the possibility that the apparent effects of family background are actually the results of advantaged children attending betterthanaverage schools. The advantage of multiple regression analysis is that it controls confounding variables by entering them into the regression equation. It is the responsibility of the investigator to anticipate the possible sources of bias as fully as he can and to enter them in the regression analysis. What can never be known is whether or not all such influences have been accounted for. Thus, the results of analysis must always describe associated effects of the independent variables. Data Base Data were used for the purpose of illustrating the workability of the model. This differs from the normal usage of data in educational research where the intent is to generalize sample results to a specific population. This was not the goal of the study. It has been stated
PAGE 104
94 previously that the major shortcoming of most studies where mathematical models have been developed is the absence of empirical testing of the proposed model. Yet, this is a crucial stage, for it is at this point that necessary questions regarding the types of data available and the existence of appropriate computational algorithms determine in fact whether or not the model can be made operational. It is one thing to conceptualize the process of education and another to carry this conceptualization through the stages of data collection and data analysis to the point where practical implications can be realized in quantitative statements of policy. While the above is not intended to grant complete license in the use of the data, it explains why the use differed somewhat from the usual requirements associated with sample usage. Concurrently, this is allied with the stated intention to study a complete population. The statistical difficulties that can be avoided by studying a complete population have been enumerated in the previous chapter. In estimating the production function the sampling unit is either at the level of state, district, school, or the individual student. sll but the last, the investigator is generally concerned with a finite population of only several hundred. The
PAGE 105
95 significance of this fact was elaborated by Glass and Stanley: If the ratio of population size to sample size is larger than 100, the techniques appropriate to making inferences to finite populations and those appropriate for infinite populations give essentially the same results. It is customary to use statistical techniques based on the assumption that infinite populations are being sampled whenever the population is reasonably large ... and the sample from the population does not constitute an appreciable proportion of the population. 2 Blalock stated that where interest centers on the regression coefficient in a regression analysis, the sample size should be at least 150. 2 Putting the two together implies that in most instances where the population is finite (1,000 or less), it would be better to consider the entire population. For purposes of model illustration, the investigator had access to what was considered the best available data at the present time. The data were collected in a study of school productivity under the direction of 4 the National Educational Finance Project (NEFP) . The study 2 Gene V. Glass and Julian C. Stanley, Statistical Methods in Education and Psychology (Englewood Cliffs, N.J. : PrenticeHall, Inc., 1970), p. 241. 3 Blalock, p. 187. 4 See Scott N. Rose, A Study to Identify Variables to Predict Local School District Productivity in Two States (Ed.D. dissertation. University of Florida, 1972).
PAGE 106
96 was based on a sample of 181 school districts in a given state. Due to the unique purposes for which the data were used, the sample cannot be considered to be a representative sample of school districts throughout the state. However, the extremes are well represented in the sample. As Hanushek and Kain stated: For a study of the educational production process, it is more important to obtain wide variation in educational practice and experience than to have a representative sample of the population of schools or students. ^ Description of the Variables Variables were selected from the NEFP data that were descriptive of both inschool and outofschool factors of each school district. Although there have been many variables identified in related research the variables selected for this study were variables used previously by researchers to examine correlates to school performance or represented variables that were selected by the Research Staff of the National Educational Finance Project as potential predictors of school productivity. 5 Eric A. Hanushek and John F. Kain, "On the Value of Equality of Educational Opportunity as a Guide to Public Policy," On Equality of Educational Opportunity , p. 118.
PAGE 107
97 Data for the following variables were compiled from available records at the state department of education for a given state, except for variables x 1 through x 3 : 6 Iflcome Â— ps.rÂ— pupil (x x ) Â— Data for x^ were taken directly from Personal Income by School Districts in the United States . I ncome Â— under $ 3 , 000 (x 3 ) Â— The percentage of gross incomes less than $3,000 was computed by totaling the number of tax returns per district and dividing this total into the number of returns reporting gross income less than $3,000. ln . cpme Â— over $ 10 , 000 (x^) Â— The percentage of gross incomes over $10,000 was computed by totaling the number of tax returns per district and dividing this total into the number of returns reporting gross incomes over $10,000. lSEA...Ti.tle I Pupils (x 4 )Â— X 4 was computed by forming a ratio of numbers of pupils eligible for Title I programs to pupils in ADM. ADM was used as the denominator rather than ADA because the attendance habits of the two groups (ESEA Title I Pupils and total pupils) would not necessarily be the same . 6 Data for variables x^ x^ were taken from Personal Income by School Districts in the United States . Dewey H . Stollar and Gerald Boardman (Gainesville, Florida: National Educational Finance Project, 1971) .
PAGE 108
98 Minority, enrollment (x 5 ) Â— The district percentages of pupils enrolled, during 196869, that were nonwhite, Spanish speaking. Oriental or American Indian were obtained directly from the state department of education. Attendance (x^_) Â— was calculated by forming a ratio of ADA to ADM. The ADA and ADM were 1968Â—69 school year figures . Future training (x^)The percentage of graduates receiving post high school education was computed by forming a ratio of the number of 1969 graduates entering future training to total 1969 graduates. Size of school district (x ) Â— ADM for the 196869 _ Q school year was used as the indicator of school district size . Percentage enrolled (x g ) Â— The percentage of children age 517 enrolled in public school was calculated from information contained in the fall 1968 school census report filed by each district with the state department of education . Transportation cost (x 10 ) Â— Transportation cost per pupil was computed by dividing the 196869 school year cost for transportation by the number of pupils in ADA. Local fiscal effort (x.^) Â— Local fiscal effort was determined by forming a ratio of local revenue per pupil
PAGE 109
99 in ADA to tiie adjusted gross income per pupil in ADA. Local revenue per ADA was computed by dividing total local revenue by the number of pupils in ADA during the same year. Expenses of instruction (x^) Â— X was computed by taking the percentage of total current expense disbursed for instruction during the 196869 school year. Longevity experience (x 13 ) Â— X 13 was calculated by forming a ratio of teachers with 20 or more years of experience to the total number of teachers for the 196869 school year. Teacher preparation (x.^) Â— ^ 2.4 was computed by forming a ratio of teachers with less than four years' training to the total number of teachers for the 196869 school year. Teacher experience (x ) Â— X was calculated by 15 15 forming a ratio of teachers with less than five years 1 experience to the total number of teachers for the 196869 school year. Advanced preparation (x 16 ) Â— X 16 was computed by forming a ratio of teachers with either an advanced degree or 30 hours of professional training beyond their bachelor's degree to the total number of teachers for the 196869 school year. Median teacher salary (x ) Â— Variable x was the 17 17 196869 median teacher salary for each school district.
PAGE 110
100 Average class size (x ) Â— Average class size for the JO 196869 school year was determined by dividing the number of district pupils in ADA by the number of classroom teachers in the district. Pupilsupport personnel ratio (x ) Â— The pupilsupport personnel ratio was calculated by forming a ratio of district pupils in ADA during the 196869 school year to the number of certified nonteaching personnel employed in the district for the same year. Expenses for transportation (x 2Q ) Â— X 2Q was computed by taking the percentage of total current expense disbursed for transportation during the 196869 school year. Median reading achievement (x 21 ) Â— X 21 was the median score for the school district during 1969 on a standardized reading achievement test for sixthgrade pupils developed by the state department of education. Average daily attendance ( x 22^ '^22 Was avera 9 e daily attendance for the school district for the 196869 school year. Total current expenditure (x ) Â— X was the total 23 23 current expenditure for the school district for the 196869 school year. Instructional expenditures perpupil (x 24 ) Â— X 24 was computed by forming the ratio of expenses of instruction to ADA (X 22 ) Â•
PAGE 111
101 Table 1 lists the variables described above. Also, a mnemonic label is assigned to each variable to expedite future reference. Discussion of the Variables The particular units in which one defines each of the variables used in a study determines the level of data treatment. Above, the observations are measures on the variables at the district level. Variables defined as aggregate amounts per district lead to an estimation of a state production function. If interest centers on a production function at the district level, then the unit of measurement becomes the school . If interest centers on a production function at the school level, then the individual student or classroom must become the unit of measurement. At the state level, unaggregated units of measurement could be used, that is, one could use the individual student or school as the unit of measurement. The necessity for these units of measurement would depend upon the ratio of between district variation to within district variation. No doubt, aggregated data will contain less variance, but unless the ratio differs radically from one, no great harm is seen.
PAGE 112
VARIABLES POSTULATED AS PREDICTORS OF LOCAL SCHOOL DISTRICT PRODUCTIVITY FOR A GIVEN STATE 102 c o Â•H P U G P P CO o c O Â•H O Â« H cn o 0 Â•H P Â•H 0) p O Â•P i Â— 1 CO 0) ip o Â•H cu > Â•P 04 rp 0 CU P 0 i Â— 1 U a CO CO ,0 cn i cu cu Â•rH cu p Â£ e Cn Q 0) o 0 Â•p o. o u rp G G CU cu H Â•H Â£ CO 0 CO CO 1 Â— 1 o CO CO *p G 0 O Or P P P G Cn Cn Or CO CO dp ip m 0 0 O o P Cn Q) cu cu c Cn Cn o> w 13 ro ro rC co a) P P P H p G G G co d) CU (U P G a 0 u 1 Â— 1 G Q) a v> Or i Â— 1 0 P rQ 04 d) p i Â— 1 (0 1 CU o Â•P p p G cn > rH d> Â— P at b G O rH G> 04 04 V A p Or > \ CO CO Â•P CU u d) u 0 u Eh W 6 !5 e ss e s: CO 0 H 0 H O H < W u ~ U CJ W G G G co H H H w Â• 0 rp CN CO Â’Vi* S3 ,G CO Â•H G m Or CO rH o o cu rC p U Â•p CO UG rG G Cn O p G ,G CO P Â•P C CO rO O P P Or (0 13 ^ C Cn P H G Â•P P G > G ro Â•P d) O cu Â£ P O H p cu r Â— i d) p o 5 Or p f< CO Â•H G cu ,G CU P Â•P Cn CO O ro G p i Â— l S G P cu Â•P rp Q 13 G Or tO < H g G P P ro i 0 p < 0 P rp o 9 0 P CU (U to Cn ' Cn G 13 (0 Cn cp tO 0 P G o Â•P P d> G P G P CP cu a; o d) rO (0 O ro p O CJ p P d) p P G CU CU Or rO (U 13 > Or CO 03 Or d) Â•P o Â•p p Â•P p G CO d> Â•p Â£ p rH Cn Â• Â— 1 G rP 0 Â•P 0 p G o G P rG W Â— cu S ro u 03 o p P 15 CO >> 15 C Eh sC P W (0 \ P3 Â«P Â— Â•P 5 13 < CU Eh o g P H G P p b p o g d) < G Or cu < C P P N s Â— Â•H P G Â•P S < fcr CO CO VO r* CD
PAGE 113
TABLE 1 (CONTINUED) 103 1 rG G 0 p n Pi p P o 0 o 0 G X 0 O m 0 Â•rH Â•H p O 0 0 G i Â— l 0 P 0 0 0 G p X p p G G P tn 0 P n T! o 0 0 rG 0 Â•rG a Â•V 0 0 0 a, r a 03 0 TS 0 rG >i G C G rT 'D rG Â•H o P 0 G 0 rtj 0 O p P 0 P > CQ 0 o rG O Â•rl Â— ' G X 0 i Â— 1 P G G C 'V 0 i 0 Cu i P o 01 H Pi tn &! 0 tn G 0 G p G 0 P 0 0 01 0 G 0 P 0 P 0 0 H P 0 P P C p 0 P 0 G rG Qj G P G C H G H G G a; o CO 0 0 0 G 0 0 G 0 G 0 0 0 o 0 G Â•G 0 0 P 0 0 0 H 0 0 0 G rl ^ rG 0 P o G 0 G 0 G 0 G Â£P Sh p 'O a) u G 0 G 0 G 0 G 0 G 0 X 0 O 0 P CO Eh P tn P iG P 0 P P P 0 P X 2 G 0 Â•H p >1 0 0 0 G G (L) V3 P P p 0 G O 0 1 Â— 1 ai CO G G G 0 0 rH i Â— 1 p i Â— i 0 0 P 0 Â•rG 0 p 0 Â•H rG U P 0 Â•H P C 0 CO Eh O P G G 0 0 G G G W H 0 G Â•H 0 G 0 G 0 iP 0 G Ol 0 i Â— l M Â•H i Â— I G X P Â— 0 0 ,G rQ P 0 O W O 0 ^ Qis G 0 0 0) 0 ^ U rs P CM G X m P Â— 0 rN Â•H tn Â« P P 0 P >1 P v W P 0 P G fO P G \ H W 0 P A v ti W Eh < (0 p p O 2 P CO 0 H Â•H G P G 0 P CO > G P a, <; H 0 W > p 0 P 0 P 0 P G Q 0 p CO p rG p G U 0 X rG P rG X G P u w G EH 0 P 0 Eh 01 W 0 P 0 w 0 < 0 g G 0 ' 0 P J '' G w 0 w 0 > 03 w 0 G O X O 0 0 'O 0 p EH P w P Eh Eh c 2 Â• O Gl O rG CM m m co [" 2 i Â— 1 rG i Â— 1 Â— i rG i Â— i rH i Â— 1
PAGE 114
TABLE 1 (COMPLETED) 104 to P p o CU 44 A i 0 G 41 tO O to cu c P CU 44 0 44 43 44 44 s cu 44 44 O 0 P 43 G U Â•H 0 41 tu CU Â•H p p Â•P 43 B p 44 to P G CU P to to P p > to Â•H to cH G Or P P (L) P P CU Â•P CU X Or a. C4 0) tO O 43 to P CU to P u to 43 P 44 4h to to P to P P m 0 0 c 41 0 to CU o 0 Â•H G Or G 1 to 0 o b CU CO to rG (0 1 Â— 1 0 H Â•P o O G Â•P 44 P (0 Â•p 44 A (0 P (0 43 X CU 44 P (0 aj cu a> p a> p > 0 (0 (X IX A P 41 S to < Eh (X G o 44 to Â•H G to Â•P Â•P Â•H G o (U 43 43 43 ~Eh 0 04 Â•H C G C Or 1 44 i Â— 1 1 ' P to P CO p Â•p rp G to tO to O P o 43 Â•P CU G rP > i Â— i Â— a p 44 to Â— ' to P O P O H (U ffi P P Â•P Or CO P CO (X u P 44 P CU P CO 0 01 Eh < CU Â— U Â— O dr to o i p 10 W C Q to C w P 1 tO > rl A G U to H (0 P P u P P P < P ft? CU Eh Â•H p c to Eh 44 CU CU a ex Or^ 43 w CU 44 ^ to (X > p X CU > 0 G < p W S < Eh H Â• o CD CO O rp 04 m a pH rp 04 04 04 04 04 The expression in parentheses is a mnemonic label for the variable title
PAGE 115
105 The reason for restricting the units of measurement at a given level to those discussed above follows. Statistically, it can be shown that the ability to predict with a multiple regression equation is greatest when using the mean values of the independent variables. As an example, in estimating a state production function based on data at the district level, the state mean on the criterion variable could be predicted best by using state means on all independent variables. For educational policymakers operating at the state level, this would be the appropriate level of data treatment. It is important to realize that since information about school resources was gathered only on a districtwide basis, the school resource inputs cannot account for any of the differences among schools or among students within the same school district. Therefore, the percentage of total variance that lies among school districts is a kind of upper limit on the amount of variance for which school inputs can account. On the other hand, by working with aggregated scores for each variable, the Law of Large Numbers can be used with advantage, that is, greater accuracy and less measurement error will be obtained with macroscopic variables. Thus in using aggregative statistics the data may not be adequate for decisions about individual
PAGE 116
106 schools, but they will be adequate for deciding policy for school districts as a whole. At any rate, attention must be given to the ecological fallacy, that is, the relationships that were demonstrated hold only at the aggregate level and not necessarily for individual schools. The only output included in the present list of Vc^i^blcjs was the sixthÂ— grade median reading achievement score for the district (^21^ * While there is some concern over whether or not cognitive skills are the most useful outcomes of schooling, knowing something meaningful about success in the area of cognitive scores is far superior to knowing nothing at all, and this knowledge should allow some general inferences that would be most helpful in pointing general policy directions. Given that inputs in the educational process include community and student as well as school inputs, the inputs observed in the sample data appear to strike a balance among the three types of inputs. The socioeconomic variables can be seen as measures of both student and community inputs. Persons of different socioeconomic status (SES) face different kinds of life situations and in adapting to these they may develop different sets of values and life styles. In short, SES variables serve as proxies for a variety of values, attitudes, and motivations related to academic
PAGE 117
107 performance. The same can be said of community inputs, for which family values and life styles are manifested in an aggregate sense for community life. The school variables appear to measure mainly the characteristics of teachers and the instructional program as reflected in instructional expenditures, average class size, and pupilsupport personnel ratio. Separate programs (reading, science) as they exist in the schools are not differentiated. It also becomes apparent that in estimating the prices of the various school inputs, the most serious source of bias is revealed. It can be argued that living costs are higher in larger cities and that a part of the salary differential between large and small school districts may be considered a corrective of local price variation. In short, a cost of living index may be needed to help explain the variation in perpupil expenditures across districts. In summary, the variables from the sample data may be considered to be less than an ideal set. It would seem that a relatively small number of school variables have been considered of what is potentially available. This is not considered a problem if it is kept in mind that the primary goal was to simply illustrate the model using the best available data and it was not crucial if analysis was restricted only to a limited area of educational policy.
PAGE 118
108 The Data Analysis Plan The first step in the data analysis involved estimating the production function and the budget equation using multiple regression. A program from Biomedical Computer Programs (BMD) was used. The particular program is identified as BMD02R Stepwise Regression. The program computes a sequence of multiple regression equations in a stepwise manner. At each step one variable is added to the regresÂ— sion equation. The variable added is the one which has the highest partial correlation with the dependent variable partialed on the variables which have already been added. In addition, variables can be forced into the regression equation. Regression equations with or without a regression intercept may be selected. This program proved highly satisfactory for several reasons. First, of the variables containing redundant information (highly related) , stepwise regression selected only the ones that were optimally related to the dependent variable, omitting the remaining variables from the equation. This helped to avoid the partialling fallacy. Secondly, since most school variables were wanted in the equation, these variables were forced. ^ Biomedical Computer Programs , ed . by W. J. Dixon (Berkeley: University of California Press, 1971).
PAGE 119
109 the stepwise procedure allowed the contribution to the multiple R of each variable as it entered the equation to be viewed (i.e., what has been gained by allowing this variable to come into the equation) . The process of estimating these two equations was exploratory in nature. Thus, as a first step all linear terms were considered, some being forced into the equation. Secondly, all quadratic and cubic terms were tested. Then selected interaction terms were tested. It was not possible to examine all interaction terms simultaneously. In a sample of 181, the maximum number of terms that the regression equation could have was 180. Yet with 20 independent variables there were 190 possible interactions. Thus analysis was limited to those interactions that were logically viewed as having potential impact on school achievement (communityschool and schoolschool interactions). Finally, the analysis ended with nonlinear functions consisting of certain forced linear terms and any other terms which made a significant contribution to the multiple R (linear, higher degree or interaction terms) . Given the production function and the budget equation, the second major stage of the data analysis was solving a mathematical programming problem in which the production function was the objective function and the constraints
PAGE 120
110 included the budget equation and bounds on the observed variation in the independent variables. A program for solving nonlinear programming problems comes from the SHARE Q Program Library. The particular program is SDA 3189SUMT. The purpose of this program is to solve nonlinear mathematical programming problems where the objective function and constraints may be nonlinear. The program uses SUMT (Sequential Unconstrained Minimization Technique) to solve the mathematical programming problem. Users must supply a subroutine to read in the problem data and three subroutines to evaluate the problem functions, and their first and second partial derivatives. If the objective function is not concave, then it is necessary to start the algorithm at various points in the feasible domain of solutions. Results of the Empirical Application In analyzing the data, it was recalled from the statement of the problem that three problems were to be solved. 1. To determine the production function that describes the inputoutput relationship between selected variables; 2. To determine the optimal combination of inputs to maximize output subject to the budget and other constraints; g The SHARE Programming Library consists of a central file and documentation system available at most IBM installations.
PAGE 121
Ill 3. To determine the optimal combination of inputs to maximize output, subject to certain constraints, given increments in the budget constraint. To solve (1) multiple regression was used. Also, the budget equation was estimated using multiple regression. Problems (2) and (3) were solved using mathematical programming. Basic Data Description Repeatedly, the necessity for knowing the ranges of each of the variables in the observed data has been indicated. This was essential since predictions cannot be extrapolated beyond the range of observed variation. At the same time, the means and standard deviations of each variable are listed to provide additional information regarding the variability of the data. This information is shown in Table 2. It was believed that these data provided the widest possible variation in existing data for a given state. Estimating the Production Function A multiple regression equation was developed using median reading achievement (x^) as the dependent variable and all of the other variables as independent variables except for ADM (x^) . This represented a choice between ADM (xg) and ADA (x 22 ) Â• ADA was preferred over ADM since
PAGE 122
BASIC DATA DESCRIPTION 112 O O O o o o O o o o Â£ O CO o o CO CD o O CO CM G Â• Â£ co o ro o o r' [" CM ro in H CTi CM 00 1 i CO in G CX) CO o o O o o o O o o CM o O o CM in CO CD O o o CM rH Â•3* ro CO CO in CO 00 CO on Â•tf CM On 65,79 Â•3* in ro O'! Cn O'! 1 Â— 1 1 Â— 1 ro ID o> rH CO Â•tf co o CO co o CM LO no G Â• G O m CM in in CO Â•H i Â— 1 rH l Â— 1 TJ Â•P CM C CO % fO Â•rH 00 4> > CO 0) P in 00 in l Â— 1 o Â•tf in in co in cn rH in co o co co co Â•H CM rH CM o cn rG (0 cu S in rco in in o co CD CO on CM (O CM in .H co cn in co Â•OÂ’ in CO on CM Â•OÂ’ co Â•OÂ’ CO m in Â•d* in CM CM cn CD CO CO CO CO Â•a* o h * G 4> 0 O Â•H H 4> CU O O G CJ rH O o CO p 4> TS 4) 4> G P O o 1 Â— 1 G CO cu CO G G Â•H * * Â•H cu Â•H 1 Â— 1 O 0 41 EH Â« l co o Or Â£ P rH a ip CO Â•H 1 Â— l G 1 Â— 1 Cn 0 mh G 0) a P 1 Â— l G i Â— 1 G G H H i Â— 1 G G 0 Â•H O G 0 rQ p 0) G H G c O W H r Â— I G <0 i xs ai G H ,G 4) (0 O H G G > CL) W 03 CO CD (0 o MH G CD G o i Â— 1 CJ G CO On 4> CO (0 P 4> >i G EH (0 G Â•H CO > CO CO Â•H 4> CD m P 0 P (U CU a> ai EH Â•H no CD o G Dr CO Â£ Â£ Â£ G G G (U CO i Â— 1 G o o o < O CU G cu a G (0
PAGE 123
17.57 8.92 52.60 113 o ro o o o o o co o o co o i'' rÂ— 1 Â• Â— l 1 Â— 1 LO % % cp (P o cp rH i Â— 1 c 0 G 0 Â•H G 0 0 G 0 0 P p i Â— i 0 Â£ u P p 0 G 0 G G G G P > 0 Â•H Â•H r G 1 Â— I X a 0 0 G CO P P H t Â— 1 G 0 w 0 Or p u 0 0 0 n Â•H 0 C rH p X Or 0 i Â— 1 Dr 4H 0 0 P O H >1 Or w 0 U a 0 Q P Â•H Dr G TS Eh p 0 d; p G P H P p 0 0 CO o 0 0 CJ U Or > 0 0 o G Cp . 1 H 0 g CP P 1 0 G G G 0 0 >H G C 0 0 i Â— 1 P P Cp U u m H p Â•H 0 0 H p 0 G 0 C (0 0 > ts 0 Dr K Dr TS 0 G 0 0t 0 0 0 'a 0 > P X 0 > O Â£ G Eh Eh C Â£ < Or w Â£ < Eh H co in ID 00 CP o 1 Â— 1 CN co rtf i Â— l rH rH rH r Â— i rH rH CN CN CN CN CN
PAGE 124
114 other variables were computed on the basis of ADA rather than ADM. In estimating the production function it was imperative that all of the relevant school inputs be repreÂ— s^^ted in the equation . This was accomplished by forcingÂ— in such variables. In this case variables x 12 Â“ x ig were forced into the equation. It is noted that variables x^q ( transportation cost per pupil) and x 2 q (percent of total current expenditures funded for transportation) deal with school policy. However, these variables were not subject to manipulation by school authorities (i.e., no one expects the school board to move the students closer to their schools). Consequently, they were left free to enter the equation if they had significant impact on school achieveÂ— men t . Also, ADA (x 22 ) was forced into the equation. The rationale for this choice was that the investigator wanted ADA represented in the regression equation so that questions regarding the relationship of district size and achievement, constrained by costs, could be considered. In testing for nonlinearities in the independent variables considered, all square terms and cubic terms and certain interaction terms were tested (i.e., tested to see if significant contributions to the variance in achievement exists) . In regression analysis, however, the number of terms must always be less than the number of cases in
PAGE 125
115 the data. This made it necessary to conduct two separate runs. In the first run" all square and cubic terms were tested. Those that proved to be insignificant were omitted from further consideration in the second "run." In the second "run," selected interaction terms were considered. Again, the possible number of interaction terms outnumbered the number of cases. Logically, though, consideration need only be given to those possible interaction terms which can reasonably be assumed to have an effect on achievement. In this case, interest centered on the interactions between school and student background variables and the interactions between certain school variables. Table 3 lists the interaction terms that were tested. TABLE 3 INTERACTION TERMS TESTED FOR THE PRODUCTION FUNCTION 3 Interactions Between StudentBackground and School Variables X 2 X 13 X 3 X 13 X 4 X 13 X 5 X 13 X 2 X 14 X 3 X 14 X 4 X 14 X 5 X 14 X 2 X 15 X 3 X 15 X 4 X 15 X 5 X 15 X 2 X 16 X 3 X 16 X 4 X 16 X 5 X 16 X 2 X 17 X 3 X 17 X 4 X 17 X 5 X 17 X 2 X 18 X 3 X 18 X 4 X 18 X 5 X 18 X 2 X 19 X 3 X 19 X 4 X 19 X 5 X 19
PAGE 126
116 TABLE 3 (CONTINUED) Interactions Between School Variables X 13 X 14 X 14 X 15 X 15 X 16 X 16 X 17 X 13 X 16 X 14 X 17 X 15 X 17 X 16 X 18 X 13 X 17 X 14 X 18 X 15 X 18 X 17 X 18 X 13 X 18 a See Table 1 for a description of variable codes To determine whether a variable that was tested made a significant contribution involved making a choice as to the significance level to be employed. For those terms that were free to come into the equation (i.e., nonforced) , it was felt that they should enter only if a noteworthy contribution was being made. The investigator observed that by setting the Ftoenter at 6.80 ( nn F. = 6.80), Â• yy i , 160 variables coming into the equation made an addition to the multiple R of generally 1 or more percent. It was felt that in latter steps of the regression analysis, any variable that could make a contribution of 1 or more percent should enter the equation. Table 4 gives the results of the analysis for estimating the production function. The equation appears to predict
PAGE 127
117 TABLE 4 No. REGRESSION COEFFICIENTS FOR THE PRODUCTION FUNCTION (DEPENDENT VARIABLE: READING ACHIEVEMENT Â— X 21 ) FtoFtoVariable Title Coefficient 5 Enter* 3 Remove 0 Linear Terms 3 Incomes over $10,000 0.150 33.95 13.26 (2) 6 Attendance 52.807 8.91 14.35 (2) 7 Future Training 0.093 10.35 14.35 (2) 12 Expenses for Instruction 4.292 14.26 6.87 (3) 13 Longevity Experience 0.042 3.13 1.04 (3) 14 Teacher Preparation 1.511 5.09 11.12 (3) 15 Teacher Experience 0.001 2.80 0 . 00 + (3) 16 Advanced Preparation 0.120 40.23 10.53 (3) 17 Median Teacher Salary 0.002 e 3.19 0.00+ (3) 18 Average Class Size 0.220 3.39 1.75 (3) 19 PupilSupport Ratio 0.011 3.41 9.63 (3) 22 Average Daily Attendance 0 . 004 e 5.51 0.71 (3) Quadratic Terms 6 X 6 0.291 133.39 15.16 (2) 12 X 2 X 12 0.035 7.51 7.51 (2) Interaction Terms X 4 X 19 0.002 19.34 10.55 (2) X 5 X 18 9 0.005 9.42 16.35 (2) X 14 X 17 0.016 11.88 8.80 (2) Constant Term 2282.167 R = 0.79 a The regression coefficient is the b weight (unstandardized partial regression coefficient) kp test of significance of a single variable in a stepwise regression at the step of entry into the equation C F test of significance of a single variable in a stepwise regression after the final step of the regression ^2 = free variable; 3 = forced variable e Variables X^ 7 , X 22 were scaled so that truncation errors would not occur. X 17 : 1 unit = $100; X 22 : 1 unit = 1,000 pupils X^ : ESEA Title I Pupils g X,_: Minority Enrollment
PAGE 128
118 fairly well as evidenced by R 2 = 0.79. The interpretation of associated effects is not as simple in the case where nonlinear terms are involved. One must use partial derivatives to determine the effects. As an example: 9x, 9x Â— = 52.807 + 2(0.291)x 6 . The effect of x 6 on x 21 depends on the level of x r . If the x 6 effect at the mean value of x^ (93.9) is considered, then 9x 21 = 1.843 9x, This states that an increase of 1 percent in ADA/ADM (x r ) at the mean level of x^ was associated with an increase in reading achievement (x^) of 1.843 units. While the designated regression coefficients could provide pages of discussion, it was not the primary goal to discuss all of the pertinent interpretations. However, a few are worthy of discussion since they highlight the necessity for a nonlinear model. Looking at the regression coefficient for average class size (x ) , a positive relationship is seen. The zero order correlation coefficient between average class size (x^g) and achievement (x 2 ^) was 0.036. It would seem that increases in class size were associated with increases in achievement. Of importance, though, was the interaction of class size with minority
PAGE 129
119 enrollment (x^) . Increases in the percentage of minority students were associated with decreases in achievement. Thus, if minorities constituted 50 percent of the school population, an increase of one unit in class size was associated with a decrease of 0.030 units in achievement. This is where the zero order correlation coefficient can be misleading since it did not reveal the interaction that was occurring between class size and minority enrollment. The regression coefficient for the percentage of teachers with less than four years' training (x ) was as expected (b = 1.511). However, this percentage interacted with median salary (x^) Â• The interaction coefficient stated that increases in median salary for a given percentage of x were associated with increases in achievement. In fact, for x^y = $10,000, an increase of 1 percent in x^ was associated with an increase of 0.099 units in achievement. As in all such cases where trouble is suspected, recourse was made to a frequency distribution. Table 5 reveals the results of such analysis. What was observed was that as the salary level increased the range of x.^ became more and more restricted. Thus when 100 Â£ x < 110, then 0 _< x ^4 Â£ 12.5.
PAGE 130
120 If the interaction is to be meaningful when x = 105, then x 14 must he in the interval from 0 to 12.5. This fact became more crucial when the stage of programming analysis was considered. TABLE 5 AN EXAMINATION OF THE INTERACTION BETWEEN THE PERCENTAGE OF TEACHERS WITH LESS THAN FOUR YEARS ' TRAINING (X 14 ) AND MEDIAN TEACHING SALARY (X 17 ) Interval No. a X b *17 X c 14 No. of Cases 1 [69.5, 80) d [2 , 32.1] 22 2 [80 , 90) [o , 25 ] 73 3 [90 , 100) [0 , 23.5] 48 4 [100 , 110) [0 , 12.5] 25 5 [110 , 120) to , 3 ] 10 6 [120 , 130) [0.9, 1 ] 2 a Given an interval for X 1? , such as [69.5, 80), the interval of corresponding values for X 14 was determined. This means that as assumed values in the interval [69.5, 80), the range of X 14 was restricted to the interval [2, 32.1]. b X 17 was scaled. 1 unit = $100 C X 14 represented a percentage d [ , ) means that the interval includes the number on the left, but not on the right. In summary, the analysis estimated a production function which predicted very well and most of the parameters conformed to expectations.
PAGE 131
121 Estimating the Budget Equation Basically, the same procedure was followed for estimating both the budget equation and the production function. The total current expenditure for instruction perpupil (x 24 ) was the dependent variable. A perpupil cost variable was preferred since using total cost leads to scaling difficulties. Since most of the school variables used in the study were considered instructional expenditures it was felt that perpupil instructional expenditures was a better choice for a dependent variable than perpupil total current expenditures (X23). The same variables were forced or allowed to be free as in the previous analysis with some exceptions. Since total current expenditure (x 22 ) was highly related to x^ it was omitted from the analysis. The use of this variable would not be meaningful, since x^ 2 was not a component of x^. Rather, the reverse is true. Likewise, the interactions of school variables were not necessarily meaningful in comparing costs, and were thus omitted from the analysis. Given the remaining variables, as before, square terms, cubic terms and interaction terms were tested. The results are given in Table 6. Other than linear terms, nop only three square terms were significant (x c , x. , x. _ ) . b 16 L / No cubic terms or interaction terms entered the equation.
PAGE 132
122 TABLE 6 REGRESSION COEFFICIENTS FOR THE BUDGET EQUATION (DEPENDENT VARIABLE: INSTRUCTIONAL EXPENDITURES PER PUPIL Â— X 24 ) 1 r+ 0 1 1 FtoNo . Variable Title Coefficient 3 Enter 13 Remove c Linear Terms 10 Transportation Cost 7.352 13.86 277.51 (2) d 12 Expenses for Instruction 7.730 41.30 51.33 (3) 13 Longevity Experience 0.364 2.19 0.72 (3) 14 Teacher Preparation 0.674 27.70 0.78 (3) 15 Teacher Experience 0.409 11.06 0.72 (3) 16 Advanced Preparation 8.817 6.22 17.51 (3) 17 Median Teacher Salary 14 ,256 e 197.31 15.30 (3) 18 Average Class Size 6.662 99.21 8.54 (3) 19 PupilSupport Ratio 0.032 12 .97 0.78 (3) 20 Expenses for Transportation 85.886 136.10 235.80 (2) 22 Average Daily Attendance 0 . 358 e 12.77 36.82 (3) Quadratic Terms 5 X 5 f 0.021 13.26 51.88 (2) 16 X 2 X 16 0.091 21.20 20.77 (2) 17 x 2 17 0.090 22.06 22.06 (2) Constant Term 1125.488 R 2 = 0.94 a The regression coefficient is the b weight (unstandardized partial regression coefficient) ^F test of significance of a single variable in a stepwise regression at the step of entry into the equation C F test of significance of a single variable in a stepwise regression after the final step of the regression d 2 = free variable; 3 = forced variable G Variables X^y, X 22 were scaled so that truncation errors would not occur. X ly : 1 unit = $100; X 22 : 1 unit = 1000 pupils f X 5 : Minority Enrollment
PAGE 133
123 The equation does appear to explain most of the variation in perpupil instructional expenditures across districts with 2 a multiple R =0.94. Perhaps the most noteworthy result of the analysis was the lack of a square term for ADA (x 22 ). Perpupil cost was not quadratic as many have hypothesized. Rather, the larger the district, the lower the perpupil cost. Given the range in ADA (316 to 953,107), which was about as wide as can be, there appeared to be no optimal size for per, 2 pupil costs. The FtoEnter for x 22 at the last step in the analysis was 0.8893. Variables x (ADPREP) and x 16 17 (MEDSAL) had significant square terms. Thus the cost of increasing either one of these variables depended on their levels. The larger x^ & was, the more it cost to increase it by one additional unit. Exactly what implicit factors were operating to cause this was not clear. One thing was clear, by examining the Ftoremove ratios, the square terms made a significant contribution to the multiple R and thus a parabola provided a better fit to the observed data than a straight line. In summary, it appears that the budget equation estimated the prices of the school inputs reasonably well. Furthermore, the coefficients gave the price of each variable while other variables were held constant. This was a necessary requirement for the programming analysis.
PAGE 134
124 Formulation of the Mathematical Programming Problem In solving a mathematical programming problem the educational production function becomes the objective function and the constraints include the budget equation as well as bounds on the observed variation of the independent variables. Model specification of the production function requires community and student inputs as well as school inputs. However, community and student inputs cannot be manipulated by school authorities. These inputs cannot be considered as variables in the programming analysis. Rather, the objective is to maximize the production function for a given set of community and student inputs. More specifically, given the production function: A., = g(C. it 3 1 (t) S (t) I P (t) ) , b i , I Â± , P Â± ), for a given set of community, peergroup, and individual inputs, the above can be transformed to: (t) A Â±t = g(k, S Â± ) where k is a constant. Thus for a given set of school inputs, the problem is to Maximize A. = g(k, S it i subject to Zf^ (k, ) _< B^. Since the variables were macroscopic (i.e., level of data treatment was the district), individual inputs were
PAGE 135
125 deleted from the model. Interest centered on the analysis of educational policy for the population as a whole. Thus, for community and student inputs the mean values of the relevant variables were used. Statistically, this had an enormous benefit, since in the production function the mean level of achievement for the complete population could be predicted perfectly (i.e., no residual error occurs). Thus, the school inputs (x x ) should become the 12 19 variables in the programming analysis, all other variables assuming constant values equal to the population means. Further reflection revealed, however, that variable x (percent of total current expenditure for instruction) was not manipulable in this analysis. All other school variables (x 13 x 19 ) related to instructional resources and the dependent variable in the budget equation was total instructional expenditure perpupil (x ) Â• Thus, ways of reallocating resources in the instructional category with a given budget for instructional expenses were being considered. This does not change the percentage of total current expenditure going to instruction, since the instructional budget remains the same. Thus in the analysis x^ 2 remained constant. Another difficulty centered on the interaction term, X 14 X 17 ^ PREP Â’ MEDSAL ^ Â• Preliminary data analysis revealed
PAGE 136
126 that the maximum observed value for x (32.1) should be 14 used with x.^ set at 122.5. Nevertheless, an examination of the frequency distribution did not support this conclusion since the range of x^ became more and more restricted aS X 17 ^ ncrease< 3This problem was discussed earlier in the chapter. The problem was resolved by making x.^ constant. A number of reasons can be advanced for setting = 0. While teachers with less than 4 years' training may be skillful at teaching reading, it can be questioned whether or not their training has been sufficient for them to function as well in helping students attain other outcomes, both cognitive and affective. It is doubtful that beginning teachers with less than four years? training would be prepared to teach the new science and mathematics programs that are now a part of elementary curriculum. In the past, school districts have not always been able to employ all fully certified staff. Thus, the value of x = 5.87, 14 which was the mean value observed in the population, was also considered. School district size, as determined by average daily attendance (ADA x 22 )Â» was potentially manipulable to the extent that consolidation of school districts can and should occur. Earlier no optimal scale of economy with respect to district size and perpupil cost was reported.
PAGE 137
127 It was observed that the larger the district the lower the perpupil cost. However, achievement was negatively related to district size (b = 0.004). It is possible, therefore, that there would be an optimum size for the school district for maximizing achievement, constrained by cost. Preliminary data analysis suggested the unrealistic figure of approximately 700,000, unrealistic because other outcomes would probably be negatively affected and in most instances it would not be feasible geographically to reorganize districts on such a scale. Thus x 22 can 1 >e set at various levels and the effects of different district sizes on achievement and the distribution of resources for a given perpupil budget can be observed. The foregoing suggests that analyses be conducted at various levels of x^ 4 (PREP < 4) and x 22 (ADA) . In addition to the constraints previously described it was also observed that the sum of variables x (EXP > 20) and x 15 (EXP < 5) must always be less than 100 and that the same held true for x. Â„ (PREP < 4) and x. ^ (ADPREP) . With 14 16 the above in mind, then, the mathematical programming problem to be solved reduced to the following: Let x 21 = (x) + 32.744 l.Sllx^ 0.004x 22 .
PAGE 138
128 Maximize (x) = 0.042x 0 . OOlx + 0 . 120x n _ + 0.002x n 16 1 + 0.016x x + 0.179x 1o +'0.002x, 14 17 18 19 subject to: I. (1) 0.364x, + 0 . 409x 8.817x lr 14.256x i J 15 16 17 6.662x 0 . 032x + 0.091x 2 + 0.090x 2 18 19 16 17 1 862.286 + 0.674x 14 + 0.358x 22 (Budget Equation) Â• H H (2) x 13 + x 15 < 100 Scaled Variables (3) X 14 + X 16 < 100 x^ 7 : 1 unit = $100 III. (4) 1.7 < X 13 <_ 52.6 x^ 2 : 1 unit = 1,000 pupils (5) 14.7 1 x 15 <_ 60.6 (6) 22.2 x^ <_ 84.3 Constants (7) 69.5 <_ x 17 < 137.5 x , x (8) 12.3 <_ x 18 <_ 22.2 (9) 44.9 <_ x ig <_ 898.3 Since x 2 ^ = cp (x) + constant, it follows that in maximizing 4* (x)Â» X 21 ** s a l so being maximized. For that reason programming algorithms never have a constant term in the objective function, since it is not needed and its omission simplifies the programming process. Thus a simple transformation of variables can always eliminate the constant term in the objective function. Of importance was whether
PAGE 139
129 or not the objective function was concave. Given that = constant, (x) was a linear function. Linear functions are always concave. Thus 4> (x) was a concave function. This meant that any local optimum determined by the algorithm would also be a global optimum. Results of the Programming Analysis In order to expedite the comparisons that were to be made with each "run" of the analysis. Table 7 lists the variables manipulated in the programming analysis along with the mean values observed in the population. Obviously, after each optimization the results can be compared against the mean values. For the first "run" x 14 (PREP < 4) and x 22 (ADA) were set at their mean values. The results were 21 x 13 Â‘15 x 16 Â‘17 Â‘18 x 19 48.0 1.7 LB 14.7 LB 73.5 OPT 9950 OPT 22.2 UB 898.3 UB LB Lower bound of the variable UB Upper bound of the variable OPT Optimal value of the variable (LB < OPT < UB) . The analysis suggested that an optimal allocation of resources among the given variables would predict an increase in achievement (x 2 ^) of 4.9 units. This represented an increase of 0.71 standard deviations. Experience did not
PAGE 140
130 TABLE 7 VARIABLES MANIPULATED IN THE MATHEMATICAL PROGRAMMING ANALYSIS No. Variable Title Mean Standard Devi at i nri 13 Longevity Experience 17.57 8.92 14 Teacher Preparation 9 5.87 5.46 15 Teacher Experience 37.38 9.57 16 Advanced Preparation 52 .49 12.05 17 Median Teacher Salary 9,149.64 1,157.65 18 Average Class Size 18.50 1.66 19 PupilSupport Ratio 136.55 82.20 21 Median Reading Achievement^ 43.07 6 .86 22 Average Daily Attendance 9 9,180.50 70,852.56 24 Instructional Expenditures 764.51 139.32 PerPupil c a For each programming problem, X, and X values 14 22 ^2i was the variable maximized C X 24 was the budget constraint assumed constant
PAGE 141
131 seem to count (x , x ) Â» nor did class size (x ) or the X O X Â«D 18 pupilsupport ratio (x ig ) . Advanced training (x 1& ) was deemed most important of the variables affecting achievement; salary (x.^) counted also. If the salary increments were eliminated for experience, and class size and pupilsupport ratios increased to their maximums , enough resources would be released within the existing budget to increase X 16 ( ADPREp ) by l74 S.D. and x Â±7 (MEDSAL) by 0.69 S.D. It should be noted at this point that in a programming analysis, variables are assumed to be independent of each other, that is, a change in one should not automatically cause change in others. Thus, given the existing salary increments for advanced training, an increase in x (ADPREP) was likely to lead to an increase in x (MEDSAL) . The analysis could not reveal what that change would be. However, increases in x Â±7 can be viewed as that accruing beyond that due to x 16 To express it another way, the above suggests that the median salary could still be increased by 0.69 S.D. (perhaps by raising the base salary) after increases in median salary due to increases in advanced preparation had been attributed. Some may suggest that by allowing the pupilÂ— support ratio ( x ig ) to go to its maximum value, other outcomes (most likely affective ones) would be affected. This
PAGE 142
132 could be an instance of increasing one output at the expense of another. With the model, though, minimum standards to be achieved by other outputs can be set. Thus a pupilsupport ratio of 136.5 (population mean) may be necessary to maintain minimum standards on other outputs. Consider the second "run," then, where x = 136.5. 19 *21 x 13 x 15 ^16 f07 ^18 * 19 45.7 1.7 14.7 70.1 9670 22.2 136.5 LB LB OPT OPT UB FIXED By requiring a maximum pupilsupport ratio of 136.5, achievement increased by 0.38 S.D. as against 0.71 S.D. previously. More resources were expended to maintain this level for X 19 Â» resources that take away from advanced training and salary with consequent effects on reading achievement. Likewise, one could argue that increasing classroom size to its maximum value would negatively affect other outcomes. Consider then if an average class size of 18.5 (population mean) is maintained. x 21 x 13 x 15 x 16 x 17 x 18 x 19 44.1 1.7 LB 14.7 LB 65.8 OPT 9320 OPT 18.5 FIXED 136.5 FIXED Achievement increased by 0.15 S.D. By "trading off" only on the resources expended for experience, less is available to increase x^ and x^_,. Actually, there is evidence to suggest that decisionmakers, given the choice of reducing
PAGE 143
133 class size or increasing salaries, have chosen to increase Q salaries. Thus in future "runs" class size assumed its maximum value. The runs considered so far have had x (PREP < 4) equal to its mean value. Consider what happened when X 14 = 0: X 21 X 13 X 15 ^16 fjL7 ^18 X 19 46.1 1.7 14.7 75.4 7970 22.2 136.5 LB LB OPT OPT UB FIXED Comparing this with a previous "run" a greater predicted increase in achievement (0.44 S.D.) was found as expected, that is, decreases in x (PREP < 4) should be associated with increases in achievement (b = 1.511). Looking at x^^ (MEDSAL) , however, a decrease from the mean value of 9,150 was noted. This was due to the interaction term x^x^., which was discussed earlier. The effect that salary had on achievement interacted with x^ . The greater the percentage of teachers with less than four years training, the more important the level of salary became. However, when x^ = 0, the b weight for salary was only 0.002, which was small compared to advanced training (b = 0.120). No 9 Herbert J. Kies ling. The Relationship of School Inputs to Public School Performance in New York State (Santa Monica, Calif.: The Rand Corporation, October, 1969), p4211 , p. 24.
PAGE 144
134 wonder then that analysis suggested that more money be spent on x^ (ADPREP) . One interpretation of the above could be that by reducing the base salary by a given amount, this money could then be expended for teachers with advanced training. As to what the overall median salary would be as a result is not clear. If decisionmakers were reluctant to decrease the base salary, then the model allows a lower bound to be set on the salary level and the problem can be resolved. The effect that district size had on achievement and the distribution of resources was considered. Throughout, it was assumed that x = 5.87 (population mean). Table 8 gives the results of the analysis at various levels of x 22 (ADA) . The table shows that by increasing the size of the district, the perpupil cost was lowered and hence within a given budget more money could be expended on x 1& and X 17 with the consequent effects on achievement. Johns and Morphet concluded that the optimal size of a school district should be approximately 50,000 students. 10 In going from the population mean of 9,180 to 50,000, with the same perpupil budget, the percentage of teachers with advanced ^Edgar L. Morphet et al . , Educational Organization _and Administration (Englewood Cliffs, N. J. : PrenticeHall, Inc., 1967), p. 174.
PAGE 145
135 TABLE 8 EFFECT OF DISTRICT SIZE ON ACHIEVEMENT AND THE DISTRIBUTION OF SCHOOL RESOURCES (X 14 = 5.87) X 22 x b 21 X 13 X 15 X 16 X 17 CO i Â— 1 X X 19 1,000 45.7 1.7 14.7 69.6 9, 630 22.2 136.5 5,000 45.7 1.7 14.7 69.8 9,650 22 .2 136.5 9, 180 a 45.7 1.7 14.7 70.1 9,670 22.2 136.5 20,000 45.8 1.7 14.7 70.6 9, 710 22.2 136.5 30,000 45.9 1.7 14.7 71.2 9,760 22.2 136.5 50,000 46.0 1.7 14.7 72.2 9,840 22.2 136.5 70,000 46.1 1.7 14.7 73.2 9,920 22.2 136.5 100,000 46.3 1.7 14.7 74.6 10,030 22.2 136.5 ^9,180 was the mean value observed in the population 43.1 was the mean value observed in the population
PAGE 146
136 training (* 16 ) could be increased by 2 . 1 percent and the median salary (x 1? ) could be increased $170 above that due to increases in x^. To complete the final stage of the data analysis the perpupil budget was incremented by fixed amounts. It was assumed that x = 9,180 and two levels of x , namely 14 X 14 = 0 and X 14 = 5 * 87 , were considered. Table 9 gives the results of the analysis. The interpretation of these results presented some problems. First, we note that increasing the perpupil budget by $100 had a greater effect on achievement when x 14 = 5.87 than when x 14 =0 (x 21 = 48.1 as against ^21 ~ 47.2). That this was so is obscured by the following. At some point in increasing the budget $100, for the case where x = 0, x reaches its maximum 16 value. From that point on, only salary (x ) can make a contribution, but its effect was far less than that of x. . That may be why x was greater when x =0 than 1 0 z 1 2.4 when x 14 = 5.87 and the budget increment was $0, but was reversed when incremented $100. This would suggest that the lower half of the table is more easily interpreted than the upper half. At any rate, when x 14 = 5.87, a budget increment of $100 raised achievement from 45.7 to 48.1 (increase of 0.35 S.D.). In incrementing $200, achievement increased to 49.7. The rate of increase was
PAGE 147
137 TABLE 9 THE EFFECT OF INCREMENTING THE BUDGET ON ACHIEVEMENT (X 22 = 9,180) LB Lower bound of the variable UB Upper bound of the variable OPT Optimal value of the variable (LB < OPT < UB) k X 19 was man iP u l atec 3Â» but the upper bound was fixed at 136.5
PAGE 148
138 less for the second $100. This is so, because x reached 16 its maximum value sometime during the second $100 increase. With only salary left to make a contribution, the rate of increase became far less. This suggests a practical problem in incrementing the budget. When highly contributing variables reach their maximum values what would happen if extrapolation were made beyond the observed range of values? Given that x 16 (ADPREP) makes a major contribution, it could be that extending x 16 beyond its observed maximum (84.3) would continue to result in similar contributions. Of course, such analysis would have to be regarded as highly speculative. Perhaps the budget increments that were chosen were too large. It may be unlikely that a school district would increase its perpupil budget by $100 in a given year. Given a more moderate increase where chief variables do not attain maximum values could generate a new population of values. Those districts that were previously near the observed maximum would use the budget increment to increase values of x lg beyond the previous maximum value. With a new population of values, another production function analysis would establish parameters based on the new population. Essentially, then the process of estimating ^ Production function can be conceived as iterative in
PAGE 149
nature. Each new year provides additional information by which the analysis can be revised and extended.
PAGE 150
CHAPTER IV SUMMARY, DISCUSSION AND SUGGESTIONS FOR FUTURE STUDIES Summary Educational decisionmakers are faced with the task of allocating school resources in such a way as to maximize student outcomes. The problem of how best to proceed is complicated by the question of priorities among the outcomes to be attained, and because of the multiplicity of student outcomes, it is not likely that all outcomes can be maximized at once. Also, aside from questions of technical efficiency, decisionmakers are constrained in their choice of plans by a limited budget and by social, political, and legal forces. Thus, the process of maximizing outcomes is irrevocably placed in the context of planning the best available use of scarce resources. The desire for rational decisionmaking entails that decisionmakers have some empirical notions of how the various inputs into the educational process relate to student outputs. The prices of school inputs must be known if the question of allocative efficiency is to be considered. 140
PAGE 151
141 Given this background, the rationale for the present study was a response to the need for studies that demonstrate the utility and limitations of mathematical models as aids in improving educational planning and decisionmaking. The problem of the study was to develop a mathematical model to facilitate the decisionmaking process in selected areas of educational activity by optimizing the allocation of scarce resources and to empirically illustrate its application. In development of the model attention was given to 1. Determining the educational production function that describes the inputoutput relationship between selected variables, 2. Determining the optimal combination of inputs to maximize output subject to certain costs and other constraints , 3. Determining the optimal combination of inputs to maximize output, subject to certain constraints, given increments in the budget constraint. To accomplish the above first required that the process of education be conceptualized. Mathematical models do not evolve in the absence of a conceptual framework. Basically, the process of education was viewed as a function of the various inputs into the process. These inputs include student, community, and school factors. The
PAGE 152
142 first two are regarded as essential to the conceptualization of the process, as these factors necessarily interact with, indeed may be causal determinants of, the various school factors. Given this interaction among the various input factors together with presumed causal directions, the process of education results in certain outcomes. Among these are cognitive achievement, affective growth, and physical development . With this conceptual framework, the translation to a mathematical model occurred. Concurrent with this translation was the desire to allocate school resources in such a way that would maximaze certain student outcomes. Thus a mathematical model was first required to estimate the educational production function and the budget equation. Given the production function as an objective function and the budget equation as one of the constraints, an optimization strategy was employed. Other constraints can be based on empirical evidence or the subjective desires of decisionmakers. Also, the constraints should include minimum standards to be achieved by other outputs and bounds on the observed variation in the independent variables. To estimate the production function and the budget equation, crosssectional data were used in a surveyÂ— type design. The data consisted of aggregate measures on input
PAGE 153
143 factors in 181 local school districts in a given state. Multiple regression techniques were used to estimate the parameters of the production function and the budget equation. The optimization strategy employed the techniques of mathematical programming, with the production function as an objective function and the budget equation as one of the constraints. The school inputs used in the production function described only a limited area of educational activity. In particular, certain teachers characteristics were considered experience, training, and level of salary. Also, classroom size, pupilsupport personnel ratio, and the size of the district were used. Given the various inputs (i.e., student community, and school) into the educational process, the following nonschool inputs were found to make significant contributions in explaining the variation in reading achievement at the sixthgrade level (x 21 ) : 1. Percentage of gross incomes over $10,000 in the school district (x^) . 2. Attendance: ratio of ADA to ADM (x ) . 6 3. Percentage of graduates receiving post high school education or training (x 7 ) . Also, the following interactions of nonÂ— school inputs with school inputs were found to be significant:
PAGE 154
144 4. Interaction of the percentage of ESEA Title I students with the pupilsupport personnel ratio (x x ) . 4 19 5 . Interaction of the percentage of minority enrollment with average class size (x^x^g) . School inputs were forced into the regression equation since estimates of the effect of each were needed. The above led to the estimation of a production function with an R 2 = 0.79. In estimating the budget equation, perpupil instructional expenditures (X 24 ) was used as the dependent variable. School variables were forced into the equation so that the prices of each could be estimated. Only one nonschool variable made a significant contribution, namely the percentage of minority enrollment (x 5 ) . Curiously, it was the square of this factor (x,^) which entered the equation. This states that as the percentage of minority enrollment increased, perpupil cost increased quadratically rather than linearly. Two school variables had significant square terms, namely percentage of teachers with advanced preparation ( x 16 ) and median teacher salary (x ) . Inclusion of transportation factors (x 12 and x 2Q ) significantly increased the predictive power of the model. The budget equation was estimated with an R 2 = 0.94.
PAGE 155
145 Given the production function and the budget equation together with bounds on the observed variation in the independent variables, a mathematical programming problem was formulated. For meaningful analysis, only school variables xj^, X 15 ' Â• Â• Â• * x 19 were manipulated at various levels of x 14 and * 22 Two levels of x 14 (percentage of teachers with less than 4 years training) were considered, x 14 = 0 and x 14 =5.87 (population mean). Some practical difficulties in using x 14 = 0 occurred due to the indicated bounds on the variables. The clearest results emerged when x ^ 4 = 5.87. At this level, it was found that by reallocating existing resources and maintaining a minimum standard for pupilsupport personnel ratio (x^g = 136.5: population mean) that the predicted increase in achievement ( x 2 i ^ that would result was 0.38 standard deviations. The analysis suggested that by eliminating salary increments due to experience, by increasing classroom size to the maximum observed, enough resources are released to increase the percentage of teachers with advanced training (x^g) h>y 17.6 percent and to increase the median salary (x 17 ) $520 above that due to increases in x^g. Given this level of efficiency, if the perpupil budget were incremented by $100, the predicted increase in achievement would be 0.35 standard deviations. Given the existing allocation of school
PAGE 156
146 resources, if a $100 increment were made so as to maximize achievement, the predicted increase would be 0.73 standard deviations . It was found that increases in district size produced only marginal increases in achievement. However, if district size were increased from 9,180 (population mean) to 50,000, enough resources would be released to increase x by 2.1 percent and x, 7 by $170 above that due to x . xo x/ 16 Throughout the analysis it was apparent that the percentage of teachers with advanced training (x lr ) was the dominant effect followed by salary (x 17 ) . The importance of salary was a function of the percentage of teachers with less than four years training (x ) Â• The greater x was, the greater the effect of salary. When x 14 = 0, the effect of salary was minor compared to the effect of advanced training. Discussion Given that the primary focus of the study was to develop a mathematical model to aid decisionmakers in educational planning, the most relevant concern is whether or not the model actually worked under real conditions. If the results suggested in the analysis for the 181 school districts involved were actually implemented, would changes
PAGE 157
147 in acheivement be within a reasonable range of that which was predicted? Of course, the answer to that question is presently unanswerable. The only way the adequacy of the model could be known was to actually implement the policies suggested by the analysis. However, a oneshot case approach could not constitute grounds for adequate verification. To prove the adequacy of the model would have required that it be tested for a sufficiently large number of cases and then consider the ratio of successes to failures, such ratio being the ultimate criterion of effectiveness. Thus, if model forecasts were shown to be reasonably correct 95 percent of the time, then it could be stated that the model has been able to describe and predict reality very well, and could be expected to do so in the near future. The primary limitation of the present study was in the research design. The major difficulty in a survey study is that causal significance can never be attached to the conclusions. Thus in the present study differences in achievement could be estimated among schools having different percentages of teachers with advanced training. What could not be known was whether actually increasing the percentage for given schools would produce the same differences. To establish such cause and effect relationships generally requires an experimental, longitudinal study
PAGE 158
148 design. To find out what happens to a system when you interfere with it, you have to interfere with it, not just passively observe it. In the present study it was recognized that omission relevant variables can bias the estimates of the regression coefficients. Given the nature of a survey study, it can not be known with certainty if all influences acting on the dependent variables through the independent variables have been brought into the analysis. Nevertheless, merely pointing to this weakness does not discredit the study. One needs to show that a bias exists and that it matters . In the context of the conceptual framework it is important to recognize the nature and extent of controls that were used in the empirical application. For instance, an attempt was made to control for student background through the use of SES variables (percent of incomes above $10,000); for community influences through the use of such variables as attendance and future training; and for student peergroup effects through percentage of minority enrollment and ESEA Title I pupils. No doubt, other influences were omitted from the model, but to what extent the coefficients were biased is not clear . It could be that given the present controls, remaining sources of bias may only slightly effect the coefficients.
PAGE 159
149 Given this state of affairs, the chief contribution of the present study can be viewed as analytical. Analytical studies of this type are those that typically search for the most important relationships among the inputs of the educational system and its outputs, the principal purpose being to locate possibilities for improvement that appear worth exploring. The findings of such analyses are necessarily probabilistic. Even so, statements about "What would happen if ... ?" are likely to lead to far wiser decisions about educational policies than the kinds of uninformed hunches on which educational decisionmakers often rely. Theoretically, there is every reason to suppose that the type of analysis displayed in the present study should be highly useful in helping educators obtain some idea of how best to deploy available funds, facilities, and personnel so as to maximize the educational outcomes students will attain, or to arrive at informed judgments about what tradeoffs might be made among several kinds of inputs . P er h a P s one of the more important conclusions emerges from the analysis of real data used to illustrate the model. Of the school effects studies that have taken place, one of the most frequent conclusions has been that because of the relatively small regression coefficients of school variables
PAGE 160
150 as compared to community and student variables, no proposed reallocation among present school variables could affect achievement very much. Smith, in his reanalysis of the Coleman study, described the situation as follows: I think that we have now milked the EEOS data dry with respect to the 'determinants' of verbal achievement. The overall results should be clear to policymakers and researchers alike. With regard to the differences among schools in resources that we conventionally measure and consider in making policy, there are few that give us any leverage over students' achievement. Within the fairly broad boundaries of existing variations, the simple manipulation of perpupil expenditure or the hiring of more experienced teachers or the instituting of a new curriculum does not lead to dramatic changes in students' verbal achievement. The myth that the reallocation of conventional inputs will lead to a redistribution of achievement outputs can no longer be accepted. ^ Yet, the results of analysis in the empirical application suggested that by reallocating existing school resources so as to maximize achievement, a significant increase would result. Furthermore, incrementing the perpupil budget by $100 led to a predicted increase in mean achievement of 0.73 standard deviations. But, it was also true that the regression coefficients of school variables in the analysis were small compared to other inputs. This appears to Marshall S. Smith, " Equality of Educational Opportunity ; The Basic Findings Reconsidered," On Equality of Educational Opportunity , p. 395.
PAGE 161
151 describe a situation where mathematicians are prone to lament, "Intuition can lead one astray." For it is one thing to look simply at regression coefficients and to speculate at the effects of reallocating resources and another to estimate the prices of these resources and to specifically employ an analysis which seeks to maximize achievement within the confines of a given budget. Suggestions for Future Studies Better Data and Variables The primary goal of the study was to develop a model. To illustrate the utility of the model, reference to real data was made. While the data that were used could by no means be considered an ideal set, the purpose of illustrating the model was served. No doubt, future application of this and similar models should require a more comprehensive set of data and variables. A list of potential variables that could find their place in such a model is given in the appendix. To avoid problems in interpretation it is suggested that future studies use mean teacher salary as a variable rather than the median, and that the absolute numbers as well as percentages of the teacher variables be collected. In the present study it was not possible to indicate what the final median salary would be as the result
PAGE 162
152 of a reallocation. This results from the fact that the median is an ordinal measure. However, given the mean and the absolute numbers for each teacher variable, this problem could be resolved. It is also felt that a cost of living index is needed to avoid biasing the parameters of the budget equation. It may be that the absence of this variable does not seriously effect the estimation of the parameters, but it cannot be known for sure until such a variable has been brought explicitly into the analysis. Bayesian Procedures One of the shortcomings observed in the present analysis was found when the budget constraint was incremented. It was found that very quickly one reached the bounds of observed variation in relevant variables, analysis being inhibited beyond that point. At the same time, it is recognized that commitment to the modeling process in education does not mean a oneshot survey but a periodic reassessment. Thus the production function should be estimated annually. To do so leads to two distinct advantages. First, with each new year to be considered, variation in the ranges of variables previously observed is expected to occur. Also, if results of previous analyses have been implemented, then wider ranges are to be expected in the
PAGE 163
153 important variables. If such a variable has previously been deployed to its maximum value, then a new population presents a new range of variation and analysis proceeds. More importantly, with periodic assessment, one can proceed in a manner analogous to Bayesian statistical procedures. This means that prior information based on previous years' analyses can be used to improve and perfect current estimation of the regression coefficients of the various input factors. Given that cyclical variations may occur from year to year, the process is deemed important for determining the average effects of variables over a period of years. 2 Technical Efficiency While the present study was limited to the investigation of allocative efficiency, the question of technical efficiency should not be avoided. In a previous chapter f methods of investigating technical efficiency have been considered. In particular, it is suggested that the residuals of production function analysis for this and other studies be examined closely. An experimental approach should be employed in ferreting out additional variables 2 Donald L. Meyer, "Bayesian Statistics," Review of Educational Research . XXXVI (December, 1966), 503516.
PAGE 164
154 that can lead to new technologies in education. For all districts that essentially have equivalent combinations of inputs, the residual reveals the most productive districts in the group. Comparisons of these districts with other districts in the group on the types of programs, staff, and facilities lead to technical improvements. Causality and Policy Implications Even though the results of the analysis in the present study were not statements of causality, it is clear that if decisionmakers were expected to implement the results of the analysis causal significance would be implicitly assumed. Social policy research must always contend with this dilemma. As a consequence methodological approaches to this problem are more and more being considered. Blalock has been a strong leader in this area and has contributed much to its development . ^ The most relevant term to describe this approach is path analysis . It involves a type of regression analysis called "causal". One postulates various alternative causal schemes for explaining a phenomenon under investigation. The analysis eliminates less plausible causal schemes. Of course, causality can ^Blalock. See also, Blalock, Methodology in Social Research (New York: McGrawHill, 1968) .
PAGE 165
155 never empirically be determined with exactitude, but the analysis can narrow the field of possible alternatives. And it is precisely this fact with which policy research must contend. It is the regression coefficients that give the laws of science. Educators must begin to make statements of the form, "A unit increase in x is believed to cause b units increase in y." Correlations are of limited value to policy makers. They need to know the change expected in altering the status quo. Regression coefficients can give this information. Beyond Cognitive Achievement In thinking about how much can be accomplished by improvements in education it is necessary to go outside the internal system of school achievement, especially cognitive achievement, to consider the possible size of longrun effects of improvements in education for occupation and income for those who have been deprived. It has been stated that previous studies show that academic achievement is not necessarily related to future success in life. Yet the realization of the American dream is for most realized in the dual category of occuaption and income . For maximizing both the goals of school and society it would be well to look at the relationship of various school inputs
PAGE 166
156 to future occupation and income. Such research may be in the distant future, but it is a goal that should not be denied . More Empirical Studies It is hardly to be questioned that studies confined to the practical applications of models should be energetically pursued to provide, eventually, a firmer basis than now exists for dealing with broad questions of educational policy. Mathematical models, such as was developed in this study, allow decisionmakers to form expectations of future consequences, these expectations being based on known empirical relationships and the decisionmakers' judgments. All of the foregoing does not mean that in applied work the researcher even pretends to be able to find the best of all possible decisions. The data are too incomplete and sometimes inaccurate, the tools of analysis are often too blunt, and the researchers knowledge of the educational process is too limited for him to be able to come up with anything more than approximations to the ideal of the true optimum. Nevertheless, an analysis which is specifically designed to look for optimal decisions, crude and approximative though it may be, is very likely to do much better than the workable but relatively arbitrary rules of thumb which play so prominent a part in educational practice.
PAGE 167
APPENDIX
PAGE 168
A LIST OF POTENTIAL VARIABLES FOR ESTIMATING THE EDUCATIONAL PRODUCTION FUNCTION I . Pupil Characteristics Ethnic origins (percentages) Percentage of transiency Ratio of aggregate attendance to aggregate membership Percentage of children receiving free lunch Percentage of vocational students II . Community Characteristics Adjusted gross income per pupil Median family income Median years of schooling in adult population Percentage of adults in unskilled, semiskilled, and service occupations Percentage of children in twoparent families Percentage of uncrowded housing units Percentage of children 517 enrolled in public schools Percentage of nonwhite population 158
PAGE 169
159 III . School Variables A. Teacher variables Average class size Pupilteacher ratio Teacher experience Teacher salaries Teachers' verbal ability Percentage of teacher turnover Teachers academic preparation (college hours in subject taught) Number of instructional assignments per teacher Percentage of teachers on tenure B . Administration variables Number of professional supportive staff per 1,000 pupils Median administrative salary Subject matter offerings Length of school year Ability grouping Textbook supply District enrollment Perpupil paraprofessional ratio Number of fulltime employees in auxiliary services per pupil
PAGE 170
160 C . Facilities Number of library books Age of building Average size of high schools School site size Percentage of makeshift classrooms (portables, etc.) per 1,000 students Percentage of pupils on double session Science laboratory facilities D. Expenditures Instructional expenditures per pupil Administrative expenditures per pupil Total expenditures per pupil Expenditures on instructional materials and supplies per pupil
PAGE 171
SELECTED BIBLIOGRAPHY Abt, Clark C. Design for an Elementary and Secondary Education CostEffectiveness Model , Volume II. Cambridge, Mass.: Abt Associates, Inc., 1967. Alkin, Marvin C. "Evaluating the CostEffectiveness of Instructional Programs." UCLA Symposium on Problems in the Evaluation of Instruction, Sponsored by the Center for the Study of Evaluation. Los Angeles: December, 1967. ' . "Preliminary Analysis of Data for a Secondary School InputOutput Model." Center for the Study of Evaluation, Report No. 42, UCLA Graduate School of Education. Los Angeles: February, 1969. . "The Use of Quantitative Methods as an Aid to DecisionMaking in Educational Administration." A paper read at a meeting of the American Educational Research Association. Los Angeles: February 58, 1969. Andrews, Frank M. Multiple Classification Analysis . Ann Arbor, Michigan: University of Michigan Press, 1967. Armor, David J. "School and Family Effects on Black and White Achievement: A Reexamination of the USOE Data, in On Equality of Educational Opportunity . Edited by Frederick Mosteller and Daniel P. Moynihan. New York Random House, 1971. Benson, Charles S. The School and the Economical System . Chicago: Science Research Associates, 1966. Blalock, Hubert M. Causal Inferences in NonExperimental Research . Chapel Hill: University of North Carolina Press, 1961. 161
PAGE 172
162 . Methodology in Social Research . New York: McGrawHill, 1968. Blau, Peter M. and Otis D. Duncan. The American Occupational Structure . New York: Wiley & Sons, Inc., 1967. Bohrnstedt, George W. "Observations on the Measurement of Change, " in Sociological Methodology . Edited by Edgar F. Borgatta. San Francisco: JosseyBass, Inc., 1969. Bowles, Samuel. "Education Production Function: Final Report." Research Project Supported by the United States Office of Education. Harvard University. February, 1969. . Planning Educational Systems for Economic Growth . Cambridge, Mass.: Harvard University Press, 1969. Bowles, Samuel and Henry M. Levin. "More on Multicollinearity and the Effectiveness of Schools, " Journal of Human Resources , III (Summer, 1968) . Bowman, Mary Jane. "Economics of Education," Review of Educational Research. XXXIX (December, 1969) . Cohen, Jacob. "Multiple Regression as a General DataAnalytic System," Psychological Bulletin , LXX (1968). Cohn, Elchanan. "Economies of Scale in Iowa High School Operation," Journal of Human Resources , III (Fall, 1968) . . "Towards Rational DecisionMaking in Secondary Education." Institute for Research on Human Resources. University Park: Pennsylvania State University, 1970. Coleman, James S. et al . Equality of Educational Oppor tunity . Washington, D.C.: U.S. Government Printing Office, 1966. Dixon, W. J. (ed.) Biomedical Computer Programs . Berkeley: University of California Press, 1971. Dunne 11, John P. InputOutput Analysis of Suburban Elementary School Districts . Ed.D. Dissertation, Illinois State University, 1970.
PAGE 173
163 Entwisle, D. R. and R. Conviser. "Input Output Analysis in Education," High School Journal . LII (January, 1969) . Farrar, Donald E. and Robert R. Glanber. "Multicollinearity in Regression Analysis: The Problem Revisited, " The Review of Economics and Statistics , XL IX (February, 1967). Fiacco, Anthony V. and Garth P. McCormick. "Computational Algorithm for the Sequential Unconstrained Minimization Technique for Nonlinear Programming, " Management Science , X (July, 1964) . Gilbert, John P. and Frederick Mosteller. "The Urgent Need for Experimentation, " in On Equality of Education al Opportunity . Edited by Frederick Mosteller and Daniel P. Moynihan. New York: Random House, 1971. Glass, Gene V. and Julian C. Stanley. Statistical Methods in Education and Psychology . Englewood Cliffs, N.J.: PrenticeHall, Inc., 1970. Goldberger, Arthur S. "Note on Stepwise Least Squares," Journal of the American Statistical Association . LVI (March, 1961). . Topics in Regression Analysis . New York: Macmillan, 1968. Gordon, Robert A. "Issues in Multiple Regression," American Journal of Sociology , LXXIII (March, 1968) . Hantanaka, Michio . Workability of InputOutput Analysis . Ph.D. Dissertation, University of Michigan, 1967. Hanushek, Eric A. and John F. Kain. "On the Value of Equality of Educational Opportunity as a Guide to Public Policy, " in On Equality of Educational Opportunity . Edited by Frederick Mosteller and Daniel P. Moynihan. New York: Random House, 1971. Hare, Van Court, Jr. Systems Analysis: A Diagnostic Approach . New York: Harcourt, Brace and World, Inc., 1967.
PAGE 174
164 Hartley, Harry. Educational PlanningProgrammingBudgetinq : A Systems Approach . New York: PrenticeHall, Inc., 1968. Hoffenberg, Marvin. "Application of Leontief InputOutput Analysis to School District Budgeting." Center for the Study of Evaluation: Working Paper No. 12, UCLA Graduate School of Education. Los Angeles: October, 1970. James, H. Thomas. The New Cult of Efficiency and Education . Pittsburgh: University of Pittsburgh Press, 1969. Katzman, Martin T. "Distribution and Production in a Big City Elementary School System," Yale Economic Essays , VIII (Spring, 1968) . Kelly, Francis J. Multiple Regression Approach . Carbondale, 111.: Southern Illinois University Press, 1969. Kiesling, Herbert J. The Relationship of School Inputs to Public School Performance in New York State . Santa Monica, Calif.: The Rand Corporation, October, 1969. . "The Study of Cost and Quality of New York School Districts: Final Report." Research supported by the United States Office of Education, Project No. 80264. Bloomington, Indiana: Indiana University, 1970. Levin, Henry M. "A CostEffectiveness Analysis of Teacher Selection, " Journal of Human Resources , V (Winter, 1970) . . "The Effect of Different Levels of Expenditure on Educational Output, " in Economic Factors Affecting the Financing of Education . Edited by Roe L. Johns et al . , Gainesville, Florida: National Educational Finance Project, Vol. II, 1970. . "A New Model of School Effectiveness." In Do Teachers Make a Difference ? Washington, D.C.: U.S . Government Printing Office, 1970.
PAGE 175
165 Linn, Robert L. and Charles E. Werts. "Assumptions in Making Causal Inferences from Part Correlation, Partial Correlations, and Partial Regression Coefficients," Psychological Bulletin , LXXII (1970) . Lyle, Phillip. Regression Analysis of Production Costs and Factory Operations . New York: Hafner Publishing Co., Inc., 1957. McAfee, Jackson K. "Problems in Measuring Change." Unpublished paper. University of Florida, 1972. McNamara, James F. "Mathematical Programming Models in Educational Planning," Review of Educational Research , XL I (December, 1971) . Mandansky, Albert. "The Fitting of Straight Lines when Both Variables are Subject to Error," Journal of the American Statistical Association , LIV (1959) . Meyer, Donald L. "Bayesian Statistics," Review of Educational Research , XXXVI (1966) . Michigan Department of Education. Research into Correlates of School Performance, Assessment Report No. 3 , 1970. Miner, Jerry. "Financial Support of Education," in Implications for Education of Prospective Changes in Society . Edited by Edgar L. Morphet and Charles 0. Ryan. Denver, Colorado: Designing Education for the Future, an EightState Study, 1967. Morphet, Edgar L. et al . Educational Organization and Administration . Englewood Cliffs, N.J.: PrenticeHall Inc., 1967. Mosteller, Frederick and Daniel P. Moynihan, eds . "A Pathbreaking Report, " in On Equality of Educational Opportunity . Edited by Frederick Mosteller and Daniel P. Moynihan. New York: Random House, 1971. Nephew, Charles T c Guides for the Allocation of School District Financial Resources . Ed.D. Dissertation, State University of New York at Buffalo, 1969.
PAGE 176
166 New York State Education Department. "Technical Report of a Project to Develop Education CostEffectiveness Models for New York State." Albany, N.Y.: Bureau of School Progress and Evaluation, March, 1970. Ribich, Thomas I. "The Effect of Educational Spending on Poverty Reduction, " in Economic Factors Affecting the Financing of Education . Volume II. Edited by Roe L. Johns et al . Gainesville, Florida: National Educational Finance Project, 1970. Riew, John. "Economies of Scale in High School Operation," Review of Economics and Statistics , XLVIII (August, 1966). Rose, Scott N. A Study to Identify Variables to Predict Local School District Productivity in Two States . Ed.D. Dissertation, University of Florida, 1972. Seiler, Karl, III. Introduction to Systems CostEffective ness; Operations Research No. 17 . New York: WileyInterscience, 1969. Shaycroft, Marion. "The Statistical Characteristics of School Means," in Studies of the American High School . Edited by John C. Flanagan et al . Cooperative Research Project No. 176, Project Talent, University of Pittsburgh, 1962. Smith, Marshall S. " Equality of Educational Opportunity: The Basic Findings Reconsidered, " in On Equality of Educational Opportunity . Edited by Frederick Mosteller and Daniel P. Moynihan. New York: Random House, 1971. Stiles, L. J. "Assessment of Educational Productivity," Journal of Educational Research , LX (April, 1967) . Stollar , Dewey H. and Gerald Boardman. Personal Income by School Districts in the United States . Gainesville , Florida: National Educational Finance Project, 1971. Temkin, Sanford. "A Comprehensive Theory of CostEffectiveness: Administration for Change Program." Philadelphia, Pa.: Research for Better Schools, Inc., April 1970. Thomas, J. Alan. The Productive School: A Systems Analysis Approach to Educational Administration . New York : Wiley & Sons, Inc., 1971.
PAGE 177
167 Tinbergen, Jan and H. C. Bos. Econometric Models of Education: Some Applications . Paris: Organization for European Economic CoOperation, 1965. Tracz , George S. "An Overview of Optimal Control Theory Applied to Educational Planning." A paper read at a meeting of the American Educational Research Association, Los Angeles: February 58, 1969. Wegner, P. "Relationship Between Multivariate Statistics and Mathematical Programming," Applied Statistics . XII (November, 1963). Wynne, Edward. "School Output Measures as Tools for Change," Education and Urban Society . XII (November, 1969) .
PAGE 178
BIOGRAPHICAL SKETCH Jackson K. McAfee was born December 27, 1939, at Vero Beach, Florida. In June, 1957, he was graduated from Vero Beach High School. He received the degree of Bachelor of Arts from North Park College In June, 1961, with a major in mathematics. From September, 1961, to June, 1963, he taught mathematics at North Park Academy, Chicago, Illinois. He received the degree of Master of Arts in mathematics from Northwestern University in August, 1965. In 1967 he attended a National Science Foundation Summer Institute in Computer Science and Related Mathematics at the University of Oklahoma. He taught mathematics at the Vero Beach Senior High School from September, 1963, to June, 1970. In September, 1970, he entered the University of Florida to work toward the degree of Doctor of Education. Jackson K. McAfee is married to the former Esther Tungseth and they have two children, Bret and Michele. He is a member of Kappa Delta Pi, Phi Delta Kappa, The American Association of School Administrators, and the American Educational Research Association. 168
PAGE 179
I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Education. Professor of Educational Administration I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Education. ./I f Ralph B. Kimbrough Professor of Educational Administration I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for This dissertation was submitted to the Dean of the College of Education and to the Graduate Council, and was accepted as partial fulfillment of the requirements for the degree of Doctor of Education. August, 1972 Dean, 777 (H/Zb/fa !ucation Dean, Graduate School

