ANALYSIS OF ON-FARM RESEARCH
Economic Analysis in Small Farm Systems
Spring Semester, 1994
Peter E. Hildebrand
Food and Resource Economics Department
University of Florida
Gainesville, FL 32611-0240
ANALYSIS OF ON-FARM RESEARCH
In order to reflect realistic responses, many kinds of research
whose purpose is to generate technology, must be conducted on the
kinds of farms where the technology is expected to be useful.
This can create much more variability in the data than does
research conducted on experiment stations where many factors not
included as treatments are controlled. Often, non-treatment
factors on-station are also controlled at levels high enough so
that they do not limit the potential of the variables being
tested. The result is to create environments that are much less
variable and much more productive than those found on most farms.
For researchers trained only in on-station research, the lack of
control over non-experimental variables and the resulting high
CVs can be exasperating. Many believe that "good" research
cannot be done on farms for these reasons. On the other hand, it
is becoming well recognized that farmers seldom can use,
directly, research results from experiment stations. One kind of
research results that seldom can be extrapolated successfully
from experiment stations to farms is fertility research. Because
experiment stations have been used for research, the soils have
been modified by amendments to the point they no longer resemble
the soils on most farms.
Following are a series of exercises to familiarize you with the
nature of on-farm research results. The data are taken from a
real on-farm maize fertility trial conducted by CIMMYT in Mexico.
In the trial, there were eight farms each with three replications
of all 12 treatments. Unfortunately, only the individual
treatment averages by farm are available. Furthermore, the data
on the characteristics of the individual fields where the trials
were located are not available. Nevertheless, this is an
excellent data set to work with. Therefore, these data will be
used to show a number of ways the results can be analyzed and
interpreted. The first series of exercises will look at the data
set as a whole. The second series will consider groups of farms
so that more specific recommendations can be made.
Below are data generated from a fertilizer trial on maize conducted on 8 farms by personnel from
CIMMYT in Mexico. Notice that in this trial, there are four levels of N and three of P205. It is a 3 x 4
factorial (with 12 plots per block). There were three blocks or replications per farm, but only the
average of the three blocks for each treatment are available and shown in the table.
Maize yields (kg ha-1 of 14 percent moisture grain) by fertilizer treatment, 8 farms.
Fertilizer treatment (kg ha-1)
N: 0 50 100 150 0 50 100 150 0 50 100 150
P205 0 0 0 0 25 25 25 25 50 50 50 50 Avg.
1 400 1240 3630 3760 0790 2580 4230 4720 1670 2510 3280 3660 2710
2 1530 2600 5140 5320 1670 3790 5100 6830 1410 4130 5890 6270 4140
3 4150 4860 4800 4870 4440 5000 4970 5280 5120 5660 6360 6620 5180
4 2420 3820 5230 4480 2360 4540 6260 7170 1610 4410 5380 6580 4520
5 1640 1920 2080 2190 2040 3210 3120 2930 1440 3440 3320 3620 2580
6 1610 2940 4140 4340 1810 3920 3610 3810 1180 3890 5380 4920 3460
7 4740 5410 4290 4920 4910 5220 5380 5140 5100 4880 4540 5280 4980
8 1210 2330 1970 2230 1530 2780 2490 2800 1370 3510 3750 4350 2530
Avg. 2210 3140 3910 4010 2440 3880 4400 4840 2360 4050 4740 5160 3760
Source: Perrin, Richard K., et al. 1976. From Agronomic data to farmer recommendations. An economics
training manual. CIMMYT. Information bulletin 27.
1. First look at the data. Are the farms quite similar or are they different? If they are quite different,
we should probably look at them as if there were going to be more than one recommendation domain.
2. Either way, let's begin to examine the data by looking at the overall average for all farms. First,
let's look at the response of the maize to P205. Notice that we have three levels of P205 for each of
the four levels of N. Begin by summarizing the average data for different levels of N as follows:
Maize response for:
N = 0
N = 50 N = 100
Yield (Ma ha-'z
3. Now, plot the P20, response data for N = 0 on a graph and calculate a quadratic the response
equation (production function) using the visiographic procedure.
4. Repeat this process for the other three levels of N.
N = 150
1. Load the data from the table in Exercise No. 1 in a spreadsheet.
Use the same orientation as in the table, i.e., farms are the rows and
treatments are the columns. You can calculate the averages as a means
of verifying accuracy of the data entered. This data set will be the
basis for a number of analyses we will be doing. Be sure to save it
in this form.
2. Now we will set up the data to facilitate estimation of the
production functions from Exercise No. 1 on the computer using the
data from all eight farms. By copying and moving your data, set up
another working table as below in order to calculate the response to
P205 (P205)2 N = 0 N = 50 etc.
0 0 400 1240
0 0 1530
25 625 790
25 625 1670
50 2500 1670
50 2500 1410
50 2500 1370 3510
You should have 24 rows of data (be careful not to include the
3. Using symbols in an x-y graph format, look at the data for N = 0
(compare yield only with the levels of P205, not the squared values).
4. You can now estimate the equation by regression using (on Quattro-
Pro) the commands T-A-R, with the two columns for P205 and (P205)2 as
the independent variables and the column for yield as the dependent
variable. The equation should be the same as the first you estimated
in Exercise No. 1 by the visiographic method.
5. Repeat for the other three levels of N. Are all the equations the
same as you got from the visiographic method? They should be.
6. Show the four equations on a single graph. Can you figure out how
to do this and get a smooth curve, not just two straight lines going
through three points?
7. Interpret the results.
1. For each of the production functions in Exercise FA2, find
mathematically where production is maximum. Do these correspond with
2. For prices of maize of $1,000 per Mg ($ is no particular
currency), N of $8 per kg and P205 of $10 per kg, find for what level
of phosphorus profit is maximized. Repeat for each level of N.
Discuss and interpret the results.
3. If you have time, you should repeat exercises FA1 to FA3 varying
nitrogen for each level of phosphorus.
Factor x Factor
So far we have analyzed this data set by finding responses to
phosphorus for set levels of nitrogen (FA1, FA2, FA3-1 and FA3-2) or
by finding responses to nitrogen for set levels of phosphorus (FA3-3).
By doing this, it is possible to answer questions such as, "If I apply
25 kg ha-1 of P205, how much nitrogen should I apply?" But it does not
answer the question of how much of each amendment is best. To do this
requires the analysis of nitrogen and phosphorus simultaneously.
1. To obtain a first estimate of the nature of the 3-dimensional
surface (shown in two dimensions) construct a graph with nitrogen on
the vertical (Y) axis and phosphorus on the horizontal (X) axis. Use
the values from the experiment as shown in the table in Exercise FA1.
Then using the averages from each treatment, write the corresponding
yields for each N-P combination at the intersection in the graph. For
example, the value for 0-0 (N-P) is 2210 and for 50-25 is 3880.
Follow the visiographic procedure from Hildebrand and Poey, pp. 108-
2. Draw in iso-quant contours for 2500, 3000, 3500, etc.
3. Using the prices from Exercise FA3, draw in the iso-cost contours
and the expansion path.
4. Interpret your work.
Factor x Factor
In the procedure used in Exercise FA4, we could determine the best
combination of N and P to use, but would need to make successive
approximations to determine the most profitable combination of the two
for any price combination. In order to be more "precise" a
mathematical production function incorporating both amendments can be
calculated by regression. The form we will use here is quadratic
(with an NP interaction term).
1. In a spreadsheet, set up the data in the following form:
N P N2 P2 NP Y
0 0 0 0 0 400
50 0 2500 0 0 1240
100 0 10000 0 0 3630
150 0 22500 0 0 3760
0 25 0 625 0 790
50 25 2500 625 1250 2580
etc. etc. etc. etc. etc. etc.
By using all the data, you should have 96 (8 X 12) rows. Calculate
the production function Y = f(N,P,N2,P2,NP) by blocking the first five
columns as independent variables in the regression menu. In the
output, the coefficients are in the same order as the columns.
2. Determine mathematically, for what quantities of N and P,
production is maximized. How much maize is produced with these
quantities of N and P? Do these values correspond with your graph
from Exercise FA4? Explain any differences.
3. Using the same prices as before, find mathematically the
quantities of N and P which maximize profit. How much maize is
produced at this level of fertilizer use? Do these quantities fall on
the expansion path from your graph from Exercise FA4? Explain any
GROUPING FARMS INTO
In the previous exercises, we have been looking at all the farms as a
single group. Any recommendations made from these analyses would
presumable apply to all the farms in the sample and all other similar
farms. However, it is evident from the data in Exercise FAI that the
farms (or at least the fields) where the trials were conducted are
quite different. The environments in the fields from farms 1, 5 and
8, for example, are obviously poorer for producing maize than are the
environments in the fields from farms 3, 4 and 7. Unfortunately, as
mentioned in the introduction to these exercises, no information is
available on the characteristics of the environments at these
locations. We do not know if the lower yields are associated with
less rainfall, poorer fertility, late planting, less effective
weeding, or what. But we do know that for some reason there are
distinct environmental differences in the fields and on the farms
where these trials were planted. Even though these differences are
real, and form the real world of the farmers, they are the reasons
that many researchers complain about doing research on farms.
Without minimizing the importance of characterizing each environment
where on-farm trials are conducted, there is a convenient means
available to quantify the evident environmental differences. This is
to convert the overall farm (field) average (over all treatments) into
an index that reflects quality of the environment in each field and
differentiates it from the environments in other fields. Thus for
example, for farm (field) 1, the average yield is 2710 kg ha-1. This
converts into an index (without units) of 2710. The index for farm
(field) 2 is 4140. It is quite obvious that environment 2,
representing farm or field 2, with an index of 4140 is a better
environment for producing maize than environment 1 with an index of
2710. Even though some statistical purists feel the use of this index
in regression is marred, it has been in use for over 50 years and has
been invaluable in the absence of any other available method.
We will use the term El for this "environmental index." In the first
exercise that follows, El will be used in conjunction with the
production function. In later exercises, it will be used as an
important component in Modified Stability Analysis.
1. To the table in Exercise FA5, add four more columns of data: EI,
NEI, N2EI and PEI. Then blocking all nine columns as independent
variables, calculate the production function:
Y = f (N,P,N2 p2,NP,EI,NEI,N2EI,PEI)
Compare the R2 values for the two equations and the deviations from
regression (Std Err of Y Est). How do you explain the differences?
2. What would you expect this equation to look like if you set the El
equal to the average El for all farms (3760 from the table in Exercise
FA1)? Do this and compare the resulting equation with the equation
from Exercise FA5. Did you expect the similarity? Explain why this
happens. Solve for maximum production and maximum profit. How do the
quantities of N, P and Y compare with those from Exercise FA5?.
3. Now set El equal to the poorest environments, say El = 2500.
Notice how the equation changes. Solve this equation for maximum
production and maximum profit. Compare resulting values of N, P and Y
with those from the previous equation. Explain the results.
4. Do the same for a high EI, say El = 5000. Compare with the
results of the other production functions.
The treatments in the trial we are using reflect continuous variables
even though the treatments themselves are discreet levels and
combinations. Another way of analyzing on-farm research data, whether
the treatments reflect continuous or discreet variables, is by means
of Modified Stability Analysis (MSA).
Begin with the data exactly in the form as in the table for Exercise
FA1. That is, the treatments across the top, forming the columns and
the farms or fields (environments) forming the rows.
The procedure uses regression to estimate the response of each
treatment to environment. This response can be either linear or
curved (usually quadratic).
Use the steps in the Training Guide' and the data from the CIMMYT
maize fertilizer trial to analyze these results by MSA.
1 Hildebrand, P.E. 1993. Steps in the analysis and
interpretation of on-farm research-extension data based on
modified stability analysis: a training guide. Staff Paper
SP93-11. Food and Resource Economics Department, University of