ANALYSIS OF ON-FARM RESEARCH
Economic Analysis in Small Farm Livelihood Systems
Fall Semester, 1996
Peter E. Hildebrand
Food and Resource Economics Department
University of Florida
Gainesville, FL 32611-0240
ANALYSIS OF ON-FARM RESEARCH
PRODUCTION FUNCTION EXERCISES
In order to reflect realistic responses, many kinds of research whose purpose is to generate technology must be
conducted on the kinds of farms where the technology is expected to be useful. This can create much more variability
in the data than does research conducted on experiment stations where many factors, those not included as treatments,
are controlled. Often, non-treatment factors on-station are also controlled at levels high enough so they do not limit
the potential of the variables being tested. The result is to create environments that are much less variable and much
more productive than those found on most farms.'
For researchers trained only in on-station research, the lack of control over non-experimental variables and the
resulting high CVs can be exasperating. Many believe that "good" research cannot be done on farms for these
reasons. On the other hand, it is becoming well recognized that farmers seldom can use, directly, research results
from experiment stations. One kind of research results that seldom can be extrapolated successfully from experiment
stations to farms is fertility research. Because experiment stations have been used for research, the soils have been
modified by amendments to the point they no longer resemble the soils on most farms.
Following are a series of exercises to familiarize you with the nature of on-farm fertilizer research results. The data
are taken from a real on-farm maize fertility trial conducted by CIMMYT in Mexico. In the trial, there were four
farms each with three replications of all 12 treatments and the trials were conducted two different years.
Unfortunately, only the individual treatment averages by farm and by year are available. Furthermore, the data on
the characteristics of the individual fields where the trials were located are not available nor are climatic data for each
year. Nevertheless, this is an excellent data set to work with. Therefore, these data will be used to show a number
of ways the results can be analyzed and interpreted. The first series of exercises will look at the data set as a whole.
The second series will consider groups of farms so that more specific recommendations can be made.
1 For more information see Hildebrand and Russell 1996, Chapter 1.
Below are data generated from a fertilizer trial on maize conducted on four farms over two years by personnel from CIMMYT
in Mexico. Notice that in this trial, there are four levels of N and three of P2O,. It is a 3 x 4 factorial (with 12 plots per
block). There were three blocks or replications per farm, but only the average of the three blocks for each treatment are
available and shown in the table.
Table 1. Maize yields (kg ha-1 of 14 percent moisture grain) by fertilizer treatment, 4 farms, 2 years
Fertilizer treatment (kg ha"1)
N: 0 50 100 150 0 50 100 150 0 50 100 150
P20- 0 0 0 0 25 25 25 25 50 50 50 50 Avg.
1/A 400 1240 3630 3760 0790 2580 4230 4720 1670 2510 3280 3660 2710
2/A 1530 2600 5140 5320 1670 3790 5100 6830 1410 4130 5890 6270 4140
3/A 4150 4860 4800 4870 4440 5000 4970 5280 5120 5660 6360 6620 5180
4/A 2420 3820 5230 4480 2360 4540 6260 7170 1610 4410 5380 6580 4520
1/B 1640 1920 2080 2190 2040 3210 3120 2930 1440 3440 3320 3620 2580
2/B 1610 2940 4140 4340 1810 3920 3610 3810 1180 3890 5380 4920 3460
3/B 4740 5410 4290 4920 4910 5220 5380 5140 5100 4880 4540 5280 4980
4/B 1210 2330 1970 2230 1530 2780 2490 2800 1370 3510 3750 4350 2530
Avg. 2212 3140 3910 4014 2444 3880 4395 4835 2362 4054 4738 5162 3762
Source: Perrin, Richard K., et al. 1976. From Agronomic data to farmer recommendations. An economics training
manual. CIMMYT. Information bulletin 27.
1. First look at the data. Are the farms and years quite similar or are they different? If they are quite different, we should
probably look at them as if there were going to be more than one recommendation domain.
2. Either way, let's begin to examine the data by looking at the overall average for all farms. First, let's look at the response
of the maize to P205. Notice that we have three levels of P20, for each of the four levels of N. Begin by summarizing the
average data for different levels of N as follows:
Maize response for:
N=0 N=50 N=100 N=150
Yield (Mg ha')
3. Now, plot the P205 response data for N = 0 on a graph (do this on graph paper and by hand rather than on a computer)
and calculate a quadratic the response equation (production function) using the visiographic procedure (Hildebrand and Poey
1985, p. 85 ff).
4. Repeat this process for the other three levels of N.
1. Load the data from the table in Exercise No. 1 in a spreadsheet. Use the same orientation as in Table 1, i.e., farms and
years (the environments) are the rows and treatments are the columns. You can calculate the averages as a means of verifying
accuracy of the data entered. This data set will be the basis for a number of analyses we will be doing. Be sure to save it in
this form. Your table should look just like Table 1.
2. Now we will set up the data to facilitate estimation of the production functions from Exercise No. 1 using the data from all
eight farms. By copying and moving your data, set up another working table as below in order to calculate the response to
P20,. Perhaps the easiest way to do this is to work on the same spreadsheet as Table 1 then after you have finished this table
MOVE it to a separate spreadsheet (I personally prefer a separate spreadsheet to a new page on the same spreadsheet, but you
can do it any way you want).
Table 2. Data arrayed for quadratic regression for phosphorus and for different levels of nitrogen.
N = 50
You should have 24 rows of data (4 farms x 2 years x 3 levels of P2zO). Be careful not to include the averages from Table 1!
3. Using markers in an x-y graph format, look at the data for N = 0 (compare yield only with the levels of P205, not the
squared values). Your graph should look something like the one below (5167/exer2a.wb2). Discuss.
0 10 20 30
Exercise FA2, continued
4. You can now estimate the quadratic equation by regression using Quattro-Pro with the two columns for P20, and (P205)2 as
the independent variables and the column for yield as the dependent variable. The equation should be the same as the first you
estimated in Exercise No. 1 by the visiographic method. Is it? However, by doing regression by statistical procedures, you
now have additional information. The Quattro Pro print out shows you the R2 value, the standard error of the Y estimate and
the standard errors of the coefficients as well as the degrees of freedom. Quattro Pro provides you a means of estimating the
significance of each of the regression coefficients (t test) as well as of the equation as a whole (F test).
Lttest To test whether each individual coefficient differs from zero, divide each coefficient by the standard error of the
coefficient to get the t value for that coefficient. Then using @TDIST(t value, df, # tails) you can find the probability level
associated with this t value. If you have prior knowledge what the sign of each coefficient should be, and in this case you
should from your theory, prior knowledge and looking at your data, you can use a one-tailed t test.
Ftest The t test determines the significance of each of the coefficients individually. Each coefficient can be accepted or
rejected in accordance with results of the test. The F test is a test of the significance of the complete equation and, at the same
time, a test of all the coefficients combined. The value of F is calculated from values already in the print out:
F = R/l(1-R)*(n-k-1)/k
where (n-k-1) is the degrees of freedom in the printout and k is the number of coefficients (2 in this example). In an F table,
the k value is the numerator degrees of freedom and the (n-k-1) value is the denominator degrees of freedom. The
denominator degrees of freedom is on the print out. The probability level for the F test is determined in Quattro Pro using
@FDIST(F value, num df, denom df)
After you have calculated your regression equation, you can array your values as below for N=0:
N = 0 Regression Output: MAIZE RESPONSE
Constant 2212.5 Phosphorus
Std Err of Y Est 1554.18
R Squared 0.00432 5
No. of Observations 24 4
Degrees of Freedom 21 _
d 3 N=0
X Coefficient(s) 15.5 0.25 2
Std Err of Coef. 56.03659 1.07676 1
"t" 0.276605 -0.2322
Prob t 0.392394 0.40932 0 10 20 30 40 so
"F" 0.045583 P205, kgl
Prob F 0.955534
5. Repeat for the other three levels of N. Are all the equations the same as you got from the visiographic method? They
6. Using the spreadsheet, show the four equations on a single graph like the one above. Can you figure out how to do this
and get a smooth curve, not just two straight lines going through three points?
7. Interpret the results.
1. For each of the production functions in Exercise FA2, find mathematically where production is maximum. Do these
correspond with your graphs?
2. For prices of maize of $0.60 per kg ($ is no particular currency), N of $8 per kg and P5Os of $10 per kg, find for what
level of phosphorus profit is maximized. Repeat for each level of N. Discuss and interpret the results.
3. If you have time, you should repeat exercises FA1 to FA3 varying nitrogen for each level of phosphorus.
Factor x Factor
So far we have analyzed this data set by finding responses to phosphorus for set levels of nitrogen (FA1, FA2 and FA3) or by
finding responses to nitrogen for set levels of phosphorus (FA3-3). By doing this, it is possible to answer questions such as,
"If I apply 25 kg ha"' of P205, how much nitrogen should I apply?" But it does not answer the question of how much of each
amendment is best. To do this requires the analysis of nitrogen and phosphorus simultaneously.
1. To obtain a first estimate of the nature of the 3-dimensional surface (shown in two dimensions) construct a graph with
nitrogen on the vertical (Y) axis and phosphorus on the horizontal (X) axis. Use the treatment (fertilizer) values from the
experiment as shown in the table in Exercise FA1. Then using the yield averages for each treatment, write the corresponding
yields for each N-P combination at the relevant intersection in the graph. For example, the value for 0-0 (N-P) is 2212 and for
50-25 is 3880. Follow the visiographic procedure from Hildebrand and Poey, pp. 108-113.
2. Draw in iso-quant contours for 2500, 3000, 3500, etc.
3. Using the prices from Exercise FA3, draw in the iso-cost contours and the expansion path.
4. Interpret your work.
Factor x Factor
In the procedure used in Exercise FA4, we could determine the best (least cost) combination of N and P to use, but would need
to make successive approximations to determine the most profitable combination of the two. In order to be more "precise" a
mathematical production function incorporating both amendments can be calculated by regression. The form we will use here
is quadratic (with an NP interaction term).
1. In a spreadsheet, set up the data in the following form:
By using all the data, you should have 96 (8 X 12) rows. Calculate the production function Y = f(N,N2,P,P2,NP) by blocking
the first five columns as independent variables in the regression menu. In the output, the coefficients are in the same order as
2. Determine mathematically, for what quantities of N and P production is maximized? How much maize is produced with
these quantities of N and P? Do these values correspond with your graph from Exercise FA4? Explain any differences.
3. Using the same prices as before, find mathematically the quantities of N and P that maximize profit. How much maize is
produced at this level of fertilizer use? Do these quantities fall on the expansion path from your graph from Exercise FA4?
Explain any differences.
GROUPING FARMS INTO
In the previous exercises, we have been looking at all the farms as a single group. Any recommendations made from these
analyses would presumable apply to all the farms in the sample and all other similar farms. However, it is evident from the
data in Exercise FA1 that the farms (or at least the fields) where the trials were conducted, or the climate in each year, are
quite different. The environments in the fields from farms 1A, 1B and 4B, for example, are obviously poorer for producing
maize than are the environments in the fields from farms 3A, 4A and 3B. Unfortunately, as mentioned in the introduction to
these exercises, no information is available on the characteristics of the environments at these locations. We do not know if
the lower yields are associated with less rainfall, poorer fertility, late planting, less effective weeding, or what. But we do
know that for some reason there are distinct environmental differences in the fields and on the farms where these trials were
planted. Even though these differences are real, and form the real world of the farmers, they are the reasons that many
researchers complain about doing research on farms. However, in order for us to have confidence in the conclusions we make
regarding maize response to N and P, for example, it is necessary to conduct such trials on farms and under real farm
conditions. Because these conditions and the growing environments they create are so highly variable, it is necessary to have a
means to separate recommendations into specific domains of similar environments and farmer needs.
Without minimizing the importance of characterizing each environment where on-farm trials are conducted, there is a
convenient means available to quantify the evident environmental differences. This is to convert the overall farm (field)
average (over all treatments) into an index that reflects quality of the environment in each field and differentiates it from the
environments in other fields. Thus for example, for farm (field) 1A, the average yield is 2710 kg ha-1. This converts into an
index (without units) of 2710. The index for farm (field) 2A is 4140. It is quite obvious that environment 2A, representing
farm or field 2, with an index of 4140 is a better environment for producing maize than environment 1A with an index of
2710. Even though some statistical purists feel the use of this index in regression is marred, it has been in use for over 50
years and has been invaluable in the absence of any other available method.
We will use the term El for this "environmental index" as a component of additional variables in the production function from
the previous exercise..
1. To the table in Exercise FAS, add four more columns of data: El, NEI, N2EI and PEL. Then blocking all nine columns as
independent variables, calculate the production function:
Y = f(N,N2,P,P2,NP,EI,NEI,N2EI,PE1)
Compare the R2 values for the two equations and the deviations from regression (Std Err of Y Est). How do you explain the
2. What would you expect this equation to look like if you set the El (in all the terms where El appears) equal to the average
El for all farms (3760 from the table in Exercise FA1)? Do this and compare the resulting equation with the equation from
Exercise FA5. Did you expect the similarity? Explain why this happens. Solve for maximum production and maximum
profit. How do the quantities of N, P and Y compare with those from Exercise FA5?
3. Now set El to represent the poorest environments, say El = 2500. Notice how the equation changes. Solve this equation
for maximum production and maximum profit. Compare resulting values of N, P and Y with those from the previous
equation. Explain the results.
4. Do the same for an El representing high environments, say El = 5000. Compare with the results of the other production
5. Discuss the implications for making specific recommendations.
Hildebrand, P.E. and F. Poey. 1985. On-farm agronomic trials in farming systems research and extension. Lynne Rienner
Publishers, Boulder CO.
Hildebrand, P.E. and J.R. Russell. 1996. Adaptability analysis: A method for the design, analysis and interpretation of on-
farm research-extension. Iowa State University Press, Ames.