extensive
testing
on farms
in four parts
PART If.
Foreign Agricultural Service April 1954
U.S.DEPARTMENT OF AGRICULTURE
This publication has
been prepared for use in
the technical cooperation
program of the Foreign
Operations Administration.
a guide to
EXTENSIVE TESTING ON FARMS
by Henry Hopp
in 4 parts
Part IV: Using Data to Design
Extensive Tests on
Farms
Foreign Agricultural Service
UNITED STATES DEPARTMENT OF AGRICULTURE
Washington, D. C.
April 1954
CONTENTS
Page
Preface . . . . ... . . 1
Improving the estimate of number of farms required . 2
Plot variability ................... .. 3
Location variability ................... 15
Treatment variability . . . . 23
Modifying the estimate for large plots . . ... 28
Determining the best size of plot . . .. 30
PREFACE
Parts II and III of this Guide describe procedures for laying out the two
kinds of extensive testsresult tests and experiments on farms. One of
the important items in the cost of such projects is their size, i.e., the
number of farms they involve and the area of the plots. Of course you
want to keep the undertaking as small as possible, and yet you want to
meet the requirements for adequate design. In the previous parts, some
rather crude methods were given for deciding on the number of farms to
include in an extensive test: Part II, on result tests, gave a ruleof
thumb method; Part III, on farm experiments, gave a somewhat better method
making "informed guesses" and applying to them certain simple statistical
procedures.
Neither of these methods is likely to be very precise. They might miss
the mark by a great deal. The one source of really accurate information
would be data from surveys or experiments.
Fortunately such data are often available: surveys already may have been
made in the area; experiments already may have been conducted at one or
several research stations. If such data pertinent to an extensive test
are available, they can be used in estimating the number of farms as well
as the size of plots. The methods involve conventional statistical pro
cedures and therefore require some knowledge of statistical analysis. For
tests that are critical, costly, or timeconsuming, the added inconvenience
that statistical methods entail may be minor compared with the savings they
accomplish.
This part of the Guide (1) describes the procedure for using data in
estimating the number of farms required for the extensive test, (2) tells
how to modify this procedure when you want plots larger than the research
plots, and (5) gives a method for determining the best size for extensive
test plots.
This part, then, serves as a supplement to the three preceding parts. You
will need it only when you wish to improve the design that you have already
arrived at through the previous parts; and the directions that you find
here can be used only in connection with that design.
IMPROVING THE ESTIMATE OF NUMBER OF FARMS REQUIRED
In Part III of this Guide you learned how to decide on the number of
farms to have in an extensive test. You estimated three kinds of varia
bilityplot, location, and treatmentand you learned to do it by a
method we might dignify as "informed guessing." With this rather crude
method, your decision as to the number of farms will not be completely
reliable, but you will have to be satisfied with it if you lack actual
yield data from surveys or experiments. If, however, you have or can
get some pertinent yield data and think it worthwhile to determine the
number of farms more reliably, you can substitute more accurate methods
for estimating each of the three kinds of variability.
The method to use will depend on, first, the kind of variability you are
estimating; second, the source of your data, i.e., whether they are from
surveys or from experiments; and third, the competence and facilities
available for making statistical computations, i.e., whether you have a
trained technician and a calculating machine for him to work with.
Three methods are presented here.
Use Method 1 for plot or location variability if you have, or can obtain,
survey data on yields and if you do not have a technician and machine
available.
Use Method 2 for plot or location variability if you have, or can obtain,
survey data and if you have a technician and machine.
Use Method 3 for plot, location, or treatment variability if you have data
from experiments in the area and if you have a technician and machine.
Although each of these methods will give you a more precise estimate of
the optimum number of farms than the guessing method, the third one will
give you the most precise estimate of all. This method, however, can be
used only if some experiments already have been performed in your area
on the practice you are testing. For Methods 1 and 2, usually, data are
more easily obtained.
You do not have to use the same method for estimating all three kinds of
variability. Thus, you might be planning an extensive test on corn ferti
lizers, and, though you may not have data on the subject, you may have
some that were obtained in experiments with corn varieties. You can use
these data for determining plot variability, following Method 3. But, if
the research had been done at only two or three locations, the data may
be insufficient for getting location variability. So, for location varia
bility, you make a little survey and obtain an estimate by Method 2.
Finally, lacking specific information on corn fertilizer experiments, you
may have to fall back on the guess procedure in Part III of this Guide to
estimate treatment variability.
LOCATION VARIABILITY
Method 1
If survey datathe kind used for Method 1do not already exist for
your area, you can obtain some by making a quick survey of farms. An
example of data collected in such a survey is shown in Table 6: 15
farms were selected at random, and yield per acre of the crop in
question was obtained from the farms' total yield of that crop, not
from small sample plots. If, however, you get yields from small plots,
follow instead the procedure in Method 2 (page 18).
The first step in determining location variability from the yield data
in table 6 is to add all the yields and divide by the number of lo
cations, thus arriving at the mean yieldin this example, 32. Now
find the difference between the mean yield and each location yield,
and enter these differences, or deviations, in the third column. Rank
the locations in increasing order of the deviations: Farm 2 has the
smallest deviation and ranks first; Farm 4 has the largest deviation
and ranks fifteenth. Now list the 15 farms in the order of this rank
(table 7). The midpoint, or median, rank is 8; and the deviation
corresponding to this rank is 9. Location variability is now obtained
as follows:
Location variability =
Deviation for midpoint rank
0.7
9
0.7
= 15
To express the variability in percent, use this equation:
Location variability in percent =
Location variability x 100
Mean yield
13 x 100
32
=41%
Use this estimate in the calculations
page 15).
of extensivetest error (Part III,
4/ Snedecor, op. cit., sec. 2.16.
Table 6. Survey data on average yield
of entire farms, for 15
randomly selected farms.
Farm Yield per Deviation
acre from mean
1 27 5 4
2 33 1 1
3 23 9 7
4 71 39 15
5 31 1 2
6 24 8
7 21 11 10
8 21 11 11
9 57 25 14
10 15 17 13
11 41 9 8
12 31 1 3
13 27 5 5
14 42 10 9
15 19 13 12
Total yield! 483
Mean yield 32
Table 7. Relisting of farms
order of rank.
Rank Farm
of table 6 in
Deviation
Method 2
With Method 2, which calls for a calculating machine, you can make a
more accurate estimate of location variability from the survey data.
We will illustrate the procedure for two types of data: one in which
yields of entire farms are available, and the other in which yields
of sample plots are available.
The data in table 6 can be used as an example of yields for entire
farms. Start with the yields in the second column of the table and
add their squares:2/
272 + 332 + + 192 18,777
From this value, the uncorrected sum of squares, subtract a "correction
factor" obtained by squaring the sum of the yields and dividing by the
number of farms:
Correction factor = 482
15
= 15,553
Subtracting 15,553 from 18,777, you get 3,224. This value is called
the corrected sum of squares. Now divide by the number of farms less
1, that is, by 15 1, to get the mean square:
Mean square = Sum of suares
Number of farms 1
= 230
The square root of 230, or
it in percent as follows:
15, is the location variability. Express
Location variability in percent
SPlot variability x 100
Mean yield
S15 x 100
32
 47%
5/ Snedecor, op. cit., sec. 2.8.
This 47 percent is a more exact computation of the location variability
than the 41 percent obtained by Method 1. This becomes, then, the
estimate to use in calculating extensivetest error (Part III, page 15).
Now let us illustrate the procedure when the yields are for small plots
on farms rather than for entire farms.6 You must measure 2 plots side
by side on each farm. Hence, you can use here the data collected for
plot variability in table 1. However, summarize the data as shown in
table 8.
The first step is to square each observation, or the value for each
plot, and add the squares to get the uncorrected sum:
312 + 212 + + 21 = 37,769
Then compute the correction factor:
Correction factor = Total yield2
Number of plots
9652
50
= 31,041
The difference between the uncorrected sum of squares and this
correction factor, 37,769 31,041, or 6,728, is called the total sum
of squares. Enter it in the indicated place in table 9. In the next
column, enter the total degrees of freedomthe total number of ob
servations less 1, or 30 1.
The next value to compute is the sum of squares for farms. Add the
squares of the farm totals (in last column of table 8) and divide by
the number of plots per farm:
662+ + 42 + + 82 74,889 = 7,444
2 2
and then subtract the correction factor, already computed as 31,041, to
get 6,403. Eater this value in table 9 as the sum of squares for farms.
In the next column enter the degrees of freedomthe number of farms
less 1, or 15 1.
Now, for plots, obtain the sum of squares and degrees of freedom by
subtracting the values for farms from the values for total.
6/ Snedecor, op. cit., sec. 10.6.
Table 8. Survey data on yields of 2 adjoining
plots on each of 15 farms. (Each plot
is of the size to be used in the
extensive test)
Farm Plot 1 Plot 2 Sum
1 31 35 66
2 21 33 54
3 25 20 45
4 74 68 142
5 32 31 63
6 21 28 49
7 24 18 42
8 20 22 42
9 52 61 113
10 13 17 30
11 46 36 82
12 33 28 61
13 30 24 54
14 38 46 84
15 17 21 58
II I
Total yield . . .. . 965
Number of plots . . . 50
Mean yield . . .... .. 32
Table 9. Analysis of variance of data in table 8, for arriv
ing at location and plot variabilities.
Source Sum Degrees
of of of Mean Units Variance
variation squares freedom square component
Farms 6,403 14 457
Plots 325 15 22 1 22
Total 6,728 29
Difference due to farms 455 2 218
Obtain the mean squares for farms and plots by dividing the degrees of
freedom into the sum of squares. Then, to find out what variation was
due to the farms, subtract the mean square for plots from the mean
square for farms. You have to do this to eliminate the plottoplot
variation from the farm differences.
The next column of table 9, "units," gives the number of individual
plots that were represented in the numbers you squared. For farms, you
squared a value that represented 2 plots; hence, enter 2 in the bottom
line of that column. For plots, you squared the values for individual
plot yields; hence, enter 1 for plots. Divide the mean squares by the
respective number of units to get the values called variance components.
The square roots of these variance components are the variability
values we seek. For farms (locations)the variability is the square
root of 218, or 14.8. Expressed in percent of the mean, it is:
Location variability x 100
Location variability in percent =Loc n vy x
Mean yield
14.8 x 100
32
= 46%
Use this variability in calculating extensivetest error (Part III,
page 15).
If the plots of the survey are approximately the same size as those
you will use in the extensive test, the variance component for plots
can be used to determine plot variability. Follow the same procedure
as with location variability:
Plot variability = V
= 4.7
You express this variability in percent as follows:
Plot variability in percent = Plot variability x 100
Mean yield
4.7 x 100
32
= 15%
Note that you have the same answer here as you obtained by Method 2 for
plot variability, page 6. Use this estimate in your calculation of
extensivetest error (Part III, page 15).
PLOT VARIABILITY
Method 1
Method 1 uses survey data and requires no calculating machine. Perhaps
you already have data from surveys previously made; but, if you do not,
you can get some by collecting simple cropcutting samples from 2 adjoin
ing plots on each of 10 or more farms, having each plot of the same size
as plots you later will use in the extensive test. From such data you
can make a fair approximation of plot variability, even without a calcu
lating machine.
Let us consider an example. You have collected data, let us say, on
yields from 2 adjoining plots on each of 15 farms. Now, for each farm
enter the yield for each plot in columns 2 and 3 of table 1; in column
4 enter the difference between them. When you have done this for all
farms, use column 5 to number the differences consecutively in increas
ing order of magnitude. Thus, Farm 5, which has the smallest difference
between plots, get the rank of 1, and Farm 2, which has the largest
difference, gets the rank of 15. Now list the 15 farms in the order of
this rank (table 2). The midpoint, or median, rank is 8; and the dif
ference corresponding to this rank is 6. Now use this difference to
obtain the figure for plot variability, thus:
Plot variability Difference at midpoint rank
Plot variability =
1.4
6
4.3
To express this plot variability in percent you must first calculate
the mean yield of the plots: divide the sum of all the yields in table
1 (965) by the total number of plots (30). Then use the mean yield (32)
in this equation:
Plot variability in percent Plot variability x 100
Mean yield
4.3 x 100
32
= 13%
1/ George W. Snedecor, Statistical Methods, 4th ed., 1946, sec. 2.16.
Table 1. Survey data on yields of 2 adjoining plots on each of 15
farms. (Each plot is of the size to be used in the
extensive test.)
35
33
20
68
31
28
18
22
61
17
36
28
24
46
21
Total yield 965
Number of plots 30
Mean yield 32
Table 2. Relisting of
rank.
farms of table 1 in order of
Rank Farm Difference
Substitute this value for the one you got by the "guessing" procedure
in Part III and use it in calculating the extensivetest error (Part III,
p. 15)
Method 2
Method 2 calls for survey data and requires a calculating machine; use
of the machine will help you obtain a more accurate measure of varia
bility. Start with data like those in table 1, but omit the last
column; instead, square each difference in the fourth column and add the
squares :
42 + 122 + + 42 = 649
To find plot variability, divide this sum of squares by the number of
plots, ant extract the square root of the quotient:
Plot variability = / Lu of squsrs of differences
V Number of plots
V 30
4.7
Plot variability in percent is obtained in theeameway as for Method 1:
Plot variability in percent = 4.7 x 100
32
= 15%
Bear in mind that use of a calculating machine does not of itself assure
a reliable estimate of plot variability. The estimate is good only so
far as the data are representative of all the farms in the area. You
should therefore evaluate both the source of the survey data and the
amount that is available. You must have measures of plot variability
from a sufficient number of farms, preferably selected at random, before
you can place much confidence in your mathematical estimate. You may
even modify this estimate by judging the representativeness of the data,
and so arrive at a better estimate. For instance, your data may have
come from soils that are more uniform than other soils in the region and
may, in your judgment, underestimate plot variability for the region as
2/ Ibid., sec. 4.2.
a whole. If so, you might increase your calculated value by a small
amount; thus, in our example, you might increase it from 15 percent to
20. The estimate you finally arrive at is the one to use in calculating
the extensivetest error (Part III, p. 15).
Method 3
Method 3, which uses data from experiments and requires a calculating
machine, gives the most exact estimate of variability. It is the method
that you are likely to choose if you are working in an area that already
has an agricultural research organization, for such organizations usually
accumulate data of the kind required for the method. These accumulated
experimental data, however, will be useful to you only if they have been
collected from farms varied enough to be fairly representative of plot
variability in the region. If they have not, you will do better to use
data obtained from surveys and to analyze them by Methods 1 or 2. Above
all, avoid the error of making a plotvariability estimate from experi
ments that were all conducted at one location, as at a research station.
If you are working in an area where research experiments are under way,
you can enlist the assistance of the technician carrying out the experi
ments: he can obtain plot variability from the analysis of variance he
makes of his data. Which item in his analysis is the measure of plot
variability will depend on the experimental design he has used; one ex
ample, however, will suffice here to show how variability can be measured
by analysis of experimental data.
Our example is based on a hypothetical experiment set up in a randomized
block design, which is a rather usual type The experiment tests four
treatments, covers four farms, and has two replications, or blocks, per
farm. 3/ Actual experiments may differ in design from this one, but all
welldesigned experiments have these characteristics (1) two or more
treatments, or factors, under test; (2) two or more replications of the
treatments at each location; (3) repetition of the same or closely re
lated treatments at several locations.
The first step in the analysis is to tabulate the data on yields from
each plot by treatment and location and to add up the total yields for
treatments, farms, and replications (table 3).
Next, prepare a worksheet like that shown in table 4; at the head of it
note the total number of plots in the experiment (32) and the total yield
from these plots (1,055). Now, using a calculating machine, proceed to
get the various sums of squares, entering the figures for each step in
the table.
3/ Snedecor, op. cit., sec. 11135.
Table 3. Experimental data on yields of 32 plots in an experiment with 4 treatments,
4 farms, and 2 replications per farm. (Each plot is of the size to be
used in the extensive test)
Treatment Farm 1 Farm 2 Farm 3 Farm 4 Total
Rep. Rep. Total Rep. Rep. Total Rep. Rep. Total Rep. Rep. Total
1 2 1 2 1 2 1 2__
A 39 31 70 36 32 68 27 34 61 35 38 73 272
B(check) 534 35 69 27 24 51 26 21 47 32 33 65 232
C 47 39 86 37 3 9 76 30 39 69 i 42 41 83 531
D 36 25 61 26 23 49 21 36 57 21 29 50 217
Total 156 130 286 126 118 244 104 130 254 10 31 271 1035
__l iL __
Table 4. Worksheet for calculating sums of squares of data in
table 3. (Number of plots, 32; sum of yields, 1,035)
Treatments Replications
Item Total Treatments Farms x on
farms farms
Uncorrected sum
of squares 34,903 273,493 269,529 69,063 135,533
Divisor 1 8 8 2 4
Quotient 34,903 34,187 33,691 4,532 33,883
Correction
factor 33,476 33,476 33,476 33,476 33,476
Corrected sum
of squares 1,427 711 215 1,056 407
711 215
215 192
130
The first one is designated as "total": it is the sum of the squares of
each individual yield number shown in table 3. The uncorrected value is
as follows:
392 + 342 + 292= 34,903
Enter this value in the first line under the column headed "Total."
Immediately beneath it enter the "Divisor," which is merely the number
of plots included in each number you have just squared; in this case it
is only 1. Divide the uncorrected sum of squares by this divisor to
get the quotient 34,903. Before you can arrive at a corrected sum of
squares, you have yet to compute a correction factor:
Total yield2
Correction factor =Numer of plots
Number of plots
S1,0352
32
= 33,476
Subtract this correction factor from the quotient,
34,903 33,476 = 1,427,
and you have the corrected sum of squares to enter as the last figure
in the column.
The second sum of squares is calculated from the treatment totals of
table 3:
2722 + 2322 + 3142 + 2172 = 273,493
Since each treatment total includes the yields of 8 plots, the divisor
is 8; From this point proceed as you did in finding the corrected sum
of squares for the total, and you will arrive at the corrected sum of
squares for treatment711.
The third sum of squares is calculated from the farm totals in table 3:
2862 + 2442 + 2342 + 2712 = 269,529
Again the divisor is 8 since each farm total includes 8 plots. Pro
ceeding as before, you obtain the corrected sum of 215.
The fourth sum of squares is calculated from the totals of each treat
ment on each farm in table 3:
702 + 692 + + 502 = 69,063
Since each squared number in this calculation includes 2 plots, the
divisor is 2. Proceeding as before, you obtain the number 1,056. But
this is not the corrected sum, as it would have been in the first 5
columns. From it you must first subtract the sums of squares for
treatments (711) and for farms (215) to get 150, the sum of squares for
treatments x farms.
The fifth sum of squares is calculated from the totals of each repli
cation on each farm in table 3:
1562 + 1502 + 1262 + + 1412 = 155,555
The divisor is 4 since each replication total includes 4 plots. Now
proceed as before until you obtain the value 407; this number, however,
contains not only the replication differences but the farm differences
as well. Hence subtract the sum of squares for farms (215), the re
mainder, 192, is the corrected sum of squares for replications on farms.
Now transfer these 5 sums of squares to the analysisofvariance sheet
(table 5). In the first column list the various sources of variation,
i.e., the various factors that contributed to the yield: treatments,
farms, treatments x farms, and replications on farms. For each of these,
and for the total, sums of squares are entered (from table 4) in the
second column. For the item just before the total, "treatments x repli
cations on farms," the sum of squares is obtained by subtracting all
other sums of squares from the total sum of squares.
The third column in table 5 is headed "degrees of freedom." For each
source of variation the number of degrees of freedom is as follows:
Treatments: Number of treatments (4) less 1.
Farms: Number of farms (4) less 1.
Treatments x farms; Degrees for treatments (3)
multiplied by degrees for farms (3).
Replications on farms: Number of replications
on each farm less one (2 1) multiplied by
number of farms (4).
Treatments x replications on farms: Total de
grees (number of observations less 1, or 32
1) less the degrees found thus far (3 + 3
+ 9 + 4).
In the last column are listed the mean squares. These are obtained
by dividing the degrees of freedom into the sum of squares. For ex
ample, the mean square for "treatments x replications on farms" is
179/12, or 15.
Table 5. Analysis of variance
for arriving at an
variability.
of data in table 3,
estimate of plot
Sum Degrees Mean
Source of variation of of square
Ss squares freedom
1. Treatments 711 3 237
2. Farms 215 3 72
3. Treatments x farms 130 i 9 14
4. Replications on farms 192 4 48
5. Treatments x replications
on farms (plot variability) 179 12 15
6. Total 1,427 31 
You are now ready to calculate the plot variability:
Plot variability Mean square for "treatments x
Plot variability
V replications on farms"
= .87
To express this variability in percent, use the following equation,
which calls for the mean yield of untreated plots. These plots are
the ones that were shown in table 3 as receiving treatment B; i.e.,
the check treatment.
Plot variability in percent =Plot variability x 100
Mean yield of check plots
= 5.87 x 100
29.0
=15
Use this estimate in the calculations of extensivetest error (Part III,
page 15).
The tests at the different farms may not involve exactly the same
experimental design. Often the treatments are not exactly the same.
As long as they are reasonably similar, you can combine them, even if
the number of treatments or replications differ. In order to combine
plot variability from different experiments, calculate first the plot
variability in percent for each experiment separately. Then square the
percentages. Add the squares, and obtain the mean of the sum by divid
ing it by the number of experiments. Finally, take the square root of
the mean. As an example:
Experiment Plot Variability Squares
(Percent)
1 12 144
2 15 225
5 10 100
4 13 169
SUm of squares 638
Mean of squares (638/4) 160
Mean plot variability ( \fl6 ) 12.6%
A word of caution about Method 3: The calculations are rather exact,
but in using the resulting information for designing an extensive test,
it is generally necessary to exercise some discretion as to applica
bility. If the plots to be used in the extensive test are much larger
than those used in the experiment, additional calculations are necessary
(see page 28).
Furthermore, in order to feel confident about using the information from
experiments, you must be fairly sure that soil variability at the experi
ment stations is reasonably representative of soil variability on farms
in the region. If the soil is more uniform at the experiment stations
than it is likely to be on the extensivetest farms, you would be justi
fied in raising somewhat the estimate of plot variability.
Method 3
Method 3, which uses experimental data for estimating variability,
should be used for location variability only if the experiments have
been conducted on enough farms to constitute a fair sampling of the
region. A minimum of 10 farms is a good ruleofthumb requisite.
Most experiments are conducted in only one location, or in only a few,
and therefore cannot serve the purpose.
Table 3 shows experimental data from only 4 farms, not enough to make
a reasonably precise estimate of location variability. When data are
from so few locations, do not use Method 3 for calculating location
variability. Instead, use one of the other methods already described.
They are based on a more inclusive sample and also require much less
computation.
Another disadvantage of using experimental data for determining location
variability is that the calculations are usually complicated. You will
have to call on the technicians at the research station to do the cal
culations. This may be feasible, and we will therefore give an example
of the procedure in the next section, on treatment variability. In
that section we will use the same example for both treatment and lo
cation variability.
TREATMENT VARIABILITY
Only data from experiments can be used to calculate treatment varia
bility. Hence, Methods 1 and 2, the ones that work with survey data,
are not applicable. But, in using Method 3, be sure that your data
are from an experiment that meets the following requirements: (1)
Experimental farms must be representative of the region as to both
anticipated treatment effects and treatment variability and (2) experi
mental treatments must be similar to those planned for the extensive
test. Unless the experiment meets both these requirements, you had
better use the method given in Part III (page 13) of this Guide.
Let us assume that the data in table 3 are from an experiment that
meets these requirements and therefore can be used to illustrate the
procedure of Method 3. Start with the analysis of variance that was
made of the data in table 5.1/ // You now rewrite the table, using
symbols as shown in table 10. The purpose of the revision is to help
7/ William G. Cochran and Gertrude M. Cox. Experimental Designs.
New York, 1950, Sec. 14.1.
8/ Snedecor, opo cit., sec. 11.135
you isolate the variance component for treatment variability from the
rest of the information obtained in the experiment. The revision looks
imposing but is not difficult if you follow directions stepby step.
Copy the first column of table 5 into the first column of table 10.
Filling in the next section of the table, "Calculating mean square in
symbols," is accomplished in 3 steps that will be easy if you simply
follow directions. By the time you have finished the third step, you
will have written equations that show the various components in symbols
for each mean square.
Step 1 is to assign a letter as a symbol for each source of variation
thus: T for treatments; F for farms; R(F) for replications, or plots
on farms; and all the symbolsTFR(F) for the total. After each
symbol, write the number that applied in the experiment: table 3 will
remind you that there were 4 treatments, 4 farms, and 2 replications
per farm, making 32 plots in all.
In Step 2, write the cofficient for each variance in step 1. This con
sists of the symbols that are missing. To clarify: all the symbols, T,
F, and R(F) are found in the total; but not all are found in each
variance. Whichever ones are missing in each are now to be written
under Step 2, in small letters. Thus, for variance T, both F and R in are
missing; write them sofr,) When you get down to the fourth line,
note that the variance, R(F) lacks both T and F; but the F is not
written in Step 2 because there is a rule that when a symbol appears
in parenthesis in the variance of Step 1, this symbol shall not be
repeated in the coefficient.
Nowyou come to Step 3, which, when finished, will give you the complete
formulas. First copy each variance into this column, preceding it with
its coefficient; for "treatments," for instance, write frt)T. Now add
to it all other variances that also contain the identifying symbol T,
not forgetting to precede each variance with its coefficient. In this
case, the other variances that contain T are TF and TR(F) ; the latter,
not having a coefficient to precede it, is added all by itself. Con
tinue thus to the end of the column. When you get to the last line,
all you will have to enter will be TR () for there is no other vari
ance with all these symbols, neither does it have a coefficient to
precede it.
Having calculated the mean squares in symbols, copy in the last column
the numerical mean squares from table 5; and then you have everything
you need to solve the symbol equations for the numerical value of the
variance components. Start with TR() the plot variability on the
fifth line. You need do no more than look in the last column to find
that its value is 15. Now you can solve for R ( on line 4:
tR(F) + TR(F = 48
Table 10.
Revision of table 5 to obtain variance components for determining
variability by Method 3.
Mean
Source of variation Calculating mean square in symbols quare
Step 1 Step 2 Step 3 in
(Variance) (Coefficient) (Completed formula) numbers
Treatments T (4) fr(f fr(f) T + r (f TF + TR(F 237
Farms (location
variability) F (4) tr(r) tr(f) F + r (f) TF + tR(F) + TR(F) 72
Treatments x farms
(treatment
variability) TF r(f) r(f) TF + TR (F) 14
Replications on
farms R (F) (2) t tR(F) + TR(F) 48
Treatments x repli
cations on farms
(plot variability) TR (F TR (F) 15
Total TFR(F) (32)
__
Since t = 4 (from the second column) and TR(r) = 15, rewrite the equation
thus:
4R(F) + 15 = 48
48 15
R(F) .25
= 8.25
Now proceed to solve
the bottom:
for treatment variability, on the third line from
r() TF + TR(F) = 14
2TF + 15 = 14
14 15
TF =
2
= 0.5
This negative value is unusual; generally TF is positive.
variance component should be taken as zero.
A negative
When you obtain a positive value for TF25, for examplesimply take
its square root to obtain the treatment variability:
Treatment variability = \ TF
S'V25
= 5
To express this treatment variability in percent, use the following
equation, which calls for the mean yield of check, or untreated, plots
(see table 3 for yields of plots receiving treatment B):
Treatment variability in percent Treatment variability x 100
Mean yield, check plots
5 x 100
29
=17%
Use this estimate in your calculation of extensivetest error (Part III,
page 15).
While we were discussing plot variability early in this section, we
pointed out that experimental data can be used for estimating location
variability also, provided the experiment covers enough locations to
provide an adequate sample. The experiment summarized in table 5 had
only 4 locations, hardly enough for a region of any size. Nevertheless
we can use this experiment to illustrate the procedure for determining
location variability. Refer again to table 10, and write the equation
for farms:
tr(r) F + r(f) TF + tR(F) + TR(F) = 72
Substitute the coefficients and the variabilities already solved:
(4 x 2F) + (2 x 0) + (4 x 8) + 15 = 72
8F + 0 + 52 + 15 = 72
F = 72 2 15
8
= 2
= 3.12
Then, since location variability is simply the square root of F
Location variability = 'VF
= 1.77
To express this location variability in percent, use this equation:
Location variability in percent Location variability x 100
Mean yield of check plots
= 1.77x 100
29
= 6%
This is the estimate that you will use
error (Part III, page 15).
in calculating extensivetest
MODIFYING THE ESTIMATE FOR LARGE PLOTS
If you have determined plot variability from data taken from plots of
the same size as those you will use in the extensive test, you will not
need the information in this section. But you may wish to use larger
plots in the extensive test, especially in order to enhance its demon
strational value. If so, you will need the information given here.
When the extensivetest plots are to be much larger than the experi
mental plots, there is an additional adjustment to make. The procedure
is quite simple. If the experimental plots were 5 square feet and the
extensivetest plots will be 500 square feet, each of the latter will
contain 100 experimentalplot units. Now refer to table 11. In the
first column are listed a range of ratios between the sizes of the
extensivetest plots and the experimental plots. The remaining columns
give corresponding factors: one for crop tests, the other for animal
tests. For a plotsize ratio of 100, the factor for crop tests is 3.2.
Now, if the plot variability in the experimental plots is 15 percent
(for example, page 15), then
Plot variability for = Plot variability in experiment (%
extensive test (%) Factor
13
5.2
= 4%
The plot variability thus obtained is the one to use in calculating
extensivetest error (Part III, page 15).
A word of explanation will clarify the two sets of factors in table 11.
In crop tests, the plot variability does not decrease in proportion to
the size of the plots: as the size of the plots is increased, more
variable ground is likely to be included. The variability between two
plots lying side by side is less than the variability between two plots
some distance apart. Thus, the set of factors for field crops, which
is the fourth root of the ratios shown in the first column, takes into
consideration this increased variability of soil with increasing plot
size.2/ In animal tests, on the contrary, this circumstance does not
apply; therefore, the factors used are simply the square roots of the
ratios shown in the first column. The factors for animal tests can be
usedalso for any other tests in which increased "plot size" does not mean
a proportionate increase in the area of land per plot.
9/ An adaptation from "An Empirical Law Describing the Heterogeneity in
the Yields of Agriculture Crops," by H. Fairfield Smith, Jour.
Agric. Sci. 28(Part 1):125, January 1958. In table 11, an
average heterogeneity factor of 0.50 is used.
Table 11. Factors to be used in determining the plot error
when the extensivetest plots are larger than
the experimental plots on which estimate of
variability was based.
Ratio: Extensivetest plot size Factor for Factor for
Experimental plot size crop tests 1/ animal tests
1.0
1.5
1.6
1.7
1.8
2.5
2.7
100
1.0
1.4
1.7
2.0
2.2
2.4
2.8
3.2
4.5
5.5
7.1
10.0
l/ This factor is the fourth root of
column: ~i 2 etc.
.J/ This factor
column:
is the square root of
V\f / 2F etc.
the number in the first
the number in the first
1.3
DETERMINING THE BEST SIZE OF PLOT
Most often you will not be concerned with the best size of the plots
for an extensive test. Sometimes, however, it is necessary to limit
the size of the plots as much as possible. For example, in a test of a
new insecticide, you may not be able to get a sufficient quantity to
permit large plots in the test. The problem of best plot size arises
also when you are testing large thingstrees, fruits, animals, or
even people. You may want, then, to use the smallest plot that you
can. In this section a procedure will be given to help you determine
the best number of individuals, or units, to have in a plot.
You recall from Part III of this Guide, page 15, that the error for an
extensive test is the sum of several variability components. But, if
we have a small number of individuals in a plot, the variability of
these individuals also becomes important, and it, too, must be added
to the other variabilities in our error calculations. You will now
see how to do this.
First, clearly designate the individuals, or units, that you are
dealing withindividual plants and trees, for example, or short
lengths of row in field crops, single hills with several plants per
hill, individual animals in cattle experiments, small quadrats in
pasture and forage experiments, students in a school, persons in a
family, or farmers in a community.
Next, obtain an estimate of the variability of these units. The
valueyou seek is the variability among units receiving no experimental
treatment. The procedure you follow is the same as the one we have
discussed for determining plot variability except that now you are
dealing with differences between individuals instead of with differ
ences between plots. The question you must answer is this, "If I
took 2 units per plot at random at a number of farms in the region,
what difference in yield would there be between them?" This question
may be answered by judgment (Part III, pages 1514) or from survey
data (see the section on plot variability in this Part, Methods 1
and 2, pages 57).
This question may be answered also by experimental data, but then a
somewhat different procedure is used. To begin with, get measures of
individual, or unit, yields in the experiment as well as yields of the
whole plots. In table 3, for instance, yields are given for the plots
as a whole, but these must now be supplemented with data for units.
Such data can be gathered by harvesting either every unit (plant, hill,
etc.) or just two units at random in 20 or so of the plots. You will
probably do the latter since it is less work, and we will therefore
show this procedure in detail.
Let us start with the experiment shown in table 3 and assume that each
plot contains 27 plants. Now, go into 20 of these plots and, from each,
harvest 2 plants at random. List the yield data as in table 12. Get
the difference between each pair of plants and write the differences in
the last column of the table.
Nov square each difference, add the squares, divide by 2 (the number of
plants per plot actually measured) and multiply by 27 (the total number
of plants per plot). This calculation gives the sum of squares for
plants in plots:
Sum of squares (plants in plots) = (1012 + 0.12 + ... + 0.212)27
2
= 57.7287
Next make the analysis of variance of data as it is shown in table 15.
For plots, the values can be merely copied from the plotvariability
line in table 5. For plants within plots, insert values as follows:
Sum of squares = the sum of squares youhave Just now
computed.
Degrees of freedom = the number of differences listed
in table 12.
Mean square = the quotient resulting from dividing the
sum of squares by the degrees of freedom.
Square root of mean square = V 2.89 or 1.70.
Units = the number of plants in each plot.
Factor = the fourth root of the number of units, since this
is a crop test (see table 11, second column).
You have yet to enter the values of differences due to plots alone,
which are as follows:
Mean square = the difference between mean square for plots
and the mean square for plants in plots = 14.9 2.89 = 12.0.
Square root of mean square = /12 or 3.46.
Units = 1
Factor = the fourth root of the number of units (see table
11, second column).
Table 12.
Yields of 2 plants selected at random from
a total of 27 plants in each of 20 plots
of the experiment shown in table 3.
Plot Plant 1 Plant 2 Difference
0.67
1.37
0.94
1.51
1.66
1.25
1.39
1.27
1.28
1.20
1.19
1.28
1.00
1.76
1.11
1.58
1.30
1.04
1.21
0.99
1.68
1.23
1.26
0.79
1.04
1.21
0.84
1.39
1.10
1.16
1.07
1.19
1.43
1.03
1.50
1.13
1.24
1.55
1.90
0.78
1.01
0.14
0.32
0.72
0.62
0.04
0.55
0.12
0.18
0.04
0.12
0.09
0.43
0.73
0.39
0.45
0.06
0.51
0.69
0.21
Table 15. Analysis of variance of data for 20 plots and for
2 plants within each plot, with 27 plants per
plot.
Source Sum Degrees Mean Square root Number
of of of square of of Factor
variation squares freedom mean square units
Plots 179 12 14.9
Plants
within
plots 57.7287 20 2.89 1.70 27 2.28
Difference due to plots 12.0 3.46 1 1.00
The variabilities are obtained by multiplying the square roots of the
mean squares by the factors in the last column of table 13:
Plant variability = 1.70 x 2.28
= 3.88
Plot variability = 3.46 x 1.00
= 3.46
These variability components are now to be expressed in percent.
Simply multiply the variability by 100 and divide by the mean yield
for the check plots. This mean yield is obtained from table 3, which
indicates the check plots as those receiving treatment B; divide the
total yield from these plots (232) by the number of plots (8). Then
Plant variability x 100
Plant variability in percent = Mean yield (check plots)
5.88 x 100
29
= 13.4%
Plot variability in percent =Plot variability x 100
Mean yield (check plots)
3.46 x 100
29
= 11.9%
Now we are ready to determine the error of the extensive test. You
do this by adding the variability components. You have just found the
components for plants and plots; let us assume values of 46 percent and
10 percent respectively for the location and treatment components in
order to illustrate the rest of the procedure. It is like the one
shown in Part III, page 15, with the addition of the plant component
and the factor (F) for the number of plants per plot, which you obtain
from table 11. For Plan A use all four components (the various plans
are given in Part III, in the Appendix):
Plant variability = 13/F%; squared = (13/F)2
Plot variability = 12%; squared = 144
Location variability = 46%; squared = 2,116
Treatment variability = 10%; squared = 100
Total of the squares (15/F)2 + 2,560
Extensivetest error \ (13/F)2 + 2,560
For Plans B to H, omit the location variability:
Plant variability = 13/F%; squared = (13/F)2
Plot variability = 12%; squared = 144
Treatment variability = 10%; squared = 100
Total of the squares (1 ~ 2 + 244
Extensive test error // (15/F)2 + 24
Now, set up table 14 to aid in determining the best number of units per
plot. Record first the specifications of the extensive test. The first
column in table 14 indicates the plan. The second column lists various
numbers of units per plot, a range from 2 to 40. You can try whatever
numbers you are interested in. In column 3, the units are transposed
into factors by referring to table 11. Since this is an experiment with
a crop, the factors in the second column of table 11 are used. For
tests in which the plots are not units of land, the factors in column 3
of table 11 would be used. Column 4 is the plant variability (13 percent)
divided by the factors of column 3. Column 5 is the square of column 4.
Column 6 is column 5 plus the remainder of the error shown at the head
of the table: 2,560 for plan A and 244 for plans B to H. Column 7 is the
square root of column 6; these values are the errors for the plans.
Column 8 is the difference to be tested, shown at the head of the table
as 30 percent, divided by the plan errors shown in column 7. In column
9 record from Part III, table 2, the number of replications required for
the values in column 8.
Column 9 gives the answer you seek. For plan A note that the number of
replications required is rather large and does not decrease beyond 5
plants per plot. Hence, 5 plants per plot are ample if you desire to
keep the plots as small as possible.
For plans B to H the number of replications required is much fewer.
They decrease up to 10 plants per plot but not thereafter. You have
little to gain, then, by having more than 10 plants in a plot.
Once you have selected the minimum number of plants per plot5 for
plan A, and 10 for plans B to Hgo back to Part III, table 3. Enter
the corresponding number of replications in column 4 of that table.
Then complete the rest of the summary in that table to determine the
total requirements of the extensive test with the stipulated number of
plants per plot.
Table 14.
SPECIFICATIONS
Number of treatments: 7 (1 is a check)
No. of plots on each farm: 1, 2, 3, or 4
Minimum difference: 30%
Anticipated variability components: Plai
nts,
Plots,
Locations,
Treatments,
15%
12%
46%
10%
Error (Plan A) = 6 (13/F)2 + 2,560
Error (Plans B to H) = '(13/F)2 + 244
(1) (2) (5) (4) (5) (6) (7) (8) (9)
Units Plant Remainder Error % Difference Replications
Plan per Factor variability Column 4 of error t error required
plot Col. 5 squared + Col. 5 ( Col. 6) (Col. 7) (Part III, Table 2)
A 2 1.2 10.8 117 2477 49.8 .60 60
5 1.5 8.7 76 2456 49.4 .61 58
10 1.8 7.2 52 2412 49.1 .61 58
15 2.0 6.5 42 2402 49.0 .61 58
20 2.1 6.2 58 2598 49.0 .61 58
30 2.5 5.7 32 '2595 48.9 .61 58
40 2.5 5.2 27 2587 48.9 .61 58
B to H 2 1.2 10.8 117 361 19.0 1.58 11
5 1.5 8.7 76 520 17.9 1.68 10
10 1.8 7.2 52 296 17.2 1.74 9
15 2.0 6.5 42 286 16.9 1.78 9
20 2.1 6.2 58 282 16.8 1.79 9
50 2.5 5.7 32 277 16.6 1.81 9
40 2.5 5.2 27 271 16.5 1.82 9
Example of a summary to aid in determining the minimum number of plants per
plot for several designs.
