in four parts
* PART If. Foreign Agricultural Service April 1954 U.S.DEPARTMENT OF AGRICULTURE
This publication has been prepared for use in the technical cooperation program of the Foreign Operations Administration.
a guide to
EXTENSIVE TESTING ON FARM
by Henry Hopp
in 4 parts
Part IV: Using Data to Design Extensive Tests on Farms
Foreign Agricultural Service UNITED STATES DEPARTMENT OF AGRICULTURE Washington, D. C.
Improving the estimate of number of farms required . . 2
Plot variability 3
Location variability . . . . . . . . . 15
Treatment variability . . . . . . . . . 23
Modifying the estimate for large plots . . . . . 28
Determining the best size of plot . . . . . . 30
Parts II and III of this Guide describe procedures for laying out the two kinds of extensive tests--result tests and experiments on farms. One of the important items in the cost of such projects is their size, i.e., the number of farms they involve and the area of the Plots. Of course you want to keep the undertaking as small as possible, and yet you want to meet the requirements for adequate design. In the previous parts, some rather crude methods were given for deciding on the number of farms to include in an extensive test: Part II, on result tests, gave a rule-ofthumb method; Part III, on farm experiments, gave a somewhat better methodmaking "informed guesses" and applying to them certain simple statistical procedures.
Neither of these methods is likely to be very precise. They might miss the mark by a great deal. The one source of really accurate information would be data from surveys or experiments.
Fortunately such data are often available: surveys already may have been made in the area; experiments already may have been conducted at one or several research stations. If' such data pertinent to an extensive test are available, they can be used in estimating the number of farms as weli. as the size of plots. The methods involve conventional statistical procedures and therefore require some knowledge of statistical analysis. For tests that are critical,, costly, or time-consuming, the added inconvenience that statistical methods entail may be minor compared with the savings they accomplish.
This part of the Guide (1) describes the procedure for using data in estimating the number of farms required for the extensive test, (2) tells how to modify this procedure when you want plots larger than the research plots, and (3) gives a method for determining the best size for extensivetest plots.
This part, then., serves as a supplement to the three preceding parts. You will need it only when you wish to improve the design that you have already arrived at through the previous parts; and the directions that you find here can be used only in connection with that design.
IMPROVING THE ESTIMATE OF NUMBER OF FARMS REQUIRED
In Part III of this Guide you learned how to decide on the number of farms to have in an extensive test., You estimated three kinds of variability--plot, location, and treatment--and you learned to do it by a method we might dignify as "informed guessing." With this rather crude method,, your decision as to the number of farms will not be completely reliable, but you will have to be satisfied with it if you lack actual yield data from surveys or experiments. If, however, you have or can get some pertinent yield data and think it worthwhile to determine the number of farms more reliably,, you can substitute more accurate methods f or estimating each of the three kinds of variability.
The method to use will depend on, first, the kind of variability you are estimating; second, the source of your data, i.e., -whether Ithey are from surveys or from experiments; and third, the competence and facilities available for making statistical computations, i.e.,, whether you have a trained technician and a calculating machine for him to work with. Three methods are presented here.
Use Method 1 for plot or location variability if you have, or can obtain, survey data on yields and if you do not have a technician and machine available.
Use Method 2 for plot or location variability if you have,, or can obtain, survey data and if you have a technician and machine.
Use Method 3 for plot, location, or treatment variability if you have data from experiments in the area and if you have a technician and machine.
Although each of these methods will give you a more precise estimate of the optimum number of farms than the guessing method, the third one will give you the most precise estimate of all. This method, however, can be used only if some experiments already have been performed in your area on the practice you are testing. For Methods 1 and 2, usually, data are more easily obtained.
You do not have to use the same method for estimating all three kinds of variability. Thus, you might be planning an extensive test on corn fertilizers, and, though you may not have data on the subject, you may have some that were obtained in experiments with corn varieties. You can use these data for determining plot variability, following Method 3.- But, if the research had been done at only two or three locations, the data may be insufficient for getting location variability. So,, for location variability, you make a little survey and obtain an estimate by Method 2. Finally, lacking specific information on corn fertilizer experiments, you may have to fall back on the guess procedure in Part III of this Guide to estimate treatment variability.
Method 1 uses survey data and requires no calculating machine. Perhaps you already have data from surveys previously made; but, if you do not,, you can get some by collecting simple crop-cutting samples from 2 adjoining plots on each of 10 or more farms, having each plot of the same size as plots you later will use in the extensive test. From such data you can make a fair approximation of plot variability, even without a calculating machine.
Let us consider an example. You have collected data, let us say, on yields from 2 adjoining plots on each of 15 farms. Now, for each farm enter the yield for each plot in columns 2 and 3 of table 1; in column
4 enter the difference between them. When you have done this for all farms, use column 5 to number the differences consecutively in increasing order of magnitude. Thus, Farm 5, which has the smallest difference between plots, get the rank of 1, and Farm2, which has the largest difference, gets the rank of 15. Now list the 15 farms in the order of this rank (table 2). The midpoint, or median, rank is 8; and the difference corresponding to this rank is 6.. Now use this difference to obtain the figure for plot variability, thus:
Plot variability Difference at midpoint rank
To express this plot variability in percent you must first calculate the mean yield of the plots: divide the sum of all the yields in table
1 (965) by the total number of plots (30). Then use the mean yield (32) in this equation:
Plot variability in percent Plot variability x,100 Mean yield
1/ George W. Snedecor, Statistical Methods, 4th ed., 1946, sec. 2.16.
Table 1. Survey data on yields of 2 adjoining plots on each of 15
farms. (Each plot is of the size to be used in the
Farm Plot 1 Plot 2 Difference difference
1 31 35 4 3
2 21 53 12 15
3 25 20 5 6
4 74 68 6 8
5 3 2 31 11
6 21 28 7 1
7 I 24 18 6 9
8 20 22 2 2
9 5? 61 9 13
10 13 i 17 4 4
11 i 46 36 10 14
12 53 28 5 7
13 30 24 6 10
14 38 46 8 12
1517 21 j4 5
Total yield 965
Number of plots 50
Mean yield 32
Table 2. Relisting of farms of table 1 in order of
Rank Farm Difference
1 5 1
2 8 2
3 1 4
4 10 4
5 15 4
6 3 5
7 12 5
8 4 6
9 7 6
10 13 6
11 6 7
12 14 8
13 9 9
14 11 10
15 2 12
Substitute this value for the one you got by the "guessing" procedure in Part III and use it in calculating the extensive-test error (Part III, p. 15).
Method 2 calls fo,. survey data and requires a calculating machine; use of the machine will help you obtain a more accurate measure of variability. Start with data like those in table 1, but omit the last column; i stead, square each difference in the fourth column and add the
42 + 122 + + 42 = 649
To find plot variability, divide this sum of squares by the number of plots, anu extract the square root of the quotient:
Plot variability u/ bu of squs~r s of differences
V Number of plots
Plot variability in percent is obtained in thesame way as for Method 1:
Plot variability in percent = 4.7 x 10 32
Bear in mind that use of a calculating machine does not of itself assure a reliable estimate of plot variability. The estimate is good only so far as the data are representative of all the farms in the area. You should therefore evaluate both the source of the survey data and the amount that is available. You must have measure of plot variability from a sufficient number of farms, preferably selected at random, before you can place much confidence in your mathematical estimate. You may even modify this estimate by judging the representativeness of the data, and so arrive at a better estimate. For instance, your data may have come from soils that are more uniform than other soils in the region and may, in your judgment, underestimate plot variability for the region as
2/ Ibid., sec. 4.2.
a whole. If so, you might increase your calculated value by a small amount; thus, in our example, you might increase it from 15 percent to 20. The estimate you finally arrive at is the one to use in calculating the extensive-teat error (Part III, p. 15).
Method 3, which uses data from experiments and requires a calculating machine, gives the most exact estimate of variability. It is the method that you are likely to choose if you are working in an area that already has an agricultural research organization, for such-organizations usually accumulate data of the kind required for the method. These accumulated experimental data, however, will be useful to you only if they have been collected from farms varied enough to be fairly representative of plot variability in the region. If they have not, you will do better-to use data obtained from surveys and to analyze them by Methods 1 or 2. Above all, avoid the error of making a plot-variability estimate from experiments that were all conducted at one location, as at a research station.
If you are working in an area where research experiments are under way, you can enlist the assistance of the technician carrying out the experiments: he can obtain plot variability from the analysis of variance he makes of his data. Which item in his analysis is the measure of plot variability will depend on the experimental design he has used; one example, however, will suffice here to show how variability can be measured by analysis of experimental data.
Our example is based on a hypothetical experiment set up in a randomized block design, which is a rather usual type. The experiment tests four treatments, covers four farms., and has two replications, or blocks, per farm. 3/ Actual experiments may differ in design from this one, but all well-designed experiments have these characteristics (1) two or more treatments, or factors, under test; (2) two or more replications of the treatments at each location; (3) repetition of the same or closely related treatments at several locations.
The first step in the analysis is to tabulate the data on yields from each plot by treatment and location and to add up the total yields for treatments, farms, and replications (table 3).
Next, prepare a worksheet like that shown in table 4; at the head of it note the total number of plots in the experiment (32) and the total yield from these plots (1,035). Now, using a calculating machine, proceed to get the various sums of squares, entering the figures for each step in the table.
3/ Snedecor, 2p cit., sec. 11.13.
Table 3. Experimental data on yields of 32 plots in an experiment with 4 treatments,
4 farms, and 2 replications per farm. (Each plot is of the size to be
used in the extensive test)
Treatment Farm 1 Farm 2 Farm 3 Farm 4 Total
Rep. Rep. Total Rep. Rep. Total Rep. Rep. Total Rep. Rep. Total
1 2 1 2 !2
A 39 31 70 36 32 68 27 34 61 35 38 73 272
B(check) 34 35 69 27 24 51 26 21 47 32 33 65 232
c 47 39 86 37 39 76 30 39 69 832 41 85 31
D ~36 25 61 26 23 49 21 36 57 21 29 50 217
Total 1 6124 1 3 5 10
Ta 156 130 I286 126 118 24 1 0 130130 21 3271 --1035
__I Lt _
Table 4. Worksheet for calculating sums of squares of data in
table 3. (Number of plots, 32; sum of yields, 1,035)
Treatments Replications Item Total Treatments Farms x on
of squares 34,903 273,493 269,529 69,063 135,533
Divisor 1 8 8 2 4
Quotient 34,903 34,187 33,691 34,532 33,883
factor 33,476 33,476 33,476 33,476 33,476
of squares 1,427 711 215 1,056 407
- 711 215
The first one is designated as "total": it is the sum of the squares of each indiVidual yield number shown in table 3. The uncorrected value is as follows:
392 + 342 + . 292 = 34,903 Enter this value in the first line under the column headed "Total." Immediately beneath it enter the "Divisor," which is merely the number of plots included in each number you have Just squared.; in this case it is only i. Divide the uncorrected sum of squares by this divisor to gdt the quotient 34,903. Before you can arrive at a corrected sum of squares, you have yet to compute a correction factor: Total yield2
Correction factor Number of plots
Subtract this correction factor from the quotient, 34,903 33,476 = 1,427,
and you have the corrected sum of squares to enter as the last figure in the column.
The second sum of squares is calculated from the treatment totals of table 3:
2722 + 2322 + 3142 + 2172 = 273,493
Since each treatment total includes the yields of 8 plots, the divisor is 8; From this point proceed as you did in finding the corrected sum of squares for the total, and you will arrive at the corrected sum of squares for treatment--711. The third sum of squares is calculated from the farm totals in table 3:
2862 + 2442 + 2342 + 2712 = 269,529
Again the divisor is 8 since each farm total includes 8 plots. Proceeding as before, you obtain the corrected sum of 215. The fourth sum of squares is calculated from the totals of each treatment on each farm in table 3: 702 + 692 + + 502 = 69,063
Since each squared number in this calculation includes 2 plots, the divisor is 2. Proceeding as before, you obtain the number 1,056. But this is not the corrected sum, as it would have been in the first 5 columns. From it you must first subtract the sums of squares for treatments (711) and for farms (215) to get 130, the sum of squares for treatments X farms.
The fifth sUM of squares is calculated from the totals of each replication on each farm in table 5:
1562 + 1502 + 1262 + .+ 1412 155,555
The divisor is 4 since each replication total includes 4 plots. Now proceed as before until you obtain the value 407; this number, however, contains not only the replication differences but the farm differences as well. Hence subtract the sum of squares for farms (215),, the remainder, 192, is the corrected sum of squares for replications on farms.
Now transfer these 5 sums of squares to the analysis-of-variance sheet (table 5). In the first column list the various sources of variation, i.e., the various factors that contributed to the yield: treatments, farms, treatments x farms, and replications on farms. For each of these, and for the total, sums of squares are entered (from table 4) in the second column. For the item just before the total, "treatments x replications on farms," the sum of squares is obtained by subtracting all other sums of squares from the total sum of squares.
The third column in table 5 is headed "degrees of freedom." For each source of variation the number of degrees of freedom is as follows:
Treatments*. Number of treatments (4) less 1.
Farms: Number of farms (4) less 1.
Treatments x farms Degrees for treatments (5)
multiplied by degrees for farms (5).
Replications on farms: Number of replications
on each farm less one (2 1) multiplied by
number of farms (4).
Treatments x replications on farms: Total degrees (number of observations less 1, or 52 1) less the degrees found thus far (3 + 5
+ 9 + 4).
In the last column are listed the mean squares. These are obtained by dividing the degrees of freedom into the sum of squares. For example, the mean square for "treatments x replications on farms" is 179/12, or 15.
Table 5. Analysis of variance of data in table 3,
for arriving at an estimate of plot
Sum Degrees Mean
Source of variation of of square
1. Treatments 711 3 237
2. Farms 215 3 72
3. Treatments x farms 130 9 14
4. Replications on farms 192 1 4 48
5. Treatments x replications
on farms plot variability) 179i 12 15
6. Total 1,427 31
You are now ready to calculate the plot variability:
Plot variability = _Mean square for "treatments x replications on farms"
To express this variability in percent, use the following equation, which calls for the mean yield of untreated plots. These plots are the ones that were shown in table 3 as receiving treatment B; i.e., the check treatment.
Plot variability in percent Plot variability x 100
Mean yield of check plots
3 .87 x 100
Use this estimate in the calculations of extensive-test error (Part III, page 15).
The tests at the different farms may not involve exactly the same experimental design. Often the treatments are not exactly the same. As long as they are reasonably similar, you can combine them, even if the number of treatments or replications differ. In order to combine plot variability from different experiments, calculate first the plot variability in percent for each experiment separately. Then square the percentages. Add the squares,, and obtain the mean of the sum by dividing it by the number of experiments. Finally, take the square root of the mean. As an example:
Experiment Plot Variability Squares
1 12 144
2 15 225
5 10 100
4I 13 169
Si~m of squares 638
Mean of squares (638/4) 160
Mean plot variability ( A\fl-6 ) 12.6%
A word of caution about Method 3: The calculations are rather exact, but in using the resulting information for designing an extensive test, it is generally necessary to exercise some discretion as to applicability. If the plots to be used in the extensive test are much larger than those used in the experiment, additional calculations are necessary (see page 28).
Furthermore, in order to feel confident about using the information from experiments, you must be fairly sure that soil variability at the experiment, stations is reasonably representative of soil variability on farms in the region. If the soil is more uniform at the experiment stations than it is likely to be on the extensive-test farms, you would be justified in raising somewhat the estimate of plot variability.
If survey data- -the kind used for Method 1--do not already exist for your area, you can obtain some by making a quick survey of farms. An example of data collected in such a survey is shown in Table 6: 15 farms were selected at random, and yield per acre of the crop in question was obtained from the farms' total yield of that crop, not from small sample plots. If, however, you get yields from small plots, follow instead the procedure in Method 2 (page 18).
The first step in determining location variability from the yield data in table 6 is to add all the yields and divide by the number of locations, thus arriving at the mean yield--in this example, 32. Now find the differenbe between the mean yield and each location yield, and enter these differences, or deviations, in the third column. Rank the locations in increasing order of the deviations: Farm 2 has the smallest deviation and ranks first; Farm 4 has the largest deviation and ranks fifteenth. Now list the 15 farms in the order of this rank (table 7). The midpoint, or median, rank is 8; and the deviation
corresponding to this rank is 9. Location variability is now obtained as follows:
Location variability =Deviation for midpoint rank
To express the variability in percent, use this equation:
Location variability in percent -Location variability x 100 Mean yield
13 x 100
Use this estimate in the calculations of extensive-test error (Part III, page 15).
4/ Snedecor, op. cit., sec. 2.16.
Table 6. Survey data on average yield
of entire farms, for 15 randomly selected farms. Farm Yield per Deviation
acre ifrom ean R
1 27 5 4
2 33 1 1
3 23 9 7
4 71 1 39 15
5 31 1 2
6 24 8
7 21 11 10
8 21 11 11
9 57 25 14
10 15 17 13
11 41 9 8
12 31 1 3
13 27 5 5
14 42 10 9
15 19 13 12
Total yield; 483 Mean yield 32
Table 7. Relisting of farms of table 6 in
order of rank.
Rank Farm Deviation
I 2 1
2 5 1
3 12 1
4 1 5
5 13 5
6 6 8
7 3 9
8 11 9
9 14 10
10 7 11
11 8 11
12 15 13
13 10 17
14 9 25
15 4 39
With Method 2, which calls for a calculating machine, you can make a more accurate estimate of location variability from the survey data. We will illustrate the procedure for two types of data: one in which yields of entire farms are available, and the other in which yields of sample plots are available. The data in table 6 can be used as an example of yields for entire farms. Start with he yields in the second column of the table and add their squares;r.
27 2 + 33 + *..+ 192 18,777 From this value, the uncorrected sum of squares, subtract a "correction
factor" obtained by squaring the sum of the yields and dividing by the number of farms:Correction factor = 8= 15,553
Subtracting 15,553 from 18,777, you get 3p224i. This value is called the corrected sum of squares. Now divide by the number of farms less 1, that is, by 15 1, to get the mean square: Mean square Sum of squares
Number of farms 1
The square root of 230, or 15, is the location variability. Express it in percent as follows:
Location variability in percent =Plot variability x 100 Mean yield
=15 x 100
5/ Snedecor, op cit., sec. 2.8.
This 47 percent is a more exact computation of the location variability than the 41 percent obtained by Method 1. This becomes, then, the estimate to use in calculating extensive-test error (Part III, page 15).
Now let us illustrate the procedure when the yields are for small plots on farms rather than for entire farms o6 You must measure 2 plots side by side on each farm. Hence, you can use here the data collected for plot variability in table 1. However, summarize the data as shown in table 8.
The first step is to square each observation, or the value for each plot, and add the squares to get the uncorrected sum:
312 + 212 + . + 212 = 37,769
Then compute the correction factor:
Correction factor Total yield2 Number of plots
The difference between the uncorrected sum of squares and this correction factor, 37,769 31,041, or 6,728, is called the total sum of squares. Enter it in the indicated place in table 9. In the next column, enter the total degrees of freedom--the total number of observations less 1, or 30 1.
The next value to compute is the sum of squares for farms. Add the squares of the farm totals (in last column of table 8) and divide by the number of plots per farm:
662 + 542 + o + 382 = 74,889 = 37,44
and then subtract the correction factor, already computed as 31,041, to get 6,403. Enter this value in table 9 as the sum of squares for farms. In the next column enter the degrees of freedom--the number of farms less 1, or 15 1.
Now, for plots, obtain the sum of squares and degrees of freedom by subtracting the values for farms from the values for total.
6/ Snedecor, op. cit., sec. 10.6.
Table 8. Survey data on yields of 2 adjoining
plots on each of 15 farms. (Each plot
is of the size to be used in the
Farm Plot 1 Plot 2 Sum
1 31 35 66
2 21 33 54
3 25 20 45
4 74 68 142
5 32 31 63
6 21 28 49
7 24 18 42
8 20 22 42
9 52 61 113
10 13 17 30
11 46 36 82
12 33 28 61
13 30 24 54
14 38 46 84
15 17 21 38
Total yield 965
Number of plots 30
Mean yield 32
Table 9. Analysis of variance of data in table 8. for arriving at location and plot variabilities.
Source Sum Degrees Mean Variance
of of of sqae Unite component
variation sqae freedom sur
Farms 60314 457
Plots 325 15 22 1 22
Total 6,728 29
Difference due to farms 4 5 2 218
Obtain the mean squares for farms and plots by dividing the degrees of freedom into the sum of squares. Then, to find out what variation was due to the farms, subtract the mean square for plots from the mean square for farms. You have to do this to'eliminate the plot-to-plot variation from the farm differences.
The next column of table 9, "units," gives the number of individual plots that were represented in the numbers you squared. For farms, you squared a value that represented 2 plots; hence, enter 2 in the bottom line of that column. For plots, you squared the values for individual plot yields; hence, enter 1 for plots. Divide the mean squares by the respective number of units to get the values called variance components. The square roots of these variance components are the variability values we seek. For farms (locatiozi)the variability is the square root of 218, or 14.8. Expressed in percent of the mean, it is:
Location variability in percent =Location variability x 100 Mean yield
=14.8 x 100
Use this variability in calculating extensive-test error (Part III) page 15).
If the plots of the survey are approximately the same size as those you will use in the extensive test, the variance component for plots can be used to determine plot variability. Follow the same procedure as with location variability:
Plot variability = V= 4.7
You express this variability in percent as follows:
Plot variability in percent =Plot variability x 100 Mean yield
-4.7 x 100
Note that you have the same answer here as you obtained by Method 2 for plot variability, page 6. Use this estimate in your calculation of extensive-test error (Part III, page 15).
Method 3, which uses experimental data for estimating variability, should be used for location variability only if the experiments have been conducted on enough farms to constitute a fair sampling of the region. A minimum of 10 farms is a good rule-of-thumb requisite. Most experiments are conducted in only one location, or in only a few, and therefore cannot serve the purpose.
Table 3 shows experimental data from only 4 farms, not enough to make a reasonably precise estimate of location variability. When data are from so few locations, do not use Method 3 for calculating location variability. Instead, use one of the other methods already described. They are based on a more inclusive sample and also require much less computation.
Another disadvantage of using experimental data for determining location variability is that the calculations are usually complicated. You will have to call on the technicians at the research station to do the calculations. This may be feasible, and we will therefore give an exampleof the procedure in the next section, on treatment variability. In that section we will use the same example for both treatment and location variability.
Only data from experiments can be used to calculate treatment variability. Hence, Methods 1 and 2, the ones that work with survey data, are not applicable. But, in using Method 3, be sure that your data are from an experiment that meets the following requirements: (1) Experimental farms must be representative of the region as to both anticipated treatment effects and treatment variability and (2) experimental treatments must be similar to those planned for the extensive test. Unless the experiment meets both these requirements, you had better use the method given in Part III (page 13) of this Guide.
Let us assume that the data in table 3 are from an experiment that meets these requirements and therefore can be used to illustrate the procedure of Method 3. Start with the analysis of variance that was made of the data in table 5-1/ 8/ You now rewrite the table, using symbols as shown in table 10. The purpose of the revision is to help
7/ William G. Cochran and Gertrude M. Cox. Experimental Designs.
New York, 1950, Sec. 14.1.
8/ Snedecor, 22. cit., sec. 11.13.
you isolate the variance component for treatment variability from the rest of the information obtained in the experiment. The revision looks imposing but is not difficult if you follow directions stepby step.
Copy the first column of table 5 into the first column of table 10. Filling in the next section of the table, "Calculating mean square in symbols," is accomplished in 3 steps that will be easy if you simply follow directions. By the time you have finished the third step, you will have written equations that show the various components in symbols for each mean square.
Step 1 is to assign a letter as a symbol for each source of variation thus: T for treatments; F for farms; R(F) for replication, or plots on farms; and all the symbols--TFRF) --for the total. After each symbol, write the number that applied in the experiment: table 3 will remind you that there were 4 treatments, 4 farms, and 2 replications per farm, making 32 plots in all.
In Step 2, write the cofficient for each variance in step 1. This consists of the symbols that are missing. To clarify: all the symbols, T, F, and RF) are found in the total; but not all are found in each variance. Whichever ones are missing in each are now to be written under Step 20 in small letters. Thus, for variance T, both F and R n are missing; write them so--fr r) When you get down to the fourth line, note that the variance, R(F lacks both T and F; but the F is not written in Step 2 because there is a rule that when a symbol appears in parenthesis in the variance of Step 1, this symbol shall not be repeated in the coefficient.
Nowymn come to Step 3, which, when finished, will give you the complete formulas. First copy each variance into this column, preceding it with its coefficient; for "treatments," for instance, write frrT. Now add to it all other variances that also contain the identifying symbol T, not forgetting to precede each variance with its coefficient. In this case, the other variances that contain T are TF and TR(F) ; the latter, not having a coefficient to precede it, is added all by itself. Continue thus to the end of the column. When you get to the last line, all you will have to enter will be TR (F for there is no other variance with all these symbols, neither does it have a coefficient to precede it.
Having calculated the mean squares in symbols, copy in the last column the numerical mean squares from table 5; and then you have everything you need to solve the symbol equations for the numerical value of the variance components. Start with TRvF) the plot variability on the fifth line. You need do no more than look in the last column to find that its value is 15. Now you can solve for RF) on line 4:
tR(F) + TR() = 48
Table 10. Revision of table 5 to obtain variance components for determining
variability by Method 3.
Source of variation Calculating mean square in symbols Mquarean
Step 1 Step 2 Step 3 in
(Variance) (Coefficient) (Completed formula) numbers
Treatments T (4) fr() fr(r) T + r (O TF + TR(a 237
variability) F (4) tr(r) tr() F + r () TF + tR(F) + TR(F) 72
Treatments x farms
variability) TF r(f) r(f) TF + TR (F) 14
farms R (F) (2) t tR(F) + TR (e 48
Treatments x replications on farms
(plot variability) TR (F)- TR (F) 15
Total TFR(F) (32) I
Since t = 4 (from the second column) and TR(F) = 15, rewrite the equation thus:
4R(F) + 15 48 48 15
Now proceed to solve for treatment variability, on the third line from the bottom: r(f)TF + TR(F) = 14 2TF + 15 = 14 14 15
This negative value is unusual; generally TF is positive. A negative variance component should be taken as zero. When you obtain a positive value for TF--25, for example--simply take its square root to obtain the treatment variability:
Treatment variability = X TF
To express this treatment variability in percent, use the following equation, which calls for the mean yield of check, or untreated, plots (see table 3 for yields of plots receiving treatment B):
Treatment variability in percent Treatment variability x 100 Mean yield, check plots
5 x 100
Use this estimate in your calculation of extensive-test error (Part III, page 15).
While we were discussing plot variability early in this section, we pointed out that experimental data can be used for estimating location variability also, provided the experiment covers enough locations to provide an adequate sample. The experiment summarized in table 3 had only 4 locations, hardly enough for a region of any size. Nevertheless we can use this experiment to illustrate the procedure for determining location variability. Refer again to table 10, and write the equation for farms:
tr(f) F + r (f) TF + tR (F) + TR (F = 72
Substitute the coefficients and the variabilities already solved:
(4 x 2F) + (2 x 0) + (4 x 8) + 15 = 72 8F + 0 + 32 + 15 = 72 F =72 32- 15
Then, since location variability is simply the square root of F-Location variability = VF
To express this location variability in percent, use this equation:
Locaion aribiliy i perent Location variability x 100 Locaion aribiliy i perent Mean yield of check plots
1 .77x 100
This is the estimate that you will use in calculating extensive-test error (Part III, page 15).
MODIFYING -THE ESTIMATE FOR LARGE PLOTS
If you have determined plot variability from data taken from plots of the same size as those you will use in the extensive test, you will not need the information in this section. But you may wish to use larger plots in the extensive test, especially in order to enhance its demonstrational value. If so, you will need the information given here.
When the extensive-test plots are to be much larger than the experimental plots, there is an additional adjustment to make. The procedure is quite simple. If the experimental plots were 5 square feet and the extensive-test plots will be 500 square feet, each of the latter will contain 100 experimental-plot units. Now refer to table 11. In the first column are listed a range of ratios between the sizes of the extensive-test plots and the experimental plots. The remaining columns give corresponding factors: one for crop tests, the other for animal tests. For a plot-size ratio of 100, the factor for crop tests is 3.2. Now, if the plot variability in the experimental plot. is 13 percent (for example, page 13), then-Plot variability for Plot variability in experiment W%
extensive test ()Factor
The plot variability thus obtained is the one to use in calculating extensive-test error (Part III, page 15).
A word of explanation will clarify the two sets of factors in table 11. In crop tests, the plot variability does not decrease in proportion to the size of the plots: as the size of the plots is increased, more variable ground is likely to be included. The variability between two plots lying side by side is less than the variability between two plots some distance apart. Thus, the set of factors for field crops, which is the fourth root of the ratios shown in the first column, takes into consideration this increased variability of soil with increasing plot size..2/ In animal tests, on the contrary, this circumstance does not apply; therefore, the factors used are simply the square roots of the ratios shown in the first column. The factors for animal tests can be
used also for any other tests in which increased "plot size" does not mean a proportionate increase in the area of land per plot.
2/ An adaptation from "An Empirical Law Describing the Heterogeneity in
the Yields of Agriculture Crops," by H. Fairfield Smith, Jour.
Agric. Sci. 28(Part 1):1-23, January 1938. In table 11, an
average heterogeneity factor of 0.50 is used.
Table 11. Factors to be used in determining the plot error
when the extensive-test plots are larger than
the experimental plots on which estimate of
variability was based.
Ratio: Extensive-test plot size IFactor for Factor for
Experimental plot size Icrop tests animall tests
1 1.0 1.0
2 1.2 1.4
3 1.3 1.7
4 1.4 2.0
5 1.5 2.2
6 1.6 2.4
8 1.7 2.8
10 1.8 3.2
20. 2.1 4.5
30 2.3 5.5
40 2.5 6.3
50 2.7 7.1
100 3.2 10.0
1/ This factor is the fourth root of the number in the first
2_/ This factor is the square root of the number in the first
column: -fT, ] 2F etc.
DETERMINING THE BEST SIZE OF PLOT
Most often you will not be concerned with the best size of the plots for an extensive test. Sometimes., however, it is necessary to limit the size of the plots as much as possible. For example, in a test of a new insecticide, you may not be able to get a sufficient quantity to permit large plots in the test. The problem of best plot size arises also when you are testing large things--trees, fruits, animals, or even people. You may want, then, to use the smallest plot that you can. In this section a procedure will be given to help you determine the best number of individuals, or units, to have in a plot.
You recall from Part III of this Guide, page 15, that the error for an extensive test is the sum of several variability components. But,, if we have a small number of individuals in a plot, the variability of these individuals also becomes important, and it, too, must be added to the other variabilities in our error calculations. You will now see how to do this.
First, clearly designate the individuals, or units, that you are dealing with--individual plants and trees, for example, or short lengths of row in field crops, single hills with several plants per hill, individual animals in cattle experiments, small quadrats in pasture and forage experiments, students in a school, persons in a family, or farmers in a community.
Next, obtain an estimate of the variability of these units, The value you seek is the variability among units receiving no experimental treatment. The procedure you follow is the same as the one we have discussed for determining plot variability except that now you are dealing with differences between individuals instead of with differences between plots. The question you must answer is this, "I~f I took 2 units per plot at random at a number of farms in the region, what difference in yield would there be between them?" This question may be answered by judgment (Part III, pages 13-14) or from survey data (see the section on plot variability in this Part, Methods 1 and 2, pages 5-7).
This question may be answered also by experimental data, but then a somewhat different procedure is used. To begin with, get measures of individual, or unit, yields in the experiment as well as yields of the whole plots. In table 3, for instance, yields are given for the plots as a whole, but these must now be supplemented with data for units. Such data can be gathered by harvesting either every unit (plant, hill, etc.) or just two units at random in 20 or so of the plots. You will probably do the latter since it is less work, and we will therefore show this procedure in detail.
Let us start with the experiment shown in table 3 and assume that each plot contains 27 plants. Now, go into 20 of these plots and, from each, harvest 2 plants at random. List the yield data as in table 12. Get the difference between each pair of p144ts and write the differences in the last column of the table. Nov square each difference, add the squokes,, divide by 2 (the number of plants per plot actually measured) and multiply by 27 (the total number of plants per plot). This calculation gives the aiim of squares for plants in plots:
Sum of squares (plants in plots) -(1.012 + 0.142 + ... + 0.212)27
Next make the analysis of variance of data as it is shown in table 15. For plots, the values can be merely copied from the plot-variability line in table 5. For plants within plots, insert values as follows:
Sum of squares = the sum of squares you have just now
Degrees of freedom = the number of differences listed
in table 12.
Mean square = the quotient resulting from dividing the
sum of squares by the degrees of freedom.
Square root of mean square = V-28 or 1.70.
Units = the number of plants in each plot.
Factor = the fourth root of the number of units, since this
is a crop test (see table 11, second column).
You have yet to enter the values of differences due to plots alone, which are as follows:
Mean square = the difference between mean square for plots
and the mean square for plants in plots = 14.9 2.89 = 12.0.
Square root of mean square V12 or 3.46.
Units = 1
Factor = the fourth root of the number of unite (see table
11, second column).
Table 12. Yields of 2 plants selected at random from
a total of 27 plants in each of 20 plots
of the experiment shown in table 3.
Plot Plant 1 Plant 2 Difference
1 0.67 1.68 1.01
2 1.37 1.23 0.14
3 0.94 1.26 0.32
4 1.51 0.79 0.72
5 1.66 i.04 0.62
6 1.25 1.21 0.04
7 1.39 0.84 0.55
8 1.27 1.39 0.12
9 1.28 1.10 0.18
10 1.20 1.16 0.04
11 1.19 1.07 0.12
12 1.28 1.19 0.09
13 1.00 1.43 0.43
14 1.76 1.03 0.73
15 1.11 1.50 0.39
16 1.58 1.13 0.45
17 1.30 1.24 0.06
18 1.04 1.55 0.51
19 1.21 1.90 0.69
20 0.99 0.78 0.21
Table 13. Analysis of variance of data for 20 plots and for
2 plants within each plot, with 27 plants per
Source Sum Degrees Mean Square root Number
of of of square of of Factor
variation squares freedom ____mean square units___Plots 179 12 14.9
plots 57.7287 20 2.89 1.70 27 2.28
Difference due to plots 12.0 3.46 1 1.00
The variabilities are obtained by multiplying the square roots of the mean squares by the factors in the last column of table 13:
Plant variability = 1.70 x 2.28
Plot variability = 3.46 x l.oo
These variability components are now to be expressed in percent. Simply multiply the variability by 100 and divide by the mean yield for the check plots. This mean yield is obtained from table 3. which indicates the check plots as those receiving treatment B; divide the total yield from these plots (232) by the number of plots (6). Then-Plant variability in percent Plant variability x 100 Mean yield (check plots)
3.88 x loo
Plot variability in percent Plot variability x 100 Mean yield (check plots)
3.46 x loo
Now we are ready to determine the error of the extensive test. You do this by adding the variability components. You have just found the components for plants and plots; let us assume values of 46 percent and 10 percent respectively for the location and treatment components in order to illustrate the rest of the procedure. It is like the one shown in Part III, page 15, with the addition of the plant component and the factor (F) for the number of plants per plot, which you obtain from table 11. For Plan A use all four components (the various plans are given in Part III, in the Appendix):
Plant variability 13/F%; squared (13/F)2
Plot variability = 12%; squared 144
Location variability = 46%; squared 2)116 Treatment variability = 10%; squared 100
Total of the squares (13/F)2 + 2,36o
Extensive-test error (13/F)2 + 2,36o
For Plans B to H,, omit the location variability:
Plant variability 13/F%; squared = (13/F)2
Plot variability = 12%; squared = 144
Treatment variability = 10%; squared = 100
Total of the squares + 244
Extensive test error 'V (13/F)2 + 2
Now, set up table 14 to aid in determining the best number of units per plot. Record first the specifications of the extensive test. The first column in table 14 indicates the plan. The second column lists various numbers of units per plot,, a range from 2 to 40. You can try whatever numbers you are interested in. In column 3, the units are transposed into factors by referring to table 11. Since this is an experiment with a crop, the factors in the second column of table 11 are used. For tests in which the plots are not units of land, the factors in column 3 of table 11 would be used. Column 4 is the plant variability (13 percent) divided by the factors of column 3. Column 5 is the square of column 4. Column 6 is column 5 Plus the remainder of the error shown at the head of the table: 2,360 for plan A and 244 for plans B to H. Column 7 is the
8 quare root of column 6; these values are the errors for the plans. Column 8 is the difference to be tested, shown at the head of the table as 30 Percent, divided by the plan errors shown in column 7. In column 9 record from Part III, table 2, the number of replications required for the values in column 8.
Column 9 gives the answer you seek. For plan A note that the number of replications required is rather large and does not decrease beyond 5 plants per plot. Hence, 5 plants per plot are ample if you desire to keep the plots as small as possible.
For plans B to H the number of replications required is much fewer. They decrease up to 10 plants per plot but not thereafter. You have little to gain, then, by having more than 10 plants in a plot.
Once you have selected the minimum number of plants per Plot--5 for plan A, and 10 for plans B to H--go back to Part III, table 3. Enter the corresponding number of replications in column 4 of that table. Then complete the rest of the summary in that table to determine the total requirements of the extensive test with the stipulated number of plants per plot.
Table 14. Example of a summary to aid in determining the minimum number of plants per
plot for several designs.
Number of treatments: 7 (1 is a check) No. of plots on each farm: 1, 2, 3, or 4 Minimum difference: 30%
Anticipated variability components: Plants, 13% Plots, 12%
Error (Plan A) = (13/F)2 + 2,360
Error (Plans B to H) = 1'(13/F)2 + 244
-(1) (2)1 (3) (4) (5) (6) (7) (8) (9)
Units Plant Remainder Error % Difference Replications
Plan per Factor variability Column 4 of error t error required
ploti Col. 3 squared + Col. 5 ( Col. 6) (Col. 7) (Part III, Table 2)
A 2 1.2 10o.8 117 2477 49.8 .60 60
5 1.5 8.7 76 2436 49.4 .61 58
10 1.8 7.2 52 2412 49.1 .61 58
15 2.0 6.5 42 2402 49.0 .61 58
20 2.1 6.2 38 2398 49.0 .61 58
30 2.3 5.7 32 2393 48.9 .61 58
40 2.5 5.2 27 2387 48.9 .61 58
B to H 2 1.2 10.8 117 361 19.0 1.58 11
5 1.5 8.7 76 320 17.9 1.68 10
10 1.8 7.2 52 296 17.2 1.74 9
15 2.0 6.5 42 286 16.9 1.78 9
20 2.1 6.2 38 282 16.8 1.79 9
30 2.3 5.7 277 16.6 1.81 9
40 2.5 5.2 27 271 16.5 1.82 9