Citation
Selection of important properties to evaluate the use of geostatistical analysis in selected northwest Florida soils

Material Information

Title:
Selection of important properties to evaluate the use of geostatistical analysis in selected northwest Florida soils
Creator:
Ovalles, Francisco A., 1950-
Publication Date:
Language:
English
Physical Description:
xvi, 208 leaves : ill. ; 28 cm.

Subjects

Subjects / Keywords:
Dissertations, Academic -- Soil Science -- UF
Soil Science thesis Ph. D
Soils ( fast )
Florida ( fast )
City of Gainesville ( local )
Soil properties ( jstor )
Statistical discrepancies ( jstor )
Soil horizons ( jstor )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1986.
Bibliography:
Includes bibliographical references (leaves 197-206).
Additional Physical Form:
Also available online.
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Francisco A. Ovalles.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. §107) for non-profit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
Resource Identifier:
029822517 ( ALEPH )
15376316 ( OCLC )

Downloads

This item has the following downloads:

selectionofimpor00oval.pdf

EK269E4VU_JBVEXD.xml

selectionofimpor00oval_0188.txt

selectionofimpor00oval_0194.txt

selectionofimpor00oval_0044.txt

selectionofimpor00oval_0205.txt

selectionofimpor00oval_0056.txt

selectionofimpor00oval_0217.txt

selectionofimpor00oval_0161.txt

selectionofimpor00oval_0180.txt

selectionofimpor00oval_0171.txt

selectionofimpor00oval_0069.txt

selectionofimpor00oval_0076.txt

selectionofimpor00oval_0123.txt

selectionofimpor00oval_0112.txt

selectionofimpor00oval_0121.txt

selectionofimpor00oval_0101.txt

selectionofimpor00oval_0174.txt

selectionofimpor00oval_0149.txt

selectionofimpor00oval_0135.txt

selectionofimpor00oval_0086.txt

selectionofimpor00oval_0091.txt

selectionofimpor00oval_0137.txt

selectionofimpor00oval_0151.txt

selectionofimpor00oval_0127.txt

selectionofimpor00oval_0096.txt

selectionofimpor00oval_0037.txt

selectionofimpor00oval_0104.txt

selectionofimpor00oval_0058.txt

selectionofimpor00oval_0041.txt

selectionofimpor00oval_0207.txt

selectionofimpor00oval_0106.txt

selectionofimpor00oval_0084.txt

selectionofimpor00oval_0049.txt

selectionofimpor00oval_0129.txt

selectionofimpor00oval_0009.txt

selectionofimpor00oval_0143.txt

selectionofimpor00oval_0158.txt

selectionofimpor00oval_0153.txt

selectionofimpor00oval_0198.txt

selectionofimpor00oval_0034.txt

selectionofimpor00oval_0206.txt

selectionofimpor00oval_0031.txt

selectionofimpor00oval_0225.txt

selectionofimpor00oval_0126.txt

selectionofimpor00oval_0175.txt

selectionofimpor00oval_0110.txt

selectionofimpor00oval_0210.txt

selectionofimpor00oval_0036.txt

selectionofimpor00oval_0213.txt

selectionofimpor00oval_0067.txt

selectionofimpor00oval_0033.txt

selectionofimpor00oval_0059.txt

selectionofimpor00oval_0011.txt

selectionofimpor00oval_0147.txt

selectionofimpor00oval_0193.txt

selectionofimpor00oval_0029.txt

selectionofimpor00oval_0021.txt

selectionofimpor00oval_0172.txt

selectionofimpor00oval_0187.txt

selectionofimpor00oval_0184.txt

selectionofimpor00oval_0216.txt

selectionofimpor00oval_0169.txt

selectionofimpor00oval_0063.txt

selectionofimpor00oval_0165.txt

selectionofimpor00oval_0170.txt

selectionofimpor00oval_0191.txt

selectionofimpor00oval_0134.txt

selectionofimpor00oval_0218.txt

selectionofimpor00oval_0197.txt

selectionofimpor00oval_0050.txt

selectionofimpor00oval_0090.txt

selectionofimpor00oval_0032.txt

selectionofimpor00oval_0019.txt

selectionofimpor00oval_0222.txt

selectionofimpor00oval_0160.txt

selectionofimpor00oval_0093.txt

selectionofimpor00oval_0105.txt

selectionofimpor00oval_0083.txt

selectionofimpor00oval_0095.txt

selectionofimpor00oval_0039.txt

selectionofimpor00oval_0073.txt

selectionofimpor00oval_0028.txt

selectionofimpor00oval_0045.txt

selectionofimpor00oval_0195.txt

selectionofimpor00oval_0055.txt

selectionofimpor00oval_0020.txt

selectionofimpor00oval_0173.txt

selectionofimpor00oval_0190.txt

selectionofimpor00oval_0008.txt

selectionofimpor00oval_0007.txt

selectionofimpor00oval_0100.txt

selectionofimpor00oval_0136.txt

selectionofimpor00oval_0108.txt

selectionofimpor00oval_0220.txt

selectionofimpor00oval_0179.txt

selectionofimpor00oval_0087.txt

selectionofimpor00oval_0178.txt

selectionofimpor00oval_0211.txt

selectionofimpor00oval_0075.txt

selectionofimpor00oval_0042.txt

selectionofimpor00oval_0103.txt

selectionofimpor00oval_0074.txt

selectionofimpor00oval_0154.txt

selectionofimpor00oval_0013.txt

selectionofimpor00oval_0223.txt

selectionofimpor00oval_0065.txt

selectionofimpor00oval_0152.txt

selectionofimpor00oval_0189.txt

selectionofimpor00oval_0204.txt

selectionofimpor00oval_0176.txt

selectionofimpor00oval_0024.txt

selectionofimpor00oval_0196.txt

selectionofimpor00oval_0224.txt

selectionofimpor00oval_0012.txt

selectionofimpor00oval_0141.txt

selectionofimpor00oval_0098.txt

selectionofimpor00oval_0016.txt

selectionofimpor00oval_0080.txt

selectionofimpor00oval_0000.txt

selectionofimpor00oval_0025.txt

selectionofimpor00oval_0140.txt

selectionofimpor00oval_0048.txt

selectionofimpor00oval_0162.txt

selectionofimpor00oval_0053.txt

selectionofimpor00oval_0209.txt

selectionofimpor00oval_0089.txt

selectionofimpor00oval_0148.txt

selectionofimpor00oval_0079.txt

selectionofimpor00oval_pdf.txt

selectionofimpor00oval_0131.txt

selectionofimpor00oval_0130.txt

selectionofimpor00oval_0081.txt

selectionofimpor00oval_0109.txt

selectionofimpor00oval_0155.txt

selectionofimpor00oval_0139.txt

selectionofimpor00oval_0057.txt

selectionofimpor00oval_0122.txt

selectionofimpor00oval_0018.txt

selectionofimpor00oval_0144.txt

selectionofimpor00oval_0052.txt

selectionofimpor00oval_0004.txt

selectionofimpor00oval_0125.txt

selectionofimpor00oval_0156.txt

selectionofimpor00oval_0070.txt

selectionofimpor00oval_0119.txt

selectionofimpor00oval_0092.txt

selectionofimpor00oval_0133.txt

selectionofimpor00oval_0102.txt

selectionofimpor00oval_0167.txt

selectionofimpor00oval_0085.txt

selectionofimpor00oval_0072.txt

selectionofimpor00oval_0030.txt

selectionofimpor00oval_0113.txt

selectionofimpor00oval_0015.txt

selectionofimpor00oval_0064.txt

selectionofimpor00oval_0002.txt

selectionofimpor00oval_0183.txt

selectionofimpor00oval_0060.txt

selectionofimpor00oval_0132.txt

selectionofimpor00oval_0014.txt

selectionofimpor00oval_0159.txt

selectionofimpor00oval_0077.txt

selectionofimpor00oval_0186.txt

selectionofimpor00oval_0157.txt

selectionofimpor00oval_0168.txt

selectionofimpor00oval_0221.txt

selectionofimpor00oval_0163.txt

selectionofimpor00oval_0066.txt

selectionofimpor00oval_0043.txt

selectionofimpor00oval_0010.txt

selectionofimpor00oval_0047.txt

selectionofimpor00oval_0200.txt

selectionofimpor00oval_0138.txt

selectionofimpor00oval_0040.txt

selectionofimpor00oval_0115.txt

selectionofimpor00oval_0017.txt

selectionofimpor00oval_0203.txt

selectionofimpor00oval_0142.txt

selectionofimpor00oval_0094.txt

selectionofimpor00oval_0071.txt

selectionofimpor00oval_0164.txt

selectionofimpor00oval_0166.txt

selectionofimpor00oval_0061.txt

selectionofimpor00oval_0116.txt

selectionofimpor00oval_0120.txt

selectionofimpor00oval_0208.txt

selectionofimpor00oval_0001.txt

selectionofimpor00oval_0202.txt

selectionofimpor00oval_0201.txt

selectionofimpor00oval_0054.txt

selectionofimpor00oval_0005.txt

EK269E4VU_JBVEXD_xml.txt

selectionofimpor00oval_0026.txt

selectionofimpor00oval_0212.txt

selectionofimpor00oval_0128.txt

selectionofimpor00oval_0118.txt

selectionofimpor00oval_0146.txt

selectionofimpor00oval_0097.txt

selectionofimpor00oval_0124.txt

selectionofimpor00oval_0027.txt

selectionofimpor00oval_0182.txt

selectionofimpor00oval_0088.txt

selectionofimpor00oval_0082.txt

selectionofimpor00oval_0145.txt

selectionofimpor00oval_0023.txt

selectionofimpor00oval_0185.txt

selectionofimpor00oval_0051.txt

selectionofimpor00oval_0078.txt

selectionofimpor00oval_0219.txt

selectionofimpor00oval_0107.txt

selectionofimpor00oval_0192.txt

selectionofimpor00oval_0215.txt

selectionofimpor00oval_0046.txt

selectionofimpor00oval_0199.txt

selectionofimpor00oval_0114.txt

selectionofimpor00oval_0022.txt

selectionofimpor00oval_0117.txt

selectionofimpor00oval_0068.txt

selectionofimpor00oval_0214.txt

selectionofimpor00oval_0003.txt

selectionofimpor00oval_0035.txt

selectionofimpor00oval_0177.txt

selectionofimpor00oval_0181.txt

selectionofimpor00oval_0111.txt

selectionofimpor00oval_0038.txt

selectionofimpor00oval_0099.txt

selectionofimpor00oval_0062.txt

selectionofimpor00oval_0006.txt

selectionofimpor00oval_0150.txt


Full Text














SELECTION OF IMPORTANT PROPERTIES TO EVALUATE
THE USE OF GEOSTATISTICAL ANALYSIS IN
SELECTED NORTHWEST FLORIDA SOILS





BY





FRANCISCO A. OVALLES




















A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY


UNIVERSITY OF FLORIDA

1986



























This dissertation is dedicated to my wife, Giordana, and my
children, Johanna Fernanda and Pedro Jose.















ACKNOWLEDGMENTS

The author wishes to express his gratitude to Dr. Mary

E. Collins, chairman of the supervisory committee, for her

continuous help, guidance, patience, and personal

friendship throughout the graduate program. Appreciation

is also extended to other members of the committee, Dr.

Gustavo Antonini, Dr. Richard Arnold, Dr. Randall "Randy"

Brown, and Dr. Stewart Fotheringham, for their constructive

reviews of this work, participation on the graduate

supervisory committee, and personal friendship.

Appreciation is expressed to the Consejo Nacional de

Investigaciones Cientificas y Tecnologicas (CONICIT),

Venezuela, for the scholarship which supported the author.

Thanks are extended to Dr. Willie Harris who

introduced me to the Keepit and YT, and always was ready to

answer any of my questions.

Very special thanks are due to Dr. Gregory "Greg"

Gensheimer for lending me the geostatistical program and

his own computer to type this dissertation.

Gratitude is expressed to the staff of the Soil

Characterization Laboratory, for their friendship and

valuable assistance, to other graduate students, staff, and

faculty.

iii








Appreciation is extended to all my friends from all

six continents (America, Africa, Asia, Europe, Oceania, and

Florida) whom I had the pleasure of knowing here.

Finally, but certainly not least, I thank my wife,

Giordana, my daughter, Johanna Fernanda, and my son, Pedro

Jose, for their love and continuous help, encouragement and

patience during this work.







































iv














TABLE OF CONTENTS

Page

ACKNOWLEDGMENTS.........................................iii

LIST OF TABLES..........................................vii

LIST OF FIGURES..........................................ix

ABBREVIATIONS ...........................................xii

ABSTRACT. ................................................ xiv

INTRODUCTION............................................... 1

LITERATURE REVIEW......................................... 4

Principal Component Analysis ............................ 4
Geostatistics .......................................... 13
Historical Development ............................. 13
Theoretical Bases................................... 15
Practical Use ...................................... 34
Fractals................................................ 53

DESCRIPTION OF STUDY AREA ................................ 59

Location...................... .. ........................ 59
Physiography, Relief, and Drainage..................... 59
Geology. ............................................... 61
Climate.................................................62
Land Use and Vegetation................................ 62
Soils..................................................63

MATERIALS AND METHODS .................................... 66

Data Source ............................................ 66
Location of Pedons...................................... 67
Statistical Analyses.................................... 68
Normality Analysis ................................. 68
Principal Component Analysis........................ 69
Geostatistical Analysis ............................ 71

RESULTS AND DISCUSSION ................................... 75
Test of Normality ......................................75


v









Page

Principal Component Analysis ........................... 83
Principal Component Analysis for Standardized
Weighted Data.................................... 83
Principal Component Analysis for A horizon
Standardized Data................................ 93
Principal Component Analysis by Soil Series.......101
Geostatistics ........................................113
Semi-Variograms............................ ......114
Fitting Semi-Variograms...........................140
Kriging ........................................... 143
Fractals .......................................... 156

SUMMARY AND CONCLUSIONS..................................165

APPENDIX

A CLASSIFICATION OF SOIL SERIES STUDIED.......... 178

B GEOGRAPHIC COORDINATES OF PEDONS STUDIED....... 181

C SEMI-VARIOGRAMS FOR DIRECTIONS WITH
LARGEST VARIABILITY........................... 186

D CONTOUR MAPS FOR DIRECTIONS WITH
LARGEST VARIABILITY........................... 192

E MAP OF PHYSIOGRAPHIC REGIONS IN
NORTHWEST FLORIDA ............................ 196

LITERATURE CITED......................................... 197

BIOGRAPHICAL SKETCH ..................................... 207


















vi














LIST OF TABLES

Table Page

1 Order, Great Group, and relative proportion
of pedons studied.................................. 65

2 Statistical moments of soil properties studied
and Kolmogorov test................................ 78

3 Proportion of total variance explained by each
principal component................................ 85

4 Eigenvectors of correlation matrix for
standardized weighted average of soil
properties......................................... 89

5 Tolerance of standardized weighted average of
soil properties by principal component............. 91

6 Correlation coefficients between standardized
weighted average of soil properties and
principal components................................ 92

7 Proportion of total variance explained by each
principal component for standardized A horizon
data ............................................... 94

8 Eigenvectors of correlation matrix for
standardized properties of A horizon............... 95

9 Tolerance of standardized properties of
A horizon by principal component................... 96

10 Correlation coefficient between standardized
properties of A horizon and principal
components.........................................97

11 Correlation coefficient between standardized
properties of Al horizon and principal
components......................................... 99

12 Correlation coefficient between standardized
properties of Ap horizon and principal
components....................................... 100


vii









Table Page

13 Variability of studied soil properties within
and between soil series and between horizons......107

14 Important semi-variogram parameters of the
weighted average of selected soil properties......127

15 Important semi-variogram parameters of the
A horizon selected properties..................... 136

16 Goodness-of-fit values of the weighted average
of selected soil properties.......................142

17 Goodness-of-fit values of the A horizon
selected properties............................... 144

18 Fractal dimension (D value) derived from
selected soil property semi-variograms............ 158

19 Fractal dimension (D value) derived from
selected soil property semi-variograms
for a reduced study area..........................162




























viii














LIST OF FIGURES

Figure Page

1 Relation among variance, covariance, and
semi-variance ..................................... 20

2 Common semi-variogram models...................... 27

3 Equation number 35................................31

4 Equation number 36 (a) and Equation
number 37 (b) ..................................... 32

5 Location of the counties from which
characterization data were available
for pedons selected for study..................... 60

6 Histogram (a) and normal probability plot (b)
of fine sand content.............................. 80

7 Histogram (a) and normal probability plot (b)
of organic carbon content......................... 82

8 Location of standardized weighted average
values of soil properties in the plane of
the first two principal components................ 86

9 Location of standardized weighted average
values of soil properties in the plane of
the rotated first two principal components........88

10 Soil properties with a large contribution
to the total variance by county for the
Albany series.................................... 102

11 Soil properties with a large contribution
to the total variance by county for the
Dothan series.................................... 103

12 Soil properties with a large contribution
to the total variance by county for the
Orangeburg series................................ 104

13 Location of selected soil series in the plane
of the first two principal components............106

ix








Figure Page

14 Location of selected soil series in the plane
of the first two principal components derived
from important soil properties...................110

15 Location of selected pedons in the studied
area............................................. 115

16 Weighted average total sand content first
direction-independent semi-variogram ............. 120

17 Weighted average clay content first
direction-independent semi-variogram............. 121

18 Weighted average total sand content fitted
direction-independent semi-variogram............. 123

19 Weighted average total sand content
direction-dependent semi-variograms.............. 124

20 Weighted average clay content fitted
direction-independent semi-variogram............. 125

21 Weighted average clay content direction-
dependent semi-variograms........................126

22 Weighted average organic carbon content fitted
direction-independent semi-variogram.............131

23 Weighted average organic carbon content
direction-dependent semi-variograms.............. 132

24 A horizon clay content fitted direction-
independent semi-variogram....................... 134

25 A horizon clay content direction-dependent
semi-variograms..................................135

26 A horizon organic carbon content fitted
direction-independent semi-variogram............. 138

27 A horizon organic carbon content direction-
dependent semi-variograms........................ 139

28 Contour map (increment is 10.0%) (a) and diagram
(vertical exaggeration is 18x, azimuth of
viewpoint is 259) (b) of kriged weighted
average total sand content.......................146

29 Contour map (increment is 10.0%) (a) and diagram
(vertical exaggeration is 18x, azimuth of

x








Figure Page

viewpoint is 259) (b) of kriged weighted
average clay content............................. 147

30 Contour map (increment is 2.0%) (a) and diagram
(vertical exaggeration is 18x, azimuth of
viewpoint is 25Q) (b) of kriged A horizon
clay content .....................................148

31 Diagram (vertical exaggeration is 18x, azimuth
of viewpoint is 259) of standard errors of
kriged weighted average total sand content.......153

32 Diagram (vertical exaggeration is 18x, azimuth
of viewpoint is 25Q) of standard errors of
kriged weighted average clay content............. 154

33 Diagram (vertical exaggeration is 18x, azimuth
of viewpoint is 25Q) of standard errors of
kriged A horizon clay content....................155

34 Location of reduced study area...................161

35 Weighted average total sand content fitted
N-S semi-variogram ............................... 186

36 Weighted average clay content fitted
N-S semi-variogram.............................. 187

37 Weighted average organic carbon content fitted
N-S semi-variogram............................... 188

38 A horizon clay content fitted
NW-SE semi-variogram............................. 189

39 A horizon organic carbon content fitted
NW-SE semi-variogram............................. 190

40 Contour map (increment is 10.0%) derived from
weighted average total sand content N-S
semi-variogram...................................192

41 Contour map (increment is 10.0%) derived from
weighted average clay content N-S
semi-variogram ................................... 193

42 Contour map (increment is 2.0%) derived from
A horizon clay content NW-SE semi-variogram......194

43 Map of physiographic regions in northwest
Florida (Source: Brooks, 1981b)..................196

xi















ABBREVIATIONS

a = range

A-Cl = A horizon clay content

A-OC = A horizon organic carbon content

BS = Base saturation

C = Coarse sand

c = Sill

Ca = Calcium

CEC = Cation exchange capacity

Co = County

3 = Bay
30 = Holmes
32 = Jackson
33 = Jefferson
37 = Leon
40 = Madison
57 = Santa Rosa
66 = Walton

COV = Covariance

C.V. = Coefficient of variation

EXT = Extractable acidity

F = Fine sand

G(h) = GAMMA = Semi-variance

h = Lag distance

K = Potassium

M = Medium sand


xii








Mg = Magnesium

Na = Sodium

OC = Organic carbon

PHI = pH-water

PH2 = pH-KCl

Sc = Selection criterion for eigenvectors

T = Tolerance

TB = Total bases

TH = Horizon thickness

TS = Total sand

VAR = Variance

VC = Very coarse sand

VF = Very fine sand





























xiii















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

SELECTION OF IMPORTANT PROPERTIES TO EVALUATE
THE USE OF GEOSTATISTICAL ANALYSIS IN
SELECTED NORTHWEST FLORIDA SOILS

BY

FRANCISCO A. OVALLES

December, 1986

Chairman: M.E. Collins
Major Department: Soil Science

Soil variability is a limiting factor in making

accurate predictions of soil performance at any particular

position on the landscape. A large number of studies have

been made to quantify soil variability, but a large portion

of them ignored the multivariate character of soils and the

geographic aspect of soil variability. Data from 151

pedons in northwest Florida were selected (i) to determine

the important properties affecting soil variability and

(ii) to evaluate the soil variability in the area studied

using geostatistics.

Data were non-normally distributed but statistical

techniques employed did not require the assumption of

normality. This result could support the presence of

systematic patterns of soils.


xiv








Principal component analysis was used to reduce the

number of soil properties to study the soil variability.

Two sets of data were used: weighted average values of soil

properties, and A horizon properties. Horizon thickness

was used as the weighting criterion. Variables were

standardized to mean zero and variance one. Plots of soil

properties in the plane of the principal components,

varimax rotation, analysis of eigenvalues, eigenvectors,

and collinearity, and calculation of correlation

coefficients between soil properties and principal

components were used to select important properties for

evaluation of soil variability.

A nested analysis of variance indicated that

properties selected by the principal component analysis

were differentiating properties.

Geostatistical analysis was applied to the properties

selected. The within-soil series variance was used as

criterion to assess stationarity. Drift was present.

Consequently, residuals were used to compute semi-

variograms.

Semi-variograms of total sand and clay contents showed

structure. Nugget variance was present in all semi-

variograms. Ranges varied from 15 to 35 km. Soil

variability was direction-dependent. The N-S and NW-SE

were the directions of maximum variability. Organic carbon

content had a large point-to-point variation.


xv








All observed semi-variograms had a characteristic wave

pattern that indicated a cyclic variation of soil

properties.

Kriged standard error diagrams were functions of the

nugget variance and showed areas where more samples are

required to increase the precision of estimates.

Fractal dimensions indicated the scale-dependent

character of soil variability.




































xvi














INTRODUCTION


The fundamental purpose of a soil survey is to

estimate the potentials and limitations of soils for many

specific uses. Soil delineations are mapped to be as

homogeneous as possible in order to correlate the

adaptability of soils to various crops, grasses, and trees;

and to predict their behavior and productivity under

different management practices (Soil Survey Staff, 1951;

1981).

Quality of soil surveys has been improved over the

years as a result of improved understanding of soil. But

soil variability remains as one of the main constraints to

reliable soil interpretations and is a limiting factor for

making accurate predictions of soil performance at any

particular position on the landscape.

The study and understanding of soil variability

represents a cornerstone for improving soil surveys.

Belobrov (1976), a Russian soil scientist, pointed out that

"The degree of approximation between the true and the

observed soil variability does not depend on the nature of

the soil cover, but mainly on the methods of investigation"

(p. 147).



1








2

For several years, soil scientists used methods of

investigation which did not consider the "real nature" of

soils, because they ignored the systematic variation of

soils on the landscape and assumed a random variation of

soils in space. On the other hand, despite the fact that

it has been recognized that a soil map unit is imperfect to

varying degrees, depending on the scale of the map and the

nature of the soil (Soil Survey Staff, 1975), most soil

surveys in the U.S.A have accepted an unrealistic model in

which map units encompass soil bodies that form discrete,

internally uniform units, with abrupt boundaries at their

edges (Hole and Campbell, 1985).

Studies of soil variability have not been consistent.

These studies have considered a random variation of soils

and at the same time they have used a limited number of

observations for characterizing map units to establish the

range of variation of observed properties. The assumption

has been that properties measured at a point also represent

the unsampled neighborhood. The extent to which this

assumption is true depends on the degree of spatial

dependence among observations.

The number of studies for quantifying soil variability

has sharply increased in the last 10 years, but

quantification still remains a problem. A large proportion

of quantitative studies are based on untested assumptions,

ignore the multivariate character of soils, or use a biased








3

selection of properties to represent the soil variability,

increasing the risk of erroneous conclusions.

For these reasons a large soil data base was selected

in northwest Florida with the following objectives: (i) to

discover which soil properties most strongly influence the

soil variability in the area studied, and (ii) to study how

geostatistics can be used in evaluating soil variability.














LITERATURE REVIEW



Principal Component Analysis

The multivariate character of soil is well recognized;

a large set of measurements of soil properties

(morphological, chemical, physical, and mineralogical) can

be derived from a single sample. The complete set of

available data is not always used for numerical analyses.

Hole and Campbell (1985) indicated that the selection of

soil properties depends on the objectives of the study, and

also reflects the constraints imposed by cost, time,

effort, and access.

There is no doubt that logically correlated variables,

such as soil pH and base saturation, are generally so

highly covariant that one or the other should not be

included in the analysis. Particle-size fractions (sand,

silt, and clay) always add up to 100%, and therefore, the

whole set of particle-size data should not be included in

the analysis. Consequently, in the process of selecting

soil properties, there is an important question to be

answered: Are the selected soil properties the most

important to represent the variability of the complete set

of data?


4








5

Webster (1977) pointed out that when one soil property

is measured in a set of individual sampling units, the

measured values can be represented by their positions on a

single line. The relation between any pair of individuals

can be represented by the distance between them and the

relations among several individuals can be established

simultaneously from their relative positions on the line.

At the same time, it is almost impossible to visualize

their positions on the line and the relations among more

than two individuals simultaneously. Thus, he indicated

that an alternative way of dealing with multivariate data

is to arrange the individuals along one or more new axes.

This reduction of an arrangement in many dimensions to a

few dimensions is known as ordination.

The two most common methods of ordination are Factor

Analysis (FA) and Principal Components Analysis (PCA).

Shaw and Wheeler (1985) said that in both techniques new

variables are defined as mathematical transformations of

the original data. However, FA assumes that the original

variable is influenced by various determinants: a part

shared by other-variables, known as the common variance;

and a unique variance which consists of both a variance

accounted for by influences specific to each variable and

also a variance relating to measurement error. In

contrast, PCA assumes that statistical variation in the

variables is explained by the variables themselves, in this








6

case by the common variance. PCA is recommended when there

are high correlations between variables, a large number of

variables, and a need for only simple data reduction. The

major objective in PCA is to select a number of components

that explain as much of the total variance as possible,

whereas FA is used to explain the interrelationship among

the original variables (Afifi and Clark, 1984). PCA has

the advantage in that the values of principal components

are relatively simple to compute and interpret.

PCA is a method that has been used to reduce the

number of variables without losing important information

(Webster, 1977). In general, the analysis finds the

principal axes of a multidimensional configuration and

determines the coordinates of each individual in the

population relative to those axes. Then, the data can be

represented in a few dimensions by projecting the points

orthogonally on the principal axes.

The basic idea of PCA is to create new variables

called the principal components (PC) (Afifi and Clark,

1984). Each new variable is a linear combination of the X

variables and can therefore be written as



PC = A11 X1 + A12 X2 + . A.. Xj (1)

where PC = principal component

Aij = coefficient (eigenvector)

X. = variable








7

Coefficients of these linear combinations are chosen

to satisfy the following requirements:

(i) Variance PC1 > variance PC2 > . variance PC.n

(ii) The values of any two PCs are uncorrelated.

(iii) For any PC the sum of the squares of the coefficients

is one.

Cuanalo and Webster (1970) used PCA in a study of

numerical classification and ordination in which

morphological, physical, and chemical soil properties (pH,

clay, silt, fine sand, proportion of stones, consistence,

water tension, color, mottling, and peatiness) were

measured at depths of 13 cm and 38 cm at 85 sites and

randomly sampled within physiographic units near Oxford,

England. The variables were standardized to unit variance

and the population was centered at the origin. It was

found that the first six PCs represented almost 70% of the

total variation presented in the original data. The first

three PCs represented more than 50% of the total variance.

The first component showed large contributions from water

tension, and chroma in both the topsoil and the subsoil.

In the second component, contribution of fine sand in the

topsoil (13 cm) and subsoil (38 cm) was dominant. Hue and

value made large contributions to the third component. The

projection of the population scatter on the plane defined

by the first two PCs gave the most informative display of

relations in the whole space. These authors suggested that








8

when numerical data are available, the data should be

examined first by ordination procedures; then, the data

selected by the ordination procedure can be used with a

numerical classification to decide if such classification

grouped data satisfactorily.

Norris (1972) used PCA to study trends in soil

variation. He described several morphological soil

properties (stage of organic matter decomposition,

percentage of stones, structure, consistence, porosity,

roots, biological activity, and color in terms of presence

or absence of gley or dark colors) in 410 pedons, 307

pedons located in woods and 103 pedons located in farmland.

The first-PC accounted for 39% of the total variance, and

corresponded to a trend from deep, stoneless pedons

developed on a clayey formation to pedons developed on

shallow limestone on steep slopes. The second PC accounted

for 14% of the total variance and separated pedons located

on farmland from those located in the woods. He concluded

that the PCs served as a summary of soil variation in the

area, because they accounted for a known percentage of the

soil variation and were correctly defined in terms of the

properties used to describe the soil.

Webster and Burrough (1972) sampled the first two

horizons from 84 soil pedons and recorded selected soil

properties (soil color, CaCO3 content, depth to CaCO3,

total penetrable soil depth, clay content, organic matter








9

content, cation exchange capacity (CEC), pH, exchangeable

Mg and K contents, and available P content). They used PCA

to reduce the dimensionality of the data, and found that

the first two PCs accounted for 55% of the total variance

(40% the first component and 15% the second component).

Separate contributions to the components were determined by

projecting vectors on the components axes. They

established that those properties determined in the field

(CaCO3 content, depth to CaCO3, clay content, and subsoil

color) were closely correlated and well represented in one

dimension in the first component. The properties measured

in the laboratory (organic matter content, CEC, and

exchangeable Mg content) contributed most to the second

component, indicating differences in management rather than

natural soil differences. Results of the numerical

classification were supported by showing the distribution

of sampling sites in space projected on to the plane of the

first two components and showing the frequency distribution

of the first PC. There was a good agreement among the

results. Therefore, it was concluded that when PCs

represent the variables that explain soil variation the

components can be mapped as isarithms and the maps have

interpretable meaning.

Kyuma and Kawaguchi (1973) employed PCA to grade the

chemical potentiality of 41 Malayan paddy soil samples; 23

physical, chemical, and mineralogical properties were








10

evaluated. The first four PCs accounted for 75% of the

total variance. The first PC was highly positively

correlated with electrical conductivity, exchangeable Ca,

Mg, Na, and K contents, moisture, CEC, available Si

content, and 0.2 M HCl-soluble K. The first PC was highly

negatively correlated with the kaolin mineral content. All

of these properties were relevant to the chemical

potentiality of the soil, thus, the first PC was called the

chemical potentiality component. The standardized scores

of the first PC were computed. These scores were used for

grading soils in terms of the chemical potentiality. The

authors stated that the result of grading was reasonable.

Placed at the top of the scale were soils developed on

juvenile marine sediments. Soils having high sand and/or

kaolin content were at the bottom of the scale. The

authors concluded that PCA was useful for comparing the

soil fertility status among soils.

Burrough and Webster (1976) used PCA with Similarity

and Canonical Variate Analyses to improve soil

classification in eastern Malaysia. Morphological and

chemical properties determined by routine analysis were

recorded from 66 randomly selected sites. The first nine

PCs accounted for more than 70% of the total variance.

Scatter diagrams of pairs of components were drawn to

elucidate the population structure. Established classes

that were originally thought to be desirable overlapped










almost completely with respect to morphological and

chemical properties. Dendograms derived from similarity

analysis confirmed the interpretations drawn from the

scatter diagrams.

Williams and Rayner (1977) employed PCA as a method

for grouping soils based on chemical composition (Fe, Ti,

Ca, K, Si, Al, P, Mg, Mn, Ni, Cu, Zn, Ga, As, Br, Rb, Sr,

Y, Zr, and Pb total contents) and other soil properties

such as particle size (sand, silt, and clay), loss on

ignition, CaCO3 content, pH, and soil moisture. The

scatter diagram showed that the first two components

divided the soils into parent material groups. This

grouping was also supported by using dendograms derived

from similarity analysis. It was concluded, on the basis

of the PCA, that the soils sampled came from three parent

materials of different ages.

McBratney and Webster (1981) studied the relationships

between sampling points using PCA. A substantial

proportion (44%) of the total variance was explained by the

first two PCs. The first component represented color.

Varimax rotation was employed to obtain a better

interpretation of the scatter diagram but it produced no

appreciable improvement in interpretability. The scatter

diagram of PC allowed the separation of sampled points into

five different groups.








12

Richardson and Bigler (1984) applied PCA to selected

soil properties (clay content, pH, organic carbon content,

CaCO3 equivalent, electrical conductivity, and soluble Mg,

Ca, and Na contents) which were meaningful to soil

development and plant growth in wetlands in North Dakota.

Four routine measurements useful for characterizing and

classifying wetland soils were identified by PCA

(electrical conductivity, organic carbon content, CaCO3

equivalent, and clay content). Electrical conductivity and

soluble Mg and Na contents were the most important

variables in explaining observable differences in wetland

soils. In addition, the use of PCA allowed the examination

of the interaction of chemical and physical properties with

the landscape position of wetland soils, as well as the

variation in properties among vegetation zones, after the

data were plotted in the plane of the first two PCs.

Edmonds et al. (1985) employed PCA as a first step for

using Cluster and Discriminant Analyses to study taxonomic

variation within three soil map units. Forty different

soil properties were included in the analyses. Variables

with low variance were excluded by the analysis. PCA was

used to reduce the number of dimensions needed to ordinate

pedons in the plane of PCs (character space) and to remove

intercorrelation of soil properties. The use of PC scores

as data for Cluster Analysis avoided distortions in

coordinates of the pedons in the plane of PCs. They








13

compared the results with the taxonomic classification of

soils, and concluded that grouping of pedons by numerical

taxonomy did not correspond to groupings by taxa in Soil

Taxonomy.

Geostatistics

Webster and Burgess (1983) pointed out that to

describe soil variation two features of soil must be taken

into account. The first is that long range trends have no

simple mathematical form; usually, there is not any obvious

repeating pattern; and the larger the area or the more

intensive the sampling the more complex the variation

appears. The second is that the point-to-point variation

in a sample reflects real soil variation. Only a small

part is the measurement error. In addition, the same

authors indicated that earlier attempts to describe spatial

variation in geology and geography involved fitting

deterministic global equations to data, either exactly or

by least squares approximation. But the two features

mentioned above make the approach inappropiate for soil.

Thus, an alternative was to treat the soil as a random

function and to describe it using geostatistics techniques.

Historical Development

Etymologically, the term geostatistics designates the

statistical study of natural phenomena, and it is defined

as the application of the formalism of random functions to








14

the reconnaissance and estimation of natural phenomena

(Journel and Huijbregts, 1978).

Geostatistics was primarily developed for the mining

industry (Matheron, 1963). Geostatistics was very useful

for engineers and geologists for studying the spacial

distribution of important properties such as grade,

thickness, or accumulation of mineral deposits.

Matheron (1963) considered that, historically,

geostatistics was as old as mining itself. He indicated

that as soon as mining men concerned themselves with

foreseeing results of future work and, in particular, as

soon they started to take and to analyze samples and

compute mean values weighted by corresponding thickness of

deposits and influence-zones, geostatistics was born.

Geostatistics started in the early 1950s in South

Africa with D.G. Krige (Olea, 1975). Krige realized that

he could not accurately estimate the gold content of mined

blocks without considering the geometrical setting

(locations and sizes) of the samples. Matheron expanded

Krige's empirical observations into a theory of the

behavior of spatially distributed variables which was

applicable to any phenomenon satisfying certain basic

assumptions, and the variables were not limited by their

physical nature.








15

Theoretical Bases

Classical statistics could not be used for ore

estimation because of their inability to take into account

the spatial aspect of the phenomenon (Matheron, 1963). An

aleatory variable had two essential properties: (i) the

possibility, theoretical at least, of repeating

indefinitely the test that assigned to the variable a

numerical value, and (ii) the independence of each test

from the previous and the next tests. A given ore-grade

within a deposit would not have those two properties. The

content of a block of ore was first of all unique, but on

the other hand, two neighboring ore samples were certainly

not independent.

Earth scientists usually deal with complex phenomena

which are the result of the interaction of variables,

through relationships which are in part unknown and in part

very complex (Olea, 1975). Variations are erratic and

often unpredictable from one point to another, but there is

usually an underlying trend in the fluctuations which

precludes regarding the data as resulting from a completely

random process. To characterize variables which are partly

stochastic and partly deterministic in their behavior,

Matheron (1971) introduced the term regionalized variable.

He developed the regionalized variable theory to describe

functions which vary in space with some continuity.








16

A regionalized variable is a continuously distributed

variable having a geographic variation too complex to be

represented by a workable mathematical function (Campbell,

1978). Although the precise nature of the variation of a

regionalized variable is too complex for a complete

description, the average rate of change over distance can

be estimated by the semi-variance. Conversely, Olea (1977)

stated that a regionalized variable is a function that

describes a natural phenomenon which has geographic

distribution.

The term geostatistics has come to mean the

specialized body of statistical techniques developed by

Matheron and associates to treat regionalized variables

(Olea, 1984). The theory of regionalized variables has two

branches: the transitive methods and the intrinsic theory

(Matheron, 1969). The first is a highly geometrical

abstraction without probabilistic hypothesis and has little

practical interest. The practical counterpart of those

geometrical abstractions is the intrinsic theory which is a

term for the application of the theory of random variables

to regionalized variables.

Matheron (1969) and Olea (1975) indicated that

regionalized variables are characterized by the following

properties: (i) localization, a regionalized variable is

numerically defined by a value which is associated with a

sample of specific size, shape, and orientation which is








17

called geometrical support. (ii) Continuity, the spatial

variation of a regionalized variable may be extremely large

or very small, depending on the phenomenom studied, but

despite this fact, an average continuity is generally

present, in some cases the average continuity cannot be

confirmed, and then a nugget effect is present.

(iii) Anisotropy, changes may be gradual in one direction

and rapid or irregular in another. These changes are known

as zonalities.

A basic assumption in the intrinsic theory is that a

regionalized variable is a random variate (Matheron, 1969).

The observed values are outcomes following some probability

density function. Henley (1981) considered that a

regionalized variable as a random function which may be

defined in terms of a probability distribution (i.e., it

may be normally distributed with a particular mean and

variance).

Olea (1984) indicated that a spatial function can

either be described by a mathematical model or given by a

relative frequency analysis based on experimentation. The

former approach is not practical because of the complexity

of spatial functions. The latter is seriously limited by

the maximum number of samples that can be collected.

Olea (1975) stated that the difficulty of the relative

frequency approach with a regionalized variable is that a

repeated test cannot be run because each outcome is unique.








18

Since a large number of samples are essential to any

statistical inference, it is not possible to determine the

probability density function which rules the occurrence of

a regionalized variable.

The impossibility of obtaining the probability density

function associated with the variable is not a serious

limitation. Most of the properties of interest depend only

on the structure of the regionalized variable as specified

by its first and second moments (Olea, 1975). A key

assumption is stationarity. Stationarity is a mathematical

way to introduce the restriction that the regionalized

variable must be homogeneous. Stationarity permits

statistical inference. A test can be repeated by assuming

stationarity even though samples must be collected at

different points. All samples are assumed to be drawn from

populations having the same moments.

Several scientists have discussed the assumption of

stationarity (Henley, 1981; Huijbregts, 1975; Journel and

Huijbregts, 1978; Olea, 1975; 1984; Tipper, 1979; Trangmar

et al., 1985; Webster, 1985). Geostatistics invokes a

stationary constraint called the intrinsic hypothesis to

resolve the impossibility of obtaining a probability

distribution. A regionalized variable is called strictly

stationary if it is stationary for any order k = 1, 2, 3,

4, . n. If k is equal to one, the regionalized

variable has first-order stationarity. Second-order








19

stationarity also implies first-order stationarity.

Second-order stationarity signifies that the first two

moments (covariance and variance) of the difference between

two observations are independent of the location and are a

function only of the distance between them.

In general, for a regionalized variable of order k,

all the moments of order k or less are invariant under

translation. For a stationary variable, the covariance has

the following properties:

(i) COV (0) > ICOV(X2 X1)I (2)

where COV = covariance
(ii) LIM COV(h) = 0, h 4 o (3)

where LIM = limit

(iii) COV(0) = VAR[Y(X)] (4)

where VAR = variance

(iv) COV(X2 X1) = COV(X1 X2) (5)

These relations are better visualized in Figure 1.

For second-order stationarity, VAR[Y(X)] must be

finite. Then, according to equation (4) COV(0) must be

finite. However, many phenomena in nature are subject to

unlimited dispersion and cannot correctly be described when

they are assigned a finite variance. Thus, to avoid this

restriction, the intrinsic theory assumes what is called

the intrinsic hypothesis. The intrinsic hypothesis is

satisfied if, for any displacement h the first two

moments of the difference [Z(x) Z(x + h)] are independent























G (-) = VAR
VAR





G (h)




I COV(<=o) 0
0
a

Lag distance h


Figure 1. Relation among variance, covariance, and semi-variance.








21

of the location x and are a function only of h:


E [Z(x) Z(x + h)] = M(h) (First moment) (6)


E [{ Z(x) Z(x + h) M(h)}2] = 2 G(h) (Second moment)

(7)

where M(h) and G(h) are referred as the drift and the semi-

variance or intrinsic function, respectively. The semi-

variance is a measure of the similarity, on the average,

between observations at a given distance apart. The more

alike the observations, the smaller is the semi-variance.

The semi-variogram (Olea, 1975; Journel and

Huijbregts, 1978), which is the plot of semi-variance

against distance h (lag), has all the structural

information needed about a regionalized variable: (i) zone

of influence that provides a precise meaning to the notion

of dependence between samples, (ii) anisotropy when

variability is direction-dependent revealing the different

behavior of the semi-variogram for different directions,

and (iii) continuity of the variable through space, which

is indicated by the shape and the particular

characteristics of the semi-variogram near the origin.

One of the oldest methods of estimating space or time

dependency between neighboring observations is through

autocorrelation (Vieira et al., 1983). Nash (1985) pointed

out that the correlogram (plot of autocorrelation against








22

distance) is the mirror image of the semi-variogram.

Vieira et al. (1983) indicated that when interpolation

between measurements is needed, the semi-variogram is a

more adequate tool to measure the correlation between

measurements. An infinite dispersion is allowed using

semi-variances.

According to Journel and Huijbregts (1978) the

autocorrelation is equal to


f(h) = C(h)/ C(0) (8)

where f(h) = autocorrelation

C(h) = autocovariance or covariance at distance h

C(0) = variance


The relationship between C(h) and C(0) is expressed by

equation (4). When the semi-variance changes, it is

assumed that its variations are small with respect to the

working scale. This is a condition of quasi or local

stationarity. When the regionalized variable is weakly

stationary, it also obeys the intrinsic hypothesis. The

semi-variance is then given by


G(h) = a2 C(h) (9)

where G(h) = semi-variance

a2 = population variance

C(h) = autocovariance








23

The autocorrelation and the semi-variance are related

by the following equation:



f(h) = 1 G(h)/ C(0) (10)


Burgess and Webster (1980a) pointed out that the

autocorrelation coefficient depends on the variance

(equation 8), and according to equation (4) the variance

must be finite to fulfill the requirement of stationarity.

It was indicated earlier that many phenomena in nature are

subject to unlimited dispersion. The semi-variance is free

of this restriction, and consequently is preferred. They

also indicated that a second advantage of working with

semi-variance is that it is easier to take into account

local trends in the property of interest. Residuals are

used when trends are present. Webster and Burgess (1980)

demonstrated that the variance of the residuals from the

mean is not equal to the variance of the difference between

the values when trends are present. Therefore,

autocorrelation is difficult to use.

Webster (1985) classified the semi-variograms into

four groups:

Safe models. They are defined for one dimension but

are safe in the sense that they are conditional positive

definite in two and three dimensions. These models are








24

1. The linear model:

G(h) = C0 + wh for h > 0 (11)

G(0) = 0 (12)

where G = semi-variance

C0 = intercept or nugget variance

w = slope

h = lag distance

Equation (11) assumes that h has an exponent a = 1. When

the exponent a = 0.5 the model is called root. When a = 2

the model is parabolic.

2. The spherical model:

G(h) = c0 + w [1.5 (h/a) 0.5 (h/a)3] (13)

for 0 < h < a
G(h) = c0 + w for h > a (14)

G(0) = 0 (15)

where a = range

c0 + w = sill


3. The exponential model:

G(h) = c0 + w [1 exp (-h/a)] for h > 0 (16)

G(0) = 0 (17)

4. The DeWijsian model:

G(h) = c0 + a ln(h) for h > 0 (18)

G(0) = 0 (19)

5. The Gaussian model:

G(h) = co + w (1 exp -(h/a)2) for h > 0 (20)

G(0) = 0 (21)








25

6. The hyperbolic model:

G(h) = h/ a + Ph (22)

where a and P are coefficients of the hyperbola

function.

Risky models. The semi-variogram increases to a sill.

1. The circular model:

G(h) = c0 + w [1 2/rt cos(h/a) + 2h/ Ta(1 h2/a2)]

for 0 < h < a (23)

G(h) = c0 + w for h > a (24)

G(0) = 0 (25)

2. The linear model with a sill:

G(h) = c0 + w (h/a) for 0 < h S a (26)

G(h) = c0 + w for h > a (27)

G(0) = 0 (28)


Nested model. The components of variance measure the

amount of variance contributed by each scale.

G(h) = j VAR [Z(x) Z(x+h)] = G0(h) + Gl(h) (29)

where G0(h) = pure nugget semi-variance

G1(h) = spatially dependent semi-variance

Anisotropic model. Variability is not equal in all

lateral directions.

G(h,8) = c0 + u(8) IhI (30)

where

u(8) = [A2 cos2 (9 a) + B sin2 (9 a)]O (31)








26

where 8 = anisotropy angle

a = direction of maximum variation

A = gradient of semi-variogram in direction of

maximum variation

B = gradient in the direction a + j n

The most common semi-variograms are showed in Figure 2.

Computing a series of semi-variograms and deriving a

model from them is usually not an end in itself. The

objectives of geostatistical studies are to determine the

characteristics of the data and to obtain the best

estimates possible with the available data. The advantage

of using a geostatistical approach is that the computed

values are optimum. The error of estimation is minimized.

The acronyn BLUE (best linear unbiased estimation) is

sometimes used to characterize this method (Green, 1985).

Estimation procedures that incorporate regionalized

variable theory were originally known as kriging, a term

named for D.G Krige (DeGraffenreid, 1982). Kriging is a

distance-weighted moving average estimation procedure that

uses the semi-variogram to determine optimal weights.

Kriging depends on computing an accurate semi-

variogram from which estimates of semi-variance are then

used to obtain the weights applied to the data when

computing the averages, and are presented in the kriging

equation (Burgess and Webster, 1980a).









27










C -- -- - - - -
CC --



GC (h) G(h)






a h a h
Spherical Exponential





C ------ -




GCh) G(h)






a h h

Gaussian Linear, Root, Parabolic




Figure 2. Common semi-variogram models.








28

When values of soil properties are averaged over point

values, which represent volumes with the same size and

shapes as the volumes of soil on which the original

descriptions were recorded (i.e., pedons), the kriging

procedure is called punctual kriging (Burgess and Webster,

1980a). When an average is made over areas, the procedure

is called block kriging (Burgess and Webster, 1980b).

Block kriging produces smaller estimation variances and

smoother maps.

Burgess and Webster (1980a) and Webster and Burgess

(1983) pointed out that kriging is a means of spatial

prediction that can be used for soil properties. In

kriging, the weights take account of the known spatial

dependence expressed in the semi-variogram and the

geometric relationships among the observed points. Kriging

is optimal in the sense that it provides estimates of

values at unrecorded places without bias and with minimum

known variance.

It has been indicated by several scientists

(Huijbregts, 1975; Olea, 1975; 1984; Trangmar et al., 1985;

Webster and Burgess, 1980) that kriging is used only with

regionalized variables that are first-order stationary. For

variables whose drift is not stationary, but for whose

residuals the intrinsic hypothesis holds, universal kriging

is used.








29

Webster and Burgess (1980) stated that universal

kriging takes account of local trends in data when

minimizing the error associated with estimation. Universal

kriging can be performed after computing suitable

expressions for the drift and corresponding semi-variograms

of the residuals.

Olea (1984) said that universal kriging is a linear

estimator of the regionalized variable and has the form

n
Z(x0) =i ri Z(xi) (32)

where Z(x0) = unknown parameter at location x0

r. = weights
1
Z(x.) = value of a property at a point xi

Matheron (1963) stated that suitable weights r.

assigned to each sample are determined by two conditions.

The first condition is that Z (x0) and Z(x ) must have

the same average value within the area of influence, and is

written as

n
ii ,i = 1 (33)

The second condition is that r. have such values that

estimation variance (kriging variance) of Z(x0) and Z(x )

should take the smallest possible value. The unknown r.'s

were found by solution of a system of linear equations

which result from forcing the unbiased estimator to have

minimum variance. The equation is as follows:







30

AX = B (34)

where A, B, and X are given by equations (35), (36), and

(37) (Figures 3 and 4).

In recent years a new method for estimation has been

developed. Vieira et al. (1983) stated that in soil

science, agrometeorology, and remote sensing, very often

some variables are cross-related with others. In addition,

some of those variables are easier to measure than others.

In such situations estimation of one variable using

information about both itself and another cross-correlated,

easier-to-measure variable should to be more useful than

the kriging of that variable by itself. This estimation is

easily made using cokriging.

Cokriging has been defined as the estimation of one

spatially distributed variable from values of another

related variable (DeGraffenreid, 1982; Gutjahr, 1984).

Dependence between two variables can be expressed by a

cross semi-variogram (McBratney and Webster, 1983a). For

any pair of variables i and j there is a cross semi-

variance G (h) at lag hij defined as


G i(h) = E [{Z (x) Z (x+h)} {Z (x) Z (x+h)}] (38)


where Z. and Z. are the values of i and j at places x and

x+h. If i = j then, equation (38) represents the auto

semi-variance.








31





G(xl,x ) G(xl,x2)...G(Xl,xk) 1 f(x1) f2(x1)...fn(xl)

G(x2,x1) G(x2,x2)...G(x2,xk) 1 f(x2) f2(x2)...fn(x2)

......................................................
G(xj,x ) G(xj,x2)...G(xj,xk) 1 f(x ) f2(xj)...fn(x )

.....................................................
A = G(xk,x1) G(xk,x2)...G(xk,xk) 1 f(xk) f2(xk)...fn(xk)

1 1 ... 1 0 0 0 ... 0

f(x1) f(x2) ... f(xk) 0 0 0 ... 0

f2(x1) f2(x2) ... f2(xk) 0 0 0 ... 0


fn(xI) fn(x2) ... fn(xk) 0 0 0 ... 0




G(xj ,xk) = Semi-variance between two sample elements
located at a distance (x. ,x).

f = Function of x, derived from the drift.






Figure 3. Equation number 35.








32




a) F1 b) G(X1,X0)
F2 G(x2,x0)


Fj G(xj,X0)
. ........
X = Fk B = G(xk,x0)
40 1
41 f(x0)
12 f2(x0)
*. .. .....o
4n fn(x0)





rF = weights.

T(Xk,X0) = semi-variance between two sample elements
located at a distance (xk x0).

f (x) = function of x, derived from the drift.

4i = Lagrange multipliers.









Figure 4. Equation number 36 (a) and equation
number 37 (b).







33

The cokriging equation is given by


j nj
Zj(x0) = jl il r1ij Z(xij), for all j (39)

where i, j = variables

Z (xj) = estimated value of variable j at location x0

r.. = weights

To avoid bias the weights have to fulfill two conditions:


nj
(i) ij1 rij = 1 (40)

and

nj
(ii) il Fij = 0 for all i not equal to j. (41)


The first condition, according to McBratney and

Webster (1983a), implies that there must be at least one

observation of the variable j for cokriging to be possible,

and as in kriging equation (34), cokriging can be expressed

in matrix notation for solving the unknown weights.

Trangmar et al. (1985) indicated that cokriging

requires at least one sample point of both the primary

variable and covariable properties within the estimation

neighborhood. If the primary variable and covariable are

present at all sampling sites in the neighborhood, then

cokriging is considered as an auto-kriging of the primary

variable alone. In such cases, cokriging is unnecessary.








34

Practical Use

Earlier studies in soil science used time series

analysis in which spatial dependence of soil properties was

considered. Webster and Cuanalo (1975) computed

correlograms for clay, silt, pH, CaCO3, color-value, and

stoniness for three horizons in pedons located at 10 m

interval along a transect in north Oxfordshire, England.

They observed that the relation between sampling points

weakened steadily over distances from 10 m to about 230 m.

The average spacing between geological boundaries on the

transect was also about 230 m. Outcrop bedrock was

inferred as one of the main sources of soil variation.

They concluded that mappable soil boundaries were likely to

occur on the average every 230 m, and sampling at spacing

closer than 115 m would be needed to detect them.

Lanyon and Hall (1980) used morphological, physical,

and chemical soil properties to test the performance and

value of auto-correlation analysis. Spatial dependence was

determined from observations made every 20 m along a

transect for solum thickness; fine-earth fraction of the A,

B, and C horizons; and for soil pH, percent base saturation

(PBS), and exchangeable cations from the deepest horizon.

They found that the range varied from 20 m for solum

thickness and exchangeable K content to 60 m for pH and

exchangeable Mg content. They concluded that auto-

correlation analysis emphasized the continuous, orderly








35

nature of soils, and the fact that spatially related

observations may be mutually dependent.

Campbell (1978) was one of the first to use

geostatistics in soil science. He studied the spatial

variation of sand and pH measurements employing the semi-

variance. Samples were collected at 10 m intervals on two

sampling grids positioned on two contiguous delineations in

eastern Kansas. There was a contrast in spatial variation

of sand content within the two delineations. Distances of

30 and 40 m were sufficient to encounter full variation of

sand content. Soil pH had a random variation within both

areas. It was concluded that the most important

application of semi-variograms was in determining optimum

sample spacing in the design of efficient sampling

strategies.

Gambolati and Volpi (1979) introduced the

determination of the trend a priori, and improved the

process of fitting the observed to a theoretical semi-

variogram. They used kriging to describe ground-water flow

near Venice, Italy. They proposed and used a modification

of the kriging technique developed by Matheron (1970) which

aimed at improving the accuracy of the interpolation

procedure. In Matheron's (1970) basic theory, the trend

was not assessed a priori. The trend was considered as a

linear combination of functions with unknown coefficients.

Gambolati and Volpi (1979) considered the trend a priori;








36

therefore the trend had to be determined. Also, they

defined the concept of theoretical consistency in kriging

applications. Theoretical consistency was derived from the

validation of the interpretation model. Validation was

made by suppressing each observation point one at a time,

by providing an estimate in that point using the remaining

(n-1) observations, and analyzing the distribution of

errors. They stated that consistency occurred when there

was no systematic error (kriged average error was

approximately zero) and the standard deviation was

consistent with the corresponding error (the average ratio

of theoretical to calculated variance was approximately

equal to one). They found that validation of the

interpretation models selected for study showed that their

approach yielded accurate results, provided the trend was

correctly assessed.

Chirlin and Dagan (1980) modeled water flow through

two-dimensional porous formations as a random process using

an approximate formulation of flow physics to obtain an

expression for the Head variogram. The Head variogram

proved markedly anisotropic, with heads differing more

widely on average for a fixed lag parallel to the head

gradient than perpendicular to it. Also they examined a

hypothetical case ignoring anisotropy. It was determined

from their experiment that the kriged standard deviation








37

was overestimated perpendicular to the mean flow and was

underestimated parallel to it.

Hajrasuliha et al. (1980) studied salinity levels in

three different fields in southwest Iran which were

initially sampled on an arbitrarily selected grid of 80 m.

Semi-variances were calculated for all three sites to

determine the degree of dependence between observations.

The results from two fields showed that observations were

spatially dependent. Contour lines of iso-salinity were

obtained by using kriging. In the third field salinity

observations were found to be spatially independent. Thus,

the number of samples necessary to get fiducial limits and

to identify the number of samples to be taken randomly

across the field for a given probability were obtained by

using classical statistical methods.

Luxmoore et al. (1981) used semi-variograms to

characterize spatial variability of infiltration rates into

a weathered shale subsoil. Infiltration rates were

measured using double-ring infiltrometers installed at 48

locations on a 2 x 2 m grid after the removal of 1 to 2 m

of soil. A high degree of variability in infiltration

rates was determined. The test for spatial patterning

using the semi-variogram approach proved negative.

Therefore, they concluded that if patterning existed at

all, it occurred on a spatial scale less than the 2 m used








38

in the study. As a result of this study, it was determined

that infiltration rate was a randomly distributed property.

Vieira et al. (1981) analyzed the spatial variability

of 1280 field-measured infiltration rates on Typic

Xerorthents. The measurements were made at the nodes of an

irregular grid. The semi-variogram showed a range of 50 m.

It was considered that, on the average, samples separated

by 50 m or more were not correlated to each other.

Conversely, they examined the effect of the neighborhood

size on the value kriged and its estimation variance. They

determined that a neighborhood of 14 m was sufficient for

the infiltration data. The estimation variances changed

very little for larger distances. -Low mean estimation

error, low variance, and high correlation coefficient

showed that the kriging estimation was exceptionally good.

Finally, it was determined that geostatistics was useful to

redesign the sampling scheme. The large number of measured

values made it possible to calculate the minimum number of

samples necessary to reproduce the infiltration rate

measurements with good precision. It was determined that

128 samples were enough to obtain nearly the same

information as with 1280 samples.

Geostatistics was used for first time to study soil

variability of large areas in Kigali, Rwanda by Vander Zaag

et al. (1981). They studied the spatial variability of

selected soil properties (pH; exchangeable Ca, Mg, K, and








39

Na contents; KCl-extractable Al content; percent Al-

saturation; effective CEC; 4g P-sorbed at an equilibrium P

concentration of 0.02 and 0.2 4g/g; extractable P content;

P and Si in the saturation extract; total N, NO3, and NH4;

and extractable S contents) in the whole country of Rwanda.

Semi-variograms of soil pH, exchangeable Ca content,

effective CEC, Si in the saturation extract, and

extractable NH4 content showed long range spatial

dependence. The spatial dependence varied from 37.5 km for

soil pH to more than 60 km for extractable NH4. The

information contained in the semi-variogram was used to

estimate values of soil properties at unsampled locations

within the range of the semi-variogram. Maps of estimation

variance of kriged values were also generated. Such maps

showed that estimation variance of kriged values generally

increased with increasing distance from sample points. It

was indicated that geostatistics could be used to make

quick, low cost assessments of soil variability of large

land areas. In addition, the map of estimation variance

gave an indication of the confidence limits of the

estimated values. The map can be used to locate optimum

sampling sites to lower the estimation variance.

McBratney and Webster (1981) computed semi-variograms

of subsoil properties (depth to subsoil, soil color,

particle-size, mineralogy, organic carbon content total N

content, ratio OC/total N, and pH). Samples were taken on








40

a transect at 20 m intervals. Semi-variograms showed

spatial dependence extending to about 360 m for some

properties, in particular color and pH. Other subsoil

properties had little or no spatial dependence, notably

particle-size fractions and organic carbon content. The

shape of some semi-variograms indicated presence of

different map units on the transect.

Van Kuilenburg et al. (1982) applied three

interpolation techniques (proximal, weighted average, and

kriging) to point data involving soil moisture supply

capacity on a 2 x 2 km grid of cover sand in the eastern

part of the Netherlands. Survey points used for

interpolation were randomly stratified with an average

density of 1.5 per ha. The root mean squared error was

used as a measure of efficiency. The root mean squared

error was large for the proximal method (less efficient)

and there was a negligible difference between root mean

squared errors for weighted average and kriging. Weighted

average had the weakness that possible clusters of survey

points were weighted too heavily. This was avoided in

kriging. Therefore, kriging proved to be the most

efficient for the survey method used.

Yost et al. (1982a) collected samples from 80 sites at

1 to 2 km intervals in Hawaii. Soil samples were taken

from 0 to 15 cm (topsoil) and 30 to 45 cm depths (subsoil).

The former depth represented the nutrient status as








41

influenced by management and the latter depth represented

the natural conditions. Semi-variograms for soil pH,

exchangeable cations (Ca, Mg, K, Na), sum of cations, P

requirements, Si and P in saturation extract, extractable P

content, and rainfall were calculated. Ranges were much

greater for soil properties in the 0 to 15 cm depth than

for those in the 30 to 45 cm depth. Semi-variograms for

Ca, Mg, K, and P contents based on the 30 to 45 cm depth

samples demonstrated greater variability and had smaller

ranges (Ca, Mg, and K) than those based on the 0 to 15 cm,

or were extremely variable (P). Si in saturation extract

had the same range in the subsoil as in the topsoil.

Subsoil properties were highly variable. Thus, soil

management and rainfall imposed a degree of uniformity on

the surface soil properties not apparent in the subsoil.

Yost et al. (1982a) concluded that soil chemical properties

had spatial dependence and that understanding such

structure may provide new insights into soil behavior over

the landscape. The semi-variograms changed at large

distances. These changes suggested that soils should be

grouped to obtain uniform regions of soil properties

suitable for management regimes.

Yost et al. (1982b) used soil data from transects in

Hawaii for estimating soil P sorption over the entire

island by using kriging. The necessity of considering non-

stationarity and the use of universal kriging were








42

evaluated. Universal kriging, either by prior polynomial

trend removal or by local polynomial trend removal during

estimation, was not beneficial in spite of widely varying P

sorption and a significant polynomial trend in the data.

The kriged estimates indicated that P sorption properties

of soil obtained from transects could be estimated in an

optimal way and could be displayed in a manner to better

understand the soil properties and genesis, and for

practical purposes, estimating the fertilizer needs and

distribution facilities.

McBratney et al. (1982) sampled 3500 sites to study

the spatial variability of Cu and Co soluble in mild

extractants measured to identify places where these metals

were deficient for animals. Semi-variograms for both Cu

and Co were isotropic and appeared to combine three

components of variation: a short range component extending

up to 3 km, a long range or geological component extending

to 15 km, and a non-spatial or nugget component, which

accounted for 32% and 63% of the total variance of Cu and

Co, respectively. Cu showed a greater degree of spatial

dependence than Co. In addition, isarithmic maps

identified areas where Cu and/or Co were deficient. An

error map showed that precision was generally acceptable.

Also, the map identified a few areas in the region in which

sampling was too sparse for confidence.








43

Byers and Stephens (1983) sampled an untilled medium-

grained fluvial sand in horizontal and vertical transects

to study the spatial structure of particle size and

saturated hydraulic conductivity. Semi-variogram and

kriging analyses indicated that both hydraulic conductivity

and particle size were relatively isotropic in the

horizontal plane but had marked anisotropy in the vertical

plane. There were marked similarities in spatial structure

in the horizontal plane. The spatial distribution of

saturated hydraulic conductivity in the horizontal plane

was estimated reasonably well using an empirical

relationship between particle size and conductivity along

with kriged estimates of the 10% finer particle size.

Ten Berge et al. (1983) studied the spatial

distribution of selected soil properties (moisture content,

moisture tension, bulk density, texture, temperature, and

equivalent surface temperature). Two transects were

sampled at 4 m intervals. Semi-variograms for moisture

content and bulk density did not show any range but only a

nugget effect. For other soil properties semi-variograms

had a range varying between 80 and more than 120 m (texture

and temperature). Gradual changes in soil characteristics

were expected. The presence of abrupt map unit boundaries

was determined for some properties (e.g., texture). The

spatial structure of the field moisture content was found

only at very shallow depths. Texture introduced







44

differences in hydraulic conductivity, which were thought

to cause differences in topsoil moisture content.

Vauclin et al. (1983) used geostatistics for studying

the variability of particle-size data, available water

content, and water stored at 1/3 bar. The soil samples

were collected within a 70 x 40 m area at the nodes of a 10

m square grid. All semi-variograms had a nugget effect

which corresponded to the variability that occurred within

distances shorter than the sampling interval and to

experimental uncertainties. The range varied from 26 m for

water stored at 1/3 bar to 50 m for silt content. Cross

semi-variograms were calculated demonstrating that

available water content at 1/3 bar was correlated with sand

content within distances of 43.5 and 30 m. Semi-variograms

and cross semi-variograms were used to krige and cokrige

additional values of available water content and water

stored at 1/3 bar every 5 m. They indicated that the use

of cokriging was a promising tool whether the principal

objective was the reduction of the estimated variance

compared with kriging or the need to estimate an under-

sampled variable by taking into account its spatial

correlation with another more sampled variable.

Spatial variability of nitrates in cotton petioles

was determined employing semi-variograms and kriging (Tabor

et al., 1984). Sampling of petioles was of two types, on

transects and from randomly selected sites on a rectangular








45

grid. Nitrates in petioles showed definite spatial

dependence in the field studied. However, for sampling

areas of < 1 m, spatial dependence was insignificant

compared to the inherent variability of the sample and

laboratory analyses. Semi-variograms and kriged maps of

nitrates in petioles suggested a strong influence of the

cultural practices such as direction of rows and

irrigation.

Bos et al. (1984) sampled in a rectangular grid 50 x

200 m at 10 m intervals on sandtailings capped with 0 to

> 2 m of strip-mine overburden. This was done to present

and discuss the use of semi-variograms to study the spatial

variation of extractable P, Na, K, Mg, and Ca contents,

extractable acidity, CEC, total P content, pH-water, pH-

KC1, and soluble salts of the topsoil (0 to 25 cm) and

relative elevation in reclaimed Florida phosphate mine

lands. Semi-variograms were calculated for data taken

along transects in four different sampling directions and a

combined direction. Some properties (CEC and relative

elevation) did not present structure of spatial variation.

The range was approximately 6 m for the combined and E-W

semi-variograms. Also, a nugget effect was observed which

represented variability at distances < 10 m. Presence of

anisotropy could not be established because well-defined

sills and ranges could not be determined for directions N-

S, NE-SW, and NW-SE. The semi-variograms were supported by








46

too few data points at large distances. It was concluded

that semi-variograms were useful in determining the spatial

variability of soil properties on reclaimed phosphate mine

lands and in improving sampling design for liming and

fertilization needs.

Xu and Webster (1984) used geostatistics to test how

these techniques could be applied for large areas. Topsoil

of 102 pedons evenly distributed throughout the studied

area in China were sampled. Soil pH-water, organic matter,

sand, total N, total P, and total K contents were measured.

Variation of soil properties appeared to be isotropic.

Soil pH showed the strongest spatial dependence.

Isarithmic mapping of local estimates of pH showed zones of

alkaline soils. Because sampling was sparse, on average

one sample for 3.5 km2, the estimation errors were large.

It was suggested that a more intensive sampling scheme

would increase confidence in the maps. This would also

improve the estimation of semi-variograms, especially for

lags in the range of 0.5 to 5 km.

Saddiq et al. (1985) collected data on soil water

tension from 99 tensiometers along a 76 m row planted with

chile pepper and irrigated through trickle tubing placed 5

cm below the soil surface. Semi-variograms indicated a

large variability and little spatial dependence in soil

water tension. The range was < 6 cm. Also, it was

determined that variability and spatial dependence were







47

functions of the method and timing of water application and

the magnitude of the soil water tension. When water was

applied through a trickle line, variability was greatest

and spatial dependence was smallest. Variability was low

and spatial dependence high after rain or extensive

flooding.

Rogowski et al. (1985) were probably the first to use

geostatistics to estimate erosion at different scales.

Erosion was measured at nodes of three different size

grids: 225 measurements from a 15 x 22.5 km grid, 25 from a

5 x 7.5 km grid, and 150 from a 1 x 1.5 km grid in west

central Pennsylvania. Erosion at each node was computed

using the universal soil loss equation. Kriging was

employed to map potential erosion. It was determined that

the large grid sampling size smoothed out the variability

by assumming that a fixed slope length and gradient were

applicable to the entire area. It was concluded that

estimation of erosion on a 1 ha basis (small grid) would

likely lead to the optimum prediction capability. This

conclusion was based primarily on the results of structural

analysis of soil loss data which suggested a workable

continuity range of about 0.1 km for an exponential semi-

variogram model. The relative dispersion was about the

same for the smaller and the larger areas.

Jim Yeh et al. (1986) measured soil water pressure

with 94 tensiometers permanently installed at 3 m intervals







48

along a 290 m transect at a 0.3 m depth in New Mexico.

Observations showed a gradual increase of soil water

pressure over time and a high degree of spatial

variability. Variations were spatially correlated over

distances at least 6 m and they were dependent upon their

mean value. These data supported the hypothesis obtained

from stochastic analysis that the variation of soil water

pressure was mean-dependent.

Phillips (1986) applied geostatistics to determine the

spatial structure of the pattern of variability of shore

erosion to identify the important scale of variation.

Shoreline erosion was measured in terms of recession rates

from two sets of aerial photographs taken in 1940 and 1978.

Statistical analysis indicated that variability of erosion

rates was high. The complex alongshore pattern and the

scale of local variability indicated that, despite

significant long-range differences in erosion rates, short-

range, local factors were more important in determining

differences in erosion rates. It was also concluded that

two major factors accounted for alongshore differences in

erosion rates. These were (i) a complex pattern of

differential resistance related to marsh fringe morphology

and (ii) a crenulated, irregular shoreline configuration

affecting exposure to wave energy.

Several scientists (Burgess et al., 1981; McBratney

and Webster, 1983a, 1983b; Webster and Burgess, 1984;







49

Webster and Nortcliff, 1984; Russo, 1984) have used

geostatistics for improving sampling techniques. The

classical statistical approach for sampling soils does not

take account of the spatial dependence among the data

within one class. Therefore, it leads to conservative

estimates of precision, with oversampling and unnecessary

cost resulting (Burgess et al., 1981). Burgess et al.

(1981) presented a sampling strategy that depended on

accurately determining the semi-variogram of the property,

and then estimation variances could be calculated for any

combination of block size and sampling density by kriging.

By this sampling method, the sampling density needed to

attain a predetermined precision could be obtained, and the

sampling effort needed to achieve the precision desired was

at a minimum.

McBratney and Webster (1983b) stated that the number

of observations needed to achieve a particular acceptable

error depends on the variation of the property in the

region concerned. The assumptions of classical statistics

have required more observations than investigators could

afford to attain the desired precision. These authors used

a method for determining the sample size that depended on

knowing the semi-variogram of the property of interest.

The semi-variogram information was used in kriging for

estimation of variance in the neighborhood of each

observation point. Variances were pooled to form the








50

global variance from which a standard error could be

calculated. The pooled value was minimized for a given

sample size if all neighborhoods were of the same size.

Therefore, the sampling size required to determine the

semi-variogram would be a major part of the task. So, if

the semi-variogram had not been known, then the best

strategy was to sample on a regular grid, with the interval

determined by the number of observations that could be

reasonably obtained.

McBratney and Webster (1983a) extended the sampling

principle for each variable to two or more co-regionalized

variables. The choice of the strategy was complicated

because not only did the sampling intensities of the main

variable and subsidiary variables differ but also their

relative sampling intensities could be changed.

Conversely, maximum kriging variance did not necessarily

occur at the center of the sampling configuration as it did

with a single variable. It was stated that in attempting

to find an optimal strategy, the maximum kriging variance

must be found by first calculating the variance for a range

of sampled spacings and relative sampling intensities.

Those that matched the maximum tolerable variance were

potentially useful. It was suggested that the optimum

scheme was the one that achieved the desired precision for

least cost.







51

Webster and Burgess (1984) described optimal

rectangular grid sampling configurations by which

estimation variance could be minimized. The geostatistical

approach had the advantage that standard errors would be

much smaller than with the classical approach. It was

stated that even when standard errors were estimated

properly by taking into account known spatial dependence,

the cost of making the desired number of measurements in a

region might still be prohibitive. Under those

circumstances weighting might provide a feasible way of

overcoming this difficulty. The aim of weighting was to

reduce the effort of measuring soil properties within

regions while maintaining the precision of replicated

observations. It was concluded that the most serious

obstacle to using optimal sampling strategies for single

estimates was the need to know the semi-variogram in

advance. The main task was the number of samples needed to

determine the semi-variogram.

Webster and Nortcliff (1984) used measured values of

extractable Fe, Mn, Cu, and Zn contents to calculate the

sampling effort required to estimate mean values with

specified precision. Semi-variograms showed that there was

a substantial dependence for Fe and Mn contents, less for

Zn content, and even less for Cu content. Estimation

variances generated by classical methods and geostatistics

were compared. The largest nugget variance in relation to








52

the total variance in the sample was for Cu. Classical

statistics slightly exaggerated the estimation variance for

Cu. The over-estimate was more serious for Zn, Mn, and Fe.

However, the major disadvantage is having to sample

intensively to obtain the semi-variogram.

Russo (1984) proposed a method to design an optimal

sampling network for semi-variogram estimation. The method

required an initial sampling network. The location of

points could be either systematically or randomly selected.

For a given sample size (n) and using a constant number of

pairs of points for each lag class, the sampling network

criterion for selecting the location of sampling points was

the uniformity of the values of the separating lag distance

(h) within a given lag class, for each of the lag classes

which covered the area of interest in the field. The

method provided a set of scaling factors which were used to

calculate the new locations of the sampling points by an

iterative procedure. Using the aforementioned criterion,

the best set of sampling points was selected. Analysis of

results indicated that by using the proposed method the

variability within and among lag classes was considerably

reduced relative to the situation where the original

locations were used. In addition, sampling points

generated by the method proposed fitted the theoretical

semi-variograms better than those which were estimated from







53

data generated on the original coordinates of sampling

points.



Fractals

It has been widely recognized that the perception of

soil variation is a function of the scale of observation.

Fridland (1976) was one of the first soil scientists to

recognize that a series of randomly operating but

interacting spatial processes at different scales could be

combined to give definite soil patterns.

Beckett and Bie (1976) indicated that the variance of

the values of any soil property within a given area is the

sum of all contributions to the soil variability within the

area. Thus, the overall variance within an area of 100 m2

contains contributions from the average variability within

areas of 1 m2, and from that between areas of 1 m2 within

areas of 5 m2, and between areas of 5 m2 within areas of 10

m2, and between areas of 10 m2 within areas of 100 m2. The

partition of the total variance can be performed for any

number of stages.

Wilding and Drees (1978) pointed out that the nature

of soil variability is dependent on the scale of

resolution. They indicated that at a low resolution level

(for example, looking at the earth from the moon) spatial

diversity may be seen as land vs water. With greater

resolution, spatial variabilty can be recognized







54

microscopically and submicroscopically in the systematic

organization of biological, chemical, and mineralogical

composition of hand specimens representative of given

horizons.

Burrough (1983a) stated that each cause of soil

variation may not only operate independently or in

combination with other factors, but also over a wide range

of scales.

Soil variation has been considered to be the result of

a systematic and a random components (Fridland, 1976;

Wilding and Drees, 1978; 1983). The former is related to

features such as landform, geomorphic elements, and soil

forming factors. The latter corresponds to those changes

in soil properties that are not related to a known cause.

Random variability is unresolved.

Burrough (1983b) indicated that the distinction

between systematic variation and noise (random variation)

is entirely scale dependent because increasing the scale of

observation almost always reveals structure in the random

component. He also stated that making allowances for the

artifices of map making, several conclusions can be drawn:

(i) pattern structures, and therefore spatial correlations,

have been recognized at all scales; (ii) the detail

resolved is partly the result of the scale of variation

present and partly due to the resolving power of the map at

the given scale; (iii) the intricacy of the drawn







55

boundaries is not related to scale; and (iv) a feature

regarded as random at one scale can be revealed as

structure at a larger scale. Also, Burrough (1983b)

pointed out that in any given spatial study there may be

many sources and scales of variability present. The

sources and scales of variability come into play

simultaneously and affect observations over all distances

between the resolution of the sampling device and the

largest inter-sample distance. Therefore, it is necessary

to find a substitute for the noise concept that takes into

account the nested, autocorrelated, and scale dependent

character of unresolved variations. Burrough (1983a;

1983b; 1983c) suggested that the concepts embodied in

fractals appear to offer a solution.

The term fractal was introduced by Mandelbrot (1977)

specifically for temporal and spatial phenomena that were

continuous but not differentiable, and exhibited partial

correlations over many scales. A continuous series, such

as a polynomial, is differentiable because it can be split

into an infinite number of absolutely smooth straight

lines. A non-differentiable continuous series cannot be

solved. Every attempt to split a non-differentiable

continuous series into smaller parts results in the

resolution of still more structure or roughness. Fractal

etymologically has the same root as fraction and fragment







56

and means "irregular or fragmented." It also means "to

break."

Fractals have two important characteristics (Burrough,

1983b). They embody the idea of "self-similarity," that

is, the manner in which variations at one scale are

repeated at another, and the concept of a fractional

dimension. The concept of fractional dimension is the

source of the name "fractal."

Mandelbrot (1977) defined a fractal curve as one where

the Hausdorff-Besicovitch dimension (D) strictly exceeds

the topological dimension. The simplest example is a

continuous linear series such as a polynomial which tends

to look more and more like a straight line as the scale at

which it is examined increases. The D value is calculated

using the following equation:



D = log N/log r (42)

where D = Hausdorff-Besicovitch dimension

N = number of steps used to measure a pattern

r = scale ratio



Burrough (1983a) pointed out that for a linear fractal

curve, D may vary between 1 (completely differentiable) and

2 (noisy). The corresponding range for D lies between 2

(absolutely smooth) and 3 (infinitely crumpled) for

surfaces. It is implicit in the concept of fractal that







57

when fractals are examined at increasingly large scales

increasing amounts of detail are revealed, while at the

same time vestiges of variations persist on the smaller

scale.

Mandelbrot (1977) developed the fractal theory based

on the physical Brownian motion. Burrough (1983b, 1983c)

extended the fractal theory to soils using Brownian and

non-Brownian fractal models and indicated that soil data

were fractals because increasing the scale of mapping

continued to reveal more and more detail. Soil data were

not "ideal" fractals because the data did not possess the

property of self-similarity at all scales. Pure fractals

are theoretically infinitely nested structures with

infinite variance.

Burrough (1981, 1983a) demonstrated that the double

logarithmic plot of a semi-variogram of a series which can

be represented by a fractional Brownian function was a

straight line of slope:



m = 4 2 D (43)

where m = slope.

D = Hausdorff-Besicovitch dimension.



Therefore, semi-variograms are also useful in

computing the fractal dimension, but despite this fact,







58

fractals have been not used by many scientists, especially

soil scientists.

Burrough (1981) computed D from semi-variograms of

different soil properties. D values varied between 1.1 and

1.9. Low values indicated a predominance of a systematic

variation in soil properties studied. Large values

indicated a random variation of soil properties. Most of

the fractal values were between 1.5 and 1.9. Fractals were

also useful in revealing short- and long-range variation

when the D dimension was used along the semi-variogram

range. Low values of D indicated domination of long-range

variation.

Fractals have been also applied to erosion studies.

Phillips (1986) studying shoreline erosion used the

methodology proposed by Burrough (1981, 1983b). He

calculated a D value of 1.91. This value indicated a very

complex, irregular pattern of erosion which was

statistically random. It also indicated a pattern

dominated by short-range, local controls which completely

obscured any long-range trends that may have existed.

A negative correlation between adjacent sites was also

found. Phillips (1986) concluded that the complex

landscape revealed by the analysis was probably related to

the dynamic nature of estuaries and coastal wetlands and

the variety of geomorphic, ecological, and human factors

that influenced marsh and shoreline development.













DESCRIPTION OF STUDY AREA



Location

The area studied is located in northwest Florida. It

extends from Santa Rosa County on the west to Madison

County on the east, and comprises most of the Florida

Panhandle (Figure 5).



Physiography, Relief, and Drainage

The study area lies in the Coastal Plain Province

(Duffee et al., 1979, 1984; Sanders, 1981; Sullivan et al.,

1975; Weeks et al., 1980). The landscape is largely the

product of streams and waves acting upon the land surface

over the past 10 to 15 million years (Fernald and Patton,

1984).

The major physiographic divisions in the area are the

Northern Highlands and the Marianna Lowlands. They comprise

the Southern Pine Hills, the Dougherty Karst, the Tifton

Uplands, the Apalachicola Delta, and the Ocala Uplift

physiographic districs. Elevations in the Northern

Highlands range from 16 or less to 114 m above sea level.

Several stream systems have produced a significant

erosional feature called the Marianna Lowlands, which


59








60









JACKSON
WALTON HOLMES
LEON MADISON





SANTA ROSA
BAY JEFFERSON









Study counties




Kilometer
0 50 100 150





Figure 5. Location of the counties from which
characterization data were available
for pedons selected for study.








61

interrupts the continuous span of the highlands across

northwest Florida. Elevations in the Marianna Lowlands

range from 20 to 80 m above sea level (Brooks, 1981a;

Fernald and Patton, 1984).

Topography varies from nearly level to gently

undulating, with slopes ranging from 0 to 35%.

Commonly the gentle slopes terminate in sinks or shallow

depressions.

The drainage system is well organized in streams that

flow southward from Alabama and Georgia. The Chattahoochee

and Flint Rivers combine to form the Apalachicola River,

the largest in this southward-flowing group of rivers.

Some of the drainage is disjointed particularly in the

karst topography of the Marianna Lowlands (Fernald, 1981).



Geology

Soils are mainly underlied by the Citronelle

Formation, the Crystal River Formation, and by

undifferentiated Miocene and Oligocene sediments (Fernald,

1981).

The Citronelle Formation is composed of sand, gravels,

and clays of Pliocene-age. The Crystal River Formation

comprises shallow marine limestone of Eocene-age. Miocene

and Oligocene sediments are mainly composed of "silty"

sand, clay, dolomitic limestone, and fossiliferous shallow







62

marine limestone. Some of the materials are part of the

Marianna Limestone Formation.

Climate

The climate of the area is controlled by latitude and

proximity to the Gulf of Mexico. The area studied is

characterized by long, warm summers and short, mild winters

(Bradley, 1972). Maximum and minimum temperatures are

affected by breezes coming from the Gulf of Mexico.

The average annual temperature is approximately 21Q C.

Maxima of about 382 C occur in June to August and minima of

about -10Q C occur in January and February. The average

growing season is approximately 275 days.

The average annual rainfall ranges between 1400 and

1660 mm. Approximately 50% of the average rainfall falls

during a 4-month rainy season from June to September. A

second period of relatively high rainfall occurs in the

late winter and early spring. Frequently, a short drought

during the late spring causes considerable moisture stress

to trees, crops, and grasses.



Land Use and Vegetation

The area studied has a considerable extension of prime

farmland that is adequate for producing crops and to

sustain high yields under conditions of high levels of

management (Caldwell, 1980). Most of the acreage is used

for urbanization, field crops, pasture, and forestry. The








63

most common crops are corn (Zea mays), soybean (Glycine

max), peanuts (Arachis hypogaea), watermelon (Citrullus

vulgaris), tobacco (Nicotiana spp), and assorted

vegetables. Livestock operations are also common.

A large part of the area is also covered by forest.

Well drained areas are characterized by the presence of

slash pine (Pinus ellioti var ellioti Engelm.), black jack

oak (Quercus marilandica Munch.), turkey oak (Quercus

laevis Walt), bluejack oak (Quercus incana Bartr.), long

leaf pine (Pinus palustris Mill), and laurel oak (Quercus

hemiphaerica Bartr.). The poorly drained areas,

corresponding to shallow, densely wooded swamps, and river

valley lowlands, are characterized by the presence of saw

palmetto (Serenoa repens Bartr.), sweet gum (Liquidamber

styraciflua L.), and cypress (Cupressus sp. L.) (Duffee et

al., 1979, 1984; Sanders, 1981; Sullivan et al., 1975;

Weeks et al., 1980).



Soils

Soils in the area studied have developed from medium-

textured marine sediments. These coastal plain materials

were transported from uplands farther north during

interglacial periods when the present land areas were

inundated by water from the Gulf of Mexico. Most of the

soils in the study area are characterized by a low level of

natural fertility and are susceptible to erosion (Duffee et








64

al., 1979, 1981; Sanders, 1981; Sullivan et al., 1975;

Weeks et al., 1980).

Approximately 83% of the soils are classified as

Ultisols (Table 1). Complete taxonomic classification is

presented in Appendix A. In general, the Typic Hapludults;

and the Typic, Aquic, Plinthic, and Rhodic Paleudults are

well and moderately well-drained, with moderate to low

available water capacity and with moderate to moderately

slow permeability. These soils are acidic, low in organic

matter and nutrient contents. In gently sloping areas,

limitations are moderate for cultivate crops due to the

erosion hazard.

Arenic Hapludults; Arenic, Grossarenic, Arenic

Plinthic, and Grossarenic Plinthic Paleudults; and Typic

Quartzipsamments commonly are well to excessively drained.

Permeability varies from rapid to moderately rapid, and

available water capacity is low to very low. Droughtness

and low water retention capacity are among the principal

limitations for cropping on these soils.

Typic Fluvaquents; Typic Humaquepts; Typic

Ochraqualfs; Ultic Haplaquods; Typic, Arenic, Grossarenic,

Aeric, Plinthic, Umbric, and Arenic Umbric Paleaquults;

Typic Albaquults; and Typic and Aeric Ochraquults are

typically poorly drained. Permeability varies from

moderate to slow. Excessive wetness and flooding are among

the most important limitations for growing crops.






65

Table 1. Order, Great Group, and relative proportion of
pedons studied.


Order Great Group Number of pedons %
studied

Alfisols Hapludalfs 2 1.3

Ochraqualfs 2 1.3



Entisols Quartzipsamments 5 3.3

Others 2 1.3



Inceptisols Dystrochrepts 1 0.7

Humaquepts 1 0.7



Spodosols Haplaquods 2 1.3



Ultisols Hapludults 10 6.6

Paleudults 97 64.5

Paleaquults 15 9.9

Others 3 2.0

.. ......o .............................................

Non-designated 11 7.1
series *


TOTAL 151 100.0



These pedons have not been classified.













MATERIALS AND METHODS



Data Source

Data from 151 pedons (Calhoun et al.,1974; Carlisle et

al., 1978, 1981, 1985; I.F.A.S. Soil Characterization

Laboratory, unpublished data) were used for the study. In

total, 20 soil properties were selected (horizon thickness;

very coarse, coarse, medium, fine, and very fine sand

fractions; total sand, silt, and clay contents; pH-water;

pH-KCl; organic carbon content; Ca, Mg, Na, and K contents

extractable in NH40AC; total bases; extractable acidity;

CEC; and base saturation). The criterion for selection was

that these soil properties had to have been measured for

each horizon of the pedon. The number of horizons per

pedon varied between 4 and 7 horizons. There were 19,820

observations.

Pedon location, description, and sampling were done by

soil scientists from U.S.D.A. Soil Conservation Service and

the I.F.A.S. Soil Science Department. Physical and

chemical analyses of the soils were made by the personnel

of the Soil Characterization Laboratory of the University

of Florida, Gainesville. Procedures used for sampling and




66







67

chemical and physical analysis were outlined by Calhoun et

al. (1974) and by Carlisle et al. (1978, 1981, 1985).

Approximately half of the data was already stored in

an IBM XT microcomputer using the database management

software KeepIT (ITsoftware, 1984). It was necessary to

input approximately half of the data to complete the set of

observations for this study.



Location of Pedons

The pedons selected for study were located for soil

survey purposes using the system of Ranges and Townships

with the Tallahassee Meridian and Base Line as reference.

The program used for spatial analysis requires the location

of pedons expressed by geographic coordinates (Xs and Ys).

Therefore, each pedon was located on topographic maps at

1:24,000 scale according to the system of Ranges and

Townships, and each location was transformed into cartesian

coordinates (longitude and latitude). Elevation above sea

level was also recorded.

The map of physiographic regions of Florida (Brooks,

1981b) at the 1:500,000 scale was used as a base map to

locate the entire set of pedons. Using as a reference the

point 30Q 00' 00'' N and 87Q 24' 18'' W (X = 0 and Y = 0),

X and Y coordinates were determined. This reference point

was used to allow only positive Xs and Ys in the studied

area.







68

Pedon locations were plotted using the POST command of

Surface II software (Sampson, 1978).



Statistical Analyses

Statistical analyses were performed using an IBM XT

microcomputer and IFAS-VAX and NERDC main frame computers.

Transfer of data between microcomputer and main frame

computers was possible by using the public domain

communication programs Kermit (to link with IFAS-VAX) and

YT (to link with CMS-NERDC).

Statistical Analysis System software (SAS Institute

Inc, 1982a, 1982b) was used for the normality and principal

component analyses and for plotting purposes. The Fortran

program written by Skrivan and Karlinger (1979) was used

for the geostatistical analysis. Surface II software

(Sampson, 1978) was employed to generate isarithmic

(contour) maps and surface diagrams.

Normality Analysis

The UNIVARIATE procedure (SAS Institute Inc., 1982a)

was used to test normality. This test was mainly based on

the study of skewness, kurtosis, the Kolmogorov test, and

cumulative plots.

The NORMAL option was employed to compute a test

statistic for the hypothesis that the input data had a

normal distribution. The Kolmogorov D statistic was

computed because the sample size was greater than 50.








69

The PLOT option was used to plot the data. The CHART

procedure was employed to obtain histograms of the data.

Principal Component Analysis

The PRINCOMP procedure (SAS Institute Inc., 1982b) was

employed for the PCA. Because the soil properties studied

had different measurement scales, there was a risk of

having heterogeneous variances. An important assumption in

this analysis is the homogeneity of variances (Afifi and

Clark, 1984). Therefore, soil properties were standardized

to mean equal to 0 and variance equal to 1. As a result

the PCs were derived from the correlation matrix instead of

the covariance matrix. Eigenvalues (variances) and

eigenvectors (coefficients) of PCs were obtained by using

the PRINCOMP procedure.

The number of PCs was selected by using a rule of

thumb (Afifi and Clark, 1984, p. 322) that the PCs selected

are those that explain at least 100/P percent of the total

variance where P is the number of variables. The PCs

selected had an eigenvalue that represented > 5% of the

total variance. Eigenvectors for each PC were selected on

the basis that they had a value larger than the value

calculated using the following equation:


Sc = 0.5/ (PC eigenvalue)i (44)

where Sc = Selection criterion







70

The PLOT procedure (SAS Institute Inc., 1982a) was

employed to plot eigenvectors. The larger the value and

the closer the eigenvector to the PC axis, the larger the

contribution of the variable to the total variance. A

varimax rotation (orthogonal rotation of axes) was used

because some of the eigenvectors did not show a clear

contribution to a particular PC.

The FACTOR procedure (SAS Institute Inc., 1982b) was

employed for the varimax rotation and to plot the rotated

eigenvectors.

Each PC is a linear combination of standardized

variables having the eigenvectors as coefficients. Due to

this fact, collinearity between variables can be a problem.

It has been reported (SAS Institute Inc., 1982b) that use

of highly correlated variables produces estimates with high

standard errors. These estimates are very sensitive to

slight changes in the data.

The REG procedure (SAS Institute Inc., 1982b) with the

option COLLIN was used for the analysis of collinearity.

Variables with a tolerance lower than 0.01 were not

considered in the analysis (Afifi and Clark, 1984).

Tolerance is defined as:



T = 1 R (45)

where T = tolerance

R = coefficient of multiple correlation







71

Finally, the correlation coefficient between the PCs

and the soil properties was computed using the equation:

rij = aij (VAR PC)I (46)

where r i = correlation coefficient

aij = eigenvector

VAR PC = PC eigenvalue

Soil properties selected for further study were those

having a high (>10.751) correlation coefficient.

Geostatistical Analysis

A Fortran program written by Skrivan and Karlinger

(1979) was employed. The geostatistical analysis had four

parts.

Semi-variograms. The X, Y, and Z (soil property)

values were used as input in this step. Before a valid

semi-variogram can be calculated, the drift, if present,

must be removed, otherwise the stationarity assumption is

not fulfilled.

Journel and Huijbregts (1978) stated the criterion to

consider when the drift is absent. They indicated that,

considering the semi-variogram as a positive definite

function, an experimental semi-variogram with an increase

smaller than |h|2 (where h = modulus of the lag distance)

for large distances h is incompatible with the intrinsic

hypothesis. Such an increase in the semi-variogram most

often indicates the presence of a trend or drift. However,

drift can be determined if the semi-variogram has already








72

been calculated. Thus, an iterative process (trial and

error) was followed to calculate the semi-variogram.

An observed semi-variogram based on the data was

calculated. If drift was present, then the information

contained in the observed semi-variogram was used to

calculate the drift coefficients and residuals of the

observations relative to the drift function. Then, a new

semi-variogram from the residuals could be calculated.

This process was repeated until drift was removed or a

satisfactory semi-variogram was obtained.

Five semi-variograms were calculated for each

variable: direction-independent and direction-dependent

(N-S, E-W, NE-SW, NW-SE). The semi-variogram plots were

obtained by using the Energraphics software (Enertronics,

1983).

Fitting semi-variograms. In this step the structural

information (range, lag distance, and slope) was used to

adjust the parameters in the semi-variogram until the model

was theoretical consistent (Gambolatti and Volpi, 1979).

Consistency occurred when the kriged average error (KAE)

was approximately zero and the average ratio of theoretical

to calculate variance, called reduced mean square error

(RMSE) was approximately equal to one. These parameters

are represented by the following equations:



n
(i) KAE = 1/n ill(Zi Zi) (47)







73

where n = number of points

Z. = measured value


Z. = kriged value

n
(ii) RMSE = 1/n i (Zi Zi)2/ a2 (48)

where a2 = calculated variance and is equal to

n-1 n n-1
02 = K(0) -ilrii C(h)- ili M(h)+igi1i2 Si2 (49)

where K(0) = sill

P. = unknown weighting coefficient

C(h) = covariance based on semi-variance and
sill

i. = unknown LaGragian multiplier

M(h) = drift

S 2 = variance of the measurement error


The fitting procedure was based on the jackknife

method developed by Tukey (Sokal and Rohlf, 1981) which is

a useful technique for analyzing statistics if

distributional assumptions are of concern.

The procedure was to split the observed data into

groups (usually of size one) and to compute values of the

statistic with a different group of observations being

ignored each time. The average of these estimates was used

to reduce the bias in the statistic. The variability among

these values was used to estimate the standard error.







74

Gambolati and Volpi (1979) extended the use of this

technique to geostatistics.

Kriging. Universal kriging was the method used in

this investigation. Universal kriging takes into account

local trends in data, minimizing the error associated with

estimation. The kriged Z value for X and Y location and

its associated variance were computed.

The kriged Z values and associated standard errors

were the inputs to the Surface II software to produce

isoline maps of the different values and associated

variances.

Fractals. Statistical Analysis System (SAS Institute

Inc., 1982a, 1982b) was employed for transforming semi-

variance and lag distance values into logarithmic values.

The REG procedure was used to obtain the slope of the

line. The Hausdorff-Besicovitch dimension was computed by

using equation (43).

Finally, this dissertation was written using

WordPerfect software (SSI Software, 1985).














RESULTS AND DISCUSSION



Test of Normality

The assumption of normality is important for most

statistical analyses. Mean and standard deviation are

needed to characterize completely the distribution of

values if the data are normally distributed. When data are

normally distributed, approximately 95% of the values fall

within two standard deviations of the mean (Montgomery,

1976; Snedecor and Cochran, 1980; SAS Institute Inc.,

1982a).

Gower (1966), however, pointed out that in PCA, unlike

other forms of multivariate analyses, no assumptions are

needed about the distribution of the variates, hypothetical

populations, except when significance tests are of

interest. Likewise, Gutjahr (1985) and Olea (1975) have

stated that the assumption of normality is not needed in

geostatistics. Stationarity is the most important

assumption in geostatistics, although Burrough (1983a)

indicated that stationarity is very difficult to achieve.

Normality, therefore, is not required for PCA and

geostatistics. However, the test of normality was

performed because a large number of soil variability


75








76

studies have implicitly assumed a normal distribution of

soil properties without using any statistical test to

justify this assumption. Also, a large data base was

available. Thus, a conclusion such as "data were non-

normally distributed because of the small number of

observations" has no validity in this study.

There are two main tests of normality. One is a

graphical method based on histograms or plots of values

measured on probability paper. The other one is based on a

quantitative measure such as the Kolmogorov test. Rao et

al. (1979) indicated that graphical methods have specific

drawbacks. First, they often rely on visual inspection,

and thus are subject to human error. Second, as graphical

methods are not based on quantitative measures, an

objective statistical evaluation of the goodness-of-fit of

the theoretical distribution to the measured data is not

possible. Consequently, the normality analysis was based

on more a quantitative measure rather than a graphical

method.

The data were tested against a theoretical normal

distribution with mean and variance equal to the sample

mean and variance. Skewness, kurtosis, the Kolmogorov D

statistic, and plot of data were used to test the null

hypothesis that the input data values were normally

distributed (SAS Institute Inc., 1982a).








77

When the distribution is not symmetric, the skewness

can be positive (skewed to the right) or negative (skewed

to the left). Kurtosis refers to the degree of peakedness

of a frequency distribution (Silk, 1979). A heavy tailed

distribution has positive kurtosis. Flat distributions

with short-tails or when almost all data values appear very

close to the mean have negative kurtosis. The measure of

skewness and kurtosis for a normally distributed population

is zero (SAS Institute Inc., 1982a).

A significance level (a) value of 0.15 was selected as

the criterion for acceptance or rejection of the null

hypothesis (H = Normal). When normality is tested the

interest is in accepting the null hypothesis. This is in

contrast to most situations when the interest is in

rejecting the null hypothesis. For these reason, Rao et

al. (1979) proposed an a value between 0.15 and 0.20 in

order to have a balance between type I and II errors.

Statistical moments for each soil property were

computed (Table 2). Most variables had large coefficients

of variation (C.V.). Soil pH (water and KC1) had the

lowest variation, reflecting uniform condition of pH, in

this case the acidity.

Other soil properties had a large C.V.. Most of these

soil properties are naturally related, and the large C.V.s

were mutually influenced. For example, the amount of








78

Table 2. Statistical moments of soil properties studied
and Kolmogorov test.


* Mean Variance C.V Skewness Kurtosis D:Normal PROB>D
(%)


TH 30.2 365.8 63.3 1.44 2.92 0.12 <.01
VC 1.2 5.4 189.6 4.59 29.3 0.30 <.01
C 6.4 39.5 97.5 1.36 1.53 0.15 <.01
M 17.0 125.6 65.9 0.86 1.11 0.06 <.01
F 32.7 209.2 44.2 0.40 -0.10 0.07 <.01
VF 12.9 68.8 64.4 1.11 1.92 0.07 <.01
TS 70.0 354.8 26.9 -1.18 1.63 0.08 <.01
Silt 10.7 78.4 83.0 3.92 27.0 0.16 <.01
Clay 19.4 260.9 83.3 1.33 2.11 0.12 <.01
PHI 5.1 0.35 11.6 -0.67 13.5 0.12 <.01
PH2 4.2 0.28 12.5 -0.19 9.91 0.10 <.01
OC 0.43 0.52 167.8 3.41 14.1 0.28 <.01
Ca 0.94 5.51 250.3 6.15 49.1 0.34 <.01
Mg 0.36 0.81 253.2 10.8 154.3 0.35 <.01
Na 0.03 0.002 130.8 3.62 25.5 0.22 <.01
K 0.06 0.009 170.5 3.72 19.3 0.28 <.01
TB 1.38 8.76 213.7 6.05 49.1 0.32 <.01
EXT 5.61 30.8 98.9 2.79 12.8 0.16 <.01
CEC 7.01 49.1 99.9 2.83 11.0 0.18 <.01
BS 18.8 399.7 106.2 1.78 2.90 0.18 <.01


* See Abbreviations, pp. xii-xiii


TH is expressed in cm; VC, C, M, F, VF, TS, silt, clay,
OC, and BS are expressed as %; Ca, Mg, Na, K, TB, EXT,
and CEC are expressed as cmol/kg.


n = 991








79

extractable cations (Ca, Mg, Na,and K) depends largely on

the CEC, which in turn depends on particle size.

The large variation in particle size (very coarse,

coarse, medium, fine, and very fine sand fractions; silt;

and clay contents) was influenced by the diversity of

Paleudults (Appendix A) and the presence of horizons with

quite different textures. Paleudults had variable

thickness of coarse-textured horizons (Typic, Arenic, and

Grossarenic Subgroups) overlying fine-textured argillic

horizons.

Most of the soil properties studied did not have

skewness and/or kurtosis close to zero. The exception was

fine sand. Also, the histogram and normal probability plot

(Figure 6) indicated that fine sand values were normally

distributed, but when the Kolmogorov test was performed, it

indicated that fine sand had a large probability of being

non-normal. The significance probability (PROB>D) of the

Kolmogorov D statistic (D:Normal) was smaller than

a = 0.15. So, the null hypothesis was rejected for fine

sand.

Results of the Kolmogorov test indicated that the soil

properties studied had a non-normal distribution. Results

of the Kolmogorov test were also supported by the

histograms and normal probability plots. Histograms

revealed that distribution of values by soil property did

not have the characteristic bell-shaped curve of a normal










80



HISTOGRAM
17.5+* 1

3
2
S..22

-- -- - 81
SS------ 46


2. 5+ -. 4 ******* .
o ,s> !!$. ..--- 77
.~4.4,.4.4 -. - -4S~ 44. -4 4
*0d4- - -f -. 4 4. 4 -. -4. 101-


s' --- ---- -- 131











87.5 NORMAL PROBABILITY .PLOT
148r
***24
2.5 E*s 13
MAY REPRESENT UP TO 4 caUNTS







NORMAL PROBABILITY PLOT
87.5+ *





25 *** 4.






+ +----+-------__+___ _ -. _+__ .. ___..*
-2 -1 0 +1 +2

iav(ri-3/8)/(n* I/4)

livulnverse of the standard normal distribution
function.
rilRank of the data value.
n=Number of non-missing values.
+*Theoretical distribution.
*zSample distribution.




Figure 6. Histogram (a) and normal probability
plot (b) of fine sand content.
4,








81

distribution. In addition, normal distribution plots

indicated a lack of correspondence between the observed and

the theoretical distributions, for example organic carbon

content (Figure 7).

Transformations (logarithmic, arcsine, or square root)

were not made on the original data because the objective

was to accept or reject the normal distribution. In

addition, interpretation of transformed data is complex.

These results could support the fact that there were

systematic patterns of soil properties; observations were

not independent but associated within certain distance.

Patterns of soil properties influenced the probability

distribution.

The presence of trends in soil properties associated

with landscape position has been recognized. Walker et al.

(1968) pointed out that such trends suggested that the

analysis of soil data in terms of mean and standard

deviation is questionable, since the assumption of random

variation does not appear valid.

In addition, Hole and Campbell (1985) indicated that

if place-to-place variation occurred at random, without

elements of organization and order, mapping efforts could

proceed only with the greatest difficulty because

information and experience gained at one location would

have little predictive value at new locations. Under such

circumstances each mapping problem would be unique because










quaquoo uoqaxo OTUp6Io jo
(q) 4oTd A4TTTqpqo.id TPULOU pue (e) UeiB0OSTH "L e*anbTa




('/Iu)/(8/-1)A"I

Z+ 1+ 0 1- Z-
------ -+--------+----_+_ _+- -_+E.--*--.+1 0



Q *f*** ++
+++ +4
+++
++

++ ** a
++4. *** f
. 0

1 0
*uoTnnqTJa .Tp *yTSJ o 3,

S**lan IA SuTIgTB-UOU Jo JeqnNH- l,


** uoTi qTJ1V Tp y j ij puww *oq jo 0*UjO*UIIAij











T *
J.Dd A.&.8ievOad "iWODN


sj./Nrio LI O dn it =S~od i AVA .
-4-----------+-- +--+-+ ---4_








WS V 19D I S I
1 *--


01 f.-





*1t








83

of the lack of a consistent geographic order that can be

transferred from previous experience to analogous settings.



Principal Component Analysis

Twenty soil properties were initialy selected to study

the soil spatial variability using geostatistics.

Geostatistical analysis is time consuming and complex.

Conversely, all soil properties do not have the same degree

of importance to quantify the spatial variability of soils.

Therefore, reduction of soil properties was necessary for

further analysis.

PCA was used as an unbiased method to select the most

important soil properties. Important soil properties were

defined as those that explained a large proportion of the

total variance.

Two sets of data were employed for this analysis. One

set was composed by the weighted average of selected soil

properties in individual pedons. Horizon thickness was

used as the weighting criterion. Information is lost when

averages are used. Therefore, a second set of data composed

of selected soil properties from the surface A horizon were

used.

Principal Component Analysis for Standardized Weighted Data

A basic assumption of PCA is that variables have

homogeneous variances (Afifi and Clark, 1984; Webster,

1977). The soil properties studied had different scales of








84

measurement (thickness was measured in cm; particle size,

organic carbon content, and base saturation in %; and

extractable cations, total bases, extractable acidity, and

CEC in cmol/kg). Therefore, it is difficult to compare

them. For this reason, all soil properties were

standardized to mean zero and variance one.

One measure of the amount of information conveyed by

each PC is in its variance (eigenvalue). For this reason,

the PCs are commonly arranged in order of decreasing

variance (Table 3). The most informative PC is the first

and the least informative is the last.

The criterion for selecting PCs was stated in the

Materials and Methods section. The first five PCs were

selected for further analysis. Each of them explained more

than 5% of the total variance (Table 3). The first five

PCs together explained more than 73% of the total variance.

Different interpretative analyses were performed to

select the soil properties that contributed the most to the

total variance. A very informative display of the

relationships between soil properties and PCs were plots

(Figure 8). The most important soil properties were those

with large values located closer to the axis of the PC.

Some properties did not have a clear contribution to

an individual PC, such as coarse and medium sand fractions

and Mg content (Figure 8). The axes of PCs were rotated




Full Text

PAGE 1

SELECTION OF IMPORTANT PROPERTIES TO EVALUATE THE USE OF GEOSTATI STI C AL ANALYSIS IN SELECTED NORTHWEST FLORIDA SOILS BY FRANCISCO A. OVALLES A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1986

PAGE 2

This dissertation is dedicated to my wife, Giordana, and my children, Johanna Fernanda and Pedro Jose.

PAGE 3

ACKNOWLEDGMENTS The author wishes to express his gratitude to Dr. Mary E. Collins, chairman of the supervisory committee, for her continuous help, guidance, patience, and personal friendship throughout the graduate program. Appreciation is also extended to other members of the committee, Dr. Gustavo Antonini, Dr. Richard Arnold, Dr. Randall "Randy" Brown, and Dr. Stewart Fotheringham, for their constructive reviews of this work, participation on the graduate supervisory committee, and personal friendship. Appreciation is expressed to the Consejo Nacional de Investigaciones Cientificas y Tecnologicas (CONICIT), Venezuela, for the scholarship which supported the author. Thanks are extended to Dr. Willie Harris who introduced me to the Keepit and YT, and always was ready to answer any of my guestions. Very special thanks are due to Dr. Gregory "Greg" Gensheimer for lending me the geostatistical program and his own computer to type this dissertation. Gratitude is expressed to the staff of the Soil Characterization Laboratory, for their friendship and valuable assistance, to other graduate students, staff, and faculty. iii

PAGE 4

Appreciation is extended to all my friends from all six continents (America, Africa, Asia, Europe, Oceania, and Florida) whom I had the pleasure of knowing here. Finally, but certainly not least, I thank my wife, Giordana, my daughter, Johanna Fernanda, and my son, Pedro Jose, for their love and continuous help, encouragement and patience during this work. iv

PAGE 5

TABLE OF CONTENTS Page ACKNOWLEDGMENTS iii LIST OF TABLES vii LIST OF FIGURES ix ABBREVIATIONS x ii ABSTRACT x i v INTRODUCTION 1 LITERATURE REVIEW 4 Principal Component Analysis 4 Geostatistics 13 Historical Development 13 Theoretical Bases 15 Practical Use 34 Fractals 53 DESCRIPTION OF STUDY AREA 59 Location 59 Physiography, Relief, and Drainage '.'.'.'.59 Geology 61 Climate 62 Land Use and Vegetation 62 Soils [53 MATERIALS AND METHODS 66 Data Source 66 Location of Pedons !!!!!!!!!! 67 Statistical Analyses 68 Normality Analysis 68 Principal Component Analysis 69 Geostatistical Analysis '.'.'.11 RESULTS AND DISCUSSION 75 Test of Normality !75 v

PAGE 6

Page Principal Component Analysis 83 Principal Component Analysis for Standardized Weighted Data 83 Principal Component Analysis for A horizon Standardized Data 93 Principal Component Analysis by Soil Series 101 Geostatistics 113 Semi-Variograms 114 Fitting Semi-Variograms 140 Kriging 143 Fractals 156 SUMMARY AND CONCLUSIONS 165 APPENDIX A CLASSIFICATION OF SOIL SERIES STUDIED 178 B GEOGRAPHIC COORDINATES OF PEDONS STUDIED 181 C SEMI-VARIOGRAMS FOR DIRECTIONS WITH LARGEST VARIABILITY 186 D CONTOUR MAPS FOR DIRECTIONS WITH LARGEST VARIABILITY 192 E MAP OF PHYSIOGRAPHIC REGIONS IN NORTHWEST FLORIDA 196 LITERATURE CITED 197 BIOGRAPHICAL SKETCH 207 vi

PAGE 7

LIST OF TABLES Table Page 1 Order, Great Group, and relative proportion of pedons studied 65 2 Statistical moments of soil properties studied and Kolmogorov test 78 3 Proportion of total variance explained by each principal component 85 4 Eigenvectors of correlation matrix for standardized weighted average of soil properties 89 5 Tolerance of standardized weighted average of soil properties by principal component 91 6 Correlation coefficients between standardized weighted average of soil properties and principal components 92 7 Proportion of total variance explained by each principal component for standardized A horizon data 94 8 Eigenvectors of correlation matrix for standardized properties of A horizon 95 9 Tolerance of standardized properties of A horizon by principal component 96 10 Correlation coefficient between standardized properties of A horizon and principal components 97 11 Correlation coefficient between standardized properties of Al horizon and principal components 99 12 Correlation coefficient between standardized properties of Ap horizon and principal components 100 vii

PAGE 8

Table Page 13 Variability of studied soil properties within and between soil series and between horizons 107 14 Important semi-variogram parameters of the weighted average of selected soil properties 127 15 Important semi-variogram parameters of the A horizon selected properties 136 16 Goodness-of-f it values of the weighted average of selected soil properties 142 17 Goodness-of-f it values of the A horizon selected properties 144 18 Fractal dimension (D value) derived from selected soil property semi-variograms 158 19 Fractal dimension (D value) derived from selected soil property semi-variograms for a reduced study area 162 viii

PAGE 9

LIST OF FIGURES Figure Page 1 Relation among variance, covariance, and serai-variance 20 2 Common semi-variogram models 27 3 Equation number 35 31 4 Equation number 36 (a) and Equation number 37 (b) 32 5 Location of the counties from which characterization data were available for pedons selected for study 60 6 Histogram (a) and normal probability plot (b) of fine sand content 80 7 Histogram (a) and normal probability plot (b) of organic carbon content 82 8 Location of standardized weighted average values of soil properties in the plane of the first two principal components 86 9 Location of standardized weighted average values of soil properties in the plane of the rotated first two principal components 88 10 Soil properties with a large contribution to the total variance by county for the Albany series 102 11 Soil properties with a large contribution to the total variance by county for the Dothan series 103 12 Soil properties with a large contribution to the total variance by county for the Orangeburg series 104 13 Location of selected soil series in the plane of the first two principal components 106 ix

PAGE 10

Figure Page 14 Location of selected soil series in the plane of the first two principal components derived from important soil properties 110 15 Location of selected pedons in the studied area 115 16 Weighted average total sand content first direction-independent semi-variogram 120 17 Weighted average clay content first direction-independent semi-variogram 121 18 Weighted average total sand content fitted direction-independent semi-variogram 123 19 Weighted average total sand content direction-dependent semi-variograms 124 20 Weighted average clay content fitted direction-independent semi-variogram 125 21 Weighted average clay content directiondependent semi-variograms 126 22 Weighted average organic carbon content fitted direction-independent semi-variogram 131 23 Weighted average organic carbon content direction-dependent semi-variograms 132 24 A horizon clay content fitted directionindependent semi-variogram 134 25 A horizon clay content direction-dependent semi-variograms 135 26 A horizon organic carbon content fitted direction-independent semi-variogram 138 27 A horizon organic carbon content directiondependent semi-variograms 139 28 Contour map (increment is 10.0%) (a) and diagram (vertical exaggeration is 18x, azimuth of viewpoint is 25s) (b) of kriged weighted average total sand content 146 29 Contour map (increment is 10.0%) (a) and diagram (vertical exaggeration is 18x, azimuth of x

PAGE 11

Figure Page viewpoint is 25q) (b) of kriged weighted average clay content 147 30 Contour map (increment is 2.0%) (a) and diagram (vertical exaggeration is 18x, azimuth of viewpoint is 25q) (b) of kriged A horizon clay content 148 31 Diagram (vertical exaggeration is 18x, azimuth of viewpoint is 25s) of standard errors of kriged weighted average total sand content 153 32 Diagram (vertical exaggeration is 18x, azimuth of viewpoint is 25s) of standard errors of kriged weighted average clay content 154 33 Diagram (vertical exaggeration is 18x, azimuth of viewpoint is 25q) of standard errors of kriged A horizon clay content 155 34 Location of reduced study area 161 35 Weighted average total sand content fitted N-S semi-variogram 186 36 Weighted average clay content fitted N-S semi-variogram 187 37 Weighted average organic carbon content fitted N-S semi-variogram 188 38 A horizon clay content fitted NW-SE semi-variogram 189 39 A horizon organic carbon content fitted NW-SE semi-variogram 190 40 Contour map (increment is 10.0%) derived from weighted average total sand content N-S semi-variogram 192 41 Contour map (increment is 10.0%) derived from weighted average clay content N-S semi-variogram 193 42 Contour map (increment is 2.0%) derived from A horizon clay content NW-SE semi-variogram 194 43 Map of physiographic regions in northwest Florida (Source: Brooks, 1981b) 196 xi

PAGE 12

ABBREVIATIONS a = range A-Cl = A horizon clay content A-OC = A horizon organic carbon content BS = Base saturation C = Coarse sand c = Sill Ca = Calcium CEC = Cation exchange capacity Co = County 3 = Bay 30 = Holmes 32 = Jackson 33 = Jefferson 37 = Leon 40 = Madison 57 = Santa Rosa 66 = Walton COV = Covariance C.V. = Coefficient of variation EXT = Extractable acidity F = Fine sand G(h) = GAMMA = Semivariance h = Lag distance K = Potassium M = Medium sand xii

PAGE 13

Mg = Magnesium Na = Sodium OC = Organic carbon PHI = pH-water PH2 = pH-KCl Sc = Selection criterion for eigenvectors T = Tolerance TB = Total bases TH = Horizon thickness TS = Total sand VAR = Variance VC = Very coarse sand VF = Very fine sand xiii

PAGE 14

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy SELECTION OF IMPORTANT PROPERTIES TO EVALUATE THE USE OF GEOSTATI STI CAL ANALYSIS IN SELECTED NORTHWEST FLORIDA SOILS BY FRANCISCO A. OVALLES December, 1986 Chairman: M.E. Collins Major Department: Soil Science Soil variability is a limiting factor in making accurate predictions of soil performance at any particular position on the landscape. A large number of studies have been made to quantify soil variability, but a large portion of them ignored the multivariate character of soils and the geographic aspect of soil variability. Data from 151 pedons in northwest Florida were selected (i) to determine the important properties affecting soil variability and (ii) to evaluate the soil variability in the area studied using geostatistics Data were non-normally distributed but statistical techniques employed did not require the assumption of normality. This result could support the presence of systematic patterns of soils. xiv

PAGE 15

Principal component analysis was used to reduce the number of soil properties to study the soil variability. Two sets of data were used: weighted average values of soil properties, and A horizon properties. Horizon thickness was used as the weighting criterion. Variables were standardized to mean zero and variance one. Plots of soil properties in the plane of the principal components, varimax rotation, analysis of eigenvalues, eigenvectors, and collinearity and calculation of correlation coefficients between soil properties and principal components were used to select important properties for evaluation of soil variability. A nested analysis of variance indicated that properties selected by the principal component analysis were differentiating properties. Geostatistical analysis was applied to the properties selected. The within-soil series variance was used as criterion to assess stationarity Drift was present. Consequently, residuals were used to compute semivariograms. Semi-variograms of total sand and clay contents showed structure. Nugget variance was present in all semivariograms. Ranges varied from 15 to 35 km. Soil variability was direction-dependent. The N-S and NW-SE were the directions of maximum variability. Organic carbon content had a large point-to-point variation. xv

PAGE 16

All observed semi-variograms had a characteristic wave pattern that indicated a cyclic variation of soil properties Kriged standard error diagrams were functions of the nugget variance and showed areas where more samples are required to increase the precision of estimates. Fractal dimensions indicated the scale-dependent character of soil variability. xvi

PAGE 17

INTRODUCTION The fundamental purpose of a soil survey is to estimate the potentials and limitations of soils for many specific uses. Soil delineations are mapped to be as homogeneous as possible in order to correlate the adaptability of soils to various crops, grasses, and trees; and to predict their behavior and productivity under different management practices (Soil Survey Staff, 1951; 1981) Quality of soil surveys has been improved over the years as a result of improved understanding of soil. But soil variability remains as one of the main constraints to reliable soil interpretations and is a limiting factor for making accurate predictions of soil performance at any particular position on the landscape. The study and understanding of soil variability represents a cornerstone for improving soil surveys. Belobrov (1976), a Russian soil scientist, pointed out that "The degree of approximation between the true and the observed soil variability does not depend on the nature of the soil cover, but mainly on the methods of investigation" (p. 147). 1

PAGE 18

For several years, soil scientists used methods of investigation which did not consider the "real nature" of soils, because they ignored the systematic variation of soils on the landscape and assumed a random variation of soils in space. On the other hand, despite the fact that it has been recognized that a soil map unit is imperfect to varying degrees, depending on the scale of the map and the nature of the soil (Soil Survey Staff, 1975), most soil surveys in the U.S. A have accepted an unrealistic model in which map units encompass soil bodies that form discrete, internally uniform units, with abrupt boundaries at their edges (Hole and Campbell, 1985). Studies of soil variability have not been consistent. These studies have considered a random variation of soils and at the same time they have used a limited number of observations for characterizing map units to establish the range of variation of observed properties. The assumption has been that properties measured at a point also represent the unsampled neighborhood. The extent to which this assumption is true depends on the degree of spatial dependence among observations. The number of studies for quantifying soil variability has sharply increased in the last 10 years, but quantification still remains a problem. A large proportion of quantitative studies are based on untested assumptions, ignore the multivariate character of soils, or use a biased

PAGE 19

selection of properties to represent the soil variability, increasing the risk of erroneous conclusions. For these reasons a large soil data base was selected in northwest Florida with the following objectives: (i) to discover which soil properties most strongly influence the soil variability in the area studied, and (ii) to study how geostatistics can be used in evaluating soil variability.

PAGE 20

LITERATURE REVIEW Principal Component Analysis The multivariate character of soil is well recognized; a large set of measurements of soil properties (morphological, chemical, physical, and mineralogical ) can be derived from a single sample. The complete set of available data is not always used for numerical analyses. Hole and Campbell (1985) indicated that the selection of soil properties depends on the objectives of the study, and also reflects the constraints imposed by cost, time, effort, and access. There is no doubt that logically correlated variables, such as soil pH and base saturation, are generally so highly covariant that one or the other should not be included in the analysis. Particle-size fractions (sand, silt, and clay) always add up to 100%, and therefore, the whole set of particle-size data should not be included in the analysis. Consequently, in the process of selecting soil properties, there is an important question to be answered: Are the selected soil properties the most important to represent the variability of the complete set of data? 4

PAGE 21

Webster (1977) pointed out that when one soil property is measured in a set of individual sampling units, the measured values can be represented by their positions on a single line. The relation between any pair of individuals can be represented by the distance between them and the relations among several individuals can be established simultaneously from their relative positions on the line. At the same time, it is almost impossible to visualize their positions on the line and the relations among more than two individuals simultaneously. Thus, he indicated that an alternative way of dealing with multivariate data is to arrange the individuals along one or more new axes. This reduction of an arrangement in many dimensions to a few dimensions is known as ordination. The two most common methods of ordination are Factor Analysis (FA) and Principal Components Analysis (PCA). Shaw and Wheeler (1985) said that in both technigues new variables are defined as mathematical transformations of the original data. However, FA assumes that the original variable is influenced by various determinants: a part shared by other variables known as the common variance; and a unigue variance which consists of both a variance accounted for by influences specific to each variable and also a variance relating to measurement error. In contrast, PCA assumes that statistical variation in the variables is explained by the variables themselves, in this

PAGE 22

case by the common variance. PCA is recommended when there are high correlations between variables, a large number of variables, and a need for only simple data reduction. The major objective in PCA is to select a number of components that explain as much of the total variance as possible, whereas FA is used to explain the interrelationship among the original variables (Afifi and Clark, 1984). PCA has the advantage in that the values of principal components are relatively simple to compute and interpret. PCA is a method that has been used to reduce the number of variables without losing important information (Webster, 1977). In general, the analysis finds the principal axes of a multidimensional configuration and determines the coordinates of each individual in the population relative to those axes. Then, the data can be represented in a few dimensions by projecting the points orthogonally on the principal axes. The basic idea of PCA is to create new variables called the principal components (PC) (Afifi and Clark, 1984). Each new variable is a linear combination of the X variables and can therefore be written as PC = A 11 X 1 + A 12 X 2 + (1) where PC = principal component coefficient (eigenvector) variable

PAGE 23

7 Coefficients of these linear combinations are chosen to satisfy the following requirements: (i) Variance PC. > variance PC> variance PC i z n (ii) The values of any two PCs are uncorrelated. (iii) For any PC the sum of the squares of the coefficients is one. Cuanalo and Webster (1970) used PCA in a study of numerical classification and ordination in which morphological, physical, and chemical soil properties (pH, clay, silt, fine sand, proportion of stones, consistence, water tension, color, mottling, and peatiness) were measured at depths of 13 cm and 38 cm at 85 sites and randomly sampled within physiographic units near Oxford, England. The variables were standardized to unit variance and the population was centered at the origin. It was found that the first six PCs represented almost 70% of the total variation presented in the original data. The first three PCs represented more than 50% of the total variance. The first component showed large contributions from water tension, and chroma in both the topsoil and the subsoil. In the second component, contribution of fine sand in the topsoil (13 cm) and subsoil (38 cm) was dominant. Hue and value made large contributions to the third component. The projection of the population scatter on the plane defined by the first two PCs gave the most informative display of relations in the whole space. These authors suggested that

PAGE 24

8 when numerical data are available, the data should be examined first by ordination procedures; then, the data selected by the ordination procedure can be used with a numerical classification to decide if such classification grouped data satisfactorily. Norris (1972) used PCA to study trends in soil variation. He described several morphological soil properties (stage of organic matter decomposition, percentage of stones, structure, consistence, porosity, roots, biological activity, and color in terms of presence or absence of gley or dark colors) in 410 pedons, 307 pedons located in woods and 103 pedons located in farmland. The first -PC accounted for 39% of the total variance, and corresponded to a trend from deep, stoneless pedons developed on a clayey formation to pedons developed on shallow limestone on steep slopes. The second PC accounted for 14% of the total variance and separated pedons located on farmland from those located in the woods. He concluded that the PCs served as a summary of soil variation in the area, because they accounted for a known percentage of the soil variation and were correctly defined in terms of the properties used to describe the soil. Webster and Burrough (1972) sampled the first two horizons from 84 soil pedons and recorded selected soil properties (soil color, CaC0 3 content, depth to CaC0 3 total penetrable soil depth, clay content, organic matter

PAGE 25

9 content, cation exchange capacity (CEC), pH, exchangeable Mg and K contents, and available P content). They used PCA to reduce the dimensionality of the data, and found that the first two PCs accounted for 55% of the total variance (40% the first component and 15% the second component). Separate contributions to the components were determined by projecting vectors on the components axes. They established that those properties determined in the field (CaC0 3 content, depth to CaC0 3 clay content, and subsoil color) were closely correlated and well represented in one dimension in the first component. The properties measured in the laboratory (organic matter content, CEC, and exchangeable Mg content) contributed most to the second component, indicating differences in management rather than natural soil differences. Results of the numerical classification were supported by showing the distribution of sampling sites in space projected on to the plane of the first two components and showing the frequency distribution of the first PC. There was a good agreement among the results. Therefore, it was concluded that when PCs represent the variables that explain soil variation the components can be mapped as isarithms and the maps have interpretable meaning. Kyuma and Kawaguchi (1973) employed PCA to grade the chemical potentiality of 41 Malayan paddy soil samples; 23 physical, chemical, and mineralogical properties were

PAGE 26

evaluated. The first four PCs accounted for 75% of the total variance. The first PC was highly positively correlated with electrical conductivity, exchangeable Ca, Mg, Na, and K contents, moisture, CEC, available Si content, and 0.2 M HCl-soluble K. The first PC was highly negatively correlated with the kaolin mineral content. All of these properties were relevant to the chemical potentiality of the soil, thus, the first PC was called the chemical potentiality component. The standardized scores of the first PC were computed. These scores were used for grading soils in terms of the chemical potentiality. The authors stated that the result of grading was reasonable. Placed at the top of the scale were soils developed on juvenile marine sediments. Soils having high sand and/or kaolin content were at the bottom of the scale. The authors concluded that PCA was useful for comparing the soil fertility status among soils. Burrough and Webster (1976) used PCA with Similarity and Canonical Variate Analyses to improve soil classification in eastern Malaysia. Morphological and chemical properties determined by routine analysis were recorded from 66 randomly selected sites. The first nine PCs accounted for more than 70% of the total variance. Scatter diagrams of pairs of components were drawn to elucidate the population structure. Established classes that were originally thought to be desirable overlapped

PAGE 27

11 almost completely with respect to morphological and chemical properties. Dendograms derived from similarity analysis confirmed the interpretations drawn from the scatter diagrams. Williams and Rayner (1977) employed PCA as a method for grouping soils based on chemical composition (Fe, Ti, Ca, K, Si, Al, P, Mg, Mn, Ni, Cu, Zn, Ga, As, Br, Rb, Sr, Y, Zr, and Pb total contents) and other soil properties such as particle size (sand, silt, and clay), loss on ignition, CaC0 3 content, pH, and soil moisture. The scatter diagram showed that the first two components divided the soils into parent material groups. This grouping was also supported by using dendograms derived from similarity analysis. It was concluded, on the basis of the PCA, that the soils sampled came from three parent materials of different ages. McBratney and Webster (1981) studied the relationships between sampling points using PCA. A substantial proportion (44%) of the total variance was explained by the first two PCs. The first component represented color. Varimax rotation was employed to obtain a better interpretation of the scatter diagram but it produced no appreciable improvement in interpretability. The scatter diagram of PC allowed the separation of sampled points into five different groups.

PAGE 28

12 Richardson and Bigler (1984) applied PCA to selected soil properties (clay content, pH, organic carbon content, CaC0 3 equivalent, electrical conductivity, and soluble Mg, Ca, and Na contents) which were meaningful to soil development and plant growth in wetlands in North Dakota. Four routine measurements useful for characterizing and classifying wetland soils were identified by PCA (electrical conductivity, organic carbon content, CaC0 3 equivalent, and clay content). Electrical conductivity and soluble Mg and Na contents were the most important variables in explaining observable differences in wetland soils. In addition, the use of PCA allowed the examination of the interaction of chemical and physical properties with the landscape position of wetland soils, as well as the variation in properties among vegetation zones, after the data were plotted in the plane of the first two PCs. Edmonds et al. (1985) employed PCA as a first step for using Cluster and Discriminant Analyses to study taxonomic variation within three soil map units. Forty different soil properties were included in the analyses. Variables with low variance were excluded by the analysis. PCA was used to reduce the number of dimensions needed to ordinate pedons in the plane of PCs (character space) and to remove intercorrelation of soil properties. The use of PC scores as data for Cluster Analysis avoided distortions in coordinates of the pedons in the plane of PCs. They

PAGE 29

13 compared the results with the taxonomic classification of soils, and concluded that grouping of pedons by numerical taxonomy did not correspond to groupings by taxa in Soil Taxonomy Geostatistics Webster and Burgess (1983) pointed out that to describe soil variation two features of soil must be taken into account. The first is that long range trends have no simple mathematical form; usually, there is not any obvious repeating pattern; and the larger the area or the more intensive the sampling the more complex the variation appears. The second is that the point-to-point variation in a sample reflects real soil variation. Only a small part is the measurement error. In addition, the same authors indicated that earlier attempts to describe spatial variation in geology and geography involved fitting deterministic global eguations to data, either exactly or by least squares approximation. But the two features mentioned above make the approach inappropiate for soil. Thus, an alternative was to treat the soil as a random function and to describe it using geostatistics techniques. Historical Development Etymologically, the term geostatistics designates the statistical study of natural phenomena, and it is defined as the application of the formalism of random functions to

PAGE 30

14 the reconnaissance and estimation of natural phenomena (Journel and Huijbregts, 1978). Geostatistics was primarily developed for the mining industry (Matheron, 1963). Geostatistics was very useful for engineers and geologists for studying the spacial distribution of important properties such as grade, thickness, or accumulation of mineral deposits. Matheron (1963) considered that, historically, geostatistics was as old as mining itself. He indicated that as soon as mining men concerned themselves with foreseeing results of future work and, in particular, as soon they started to take and to analyze samples and compute mean values weighted by corresponding thickness of deposits and influence-zones, geostatistics was born. Geostatistics started in the early 1950s in South Africa with D.G. Krige (Olea, 1975). Krige realized that he could not accurately estimate the gold content of mined blocks without considering the geometrical setting (locations and sizes) of the samples. Matheron expanded Krige 's empirical observations into a theory of the behavior of spatially distributed variables which was applicable to any phenomenon satisfying certain basic assumptions, and the variables were not limited by their physical nature.

PAGE 31

15 Theoretical Bases Classical statistics could not be used for ore estimation because of their inability to take into account the spatial aspect of the phenomenon (Matheron, 1963). An aleatory variable had two essential properties: (i) the possibility, theoretical at least, of repeating indefinitely the test that assigned to the variable a numerical value, and (ii) the independence of each test from the previous and the next tests. A given ore-grade within a deposit would not have those two properties. The content of a block of ore was first of all unigue, but on the other hand, two neighboring ore samples were certainly not independent. Earth scientists usually deal with complex phenomena which are the result of the interaction of variables, through relationships which are in part unknown and in part very complex (Olea, 1975). Variations are erratic and often unpredictable from one point to another, but there is usually an underlying trend in the fluctuations which precludes regarding the data as resulting from a completely random process. To characterize variables which are partly stochastic and partly deterministic in their behavior, Matheron (1971) introduced the term regionalized variable. He developed the regionalized variable theory to describe functions which vary in space with some continuity.

PAGE 32

16 A regionalized variable is a continuously distributed variable having a geographic variation too complex to be represented by a workable mathematical function (Campbell, 1978). Although the precise nature of the variation of a regionalized variable is too complex for a complete description, the average rate of change over distance can be estimated by the semi-variance. Conversely, Olea (1977) stated that a regionalized variable is a function that describes a natural phenomenon which has geographic distribution. The term geostatistics has come to mean the specialized body of statistical techniques developed by Matheron and associates to treat regionalized variables (Olea, 1984). The theory of regionalized variables has two branches: the transitive methods and the intrinsic theory (Matheron, 1969). The first is a highly geometrical abstraction without probabilistic hypothesis and has little practical interest. The practical counterpart of those geometrical abstractions is the intrinsic theory which is a term for the application of the theory of random variables to regionalized variables. Matheron (1969) and Olea (1975) indicated that regionalized variables are characterized by the following properties: (i) localization, a regionalized variable is numerically defined by a value which is associated with a sample of specific size, shape, and orientation which is

PAGE 33

17 called geometrical support. (ii) Continuity, the spatial variation of a regionalized variable may be extremely large or very small, depending on the phenomenom studied, but despite this fact, an average continuity is generally present, in some cases the average continuity cannot be confirmed, and then a nugget effect is present, (iii) Anisotropy, changes may be gradual in one direction and rapid or irregular in another. These changes are known as zonalities. A basic assumption in the intrinsic theory is that a regionalized variable is a random variate (Matheron, 1969). The observed values are outcomes following some probability density function. Henley (1981) considered that a regionalized variable as a random function which may be defined in terms of a probability distribution (i.e., it may be normally distributed with a particular mean and variance) Olea (1984) indicated that a spatial function can either be described by a mathematical model or given by a relative freguency analysis based on experimentation. The former approach is not practical because of the complexity of spatial functions. The latter is seriously limited by the maximum number of samples that can be collected. Olea (1975) stated that the difficulty of the relative freguency approach with a regionalized variable is that a repeated test cannot be run because each outcome is unigue.

PAGE 34

18 Since a large number of samples are essential to any statistical inference, it is not possible to determine the probability density function which rules the occurrence of a regionalized variable. The impossibility of obtaining the probability density function associated with the variable is not a serious limitation. Most of the properties of interest depend only on the structure of the regionalized variable as specified by its first and second moments (Olea, 1975). A key assumption is stationarity. Stationarity is a mathematical way to introduce the restriction that the regionalized variable must be homogeneous Stationarity permits statistical inference. A test can be repeated by assuming stationarity even though samples must be collected at different points. All samples are assumed to be drawn from populations having the same moments. Several scientists have discussed the assumption of stationarity (Henley, 1981; Huijbregts, 1975; Journel and Huijbregts, 1978; Olea, 1975; 1984; Tipper, 1979; Trangmar et al., 1985; Webster, 1985). Geostatistics invokes a stationary constraint called the intrinsic hypothesis to resolve the impossibility of obtaining a probability distribution. A regionalized variable is called strictly stationary if it is stationary for any order k = 1, 2, 3, 4, n. If k is egual to one, the regionalized variable has first-order stationarity. Second-order

PAGE 35

stationarity also implies first-order stationarity. Second-order stationarity signifies that the first two moments (covariance and variance) of the difference between two observations are independent of the location and are a function only of the distance between them. In general, for a regionalized variable of order k, all the moments of order k or less are invariant under translation. For a stationary variable, the covariance has the following properties: (i) COV (0) > |COV(X 2 X x )| (2) where COV = covariance (ii) LIM COV(h) = 0, h -* <*> (3) where LIM = limit (iii) COV(0) =VAR[Y(X)] (4) where VAR = variance (iv) COV(X 2 X 1 ) = COV(X 1 x 2 ) (5) These relations are better visualized in Figure 1. For second-order stationarity, VAR[Y(X)] must be finite. Then, according to equation (4) COV(0) must be finite. However, many phenomena in nature are subject to unlimited dispersion and cannot correctly be described when they are assigned a finite variance. Thus, to avoid this restriction, the intrinsic theory assumes what is called the intrinsic hypothesis. The intrinsic hypothesis is satisfied if, for any displacement h the first two moments of the difference [Z(x) Z(x + h) ] are independent

PAGE 36

20

PAGE 37

21 of the location x and are a function only of h: E [Z(x) Z(x + h)] = M(h) (First moment) (6) E [{ Z(x) Z(x + h) M(h)} 2 ] = 2 G(h) (Second moment) (7) where M(h) and G(h) are referred as the drift and the semivariance or intrinsic function, respectively. The semivariance is a measure of the similarity, on the average, between observations at a given distance apart. The more alike the observations, the smaller is the semi-variance. The semi-variogram (Olea, 1975; Journel and Huijbregts, 1978), which is the plot of semi-variance against distance h (lag), has all the structural information needed about a regionalized variable: (i) zone of influence that provides a precise meaning to the notion of dependence between samples, (ii) anisotropy when variability is direction-dependent revealing the different behavior of the semi-variogram for different directions, and (iii) continuity of the variable through space, which is indicated by the shape and the particular characteristics of the semi-variogram near the origin. One of the oldest methods of estimating space or time dependency between neighboring observations is through autocorrelation (Vieira et al., 1983). Nash (1985) pointed out that the correlogram (plot of autocorrelation against

PAGE 38

distance) is the mirror image of the semi-variogram. Vieira et al. (1983) indicated that when interpolation between measurements is needed, the semi-variogram is a more adeguate tool to measure the correlation between measurements. An infinite dispersion is allowed using semi-variances According to Journel and Huijbregts (1978) the autocorrelation is egual to f(h) = C(h)/ C(0) (8) where f(h) = autocorrelation C(h) = autocovariance or covariance at distance h C(0) = variance The relationship between C(h) and C(0) is expressed by eguation (4). When the semi-variance changes, it is assumed that its variations are small with respect to the working scale. This is a condition of guasi or local stationarity. When the regionalized variable is weakly stationary, it also obeys the intrinsic hypothesis. The semi-variance is then given by G(h) = a 2 C(h) (9) where G(h) = semi-variance a 2 = population variance C(h) = autocovariance

PAGE 39

The autocorrelation and the semi-variance are related by the following equation: f(h) = 1 G(h)/ C(0) (10) Burgess and Webster (1980a) pointed out that the autocorrelation coefficient depends on the variance (equation 8), and according to equation (4) the variance must be finite to fulfill the requirement of stationarity It was indicated earlier that many phenomena in nature are subject to unlimited dispersion. The semi-variance is free of this restriction, and consequently is preferred. They also indicated that a second advantage of working with semi-variance is that it is easier to take into account local trends in the property of interest. Residuals are used when trends are present. Webster and Burgess (1980) demonstrated that the variance of the residuals from the mean is not equal to the variance of the difference between the values when trends are present. Therefore, autocorrelation is difficult to use. Webster (1985) classified the semi-variograms into four groups : Safe models They are defined for one dimension but are safe in the sense that they are conditional positive definite in two and three dimensions. These models are

PAGE 40

24 1. The linear model: G(h) = c Q + wh for h > 0 (11) G(0) = 0 (12) where G = semi-variance c Q = intercept or nugget variance w = slope h = lag distance Equation (11) assumes that h has an exponent a = 1. When the exponent a = 0.5 the model is called root. When a = 2 the model is parabolic. 2. The spherical model: G(h) = c Q + w [1.5 (h/a) 0.5 (h/a) 3 ] (13) for 0 < h < a G(h) = c Q + w for h > a (14) G(0) = 0 (15) where a = range c Q + w = sill 3. The exponential model: G(h) = c Q + w [1 exp (-h/a)] for h > 0 (16) G(0) = 0 (17) 4. The DeWijsian model: G(h) = c 0 + a ln(h) for h > 0 (18) G(0) = 0 (19) 5. The Gaussian model: G(h) = c 0 + w (1 exp -(h/a)*) for h > 0 (20) G(0) = 0 ( 2i)

PAGE 41

25 6. The hyperbolic model: G(h) = h/ a + 3h (22) where a and p are coefficients of the hyperbola function. Risky models The semi-variogram increases to a sill. 1. The circular model: G(h) = c Q + w [1 2/ti cos(h/a) + 2h/ na(l h 2 /a 2 )*] for 0 < h < a (23) G(h) = c Q + w for h > a (24) G(0) = 0 (25) 2. The linear model with a sill: G(h) = c Q + w (h/a) for 0 < h < a (26) G(h) = c Q + w for h > a (27) G(0) = 0 (28) Nested model The components of variance measure the amount of variance contributed by each scale. G(h) = i VAR [Z(x) Z(x+h)] = G Q (h) + G 1 (h) (29) where G Q (h) = pure nugget semi-variance G^fh) = spatially dependent semi-variance Anisotropic model Variability is not equal in all lateral directions. G(h,0) = c Q + u(0) |h| (30) where u(6) = [A 2 cos 2 (9 a) + B sin 2 (9 a)]* (31)

PAGE 42

26 where 9 = anisotropy angle a = direction of maximum variation A = gradient of semi-variogram in direction of maximum variation B = gradient in the direction a + § u The most common semi-variograms are showed in Figure 2. Computing a series of semi-variograms and deriving a model from them is usually not an end in itself. The objectives of geostatistical studies are to determine the characteristics of the data and to obtain the best estimates possible with the available data. The advantage of using a geostatistical approach is that the computed values are optimum. The error of estimation is minimized. The acronyn BLUE (best linear unbiased estimation) is sometimes used to characterize this method (Green, 1985). Estimation procedures that incorporate regionalized variable theory were originally known as kriging, a term named for D.G Krige ( DeGraf f enreid, 1982). Kriging is a distance-weighted moving average estimation procedure that uses the semi-variogram to determine optimal weights. Kriging depends on computing an accurate semivariogram from which estimates of semi-variance are then used to obtain the weights applied to the data when computing the averages, and are presented in the kriging eguation (Burgess and Webster, 1980a).

PAGE 43

27 Gaussian Linear, Root, Parabolic Figure 2. Common semi-variogram models.

PAGE 44

28 When values of soil properties are averaged over point values, which represent volumes with the same size and shapes as the volumes of soil on which the original descriptions were recorded (i.e., pedons ) the kriging procedure is called punctual kriging (Burgess and Webster, 1980a). When an average is made over areas, the procedure is called block kriging (Burgess and Webster, 1980b). Block kriging produces smaller estimation variances and smoother maps. Burgess and Webster (1980a) and Webster and Burgess (1983) pointed out that kriging is a means of spatial prediction that can be used for soil properties. In kriging, the weights take account of the known spatial dependence expressed in the semi-variogram and the geometric relationships among the observed points. Kriging is optimal in the sense that it provides estimates of values at unrecorded places without bias and with minimum known variance. It has been indicated by several scientists (Huijbregts, 1975; Olea, 1975; 1984; Trangmar et al., 1985; Webster and Burgess, 1980) that kriging is used only with regionalized variables that are first-order stationary. For variables whose drift is not stationary, but for whose residuals the intrinsic hypothesis holds, universal kriging is used.

PAGE 45

29 Webster and Burgess (1980) stated that universal kriging takes account of local trends in data when minimizing the error associated with estimation. Universal kriging can be performed after computing suitable expressions for the drift and corresponding semi-variograms of the residuals. Olea (1984) said that universal kriging is a linear estimator of the regionalized variable and has the form n Z(x Q ) = i £ 1 r i Z( XjL ) (32) where Z(x Q ) = unknown parameter at location x Q r. = weights Z(x i ) = value of a property at a point x i Matheron (1963) stated that suitable weights r. assigned to each sample are determined by two conditions. The first condition is that Z (x Q ) and Zix^ must have the same average value within the area of influence, and is written as ill r i = 1 (33) The second condition is that r. have such values that estimation variance (kriging variance) of Z(x Q ) and Z(x i ) should take the smallest possible value. The unknown I\ 1 s were found by solution of a system of linear eguations which result from forcing the unbiased estimator to have minimum variance. The eguation is as follows:

PAGE 46

AX = B (34) where A, B, and X are given by equations (35), (36), and (37) (Figures 3 and 4) In recent years a new method for estimation has been developed. Vieira et al. (1983) stated that in soil science, agrometeorology, and remote sensing, very often some variables are cross-related with others. In addition, some of those variables are easier to measure than others. In such situations estimation of one variable using information about both itself and another cross-correlated, easier-to-measure variable should to be more useful than the kriging of that variable by itself. This estimation is easily made using cokriging. Cokriging has been defined as the estimation of one spatially distributed variable from values of another related variable (DeGraf fenreid, 1982; Gutjahr, 1984). Dependence between two variables can be expressed by a cross semi-variogram (McBratney and Webster, 1983a). For any pair of variables i and j there is a cross semivariance G (h) at lag h^. defined as G ij (h) = E [{Z (x) Z^x+h)} {Zj(x) Z.(x+h)}] (38) where Z and Z. are the values of i and j at places x and x+h. If i = j then, equation (38) represents the auto semi-variance.

PAGE 47

31 G(x 1 ,x 1 ) G(X 1 ,X 2 ) 1 f (x 1 ) f a (x 1 ) .fn(x 1 ) G(x 2 ,x 1 ) G(x 2 ,x 2 ) ..G(x 2 ,x k ) 1 f(x 2 ) fMx 2 ). .fn(x 2 ) G(x. ,x 1 ) G(x ,X-) .G(x. ,x, ) 1 f (x.) J f 2 (x.) j ..fn(x.) j G(x R ,x 1 ) G(x k ,x 2 ). ..G(x k ,x k ) 1 f(x k ) fMx k ). ..fn(x k ) 1 1 1 0 0 0 0 f (x x ) f(x 2 ) f(x k ) 0 0 0 0 f 2 (x 1 ) f 2 (x 2 ) f*(x k ) 0 0 0 0 fn(x 1 ) fn(x 2 ) fn(x k ) 0 0 0 0 G(x. ,x k ) = Semi-variance between two sample elements located at a distance (x. ,x v ). D X f 1 = Function of x, derived from the drift. Figure 3. Equation number 35.

PAGE 48

32 b) G(X 1 ,X Q ) G(x 2 ,x Q ) G(Xj ,x Q ) B = G(x k ,X Q ) 1 f(x Q ) fMx Q ) n fTl(X Q ) Tj = weights. T(x k ,x Q ) = semi-variance between two sample elements located at a distance (x k x Q ). f 1 (x) = function of x, derived from the drift. p.. = Lagrange multipliers. Figure 4. Eguation number 36 (a) and equation number 37 (b)

PAGE 49

33 The cokriging equation is given by j nj W = jll i£l r ij z < x ij)' for a11 3 (39) where i, j = variables Z (Xj) = estimated value of variable j at location x Q r ij = we ^9 nts To avoid bias the weights have to fulfill two conditions: nj (i) r j = 1 (40) and nj (ii) i £ 1 r A j = 0 for all i not equal to j (41) The first condition, according to McBratney and Webster (1983a), implies that there must be at least one observation of the variable j for cokriging to be possible, and as in kriging equation (34), cokriging can be expressed in matrix notation for solving the unknown weights. Trangmar et al. (1985) indicated that cokriging requires at least one sample point of both the primary variable and covariable properties within the estimation neighborhood. If the primary variable and covariable are present at all sampling sites in the neighborhood, then cokriging is considered as an auto-kriging of the primary variable alone. In such cases, cokriging is unnecessary.

PAGE 50

34 Practical Use Earlier studies in soil science used time series analysis in which spatial dependence of soil properties was considered. Webster and Cuanalo (1975) computed correlograms for clay, silt, pH, CaC0 3 color-value, and stoniness for three horizons in pedons located at 10 m interval along a transect in north Oxfordshire, England. They observed that the relation between sampling points weakened steadily over distances from 10 m to about 230 m. The average spacing between geological boundaries on the transect was also about 230 m. Outcrop bedrock was inferred as one of the main sources of soil variation. They concluded that mappable soil boundaries were likely to occur on the average every 230 m, and sampling at spacing closer than 115 m would be needed to detect them. Lanyon and Hall (1980) used morphological, physical, and chemical soil properties to test the performance and value of auto-correlation analysis. Spatial dependence was determined from observations made every 20 m along a transect for solum thickness; fine-earth fraction of the A, B, and C horizons; and for soil pH, percent base saturation (PBS), and exchangeable cations from the deepest horizon. They found that the range varied from 20 m for solum thickness and exchangeable K content to 60 m for pH and exchangeable Mg content. They concluded that autocorrelation analysis emphasized the continuous, orderly

PAGE 51

35 nature of soils, and the fact that spatially related observations may be mutually dependent. Campbell (1978) was one of the first to use geostatistics in soil science. He studied the spatial variation of sand and pH measurements employing the semivariance. Samples were collected at 10 m intervals on two sampling grids positioned on two contiguous delineations in eastern Kansas. There was a contrast in spatial variation of sand content within the two delineations. Distances of 30 and 40 m were sufficient to encounter full variation of sand content. Soil pH had a random variation within both areas. It was concluded that the most important application of semi-variograms was in determining optimum sample spacing in the design of efficient sampling strategies. Gambolati and Volpi (1979) introduced the determination of the trend a priori, and improved the process of fitting the observed to a theoretical semivariogram. They used kriging to describe ground-water flow near Venice, Italy. They proposed and used a modification of the kriging technique developed by Matheron (1970) which aimed at improving the accuracy of the interpolation procedure. In Matheron' s (1970) basic theory, the trend was not assessed a priori. The trend was considered as a linear combination of functions with unknown coefficients. Gambolati and Volpi (1979) considered the trend a priori;

PAGE 52

36 therefore the trend had to be determined. Also, they defined the concept of theoretical consistency in kriging applications. Theoretical consistency was derived from the validation of the interpretation model. Validation was made by suppressing each observation point one at a time, by providing an estimate in that point using the remaining (n-1) observations, and analyzing the distribution of errors. They stated that consistency occurred when there was no systematic error (kriged average error was approximately zero) and the standard deviation was consistent with the corresponding error (the average ratio of theoretical to calculated variance was approximately equal to one). They found that validation of the interpretation models selected for study showed that their approach yielded accurate results, provided the trend was correctly assessed. Chirlin and Dagan (1980) modeled water flow through two-dimensional porous formations as a random process using an approximate formulation of flow physics to obtain an expression for the Head variogram. The Head variogram proved markedly anisotropic, with heads differing more widely on average for a fixed lag parallel to the head gradient than perpendicular to it. Also they examined a hypothetical case ignoring anisotropy. it was determined from their experiment that the kriged standard deviation

PAGE 53

37 was overestimated perpendicular to the mean flow and was underestimated parallel to it. Hajrasuliha et al. (1980) studied salinity levels in three different fields in southwest Iran which were initially sampled on an arbitrarily selected grid of 80 m. Semi-variances were calculated for all three sites to determine the degree of dependence between observations. The results from two fields showed that observations were spatially dependent. Contour lines of iso-salinity were obtained by using kriging. In the third field salinity observations were found to be spatially independent. Thus, the number of samples necessary to get fiducial limits and to identify the number of samples to be taken randomly across the field for a given probability were obtained by using classical statistical methods. Luxmoore et al. (1981) used semi-variograms to characterize spatial variability of infiltration rates into a weathered shale subsoil. Infiltration rates were measured using double-ring inf iltrometers installed at 48 locations on a 2 x 2 m grid after the removal of 1 to 2 m of soil. A high degree of variability in infiltration rates was determined. The test for spatial patterning using the semi-variogram approach proved negative. Therefore, they concluded that if patterning existed at all, it occurred on a spatial scale less than the 2 m used

PAGE 54

38 in the study. As a result of this study, it was determined that infiltration rate was a randomly distributed property. Vieira et al. (1981) analyzed the spatial variability of 1280 field-measured infiltration rates on Typic Xerorthents. The measurements were made at the nodes of an irregular grid. The semi-variogram showed a range of 50 m. It was considered that, on the average, samples separated by 50 m or more were not correlated to each other. Conversely, they examined the effect of the neighborhood size on the value kriged and its estimation variance. They determined that a neighborhood of 14 m was sufficient for the infiltration data. The estimation variances changed very little for larger distances. Low mean estimation error, low variance, and high correlation coefficient showed that the kriging estimation was exceptionally good. Finally, it was determined that geostatistics was useful to redesign the sampling scheme. The large number of measured values made it possible to calculate the minimum number of samples necessary to reproduce the infiltration rate measurements with good precision. It was determined that 128 samples were enough to obtain nearly the same information as with 1280 samples. Geostatistics was used for first time to study soil variability of large areas in Kigali, Rwanda by Vander Zaag et al. (1981). They studied the spatial variability of selected soil properties (pH; exchangeable Ca, Mg, K, and

PAGE 55

Na contents; KCl-extractable Al content; percent Alsaturation; effective CEC; ug P-sorbed at an equilibrium P concentration of 0.02 and 0.2 ug/g; extractable P content; P and Si in the saturation extract; total N, NO3 and NH4; and extractable S contents) in the whole country of Rwanda. Semi-variograms of soil pH, exchangeable Ca content, effective CEC, Si in the saturation extract, and extractable NH4 content showed long range spatial dependence. The spatial dependence varied from 37.5 km for soil pH to more than 60 km for extractable NH 4 The information contained in the semi-variogram was used to estimate values of soil properties at unsampled locations within the range of the semi-variogram. Maps of estimation variance of kriged values were also generated. Such maps showed that estimation variance of kriged values generally increased with increasing distance from sample points. It was indicated that geostatistics could be used to make quick, low cost assessments of soil variability of large land areas. In addition, the map of estimation variance gave an indication of the confidence limits of the estimated values. The map can be used to locate optimum sampling sites to lower the estimation variance. McBratney and Webster (1981) computed semi-variograms of subsoil properties (depth to subsoil, soil color, particle-size, mineralogy, organic carbon content total N content, ratio OC/ total N, and pH) Samples were taken on

PAGE 56

40 a transect at 20 m intervals. Semi-variograms showed spatial dependence extending to about 360 m for some properties, in particular color and pH. Other subsoil properties had little or no spatial dependence, notably particle-size fractions and organic carbon content. The shape of some semi-variograms indicated presence of different map units on the transect. Van Kuilenburg et al. (1982) applied three interpolation technigues (proximal, weighted average, and kriging) to point data involving soil moisture supply capacity on a 2 x 2 km grid of cover sand in the eastern part of the Netherlands. Survey points used for interpolation were randomly stratified with an average density of 1.5 per ha. The root mean sguared error was used as a measure of efficiency. The root mean sguared error was large for the proximal method (less efficient) and there was a negligible difference between root mean sguared errors for weighted average and kriging. Weighted average had the weakness that possible clusters of survey points were weighted too heavily. This was avoided in kriging. Therefore, kriging proved to be the most efficient for the survey method used. Yost et al. (1982a) collected samples from 80 sites at 1 to 2 km intervals in Hawaii. Soil samples were taken from 0 to 15 cm (topsoil) and 30 to 45 cm depths (subsoil). The former depth represented the nutrient status as

PAGE 57

influenced by management and the latter depth represented the natural conditions. Semi-variograms for soil pH, exchangeable cations (Ca, Mg, K, Na) sum of cations, P reguirements Si and P in saturation extract, extractable P content, and rainfall were calculated. Ranges were much greater for soil properties in the 0 to 15 cm depth than for those in the 30 to 45 cm depth. Semi-variograms for Ca, Mg, K, and P contents based on the 30 to 45 cm depth samples demonstrated greater variability and had smaller ranges (Ca, Mg, and K) than those based on the 0 to 15 cm, or were extremely variable (P). Si in saturation extract had the same range in the subsoil as in the topsoil. Subsoil properties were highly variable. Thus, soil management and rainfall imposed a degree of uniformity on the surface soil properties not apparent in the subsoil. Yost et al. (1982a) concluded that soil chemical properties had spatial dependence and that understanding such structure may provide new insights into soil behavior over the landscape. The semi-variograms changed at large distances. These changes suggested that soils should be grouped to obtain uniform regions of soil properties suitable for management regimes. Yost et al. (1982b) used soil data from transects in Hawaii for estimating soil P sorption over the entire island by using kriging. The necessity of considering nonstationarity and the use of universal kriging were

PAGE 58

evaluated. Universal kriging, either by prior polynomial trend removal or by local polynomial trend removal during estimation, was not beneficial in spite of widely varying P sorption and a significant polynomial trend in the data. The kriged estimates indicated that P sorption properties of soil obtained from transects could be estimated in an optimal way and could be displayed in a manner to better understand the soil properties and genesis, and for practical purposes, estimating the fertilizer needs and distribution facilities. McBratney et al. (1982) sampled 3500 sites to study the spatial variability of Cu and Co soluble in mild extractants measured to identify places where these metals were deficient for animals. Semi-variograms for both Cu and Co were isotropic and appeared to combine three components of variation: a short range component extending up to 3 km, a long range or geological component extending to 15 km, and a non-spatial or nugget component, which accounted for 32% and 63% of the total variance of Cu and Co, respectively. Cu showed a greater degree of spatial dependence than Co. In addition, isarithmic maps identified areas where Cu and/or Co were deficient. An error map showed that precision was generally acceptable. Also, the map identified a few areas in the region in which sampling was too sparse for confidence.

PAGE 59

43 Byers and Stephens (1983) sampled an untilled mediumgrained fluvial sand in horizontal and vertical transects to study the spatial structure of particle size and saturated hydraulic conductivity. Semi-variogram and kriging analyses indicated that both hydraulic conductivity and particle size were relatively isotropic in the horizontal plane but had marked anisotropy in the vertical plane. There were marked similarities in spatial structure in the horizontal plane. The spatial distribution of saturated hydraulic conductivity in the horizontal plane was estimated reasonably well using an empirical relationship between particle size and conductivity along with kriged estimates of the 10% finer particle size. Ten Berge et al. (1983) studied the spatial distribution of selected soil properties (moisture content, moisture tension, bulk density, texture, temperature, and equivalent surface temperature). Two transects were sampled at 4 m intervals. Semi-variograms for moisture content and bulk density did not show any range but only a nugget effect. For other soil properties semi-variograms had a range varying between 80 and more than 120 m (texture and temperature). Gradual changes in soil characteristics were expected. The presence of abrupt map unit boundaries was determined for some properties (e.g., texture). The spatial structure of the field moisture content was found only at very shallow depths. Texture introduced

PAGE 60

44 differences in hydraulic conductivity, which were thought to cause differences in topsoil moisture content. Vauclin et al. (1983) used geostatistics for studying the variability of particle-size data, available water content, and water stored at 1/3 bar. The soil samples were collected within a 70 x 40 m area at the nodes of a 10 m square grid. All semi-variograms had a nugget effect which corresponded to the variability that occurred within distances shorter than the sampling interval and to experimental uncertainties. The range varied from 26 m for water stored at 1/3 bar to 50 m for silt content. Cross semi-variograms were calculated demonstrating that available water content at 1/3 bar was correlated with sand content within distances of 43.5 and 30 m. Semi-variograms and cross semi-variograms were used to krige and cokrige additional values of available water content and water stored at 1/3 bar every 5 m. They indicated that the use of cokriging was a promising tool whether the principal objective was the reduction of the estimated variance compared with kriging or the need to estimate an undersampled variable by taking into account its spatial correlation with another more sampled variable. Spatial variability of nitrates in cotton petioles was determined employing semi-variograms and kriging (Tabor et al., 1984). Sampling of petioles was of two types, on transects and from randomly selected sites on a rectangular

PAGE 61

grid. Nitrates in petioles showed definite spatial dependence in the field studied. However, for sampling areas of < 1 m, spatial dependence was insignificant compared to the inherent variability of the sample and laboratory analyses. Semi-variograms and kriged maps of nitrates in petioles suggested a strong influence of the cultural practices such as direction of rows and irrigation. Bos et al. (1984) sampled in a rectangular grid 50 x 200 m at 10 m intervals on sandtailings capped with 0 to > 2 m of strip-mine overburden. This was done to present and discuss the use of semi-variograms to study the spatial variation of extractable P, Na, K, Mg, and Ca contents, extractable acidity, CEC, total P content, pH-water, pHKC1, and soluble salts of the topsoil (0 to 25 cm) and relative elevation in reclaimed Florida phosphate mine lands. Semi-variograms were calculated for data taken along transects in four different sampling directions and a combined direction. Some properties (CEC and relative elevation) did not present structure of spatial variation. The range was approximately 6 m for the combined and E-W semi-variograms. Also, a nugget effect was observed which represented variability at distances < 10 m. Presence of anisotropy could not be established because well-defined sills and ranges could not be determined for directions NS, NE-SW, and NW-SE. The semi-variograms were supported by

PAGE 62

too few data points at large distances. It was concluded that semi-variograras were useful in determining the spatial variability of soil properties on reclaimed phosphate mine lands and in improving sampling design for liming and fertilization needs. Xu and Webster (1984) used geostatistics to test how these technigues could be applied for large areas. Topsoil of 102 pedons evenly distributed throughout the studied area in China were sampled. Soil pH-water, organic matter, sand, total N, total P, and total K contents were measured. Variation of soil properties appeared to be isotropic. Soil pH showed the strongest spatial dependence. Isarithmic mapping of local estimates of pH showed zones of alkaline soils. Because sampling was sparse, on average one sample for 3.5 km 2 the estimation errors were large. It was suggested that a more intensive sampling scheme would increase confidence in the maps. This would also improve the estimation of semi-variograms especially for lags in the range of 0.5 to 5 km. Saddig et al. (1985) collected data on soil water tension from 99 tensiometers along a 76 m row planted with chile pepper and irrigated through trickle tubing placed 5 cm below the soil surface. Semi-variograms indicated a large variability and little spatial dependence in soil water tension. The range was < 6 cm. Also, it was determined that variability and spatial dependence were

PAGE 63

47 functions of the method and timing of water application and the magnitude of the soil water tension. When water was applied through a trickle line, variability was greatest and spatial dependence was smallest. Variability was low and spatial dependence high after rain or extensive flooding. Rogowski et al. (1985) were probably the first to use geostatistics to estimate erosion at different scales. Erosion was measured at nodes of three different size grids: 225 measurements from a 15 x 22.5 km grid, 25 from a 5 x 7.5 km grid, and 150 from a 1 x 1.5 km grid in west central Pennsylvania. Erosion at each node was computed using the universal soil loss eguation. Kriging was employed to map potential erosion. It was determined that the large grid sampling size smoothed out the variability by assumming that a fixed slope length and gradient were applicable to the entire area. It was concluded that estimation of erosion on a 1 ha basis (small grid) would likely lead to the optimum prediction capability. This conclusion was based primarily on the results of structural analysis of soil loss data which suggested a workable continuity range of about 0.1 km for an exponential semivariogram model. The relative dispersion was about the same for the smaller and the larger areas. Jim Yeh et al. (1986) measured soil water pressure with 94 tensiometers permanently installed at 3 m intervals

PAGE 64

48 along a 290 m transect at a 0.3 m depth in New Mexico. Observations showed a gradual increase of soil water pressure over time and a high degree of spatial variability. Variations were spatially correlated over distances at least 6 m and they were dependent upon their mean value. These data supported the hypothesis obtained from stochastic analysis that the variation of soil water pressure was mean-dependent. Phillips (1986) applied geostatistics to determine the spatial structure of the pattern of variability of shore erosion to identify the important scale of variation. Shoreline erosion was measured in terms of recession rates from two sets of aerial photographs taken in 1940 and 1978. Statistical analysis indicated that variability of erosion rates was high. The complex alongshore pattern and the scale of local variability indicated that, despite significant long-range differences in erosion rates, shortrange, local factors were more important in determining differences in erosion rates. It was also concluded that two major factors accounted for alongshore differences in erosion rates. These were (i) a complex pattern of differential resistance related to marsh fringe morphology and (ii) a crenulated, irregular shoreline configuration affecting exposure to wave energy. Several scientists (Burgess et al., 1981; McBratney and Webster, 1983a, 1983b; Webster and Burgess, 1984;

PAGE 65

49 Webster and Nortcliff 1984; Russo, 1984) have used geostatistics for improving sampling technigues. The classical statistical approach for sampling soils does not take account of the spatial dependence among the data within one class. Therefore, it leads to conservative estimates of precision, with over sampling and unnecessary cost resulting (Burgess et al., 1981). Burgess et al. (1981) presented a sampling strategy that depended on accurately determining the semi-variogram of the property, and then estimation variances could be calculated for any combination of block size and sampling density by kriging. By this sampling method, the sampling density needed to attain a predetermined precision could be obtained, and the sampling effort needed to achieve the precision desired was at a minimum. McBratney and Webster (1983b) stated that the number of observations needed to achieve a particular acceptable error depends on the variation of the property in the region concerned. The assumptions of classical statistics have reguired more observations than investigators could afford to attain the desired precision. These authors used a method for determining the sample size that depended on knowing the semi-variogram of the property of interest. The semi-variogram information was used in kriging for estimation of variance in the neighborhood of each observation point. Variances were pooled to form the

PAGE 66

50 global variance from which a standard error could be calculated. The pooled value was minimized for a given sample size if all neighborhoods were of the same size. Therefore, the sampling size reguired to determine the semi-variogram would be a major part of the task. So, if the semi-variogram had not been known, then the best strategy was to sample on a regular grid, with the interval determined by the number of observations that could be reasonably obtained. McBratney and Webster (1983a) extended the sampling principle for each variable to two or more co-regionalized variables. The choice of the strategy was complicated because not only did the sampling intensities of the main variable and subsidiary variables differ but also their relative sampling intensities could be changed. Conversely, maximum kriging variance did not necessarily occur at the center of the sampling configuration as it did with a single variable. It was stated that in attempting to find an optimal strategy, the maximum kriging variance must be found by first calculating the variance for a range of sampled spacings and relative sampling intensities. Those that matched the maximum tolerable variance were potentially useful. It was suggested that the optimum scheme was the one that achieved the desired precision for least cost.

PAGE 67

51 Webster and Burgess (1984) described optimal rectangular grid sampling configurations by which estimation variance could be minimized. The geostatistical approach had the advantage that standard errors would be much smaller than with the classical approach. It was stated that even when standard errors were estimated properly by taking into account known spatial dependence, the cost of making the desired number of measurements in a region might still be prohibitive. Under those circumstances weighting might provide a feasible way of overcoming this difficulty. The aim of weighting was to reduce the effort of measuring soil properties within regions while maintaining the precision of replicated observations. It was concluded that the most serious obstacle to using optimal sampling strategies for single estimates was the need to know the semi-variogram in advance. The main task was the number of samples needed to determine the semi-variogram. Webster and Nortcliff (1984) used measured values of extractable Fe, Mn, Cu, and Zn contents to calculate the sampling effort reguired to estimate mean values with specified precision. Semi-variograms showed that there was a substantial dependence for Fe and Mn contents, less for Zn content, and even less for Cu content. Estimation variances generated by classical methods and geostatistics were compared. The largest nugget variance in relation to

PAGE 68

52 the total variance in the sample was for Cu. Classical statistics slightly exaggerated the estimation variance for Cu. The over-estimate was more serious for Zn, Mn, and Fe. However, the major disadvantage is having to sample intensively to obtain the semi-variogram. Russo (1984) proposed a method to design an optimal sampling network for semi-variogram estimation. The method required an initial sampling network. The location of points could be either systematically or randomly selected. For a given sample size (n) and using a constant number of pairs of points for each lag class, the sampling network criterion for selecting the location of sampling points was the uniformity of the values of the separating lag distance (h) within a given lag class, for each of the lag classes which covered the area of interest in the field. The method provided a set of scaling factors which were used to calculate the new locations of the sampling points by an iterative procedure. Using the aforementioned criterion, the best set of sampling points was selected. Analysis of results indicated that by using the proposed method the variability within and among lag classes was considerably reduced relative to the situation where the original locations were used. In addition, sampling points generated by the method proposed fitted the theoretical semi-variograms better than those which were estimated from

PAGE 69

53 data generated on the original coordinates of sampling points Fractals It has been widely recognized that the perception of soil variation is a function of the scale of observation. Fridland (1976) was one of the first soil scientists to recognize that a series of randomly operating but interacting spatial processes at different scales could be combined to give definite soil patterns. Beckett and Bie (1976) indicated that the variance of the values of any soil property within a given area is the sum of all contributions to the soil variability within the area. Thus, the overall variance within an area of 100 m 2 contains contributions from the average variability within areas of 1 m 2 and from that between areas of 1 m 2 within areas of 5m 2 and between areas of 5 m 2 within areas of 10 m 2 and between areas of 10 m 2 within areas of 100 m 2 The partition of the total variance can be performed for any number of stages. Wilding and Drees (1978) pointed out that the nature of soil variability is dependent on the scale of resolution. They indicated that at a low resolution level (for example, looking at the earth from the moon) spatial diversity may be seen as land vs water. With greater resolution, spatial variabilty can be recognized

PAGE 70

54 microscopically and submicroscopically in the systematic organization of biological, chemical, and mineralogical composition of hand specimens representative of given horizons Burrough (1983a) stated that each cause of soil variation may not only operate independently or in combination with other factors, but also over a wide range of scales. Soil variation has been considered to be the result of a systematic and a random components (Fridland, 1976; Wilding and Drees, 1978; 1983). The former is related to features such as landform, geomorphic elements, and soil forming factors. The latter corresponds to those changes in soil properties that are not related to a known cause. Random variability is unresolved. Burrough (1983b) indicated that the distinction between systematic variation and noise (random variation) is entirely scale dependent because increasing the scale of observation almost always reveals structure in the random component. He also stated that making allowances for the artifices of map making, several conclusions can be drawn: (i) pattern structures, and therefore spatial correlations, have been recognized at all scales; (ii) the detail resolved is partly the result of the scale of variation present and partly due to the resolving power of the map at the given scale; (iii) the intricacy of the drawn

PAGE 71

55 boundaries is not related to scale; and (iv) a feature regarded as random at one scale can be revealed as structure at a larger scale. Also, Burrough (1983b) pointed out that in any given spatial study there may be many sources and scales of variability present. The sources and scales of variability come into play simultaneously and affect observations over all distances between the resolution of the sampling device and the largest inter-sample distance. Therefore, it is necessary to find a substitute for the noise concept that takes into account the nested, autocorrelated, and scale dependent character of unresolved variations. Burrough (1983a; 1983b; 1983c) suggested that the concepts embodied in fractals appear to offer a solution. The term fractal was introduced by Mandelbrot (1977) specifically for temporal and spatial phenomena that were continuous but not dif f erentiable and exhibited partial correlations over many scales. A continuous series, such as a polynomial, is dif f erentiable because it can be split into an infinite number of absolutely smooth straight lines. A non-diff erentiable continuous series cannot be solved. Every attempt to split a non-diff erentiable continuous series into smaller parts results in the resolution of still more structure or roughness. Fractal etymologically has the same root as fraction and fragment

PAGE 72

56 and means "irregular or fragmented." It also means "to break." Fractals have two important characteristics (Burrough, 1983b). They embody the idea of "self -similarity that is, the manner in which variations at one scale are repeated at another, and the concept of a fractional dimension. The concept of fractional dimension is the source of the name "fractal." Mandelbrot (1977) defined a fractal curve as one where the Hausdorf f-Besicovitch dimension (D) strictly exceeds the topological dimension. The simplest example is a continuous linear series such as a polynomial which tends to look more and more like a straight line as the scale at which it is examined increases. The D value is calculated using the following equation: D = log N/log r (42) where D = Hausdorf f-Besicovitch dimension N = number of steps used to measure a pattern r = scale ratio Burrough (1983a) pointed out that for a linear fractal curve, D may vary between 1 (completely dif ferentiable) and 2 (noisy). The corresponding range for D lies between 2 (absolutely smooth) and 3 (infinitely crumpled) for surfaces. It is implicit in the concept of fractal that

PAGE 73

57 when fractals are examined at increasingly large scales increasing amounts of detail are revealed, while at the same time vestiges of variations persist on the smaller scale. Mandelbrot (1977) developed the fractal theory based on the physical Brownian motion. Burrough (1983b, 1983c) extended the fractal theory to soils using Brownian and non-Brownian fractal models and indicated that soil data were fractals because increasing the scale of mapping continued to reveal more and more detail. Soil data were not "ideal" fractals because the data did not possess the property of self -similarity at all scales. Pure fractals are theoretically infinitely nested structures with infinite variance. Burrough (1981, 1983a) demonstrated that the double logarithmic plot of a semi-variogram of a series which can be represented by a fractional Brownian function was a straight line of slope: m = 4 2 D (43) where m = slope. D = Hausdorf f -Besicovitch dimension. Therefore, semi-variograms are also useful in computing the fractal dimension, but despite this fact,

PAGE 74

58 fractals have been not used by many scientists, especially soil scientists. Burrough (1981) computed D from semi-variograms of different soil properties. D values varied between 1.1 and 1.9. Low values indicated a predominance of a systematic variation in soil properties studied. Large values indicated a random variation of soil properties. Most of the fractal values were between 1.5 and 1.9. Fractals were also useful in revealing shortand long-range variation when the D dimension was used along the semi-variogram range. Low values of D indicated domination of long-range variation. Fractals have been also applied to erosion studies. Phillips (1986) studying shoreline erosion used the methodology proposed by Burrough (1981, 1983b). He calculated a D value of 1.91. This value indicated a very complex, irregular pattern of erosion which was statistically random. It also indicated a pattern dominated by short-range, local controls which completely obscured any long-range trends that may have existed. A negative correlation between adjacent sites was also found. Phillips (1986) concluded that the complex landscape revealed by the analysis was probably related to the dynamic nature of estuaries and coastal wetlands and the variety of geomorphic, ecological, and human factors that influenced marsh and shoreline development.

PAGE 75

DESCRIPTION OF STUDY AREA Location The area studied is located in northwest Florida. It extends from Santa Rosa County on the west to Madison County on the east, and comprises most of the Florida Panhandle (Figure 5). Physiography, Relief, and Drainage The study area lies in the Coastal Plain Province (Duffee et al., 1979, 1984; Sanders, 1981; Sullivan et al., 1975; Weeks et al. 1980). The landscape is largely the product of streams and waves acting upon the land surface over the past 10 to 15 million years (Fernald and Patton, 1984) The major physiographic divisions in the area are the Northern Highlands and the Marianna Lowlands. They comprise the Southern Pine Hills, the Dougherty Karst, the Tifton Uplands, the Apalachicola Delta, and the Ocala Uplift physiographic districs. Elevations in the Northern Highlands range from 16 or less to 114 m above sea level. Several stream systems have produced a significant erosional feature called the Marianna Lowlands, which 59

PAGE 76

60 Figure 5. Location of the counties from which characterization data were available for pedons selected for study.

PAGE 77

61 interrupts the continuous span of the highlands across northwest Florida. Elevations in the Marianna Lowlands range from 20 to 80 m above sea level (Brooks, 1981a; Fernald and Patton, 1984). Topography varies from nearly level to gently undulating, with slopes ranging from 0 to 35%. Commonly the gentle slopes terminate in sinks or shallow depressions The drainage system is well organized in streams that flow southward from Alabama and Georgia. The Chattahoochee and Flint Rivers combine to form the Apalachicola River, the largest in this southwardflowing group of rivers. Some of the drainage is disjointed particularly in the karst topography of the Marianna Lowlands (Fernald, 1981). Geology Soils are mainly underlied by the Citronelle Formation, the Crystal River Formation, and by undifferentiated Miocene and Oligocene sediments (Fernald, 1981) The Citronelle Formation is composed of sand, gravels, and clays of Pliocene-age. The Crystal River Formation comprises shallow marine limestone of Eocene-age. Miocene and Oligocene sediments are mainly composed of "silty" sand, clay, dolomitic limestone, and f ossilif erous shallow

PAGE 78

62 marine limestone. Some of the materials are part of the Marianna Limestone Formation. Climate The climate of the area is controlled by latitude and proximity to the Gulf of Mexico. The area studied is characterized by long, warm summers and short, mild winters (Bradley, 1972). Maximum and minimum temperatures are affected by breezes coming from the Gulf of Mexico. The average annual temperature is approximately 21s C. Maxima of about 38q c occur in June to August and minima of about -10q C occur in January and February. The average growing season is approximately 275 days. The average annual rainfall ranges between 1400 and 1660 mm. Approximately 50% of the average rainfall falls during a 4-month rainy season from June to September. A second period of relatively high rainfall occurs in the late winter and early spring. Frequently, a short drought during the late spring causes considerable moisture stress to trees, crops, and grasses. Land Use and Vegetation The area studied has a considerable extension of prime farmland that is adequate for producing crops and to sustain high yields under conditions of high levels of management (Caldwell, 1980). Most of the acreage is used for urbanization, field crops, pasture, and forestry. The

PAGE 79

63 most common crops are corn ( Zea mays ) soybean ( Glycine max), peanuts (Arachis hypogaea ) watermelon ( Citrullus vulgaris ) tobacco ( Nicotiana spp) and assorted vegetables. Livestock operations are also common. A large part of the area is also covered by forest. Well drained areas are characterized by the presence of slash pine (Pinus ellioti var ellioti Engelm. ) black jack oak ( Quercus marilandica Munch. ) turkey oak ( Quercus laevis Walt), blue jack oak ( Quercus incana Bartr.), long leaf pine (Pinus palustris Mill), and laurel oak ( Quercus hemiphaerica Bartr.). The poorly drained areas, corresponding to shallow, densely wooded swamps, and river valley lowlands, are characterized by the presence of saw palmetto ( Serenoa repens Bartr.), sweet gum ( Liquidamber styracif lua L. ) and cypress ( Cupressus sp. L. ) (Duffee et al., 1979, 1984; Sanders, 1981; Sullivan et al., 1975; Weeks et al. 1980) Soils Soils in the area studied have developed from mediumtextured marine sediments. These coastal plain materials were transported from uplands farther north during interglacial periods when the present land areas were inundated by water from the Gulf of Mexico. Most of the soils in the study area are characterized by a low level of natural fertility and are susceptible to erosion (Duffee et

PAGE 80

64 al., 1979, 1981; Sanders, 1981; Sullivan et al., 1975; Weeks et al. 1980 ) Approximately 83% of the soils are classified as Ultisols (Table 1). Complete taxonomic classification is presented in Appendix A. In general, the Typic Hapludults; and the Typic, Aquic, Plinthic, and Rhodic Paleudults are well and moderately well-drained, with moderate to low available water capacity and with moderate to moderately slow permeability. These soils are acidic, low in organic matter and nutrient contents. In gently sloping areas, limitations are moderate for cultivate crops due to the erosion hazard. Arenic Hapludults; Arenic, Grossarenic, Arenic Plinthic, and Grossarenic Plinthic Paleudults; and Typic Quartz ipsamments commonly are well to excessively drained. Permeability varies from rapid to moderately rapid, and available water capacity is low to very low. Droughtness and low water retention capacity are among the principal limitations for cropping on these soils. Typic Fluvaguents; Typic Humaquepts; Typic Ochraqualfs; Ultic Haplaquods; Typic, Arenic, Grossarenic, Aerie, Plinthic, Umbric, and Arenic Umbric Paleaquults; Typic Albaguults; and Typic and Aerie Ochraquults are typically poorly drained. Permeability varies from moderate to slow. Excessive wetness and flooding are among the most important limitations for growing crops.

PAGE 81

65 Table 1. Order, Great Group, and relative proportion of pedons studied. Order Great Group Number of pedons studied % Alf isols Hapludalf s 2 1.3 Ochragualf s 2 1.3 Entisols Quartz ipsamments 5 3.3 Others 2 1.3 Inceptisols Dystrochrepts 1 0.7 Humaquepts 1 0.7 Spodosols Haplaquods 2 1.3 Ultisols Hapludults 10 6.6 Paleudults 97 64.5 Paleaquults 15 9.9 Others 3 2.0 Non-designated series 11 7.1 TOTAL 151 100.0 These pedons have not been classified.

PAGE 82

MATERIALS AND METHODS Data Source Data from 151 pedons (Calhoun et al.,1974; Carlisle et al., 1978, 1981, 1985; I.F.A.S. Soil Characterization Laboratory, unpublished data) were used for the study. In total, 20 soil properties were selected (horizon thickness; very coarse, coarse, medium, fine, and very fine sand fractions; total sand, silt, and clay contents; pH-water; pH-KCl; organic carbon content; Ca, Mg, Na, and K contents extractable in NH 4 OAC; total bases; extr actable acidity; CEC; and base saturation). The criterion for selection was that these soil properties had to have been measured for each horizon of the pedon. The number of horizons per pedon varied between 4 and 7 horizons. There were 19,820 observations Pedon location, description, and sampling were done by soil scientists from U.S.D.A. Soil Conservation Service and the I.F.A.S. Soil Science Department. Physical and chemical analyses of the soils were made by the personnel of the Soil Characterization Laboratory of the University of Florida, Gainesville. Procedures used for sampling and 66

PAGE 83

67 chemical and physical analysis were outlined by Calhoun et al. (1974) and by Carlisle et al. (1978, 1981, 1985). Approximately half of the data was already stored in an IBM XT microcomputer using the database management software KeepIT (ITsoftware, 1984). It was necessary to input approximately half of the data to complete the set of observations for this study. Location of Pedons The pedons selected for study were located for soil survey purposes using the system of Ranges and Townships with the Tallahassee Meridian and Base Line as reference. The program used for spatial analysis requires the location of pedons expressed by geographic coordinates (Xs and Ys). Therefore, each pedon was located on topographic maps at 1:24,000 scale according to the system of Ranges and Townships, and each location was transformed into cartesian coordinates (longitude and latitude). Elevation above sea level was also recorded. The map of physiographic regions of Florida (Brooks, 1981b) at the 1:500,000 scale was used as a base map to locate the entire set of pedons. Using as a reference the point 30q 00* 00'' N and 87q 24' 18" W (X = 0 and Y = 0), X and Y coordinates were determined. This reference point was used to allow only positive Xs and Ys in the studied area.

PAGE 84

' 68 Pedon locations were plotted using the POST command of Surface II software (Sampson, 1978). Statistical Analyses Statistical analyses were performed using an IBM XT microcomputer and IFAS-VAX and NERDC main frame computers. Transfer of data between microcomputer and main frame computers was possible by using the public domain communication programs Kermit (to link with IFAS-VAX) and YT (to link with CMS -NERDC ) Statistical Analysis System software (SAS Institute Inc, 1982a, 1982b) was used for the normality and principal component analyses and for plotting purposes. The Fortran program written by Skrivan and Karlinger (1979) was used for the geostatistical analysis. Surface II software (Sampson, 1978) was employed to generate isarithmic (contour) maps and surface diagrams. Normality Analysis The UNIVARIATE procedure (SAS Institute Inc., 1982a) was used to test normality. This test was mainly based on the study of skewness, kurtosis, the Kolmogorov test, and cumulative plots. The NORMAL option was employed to compute a test statistic for the hypothesis that the input data had a normal distribution. The Kolmogorov D statistic was computed because the sample size was greater than 50.

PAGE 85

69 The PLOT option was used to plot the data. The CHART procedure was employed to obtain histograms of the data. Principal Component Analysis The PRINCOMP procedure (SAS Institute Inc., 1982b) was employed for the PCA. Because the soil properties studied had different measurement scales, there was a risk of having heterogeneous variances. An important assumption in this analysis is the homogeneity of variances (Afifi and Clark, 1984). Therefore, soil properties were standardized to mean egual to 0 and variance equal to 1. As a result the PCs were derived from the correlation matrix instead of the covariance matrix. Eigenvalues (variances) and eigenvectors (coefficients) of PCs were obtained by using the PRINCOMP procedure. The number of PCs was selected by using a rule of thumb (Afifi and Clark, 1984, p. 322) that the PCs selected are those that explain at least 100/P percent of the total variance where P is the number of variables. The PCs selected had an eigenvalue that represented > 5% of the total variance. Eigenvectors for each PC were selected on the basis that they had a value larger than the value calculated using the following equation: Sc = 0.5/ (PC eigenvalue)^ (44) where Sc = Selection criterion

PAGE 86

70 The PLOT procedure (SAS Institute Inc., 1982a) was employed to plot eigenvectors. The larger the value and the closer the eigenvector to the PC axis, the larger the contribution of the variable to the total variance. A varimax rotation (orthogonal rotation of axes) was used because some of the eigenvectors did not show a clear contribution to a particular PC. The FACTOR procedure (SAS Institute Inc., 1982b) was employed for the varimax rotation and to plot the rotated eigenvectors Each PC is a linear combination of standardized variables having the eigenvectors as coefficients. Due to this fact, collinearity between variables can be a problem. It has been reported (SAS Institute Inc., 1982b) that use of highly correlated variables produces estimates with high standard errors. These estimates are very sensitive to slight changes in the data. The REG procedure (SAS Institute Inc., 1982b) with the option COLLIN was used for the analysis of collinearity. Variables with a tolerance lower than 0.01 were not considered in the analysis (Afifi and Clark, 1984). Tolerance is defined as: T = 1 R where T = tolerance R = coefficient of multiple correlation (45)

PAGE 87

Finally, the correlation coefficient between the PCs and the soil properties was computed using the equation: r ij = a ij (VAR PC) (46) where r^j = correlation coefficient a^j = eigenvector VAR PC = PC eigenvalue Soil properties selected for further study were those having a high (>|0.75|) correlation coefficient. Geostatistical Analysis A Fortran program written by Skrivan and Karlinger (1979) was employed. The geostatistical analysis had four parts Semi-variograms The X, Y, and Z (soil property) values were used as input in this step. Before a valid semi-variogram can be calculated, the drift, if present, must be removed, otherwise the stationarity assumption is not fulfilled. Journel and Huijbregts (1978) stated the criterion to consider when the drift is absent. They indicated that, considering the semi-variogram as a positive definite function, an experimental semi-variogram with an increase smaller than |h| 2 (where h = modulus of the lag distance) for large distances h is incompatible with the intrinsic hypothesis. Such an increase in the semi-variogram most often indicates the presence of a trend or drift. However, drift can be determined if the semi-variogram has already

PAGE 88

72 been calculated. Thus, an iterative process (trial and error) was followed to calculate the semi-variogram. An observed semi-variogram based on the data was calculated. If drift was present, then the information contained in the observed semi-variogram was used to calculate the drift coefficients and residuals of the observations relative to the drift function. Then, a new semi-variogram from the residuals could be calculated. This process was repeated until drift was removed or a satisfactory semi-variogram was obtained. Five semi-variograms were calculated for each variable: direction-independent and direction-dependent (N-S, E-W, NE-SW, NW-SE). The semi-variogram plots were obtained by using the Energraphics software (Enertronics, 1983) Fitting semi-variograms In this step the structural information (range, lag distance, and slope) was used to adjust the parameters in the semi-variogram until the model was theoretical consistent (Gambolatti and Volpi, 1979). Consistency occurred when the kriged average error (KAE) was approximately zero and the average ratio of theoretical to calculate variance, called reduced mean square error (RMSE) was approximately equal to one. These parameters are represented by the following equations: n (i) KAE = 1/n i fi 1 (Z i Z ) (47)

PAGE 89

73 where n = number of points = measured value Z^ = kriged value n (ii) RMSE = 1/n i £ 1 (Z i Z ) 2 / a 2 (48) where a 2 = calculated variance and is equal to n-1 n n-1 a 2 = k(0) i Z 1 r i c(h)i £ 1 u i M(h)+ i s 1 r i 2 S L 2 (49) where K(0) = sill I\ = unknown weighting coefficient C(h) = covariance based on semi-variance and sill = unknown LaGragian multiplier M(h) = drift S^ 2 = variance of the measurement error The fitting procedure was based on the jackknife method developed by Tukey (Sokal and Rohlf, 1981) which is a useful technique for analyzing statistics if distributional assumptions are of concern. The procedure was to split the observed data into groups (usually of size one) and to compute values of the statistic with a different group of observations being ignored each time. The average of these estimates was used to reduce the bias in the statistic. The variability among these values was used to estimate the standard error.

PAGE 90

74 Gambolati and Volpi (1979) extended the use of this technique to geostatistics Kriging Universal kriging was the method used in this investigation. Universal kriging takes into account local trends in data, minimizing the error associated with estimation. The kriged Z value for X and Y location and its associated variance were computed. The kriged Z values and associated standard errors were the inputs to the Surface II software to produce isoline maps of the different values and associated variances Fractals Statistical Analysis System (SAS Institute Inc., 1982a, 1982b) was employed for transforming semivariance and lag distance values into logarithmic values. The REG procedure was used to obtain the slope of the line. The Hausdorf f -Besicovitch dimension was computed by using equation (43). Finally, this dissertation was written using WordPerfect software (SSI Software, 1985).

PAGE 91

RESULTS AND DISCUSSION Test of Normality The assumption of normality is important for most statistical analyses. Mean and standard deviation are needed to characterize completely the distribution of values if the data are normally distributed. When data are normally distributed, approximately 95% of the values fall within two standard deviations of the mean (Montgomery, 1976; Snedecor and Cochran, 1980; SAS Institute Inc., 1982a) Gower (1966), however, pointed out that in PCA, unlike other forms of multivariate analyses, no assumptions are needed about the distribution of the variates, hypothetical populations, except when significance tests are of interest. Likewise, Gutjahr (1985) and Olea (1975) have stated that the assumption of normality is not needed in geostatistics Stationarity is the most important assumption in geostatistics, although Burrough (1983a) indicated that stationarity is very difficult to achieve. Normality, therefore, is not required for PCA and geostatistics. However, the test of normality was performed because a large number of soil variability 75

PAGE 92

76 studies have implicitly assumed a normal distribution of soil properties without using any statistical test to justify this assumption. Also, a large data base was available. Thus, a conclusion such as "data were nonnormally distributed because of the small number of observations" has no validity in this study. There are two main tests of normality. One is a graphical method based on histograms or plots of values measured on probability paper. The other one is based on a guantitative measure such as the Kolmogorov test. Rao et al. (1979) indicated that graphical methods have specific drawbacks. First, they often rely on visual inspection, and thus are subject to human error. Second, as graphical methods are not based on guantitative measures, an objective statistical evaluation of the goodness-of-f it of the theoretical distribution to the measured data is not possible. Conseguently, the normality analysis was based on more a guantitative measure rather than a graphical method. The data were tested against a theoretical normal distribution with mean and variance egual to the sample mean and variance. Skewness, kurtosis, the Kolmogorov D statistic, and plot of data were used to test the null hypothesis that the input data values were normally distributed (SAS Institute Inc., 1982a).

PAGE 93

77 When the distribution is not symmetric, the skewness can be positive (skewed to the right) or negative (skewed to the left). Kurtosis refers to the degree of peakedness of a freguency distribution (Silk, 1979). A heavy tailed distribution has positive kurtosis. Flat distributions with short-tails or when almost all data values appear very close to the mean have negative kurtosis. The measure of skewness and kurtosis for a normally distributed population is zero (SAS Institute Inc., 1982a). A significance level (a) value of 0.15 was selected as the criterion for acceptance or rejection of the null hypothesis (H = Normal). When normality is tested the interest is in accepting the null hypothesis. This is in contrast to most situations when the interest is in rejecting the null hypothesis. For these reason, Rao et al. (1979) proposed an a value between 0.15 and 0.20 in order to have a balance between type I and II errors. Statistical moments for each soil property were computed (Table 2). Most variables had large coefficients of variation (C.V.). Soil pH (water and KC1) had the lowest variation, reflecting uniform condition of pH, in this case the acidity. Other soil properties had a large C.V.. Most of these soil properties are naturally related, and the large C.V.s were mutually influenced. For example, the amount of

PAGE 94

78 Table 2. Statistical moments of soil properties studied and Kolmogorov test. Mean Variance C.V Skewness Kurtosis D: Normal PROB>D (%) TH 30.2 365.8 63 3 1 /I A 1.44 "1 ft A 2.92 0 1 ft 12 < ft 1 01 VC •1 ft 1 2 5 4 ion 189 6 4.59 29.3 U *5 A 30 < A 1 01 C /A 6 4 39.5 97 5 "1 C 1.36 1.53 0 15 < A 1 .01 M 1 *1 ft 17 0 125.6 65 9 0.86 1 11 u A C Ob < A 1 .01 F 32.7 209.2 A A 44 2 ft >1 ft 0.40 A 1 A -U 1U A 0 A *7 0 / < A 1 .01 VF 12 9 68.8 64 4 1.11 1 ft ft 1.92 0 ft "7 07 < 01 TS 70.0 354.8 26. 9 -1.18 1.63 0. 08 < .01 Silt 10.7 78.4 83. 0 3.92 27.0 0. 16 < .01 Clay 19.4 260.9 83. 3 1.33 2.11 0. 12 < .01 PHI 5.1 0.35 11. 6 -0.67 13.5 0. 12 < .01 PH2 4.2 0.28 12. 5 -0.19 9.91 0. 10 < .01 OC 0.43 0.52 167. 8 3.41 14.1 0. 28 < .01 Ca 0.94 5.51 250. 3 6.15 49.1 0. 34 < .01 Mg 0.36 0.81 253 2 10.8 154.3 0. 35 < .01 Na 0.03 0.002 130. 8 3.62 25.5 0. 22 < .01 K 0.06 0.009 170. 5 3.72 19.3 0. 28 < .01 TB 1.38 8.76 213. 7 6.05 49.1 0. 32 < .01 EXT 5.61 30.8 98. 9 2.79 12.8 0. 16 < .01 CEC 7.01 49.1 99. 9 2.83 11.0 0. 18 < .01 BS 18.8 399.7 106. 2 1.78 2.90 0. 18 < .01 See Abbreviations, pp. xii-xiii TH is expressed in cm; VC, C, M, F, VF, TS, silt, clay, OC, and BS are expressed as %; Ca, Mg, Na, K, TB, EXT, and CEC are expressed as cmol/kg. n = 991

PAGE 95

79 extractable cations (Ca, Mg, Na,and K) depends largely on the CEC, which in turn depends on particle size. The large variation in particle size (very coarse, coarse, medium, fine, and very fine sand fractions; silt; and clay contents) was influenced by the diversity of Paleudults (Appendix A) and the presence of horizons with guite different textures. Paleudults had variable thickness of coarse-textured horizons (Typic, Arenic, and Grossarenic Subgroups) overlying fine-textured argillic horizons Most of the soil properties studied did not have skewness and/or kurtosis close to zero. The exception was fine sand. Also, the histogram and normal probability plot (Figure 6) indicated that fine sand values were normally distributed, but when the Kolmogorov test was performed, it indicated that fine sand had a large probability of being non-normal. The significance probability (PROB>D) of the Kolmogorov D statistic (DiNormal) was smaller than a = 0.15. So, the null hypothesis was rejected for fine sand. Results of the Kolmogorov test indicated that the soil properties studied had a non-normal distribution. Results of the Kolmogorov test were also supported by the histograms and normal probability plots. Histograms revealed that distribution of values by soil property did not have the characteristic bell-shaped curve of a normal

PAGE 96

80 HISTOGRAM $ 17. 5+* l •* 2 .** 6 W .^^^^^sS^^^^^i^^^r^JS^^^r^^^^^t^:^; 101la, ,5S5:<;^:**^:**5S*n:t555*^*5S***^**5X:**^^^<: 14.3 2.5+***** 13 MAY RE PRESENT UP TO COUNTS* 87.5 + NORMAL PR08A3ILITY PLOT : ^ -~ ^ ^ ^ + -2 -1 0 lAvInvrs of th* standard normal distribution function. rl'Dank of ths data value. nNuabr of non-alasing values. ^ThaoTstical distribution. •sSaapls distribution. Figure 6. Histogram (a) and normal probability plot (b) of fine sand content.

PAGE 97

81 distribution. In addition, normal distribution plots indicated a lack of correspondence between the observed and the theoretical distributions, for example organic carbon content (Figure 7). Transformations (logarithmic, arcsine, or sguare root) were not made on the original data because the objective was to accept or reject the normal distribution. In addition, interpretation of transformed data is complex. These results could support the fact that there were systematic patterns of soil properties; observations were not independent but associated within certain distance. Patterns of soil properties influenced the probability distribution. The presence of trends in soil properties associated with landscape position has been recognized. Walker et al. (1968) pointed out that such trends suggested that the analysis of soil data in terms of mean and standard deviation is guestionable since the assumption of random variation does not appear valid. In addition, Hole and Campbell (1985) indicated that if place-to-place variation occurred at random, without elements of organization and order, mapping efforts could proceed only with the greatest difficulty because information and experience gained at one location would have little predictive value at new locations. Under such circumstances each mapping problem would be unigue because

PAGE 98

82 5.3+* HISTOGRAM 2 2.7 ** ^ -* 1, *T -T^-.-,-s-.-v-V -r. — SI ^ ~ ~ t 1~ + — K —-.——-.— => V A Y REPRESENT UP TO 13 COUNTS l 2 l i + i 3 3 2 2 I 3 3 ^ 12 11 10 lh 12 31 37 39 53 505 5.3+ 2.7+ NORMAL PROBABILITY PLOT lvInvera. of the standard normal distribution function. rl'Rantc of the data Talus. n.NuBjber of non-missing values. > = Th8oretlcal distribution. 'sSaapls distribution. ** ** + *** ++ ++ *+++ **+ ++** +++• ** +++ *** +++ ***** 0. 1+**************************** -2 -1 0~ ~* +T" inv(ri-3/8)/(n+| /4) +2 Figure 7. Histogram (a) and normal probability plot (b) of organic carbon content.

PAGE 99

83 of the lack of a consistent geographic order that can be transferred from previous experience to analogous settings. Principal Component Analysis Twenty soil properties were initialy selected to study the soil spatial variability using geostatistics Geostatistical analysis is time consuming and complex. Conversely, all soil properties do not have the same degree of importance to quantify the spatial variability of soils. Therefore, reduction of soil properties was necessary for further analysis. PCA was used as an unbiased method to select the most important soil properties. Important soil properties were defined as those that explained a large proportion of the total variance. Two sets of data were employed for this analysis. One set was composed by the weighted average of selected soil properties in individual pedons. Horizon thickness was used as the weighting criterion. Information is lost when averages are used. Therefore, a second set of data composed of selected soil properties from the surface A horizon were used. Principal Component Analysis for Standardized Weighted Data A basic assumption of PCA is that variables have homogeneous variances (Afifi and Clark, 1984; Webster, 1977). The soil properties studied had different scales of

PAGE 100

84 measurement (thickness was measured in cm; particle size, organic carbon content, and base saturation in %; and extractable cations, total bases, extractable acidity, and CEC in cmol/kg) Therefore, it is difficult to compare them. For this reason, all soil properties were standardized to mean zero and variance one. One measure of the amount of information conveyed by each PC is in its variance (eigenvalue). For this reason, the PCs are commonly arranged in order of decreasing variance (Table 3). The most informative PC is the first and the least informative is the last. The criterion for selecting PCs was stated in the Materials and Methods section. The first five PCs were selected for further analysis. Each of them explained more than 5% of the total variance (Table 3). The first five PCs together explained more than 73% of the total variance. Different interpretative analyses were performed to select the soil properties that contributed the most to the total variance. A very informative display of the relationships between soil properties and PCs were plots (Figure 8). The most important soil properties were those with large values located closer to the axis of the PC. Some properties did not have a clear contribution to an individual PC, such as coarse and medium sand fractions and Mg content (Figure 8). The axes of PCs were rotated

PAGE 101

85 Table 3 Proportion of total variance explained by each principal component. Principal Component Eigenvalue Proportion (%) Cumulative Proportion 1 5, .9119 29, .56 29 .56 2 3, .0450 15, .23 44 .79 3 2. .5310 12, .65 57 .44 4 1, .9153 9, .57 67, .01 5 1, .2385 6, .19 73, .20 6 0, .8040 4, ,02 77, .22 7 0. .7824 3, ,91 81, .13 8 0, .6933 3, ,47 84, .60 9 0, ,6382 3. ,19 87, ,79 10 0, ,5872 2. ,94 90, .73 11 0, ,4871 2. ,44 93, ,17 12 0. ,4377 2, ,19 95, ,36 13 0. ,3458 1, ,73 97, ,09 14 0. ,2393 1. ,20 98. ,29 15 0. ,2020 1. ,01 99. ,30 16 0. ,1209 0. ,60 99. ,90 17 0. ,0168 0. 08 99. ,98 18 0. ,0037 0. 02 100. ,00 19 0. 0002 0. 00 100. ,00 20 0. 0000 0. 00 100. 00 Proportion of the total variance.

PAGE 102

86 CO rO O o I • O I O C\J o o o o I CM o I rO o I d i o I CM Id Z o CL O o < o Q. n 0) •H +J M CD Ci 0 a -h •H 0 n m • 0 -p c cu CD c 1 — 1 0 .1! a > 6 0 CD 0 CP id rH n > •H (0 u c n3 •H CD +J a X, cr> o -H 0) 5 n CO N H H 4-1 M CD +J C n3 4-J 4J 0 CD 4-1 c 0 id H c 0 H cd d 4-1 0 0 G 3 H iN3N0dW00 "IVdlONIdd CO CU u CT> •H fa

PAGE 103

87 toward clusters of those soil properties with no clear contribution to an individual PC. An orthogonal rotation (Varimax rotation) was employed (Figure 9). For this specific example varimax rotation showed that those soil properties with initially no clear contribution to an individual PC were closer to the axis of the principal component 1 (PCI). This analysis was complemented with a guantitative selection of eigenvectors (coefficients of the linear combination of soil variables). Eigenvectors were calculated for each PC (Table 4). The criterion for selecting important eigenvectors was also stated in the Materials and Methods section. Selected eigenvectors for PCI had an absolute value larger than the selection criterion value (Sc) 0.2056. Soil properties selected as important constituents of the PCI, based on the Sc value, were medium and total sand contents, clay content, Ca, Mg, Na, and K contents, total bases, extractable acidity, and CEC. Eigenvectors selected for principal component 2 (PC2), principal component 3 (PC3), principal component 4 (PC4), and principal component 5 (PC5) had absolute values larger than 0.2865, 0.3143, 0.3613, and 0.4493, respectively. Each PC is defined as a linear combination of the standardized variables, but collinearity among variables may be a problem. An analysis of collinearity was

PAGE 104

88 05 o ro IT) o O Ld O a. o o _l < ql o cr Q. O I rO O un o o o I s o CM o o CM o I o o i 0) H 4-> U CD ft 0 M • Cu tn +J i — i c H 0 c 0 4-1 s 0 0 o BO O C 0) •H CP M fl ft M CD 0 > R3 4-) 0) w 4-) u XS •H Cn tw •H 3 CU 4-1 |Q 0 4J N 0 •H u T3 M fl s: T3 -p C fl +J 0
PAGE 105

89 Table 4. Eigenvectors of correlation matrix for standardized weighted soil properties. Soil Principal Component Property 12 3 4 5 rpTT in n U U z a z a U uuy y a -U 1 jib u A 1 A 1 4.341 -u 1 Jz4 VL a u 1 O A "3 n U Ub /Z a u j b y d A U ^ "7 53 C z / ob -u z b lb L A u 1 o b b n U i "7 c n 1 /bU A -u 4bJb A U lb /u A U A C *3 Q Ub Jo TVT M A -u zjy4 a (J 1 o o c -u 0 ~> ^ 0 1 z j J A u uy uu A U zobU El r -u iby 4 a u Ulb / A U 4b J b A -u U4bl A U A o O C zoob Vr a U Ulzl a -u 1 Q £ 1 iy bi A U 4U1 / A u 1 C C A lbbu A -u jo /o mo TS a -u u i a C i 1 ZD 1 A U lb Jo A u iiy i A U z /4y Silt 0. 1623 -0. 1796 -0. 0478 0. 4111 -0. 3917 Clay 0. 3109 -0. 0604 -0. 1687 -0. 3341 -0. 1507 PHI -0 1816 0 3867 0 I860 -0 0321 -0 0877 PH2 -0. 1018 0. 4156 0. 1352 0. 1275 -0. 1812 OC 0. 1027 -0. 0814 0. 0655 0. 5700 0. 1678 Ca 0. 2654 0. 3372 0. 0558 0. 0649 -0. 0258 Mg 0. 2552 0. 2061 -0. 0143 -0. 0635 0. 1237 Na 0. 2590 0. 0205 0. 0103 0. 0525 0. 2423 K 0. 2577 0. 1255 0. 0119 0. 0861 0. 1200 TB 0. 3007 0. 3353 0. 0407 0. 0343 0. 0267 EXT 0. 3131 -0. 2098 -0. 0584 0. 0920 0. 2795 CEC 0. 3755 -0. 0274 -0. 0307 0. 0981 0. 2194 BS 0. 0829 0. 4214 0. 0785 0. 0198 -0. 2434 Sc** 0. 2056 0. 2865 0. 3143 0. 3613 0. 4493 See Abbreviations, pp. xii-xiii. \ ** Sc = 0.5 + (Principal Component eigenvalue) All underlined values had an absolute value larger than its corresponding Sc. Underlined values were selected for further study.

PAGE 106

90 performed for those soil properties previously selected. Soil properties with a Tolerance (T) < 0.01 were considered to be highly intercorrelated and were also excluded (Table 5). PC5 was not included in the analysis of collinearity because all eigenvectors had an absolute value smaller than Sc. According to this criterion Ca, Mg, Na, and K contents, total bases, extractable acidity, and CEC were highly intercorrelated for PCI. Similar reduction of variables was applied to other PCs. A final reduction was made by calculating correlation coefficients between soil properties and PCs (Table 6). A large correlation coefficient (|0.75|) was initially selected as criterion to the reduce even more the number of soil properties. Based on the correlation coefficient, fine sand, total sand, clay, and organic carbon contents were selected. Other soil properties also had a large correlation coefficient, but they were previously eliminated because of the small eigenvectors or the low tolerance. In summary, fine sand, total sand, clay, and organic carbon contents were selected for further analysis. The selection was based on analyses of PCs plots, PCs rotated axes plots, guantitative selection of larger eigenvectors, collinearity tests, and computation of correlation coefficients between soil properties and PCs. The selected

PAGE 107

91 Table 5. Tolerance of standardized weighted soil properties by principal component. Principal Component 1 2 3 4 T T T T M 0.69 PHI 0.61 vc 0.45 TH 0.90 TS 0.11 PH2 0.54 c 0.22 Silt 0.73 Clay 0.14 Ca 0.08 M 0.36 OC 0.73 Ca <.01 TB 0.08 F 0.78 Ma < 01 RC u o U .J O VP" v r Na <.01 K <.01 TB <.01 EXT <.01 CEC <.01 See Abbreviations, pp. xii-xiii. T = 1 R (R = coefficient of multiple correlation)

PAGE 108

92 Table 6. Correlation coefficients between standardized weighted soil properties and principal components Soil Principal Component Property 12 3 4 5 TH 0.0686 ft ft 1 T ft 0 0173 -0 2157 -0.6008 -0 1473 VC -0.2538 ft ft ft ft ft 0.0998 /• C ft ^1 ft -0 5878 0.3856 -0 2911 c ft, A C 1 ft -0.4513 rt i /" ft, 0.3069 ft 1 "\ *7 yl -0 .7374 0.2173 -0 0599 M ft c ft ft *i -0.5821 ft *^ 4 /* J 0 .3464 -0 5143 0 .1246 0 3183 F /•N A ^ t ft, -0 4119 0.0274 ft ~7 y*" ft A 0.7694 -0.0624 0 3212 VF ft ft ft ft a 0.0294 -0.3422 0.6391 0 2145 -0 4315 TS -0.8357 0.2183 0.2606 0.1648 0. 3058 Silt 0.3946 -0.3134 ft a i r i -0.0757 0 .5689 -0 4359 Clay 0.7559 -0.1054 -0.2684 -0.4624 -0. 1677 PHI -0.0443 0.6748 0.2959 -0.0444 -0. 0976 PH2 0.2475 0.7252 0.2151 0.1765 -0. 2017 oc 0.2497 -0.1420 0.1042 0 7889 0 1867 Ca 0.6453 0.5884 0.0888 0.0898 -0. 0287 Mg 0.6205 0.3596 -0.0227 -0.0879 0. 1377 Na 0.6297 0.0358 0.0164 0.0727 0. 2697 K 0.6266 0.2188 0.0191 0.1192 0. 1335 TB 0.7311 0.5851 0.0647 0.0475 0. 0297 EXT 0.7613 -0.3661 -0.0929 0.1275 0. 3109 CEC 0.9130 -0.0478 -0.0488 0.1358 0. 2442 BS 0.2016 0.7353 0.1249 0.0274 -0. 2708 See Abbreviations, pp. xii -xiii. All underlined values had an absolute value > 0.75

PAGE 109

93 soil properties were those that explained most of the variance of the total set of data. Principal Component Analysis for A Horizon Standardized Data The first five PCs explained approximately 74% of the total variance for the A horizon (Table 7). Similar analyses of plots as indicated earlier for standardized weighted average values were used. Eigenvectors with absolute values larger than a Sc value of 0.2126, 0.2553, 0.3070, 0.3857, and 0.4605 for PCI, PC2, PC3, PC4, and PC5, respectively, were selected (Table 8) Analysis of collinearity showed that only total sand had a T value < 0.01 (Table 9), therefore, A horizon total sand was eliminated for further analysis. Silt and clay also had low T values indicating some correlation among those properties. After computing the correlation coefficient between soil properties and PCs (Table 10), clay content and CEC were selected. They had a correlation coefficient larger than |0.75|. While organic carbon content in the A horizon is an important property it was not selected by the PCA. Therefore, it may be concluded that organic carbon content was not as important as clay and CEC in explaining the total variance. Two kinds of A horizons were present (Ap and Al). The Ap horizon is influenced by management conditions and the

PAGE 110

94 Table 7. Proportion of total variance explained by each principal component for standardized A horizon data. Principal Eigenvalue Proportion Cumulative Component (%) Proportion 1 5.5308 27.65 27.65 2 3.8348 19.17 46.82 3 2.6524 13.26 60.08 4 1.6807 8.40 68.48 5 1.0856 5.43 73.91 6 0.9913 4.96 78.87 7 0.9273 4.64 83.51 8 0.7856 3.93 87.44 9 0.6752 3.08 90.52 10 0.3676 1.84 92.36 11 0.3144 1.57 93.93 12 0.2913 1.46 95.39 13 0.2078 1.04 96.43 14 0.1913 0.96 97.39 15 0.1796 0.90 98.29 16 0.1335 0.72 99.01 17 0.0783 0.62 99.63 18 0.0653 0.33 99.96 19 0.0073 0.04 100.00 20 0.0000 0.00 100.00 Proportion of the total variance.

PAGE 111

95 Table 8. Eigenvectors of correlation matrix for standardized properties of A horizon. Soil Principal Component Property 12 3 4 5 TH -0 0509 0 0935 -0 0047 -0 2287 0 0146 VC -0 1167 0 2425 -0 3346 0 0244 -0. 1879 c -0 1566 0 2945 -0 3872 0 0516 -0. 0386 M -0 2206 0 2533 -0 2541 0 1803 0 2296 F -0 1573 -0 2129 0 4110 0 2365 0 0394 VF 0 0461 -0 2672 0 2912 -0. 1316 -0 5020 TS -0 3654 0 0104 0 1294 0. 3020 -0 0999 Sil 0. 3093 -0. 1332 -0. 1973 -0. 2671 -0. 0016 Cla 0. 3280 0. 1182 -0. 0222 -0. 2658 0. 1782 jrn n u • n r 1 n u jU JO u j £. y U u U j / ? U UU1U PH2 -0. 0958 0. 3251 0. 2977 -0. 1084 0. 1614 OC 0. 2990 -0. 1663 -0. 0832 0. 0868 0. 1278 Ca 0. 2397 0. 2909 0. 2653 -0. 0275 0. 1489 Mg 0. 2157 0. 2739 0. 2156 -0. 1050 0. 0385 Na -0. 0251 -0. 1989 0. 0893 0. 2417 0. 6454 K 0. 2722 0. 0417 0. 0078 0. 4142 0. 0353 TB 0. 1644 0. 2427 0. 0529 0. 4085 -0. 3198 EXT 0. 2051 0. 3312 0. 1259 0. 0017 -0. 0302 CEC 0. 3238 -0. 1661 -0. 1199 0. 2168 0. 0189 BS 0. 2971 0. 0946 -0. 0441 0. 3565 -0. 1662 Sc ** 0. 2126 0. 2553 0. 3070 0. 3857 0. 4605 See Abbreviations, pp. xii-xiii. i ** Sc = 0.5 + (Principal Component eigenvalue) 2 All underlined values had an absolute value larger than its corresponding Sc. Underlined values were selected for further study.

PAGE 112

96 Table 9. Tolerance of standardized properties of A horizon by principal component. Principal Component 1 2 3 4 5 T T T T T M u by C a £ i 0.61 vc 0 a n 27 K 0.64 VF A A A 0.99 TS < 01 VF 0 67 c 0 A >l 24 TB 0.64 Na A A A 0.99 C* A "1 4Silt r\ a a 0.09 PHI 0.38 F 0 .73 Clay a a a 0.03 PH2 0.36 PHI 0 96 OC 0.23 Ca 0.23 Ca 0.28 Mg 0.39 Mg 0.35 EXT 0.36 K 0.56 CEC 0.14 BS 0.32 See Abbreviations, pp. xii-xiii. T = 1 R (R = coefficient of multiple correlation).

PAGE 113

97 Table 10. Correlation coefficient between standardized properties of A horizon and principal component. Soil Principal Component Property 12 3 4 5 TH -0. 1197 0. 1831 -0. 0077 -0. 2964 0. 0152 VC -0. 2744 0. 4748 -0. 5450 0. 0316 -0. 1957 C -0. 3683 0. 5768 -0. 6306 0. 0669 -0. 0403 M -0. 5189 0. 4961 -0. 4138 0. 2337 0. 2393 F -0. 3699 -0. 4169 0. 6693 0. 3066 0. 0411 VF 0. 1084 -0. 5233 0. 4743 -0. 1706 -0. 5230 TS -0. 8593 0. 0203 0. 2107 0. 3915 -0. 1042 Silt 0. 7274 -0. 2609 -0. 3213 -0. 3462 -0. 0017 Clay 0. 7714 0. 2315 -0. 0362 -0. 3446 0. 1857 PHI -0. 2025 0. 5988 0. 5358 -0. 0751 -0. 0010 PH2 -0. 2254 0. 6367 0. 4849 -0. 1406 0. 1681 OC 0. 7031 -0. 3256 -0. 1355 0. 1126 0. 1332 Ca 0. 5639 0. 5696 0. 4321 0. 0357 0. 1551 Mg 0. 5073 0. 5364 0. 3511 -0. 1362 0. 0402 Na -0. 0589 -0. 3895 0. 1454 0. 3134 0. 6724 K 0. 6403 0. 0817 0. 0127 0. 5371 0. 0368 TB 0. 3865 0. 4753 0. 0861 0. 5296 -0. 3332 EXT 0. 4824 0. 6487 0. 2051 0. 0023 -0. 0315 CEC 0. 7615 0. 3253 -0. 1954 0. 2811 0. 0197 See Abbreviations, pp. xii-xiii. All underlined values had an absolute value > 0.75.

PAGE 114

98 Al horizon is found in relatively natural conditions. Thus, the PCA was employed separately on these two classes of A horizons. All steps previously described were followed in the PCA for these two groups of A horizons. The final selection of soil properties by correlation coefficients (Tables 11 and 12) revealed that organic carbon content and extractable acidity were two important properties of the Al horizon for PCI (these soil properties represented approximately 39% of the total variance). The PCA revealed the importance of organic carbon content and the natural acidic conditions reflected by the extractable acidity values. Base saturation was the most important property of the Ap horizon for PCI (Table 12). Base saturation represented approximately 24% of the total variance. PCA revealed, therefore, the influenced of management conditions (liming) on the Ap horizons. The PC2 for the Ap horizon also indicated organic carbon content was an important property. For this reason organic carbon content was also selected. Other soil properties were not selected because they were previously excluded by the PCA. Organic carbon and clay contents were selected as important soil properties for both sets of data, weighted average and A horizon values.

PAGE 115

99 Table 11. Correlation coefficient between standardized properties of Al horizon and principal components Soil Principal Component Property 12 3 4 5 in -0 0 1 R S \J _L O -J 99RR n 9 1 Q R z _L y o A u a ^ "5 1 u ") 7 Q A z. / y 4 vr V -0 ^RQQ j o y y 1 9R7 i. 6. 0 / n u 67 ^ A n u 16 / Z -0 ^96.8 n u • ID J J n /DUD u m 9 a 1 "3 C A M -D n \j 9 9fi6 u Z D / 1 u i a i i TT J. -0 u • -0 u • I 1 II J J 764,4 1 O 4 ft UUUl u z y y b VF 0 1727 -0. 3194 -0 6310 0 2714 -0 5134 TS -0. 9064 -0. 0054 -0. 1940 -0. 0038 0. 1558 Silt 0. 7851 -0. 2490 0. 1689 -0. 0236 -0. 3054 Clay 0. 8310 0. 2766 0. 1819 0. 0906 0. 0348 PHI -0. 2737 0. 6579 -0. 2146 0. 2636 0. 0405 PH2 -0. 3744 0. 6891 -0. 0656 0. 3259 0. 1152 OC 0. 7648 -0. 3696 0. 0709 -0. 0410 0. 1346 Ca 0. 6051 0. 7386 -0. 0351 0. 0318 -0. 0022 Mg 0. 6332 0. 5895 -0. 0881 -0. 1155 0. 0607 Na 0. 7245 -0. 0781 0. 0625 -0. 0307 0. 1225 K 0. 8072 0. 3281 0. 0261 -0. 1239 0. 1096 TB 0. 6458 0. 7187 -0. 0411 0. 0035 0. 0138 EXT 0. 7792 -0. 4500 0. 1489 0. 1139 0. 1982 CEC 0. 8902 -0. 2024 0. 1232 0. 1042 0. 1845 BS 0. 0278 0. 7952 -0. 1907 0. 0247 -0. 2019 See Abbreviations, pp. xii-xiii. All underlined values had an absolute value > 0.75.

PAGE 116

100 Table 12. Correlation coefficient between standardized properties of Ap horizon and principal components Soil Principal Component Property 12 3 4 5 TH 0.0867 -0 2165 -0 1594 -0 0704 0 8748 VC 0.2069 -0 6088 -0 1 r\ f\ o 1908 0 3882 -0 "1 A 1 '"4 1432 C 0.2684 -0 /-inn 6478 -0 3617 0 5061 -0 f\ A A f 0446 M 0.0718 -0 /** A A O 6448 -0 2155 0 5817 0 0229 c — n A1AA u n u fil 9 4 UU4U u 9 (177 VF -0.2499 0. 4575 0. 3277 -0. 5642 -0. 1930 TS -0.5737 0. 0372 0. 6245 0. 4727 0. 0997 Silt 0.2532 0. 0964 -0. 6603 -0. 4205 -0. 1907 Clay 0.6416 0. 0231 -0. 4217 -0. 3842 0. 0219 PHI 0.4305 -0. 1677 0. 6459 0. 1684 -0. 1371 PH2 0.5942 -0. 1764 0. 5754 0. 0675 -0. 1712 OC 0.2859 0. 7581 -0. 2388 0. 3986 -0. 0544 Ca 0.8269 0. 3231 0. 2853 0. 1766 0. 1710 Mg 0.8090 0. 1242 0. 2352 0. 0532 -0. 0411 Na -0.1085 0. 4697 -0. 0495 0. 5145 -0. 2272 K 0.6178 0. 2221 -0. 1573 -0. 1801 0. 0943 TB 0.8668 0. 3063 0. 2655 0. 1490 0. 1304 EXT 0.0919 0. 7915 -0. 3943 0. 3386 0. 0087 CEC 0.2515 0. 8271 -0. 2509 0. 3608 0. 0581 See Abbreviations, pp. xii-xiii. All underlined values had an absolute value > 0.75.

PAGE 117

101 Principal Component Analysis by Soil Series This analysis was included to determine how this technique can be used to select important soil properties by soil series, to evaluate the variability of similar soils, and to evaluate the correct placement of pedons within the soil classification system. Theoretically, each PC may explain approximately the same proportion of the total variance for similar soils. To evaluate this assumption, soil series with the largest number of observations were selected and analyzed. These were the Albany, Dothan, and Orangeburg series. Results of this analysis are presented in Figures 10, 11, and 12. The proportion of the total variance explained by the first PC varied widely. The proportion varied between 35.7% and 71.5% for the Albany series, from 30.3% to 66.3% for the Dothan series, and from 39.3% to 82.8% for the Orangeburg series. There was a wide average difference (38%) between the minimum and maximum values for the three soils. The degree of importance of the soil properties varied from one county to another for the same soil series. For example, total sand content was an important property to explain the total variation of the Albany series in Jackson and Leon Counties but was not important in Santa Rosa County. Similar examples can be observed with other soil properties between different counties in each soil series selected.

PAGE 118

102 CO CO lice: c\l X Q. co CO rO rO ro ro rO CO ro 1^ro CD rm to CD CD 6 o O O c iQo CO o E u CO co > o c o QJ 0 c id •H M > (13 +J O 4J CU X! +J o 4J c o H 4-> •H in 4J C O u , c rO id si H +• C H £ CD J3 -P M O CM to 0) H 4J M i o *j M c o H O -H O >i CO .Q 0 H H

PAGE 119

103 CO CD O LU C X UJ CD 2 O o o X Cl co co o < > o CD CD CM rO if) CO O rO CO ID CD CD o i_ Cl o CO o a. E e o O Cu >> a. o c CO CO > CO o a) u c •H P id > rH ia -p o P a) -p o -p c o H -P P -P C C u 0) tr U to Si p •H a) H p p a) o o p a -H o CO CO 0) rl P CD CQ c (13 P 0 Q i -P c o 0 p 3 tr H

PAGE 120

104 v. cr C Ll c h-~ X Ui 10 C c cvj !TTT l:::: a.:::: x Q. > C +•* co CO 6 o O 0>> c > c > 0) 0 a rd H !-l > H id -p 0 a) 4J C c 0 H +J X! • H P CD +J •H C P 0 QJ u tn o tr & p. M a) D id c id P. -P o •-H QJ w -P 0 H 1-1 4-) 0 P. iw a) 0 >l 0 4J p c a. 3 0 H 0 •H 0 CO ^3 o CO cr < > 10 o CO CM GO CM co CD oi o CM ID CO CO a. E CN 0) P c H

PAGE 121

105 A large variation existed in the proportion of the total variance explained by the first PC between counties in each soil series. In addition, the degree of importance of each soil property varied between counties in each soil series. For these reasons, the three soil series were plotted in the plane of the first two PCs (Figure 13) to visualize the relationship between pedons in each soil series. A large degree of dispersion was observed in each soil series. A clear grouping of pedons by individual soil series did not exist. Thus, a nested analysis of the variance was used to created a clearer understanding of the variation among pedons within each series. Soil series, pedons within each series, and horizons within each pedon were considered as sources of soil variation (Table 13). Theoretically, a larger variation may occur between soil series (e.g., between Albany and Dothan) and between horizons within pedons belonging to the same series (e.g., between A and B horizons in the Albany series ) A large part of the total variation was explained by differences between pedons belonging to the same series. More than 30% of the variability in all sand fractions (except total sand), silt, pH-water, K content, and CEC was explained by the differences between pedons within the same soil series.

PAGE 122

106 e o o <4 O z o i o 8 < a. o z K a I CD -P U-J 0 0) e— 10 1— 1 fcij CD +j c H (0 CD • -H 4J G C cn CD C i — i 0 H ft 0 B en 0 u na rc o CD •H rH U CD c 09 •H H ft 0 o c 0 *J H +J +J CO en 0 M 0 H 2 iN3NOdHOO IVdlONIUd CD U 3 tJi •H fa

PAGE 123

107 Table 13. Variability of studied soil properties within and between soil series and between horizons. Soil Source of Variation Property Soil Series Pedon ** Horizon Error % rTlT T TH A 1 U U J.J Q 0 A yz 4 VC U b b 4 Z 0.1 J Z X C Z j u z 1 a n 1 0 u M U U Q "3 "7 O.J a n 0 • u F Q "7 O / R "7 T lo Z 1 ^ a VF 0 0 y l i J 1 c; a J o TS 51.8 0.0 34.0 14.2 Silt 7.0 34.6 21.5 36.9 Clay 36.7 0.0 43.1 20.2 PHI 7.2 36.9 4.5 51.4 PH2 0.0 26.0 9.2 64.8 OC 1.4 0.0 95.0 3.6 Ca 12.7 0.0 77.5 9.7 Mg 26.7 0.0 61.5 11.8 Na 7.0 0.0 88.8 4.2 K 0.0 30.2 25.6 44.2 TB 5.7 23.9 27.9 42.4 EXT 0.0 0.0 81.0 19.0 CEC 0.0 37.3 48.5 14.2 BS 9.8 22.4 23.4 44.3 See Abbreviations, pp. xii-xiii. ** Pedon within soil series.

PAGE 124

108 A large part of the variability in total sand and clay contents was explained by the differences between soil series. More than 40% of the variability in clay, organic carbon, Ca, Mg, and Na contents; extractable acidity; and CEC was explained by the difference between horizons. Some soil properties (horizon thickness, very coarse sand and silt contents, pH-KCl, K content, total bases, and base saturation) had a large unexplained variability (error). Total sand and clay contents fulfilled the initial hypothesis which stated that a large part of the variability was explained by differences among soil series. Organic carbon content also fulfilled the initial hypothesis that a large part of the variability was explained by differences among soil horizons within similar soil series. These results validated the conclusions of the PCA, for both standardized weighted data and standardized A horizon data. Total sand, clay, and organic carbon contents were selected by the PCA as soil properties which were important in explaining the total variance. Fine sand content and CEC were also selected by PCA, but according to the nested analysis of variance, a large part of their variability was explained by the differences among pedons within soil series. Therefore, fine sand and CEC were not included in the geostatistical analysis.

PAGE 125

109 In addition, these results also validated the use of a large correlation coefficient (|0.75|) because this coefficient allowed the selection of those variables with large variability between soil series and horizons. Both PCA and nested analysis of variance were very useful in selecting important soil properties (total sand, clay, and organic carbon contents) for further analysis. PCA reduced the large number of soil properties selected initially. The nested analysis of variance demonstrated that most of the soil properties selected by the PCA were important as differentiating properties between soil series and/or horizons. Likewise, the selected soil properties are important to determine specific soil potentials (e.g., fertility and irrigation). Thus, the variability of the selected soil properties affect the accuracy of the predictions for these specific performances. For a final validation, soils were plotted in the plane of the first two PCs, considering only the important selected soil properties (Figure 14). In this a slightly better grouping of soils by series was observed compared to Figure 13. An important conclusion from these analyses is that because of the multivariate character of soils, the selection of variables must be based on some quantitative method. Otherwise the biased selection of variables can introduce a large source of error in the results.

PAGE 126

110 P P O M-l (U < Q o o <4 S O M o — h o a. 3 o < a. 3 z £ a. I 2 iN3N0d00 IVdIONiad CD (1) CO c H O a h C "H •h u co t3 CO U 0) co ItJ a H • U CO •H >-l P M-l Oj P C 5 O P •H P P (C CO u o (J 4-1 0) QJ O 0,

PAGE 127

Ill Conversely, the use of the complete set of data would add more complexity to the analysis. A large number of soil properties had a large proportion of the variability either explained by differences among pedons belonging to the same soil series and/or unexplained soil variability. It is believed that the possible causes of the variability are: (i) Soil properties relevant to define series, such as morphological properties, were not considered in the analyses Variability of total sand, clay, and organic carbon contents was successfully explained by differences between soil series and/or horizons because these soil properties are related to morphological properties of a given horizon. Total sand content is related to the coarse-textured surface horizon, clay content is related to the argillic horizon, and organic carbon content is related to the surface A horizon. (ii) Sampling errors by assuming an erroneous concept of soil variability. Sampling errors are introduced if soil scientists assume that the sampling unit is completely uniform when it is not so. It seems very difficult to have a completely uniform sampling unit. Variability has been recognized at all scales. Soil variability has been widely recognized at macroscopic scale (Beckett and Webster, 1971; Beckett and

PAGE 128

112 Bie, 1976). In addition, variability can be recognized microscopically and submicroscopically (Wilding and Drees, 1978, 1983). If soil variability is considered as the sum of variability at all scales, then, it is very difficult to have uniform soils. The reality is that "uniform" soils are those in which the internal variability ("within" variability) is lower than the variability compared to the surrounding soils ("between" variability). (iii) A large source of variation was introduced because of lack of emphasis by soil scientists on soil and landscape relationships. Descriptions of the geomorphic environment are very ambiguous for some soil series. For example, the geographic setting of Orangeburg series is described as follows: Orangeburg soils are on nearly level to strongly sloping uplands of the Coastal Plain. Slopes range from 0 to 20% (National Cooperative Soil Survey, 1982). The geomorphic environment was described as gently sloping uplands with 4% gradient for an individual pedon of Orangeburg series (Carlisle et al., 1985; p. 192). Many soil investigations in the U.S. have involved geomorphic surfaces. Ruhe (1969) defined a geomorphic surface as a portion of the landscape specifically defined in space and time. The surface is a mappable unit that has no size limit and may include a number of landforms and landscapes. According to this concept only time for a

PAGE 129

113 geomorphic surface is uniform; other geomorphic features related to space (i.e., physiography) can have large variations. In addition, a low degree of accuracy in the pedon location descriptions was evident while locating the selected pedons on topographic maps. More emphasis has been placed on the descriptive aspect of the soil series than in the geographical aspect. Importance of the geographical aspect of soil was pointed out by Bie (1984). He indicated that the accurate location of pedons by X and Y coordinates would be a great contribution to soil science. (iv) Possible errors in soil correlation. The large degree of pedon dispersion within individual soil series may be the result of incorrect placement of individual pedons into the soil classification system. Soil correlation was beyond the scope of this investigation, but PCA may be a useful guantitative method to indicate problems in soil correlation. Geostatistics The variability of soil properties is a limiting factor for reliable soil interpretations and for making accurate predictions of soil performance at any particular location on the landscape.

PAGE 130

114 A large number of studies have been made to quantify soil variability, but they have not taken into consideration the geographic character of soil variability. Conversely, geostatistical analysis is based on the geographic location of the individual observations. Therefore, geostatistical techniques can offer a solution to some of the unsolved problems of spatial variability of soils The 151 pedons studied were located by a system of X and Y coordinates (Appendix C). Pedons were irregularly distributed in an approximately 380 x 100 km grid (Figure 15) Geostatistical analysis can be used for horizontal and vertical directions, but using both these directions adds more complexity to the analysis. Therefore, the data were selected to represent the variability of soils in the horizontal plane. Two sets of data were analyzed. One set was composed by weighted average values of total sand, clay, and organic carbon contents. The other set of data included clay and organic carbon contents from the A horizon. Semi-Varioqrams The first step in the geostatistical analysis was to calculate the semi-variance. The number of pedons provided sufficient pairs of observations for reliable estimates of semi-variograms. The total number of pairs was calculated

PAGE 131

115

PAGE 132

116 from the combinatorial equation: Ns of Combinations =n!/n!(n-r)! (48) where n = Total number of pedons r = Number of pedons taken at one time When r = 2, equation (48) reduces to Ns of Pairs = n (n 1) / 2 (49) According to equation (49), 151 pedons provided 11325 pairs. Direction-independent and direction-dependent semivariances were calculated for each soil property studied. Semi-variances for direction-independent and E-W, NE-SW, NW-SE, and N-S directions were supported by 11,325; 8,298; 924; 1,450; and 653 pairs of observations, respectively. A reliable semi-variogram is obtained when intervals are chosen such that the number of pairs is large enough to ensure accurate definition of each point on the semivariogram. A rule of thumb is to use intervals such that the minimum number of pairs of observations in each interval is about 50 (Skrivan and Karlinger, 1979). Likewise, the maximum lag distance to provide reliable semi-variograms is a half of the total length (Journel and Huijbregts, 1978). The total length was approximately 380 km in the E-W direction and 100 km in the N-S direction. A

PAGE 133

117 lag distance of 10 km was selected to calculate the semivariance up to 190 km in the E-W direction and 50 km in the N-S direction. Stationarity is an important assumption to consider in geostatistics The criterion used to determine the validity of this assumption was explained by Journel and Huijbregts (1978). They indicated that when the semivariance increase is larger than |h 2 | ( |h| = modulus of the lag distance) for large distances h, the increase is incompatible with the intrinsic hypothesis. Such an increase in the semi-variance often indicates the presence of drift. Statisticians established the constraint of stationarity because each sample was considered unigue when geostatistics was developed. Conseguently statistical inference about the population could not be made. Geostatistics was developed in the mining industry. Sampling procedure in mining is guite different from sampling procedures applied in soils. Sampling ore deposits involves large volumes of individual samples, large sampling time, and high costs. It is very difficult to take sample replications in mining. Sampling soils is a completely different situation. Most soil samples are taken within 2 m from the soil surface. In addition, soil samples can be taken at

PAGE 134

118 distances varying from a few cm to several km apart at relatively low cost. Soil stationarity was assumed before geostatistics could be used in soil science. Soil scientists have assumed stationarity when they take replications to increase the precision of the results. Stationarity of soils is assumed when the placement of soils in the classification system is tested. Stationarity of soils has been also implicitly assumed when a map unit is delineated by a soil survey. Observations are not unique in soil science. It is possible to take relatively homogeneous replications of soil samples. Stationarity is not as serious a problem in soils as it is in mining. Therefore, the criterion to determine soil stationarity needs to be defined. Stationarity is important within the area in which a large degree of similarity and dependence in soil property values exits. The similarity and dependence of soil properties values are large within a map unit. The degree of dependence decreases when soil properties are measured in different map units up to the point in which soil properties values are no longer related. The within-unit (WU) variability and the between-unit (BU) variability are important in order to know the degree of uniformity of map units, of individual pedons, or of soil properties. The variability WU is expected to be

PAGE 135

119 smaller than the variability BU. Although, where different levels of management have been applied, WU variability may exceed the BU variability (Beckett and Webster, 1971; McCormack and Wilding, 1969). In general, the WU variability gives us the degree of uniformity or variability of a map unit or individual soil properties. The WU variance could then be used as a criterion to establish stationarity for those soil properties less affected by management (e.g., total sand and clay contents). When twice the value of semi-variance (G) is larger than the WU variance ( 2G = variance) in the area in which the soil properties are supposed to be related, then, stationarity is absent. Data were grouped by soil series. The WU variability was represented by the WU variance of the soil series. Total sand and clay contents had a WU variance of 33.8 and 14.3 respectively. The first semi-variogram for total sand content (Figure 16) had a G value that increased from 198.4 at 5 km distance to 249.1 at 15 km distance. The increase in distance represented an increase of 2 in the modulus of the lag distance. The semi-variogram for clay content had a G value that increased from 153.5 at 5 km distance to 180.1 at 15 km distance (Figure 17). Therefore, semivariograms of weighted average total sand and clay contents had drift. If the information contained in the semivariogram is to be used for making optimal unbiased

PAGE 136

120 C id > I VWWV9 •h e QJ
PAGE 137

121 3 8 8 3 -p c C 0) 0, 1) c H I c 0 -H +J u 0) •H 4J in •H +J C a) +j c o u >i m i — i o E to u Bi O •H -a j-i D id (13 U > VWWV9 tX-H •h e a) Q) M •H

PAGE 138

122 estimates (kriging) of the selected soil properties at unsampled locations, the drift must be removed. An iterative procedure, explained in the Materials and Methods section, was used to remove the drift. The observed drift in total sand and clay content semi-variograms was reduced, but it was not completely removed. A reason for this may be the presence of a short range variability in the soil properties. Pedons with large differences in soil properties were located at short distances. Different pedons were compared when semivariograms were calculated with lag increments of 10 km. These results are supported by previous works. Burrough (1983a) pointed out that it seems impossible to achieve stationarity. Olea (1975) could not eliminate the drift in the data; then, he used universal kriging to produce maps that indicated trends in data variability. Total sand and clay content semi-variograms were characterized by the presence of structure. Semi-variogram structure occurs when there is an increase of the semivariance to a maximum value (Figures 18, 19, 20, and 21). These semi-variograms had characteristics nugget variances (intercept), ranges, and sills (Table 14). Theoretically, the semi-variogram should pass through the origin when the distance h = 0 (h is the lag distance). However, total sand and clay contents had non-zero semivariances as h decrease to zero. This is called nugget

PAGE 139

123

PAGE 140

124 in ro IT) CO I CO LU I CO I 4 a § 4 4 4 < 4 4 a < < 4 4 4 < 4 4
PAGE 141

125

PAGE 142

126 VWWV9

PAGE 143

127 Table 14. Important semi-variogram parameters of the weighted average of selected soil properties. Semi-variogram Range (km) Sill Nugget variance g. 0 Total sand content Directionindependent 34.7 324.0 173.6 53 6 East-West 24.8 295.8 160.1 54. 1 Northeast-Southwest 34.7 292.4 182.7 62. 5 Northwest-Southeast 25.4 287.3 220.8 76. 9 North-South 30.0 352.5 120.5 34. 2 Clay content Directionindependent 34 .7 230.2 135.9 59. 0 East-West 34 .6 211.3 123.2 58. 3 Northeast-Southwest 34 .7 206.5 143.3 69. 4 Northwest-Southeast 15 .6 210.3 153.2 72. 9 North-South 34 .7 299.5 99.7 33. 3 Organic < carbon content Direction-independent <10 .0 0.120 0.120 100. 0 East-West <10 .0 0.119 0.119 100. 0 Northeast-Southwest <10 .0 0.069 0.069 100. 0 Northwest-Southeast <10 .0 0.052 0.052 100. 0 North-South <10 .0 0.201 0.201 100. 0 % of the sill represented by the nugget variance

PAGE 144

128 variance or nugget effect (Journel and Huijbregts, 1978). Nugget variance represents the unexplained variance, often caused by measurement error or variability of the soil properties that could not be identified at the scale employed. The intercept, which is the estimate of G at h = 0, provided an indication of the variation at a distance shorter than 10 km. The range of the semi-variogram is the distance at which G attains the maximum value (sill). The range can be interpreted as the diameter of the zone of influence which represents the average maximum distance over which observations are related. They are dependent. At a distance larger than the range, observations are no longer related. They are independent. At distances less than the range, measured properties (e.g., total sand and clay contents) of two samples become more alike with decreasing distance between them. Thus, the range provides an estimate of the areas of similarity. The range also represents the average minimum distance at which maximum variation occurs. The maximum semi-variance value is called the sill. The sill is egual to the sum of the nugget variance and the spatial covariance (Co + C). Often, the sill is approximately egual to the sample variance (Journel and Huijbregts, 1978).

PAGE 145

129 Total sand and clay content semi-variograms were anisotropic, indicating that the variability of selected soil properties changed with direction (Table 14). The longest range was of approximately 35 km for both total sand and clay contents. The longest range was for direction-independent and NE-SW semi-variograms for total sand content; and for direction-independent, NE-SW, and N-S semi-variograms for clay content. The largest variation (sill) occurred in the N-S direction for both total sand and clay contents. The largest proportion of the unexplained variation occurred in the NW-SE direction for both total sand and clay contents. Differences between direction-dependent semi-variograms for the soil properties selected by the PCA could be the result of differences in geology and topography. Organic carbon content is a soil property influenced by management. The WU variance was smaller than the BU variance (0.0 and 0.07 respectively). Organic carbon content had very low WU and BU variances because the largest variation occurred between horizons (Table 13). Stationarity in organic carbon values was present when the wu variance was used to determine stationarity; but stationarity was absent if the semi-variance increment compared to the lag distance increment was used. This situation resulted because semi-variance organic carbon values were very small (generally less than one),

PAGE 146

130 therefore, any increment in lag distance always resulted in the absence of drift. The absence of drift, when lag distance was used as the criterion, was a problem related to the measurement scale. This was supported by the fact that organic carbon had the largest C.V. (Table 2) among the three soil properties used for the geostatistical analysis. Consequently, the WU variance was used as the criterion to determine stationarity Semi-variogram for organic carbon content did not have any structure (Figures 22 and 23); there was no increase to a maximum value. A pure nugget effect was observed, indicating a short-range variability in organic carbon content. Organic carbon content had a large point to point variation at short distances of separation and an absence of spatial correlation at the scale used. The range of the organic carbon content semi-variogram was a distance smaller than 10 km. Direction-dependent semi-variograms (Figure 23) showed an anisotropic variation in organic carbon content. The largest variation ocurred in the N-S direction (Table 14). The anisotropic variation of organic carbon indicated that the factors which influence the organic carbon content (e.g., vegetation, moisture, drainage, relief, management) are different in different directions with the largest variability in the N-S direction.

PAGE 147

131 VWWV9

PAGE 148

132 VWWV9 •4-1 c xs G a 0) T3 I C 0 H JJ u -H 13 -P C 0) •p c 0 0 c 0 A u 0 o H c id & M 0 (ti o (D .C I CT>-H •H g
PAGE 149

133 Semi-variograms of A horizon soil properties (clay and organic carbon contents) also indicated presence of drift. The WU variances were used as criterion to determine stationarity WU variances were 10.0 and 0.0 for the A horizon clay and organic carbon contents, respectively. A large part of the drift was removed for semivariograms of the A horizon soil properties by using the residuals, but it was not completely removed. A reason for this can be related to the presence of different pedons within short distances. Semi-variograms of the A horizon clay content indicated presence of structure (Figures 24 and 25). The maximum variance (sill) was reached within distances varying from 20 to 35 km (Table 15). The maximum variation occurred in the NW-SE direction. Variation of the A horizon clay content was smaller than the variation of the weighted average clay content. This was due to the fact that weighted average data included contrasting horizons in clay content such as A and B horizons. The direction of maximum variation of the A horizon clay content corresponded to the direction in which the weighted average clay content had the largest nugget variance. Therefore, it is probable that the large variation in the A horizon clay content in the NW-SE direction was one of the causes of the large unexplained

PAGE 150

134

PAGE 151

135 VWWV9

PAGE 152

136 Table 15. Important semi-variogram parameters of the A horizon selected properties. Semi-variogram Range Sill Nugget o "O (1cm) variance Clay content Direction-independent 34.8 75.8 24.4 32. 2 East-West 45.0 96.9 25.8 26. 6 Northeast-Southwest 25.0 55.6 28.6 51. 4 Northwest-Southeast 35.1 118.0 7.5 6. 4 North-South 20.1 53.0 22.3 42. 1 Organic carbon content Direction-dependent <10.0 1.048 1.048 100. 0 East-West <10.0 1.045 1.045 100. 0 Northeast-Southwest <10.0 1.014 1.014 100. 0 Northwest-Southeast <10.0 1.089 1.089 100. 0 North-South <10.0 0.955 0.955 100. 0 % of the sill represented by the nugget variance.

PAGE 153

137 variation in the same direction for the weighted average clay content. Semi-variogram of A horizon organic carbon indicated, as did the semi-variogram of weighted average organic carbon content, a pure nugget effect (Figures 26 and 27). A reason for this is that the A horizon organic carbon content had a large point to point variation at short distances. Variation of the A horizon organic carbon content was larger than that for the weighted average organic carbon content. This could be due to the fact that some of the A horizons were affected by management conditions (Ap) and other A horizons were under relatively natural conditions (Al). This result seems to be in contradiction with results obtained with the nested analysis of variance (Table 13), but two aspects need to be considered. First, the nested analysis of variance did not considered the pedon location. Second, the nested analysis of variance included surface and subsurface A horizons. Therefore, a masking of the differences between surface A horizons (Ap and Al) could have occurred when the nested analysis of variance was used. All observed semi-variograms had a characteristic wave pattern, indicating a cyclic variation in the studied soil properties

PAGE 154

138 a tu -a c 0) a c H 1 c 0 +J 0 o M •H T3 0 +J +J •H M-l +J c 0) +j c 0 0 c 0 u 0 u •H a • a Sh u 0 0 c, H 0 M H > 1 0 H e 0) < 0 M a •H VWWV9

PAGE 155

139 c 0) rr-i C 0) M 1 0 •H £ < n VWWV9

PAGE 156

140 Fitting Semi-Variograms The process of fitting the observed semi-variogram to a theoretical model is another important step in the geostatistical analysis. It is important to choose the appropriate model for estimating the semi-variogram because each model yields quite different values for the nugget variance and range, both of which are critical parameters for kriging. The process of fitting observed semi-variograms to theoretical models was time consuming. Therefore, direction-independent and direction-dependent semivariograms with the largest variation were selected. Olea (1984) stated that there is no single solution to curve fitting. The user must decide what part of the semivariogram should be fitted and what part should be regarded as anomalous. Points located within distances varying from 0 to 50 km were selected because the range was included and there was a large semi-variogram reliability within these distances. The choice of the model was governed by the general graphic appearance of the observed semi-variogram. The curve fitting procedure described in the Material and Methods section was used. The objective of the fitting procedure was to adjust the parameters in the semivariogram until the model was theoretically consistent. Consistency is reached when the kriged average error (KAE)

PAGE 157

141 is approximately zero and the kriged reduced mean square error (KRMSE) is approximately equal to one. KAE and KRMSE are defined by equations (47) and (48). KAE gives the average of the difference between the observed and the theoretical (estimated) values, KRMSE represents the ratio between the theoretical and the calculated variance (sill). Models selected had KAE and KRMSE values very close to zero and one, respectively. Kriged mean square errors (KMSE) were computed according to the following equation: KMSE = [1/n (Z j Z i )2 ] 1/2 (52 ) where n = number of points Z^= measured value Z i = kriged value KMSE gives an idea of the dispersion of the measured values respect to the kriged values. Because of these values (Table 16), directionindependent semi-variograms for weighted average values of total sand and clay contents were fitted by the DeWijsian (logarithmic) model (Figures 18 and 20). Total sand and clay content N-S semi-variograms were fitted by the Spherical model (Figures 35 and 36). Organic carbon content direction-independent and N-S semi-variograms were fitted by the Linear model (Figures 22 and 37).

PAGE 158

142 Table 16. Goodness-of -f it values of the weighted average of selected soil properties. Semi-variogram Model KAE* KMSE+ KRMSE** Total sand content Directionindependent DeWijsian -0.0589 16.8347 1.0670 North-South Spherical -0.0699 16.7292 1.3826 Clay content Directionindependent DeWijsian 0.0666 13.7246 1.0001 North-South Spherical 0.0363 13.6306 1.1479 Organic carbon content Directionindependent Linear -0.0001 0.3754 1.0535 North-South Linear 0.0004 0.3778 0.8321 + KAE = Kriged Average Error KMSE = Kriged Mean Sguare Error KRMSE = Kriged Reduced Mean Square Error

PAGE 159

143 The KAE and KRMSE values for the A horizon soil properties (Table 17) indicated that direction-independent and the NW-SE semi-variograms for clay content were fitted by the Spherical and Root models, respectively (Figures 24 and 38). The A horizon direction-independent and the NWSE semi-variograms for organic carbon content were both fitted by the Linear model (Figures 26 and 39). Kriging One of the prime reasons for obtaining a semivariogram is to use it for estimation. Soil survey recognizes two main kinds of estimates (Webster, 1985). One is the average value of a soil property within some defined region. The other is the prediction of values of a property at unsampled places (interpolation). The information derived from the fitted semivariograms was used to generate contour maps of kriged values of soil properties (interpolated values). Contour maps were produced by using universal kriging because trends were present in the data. Kriging is a technigue of making optimal, unbiased estimates of regionalized variables at unsampled locations using the information contained in the semi-variogram (range, nugget effect, theoretical model). Kriging is optimal because it reduces the estimation variance and is unbiased because KAE is zero.

PAGE 160

144 Table 17. Goodness-of -f it values for the A horizon selected properties. Semi-variogram Model KAE* KMSE+ KRMSE** Clay content Directionindependent Spherical 0.1071 6.3258 1.0175 NorthwestSoutheast Root 0.2497 8.1069 1.4189 Organic carbon content Directionindependent Linear 0.0000 1.0212 0.9883 NorthwestLinear 0.0003 1.0336 1.0017 Southeast + ** KAE = Kriged Average Error KMSE = Kriged Mean Sguare Error KRMSE = Kriged Reduced Mean Sguare Error

PAGE 161

145 Contour maps were generated for total sand and clay (weighted average and A horizon) contents derived from the direction-independent and direction-dependent with largest variability semi-variograms (Figures 28, 29, and 30). Contour maps were better interpreted when used with diagrams. Organic carbon content (weighted average and A horizon values) was not used for contouring maps because of the large nugget variance that produced large variance estimates influencing the reliability of the map. Areas with discontinuous contour lines (i.e., lower left hand side and upper right hand side of the Figures used) corresponded to zones with no sampled pedons (Figure 15). Contour maps were influenced by the nugget and the minimum variance criterion (within-unit variance). The latter ensured that the interpolated value at a sampling point was the observed value there. The presence of a nugget variance indicated that the semi-variogram was composed by two functions (except organic carbon), one describing the spatial dependence, and the other a purely random variation that influenced the boundary between delineations. Clay content weighted average had a nugget variance larger in proportion to the sill than that for total sand content (Table 14). This situation could indicate that the map of of kriged values of clay content weighted average encompassed more variable units than the map of kriged

PAGE 162

146 0 £ O o a o c 0 •H -U rtJ >-i Di Q) (0 > X (TJ 0) T3 rH — -a a) £ Oi S-l n a •H 4-1 73 O TJ — C X! (0 — ft! oi ~— in cn <*P cn O -H P c H 0 a > +J C i o M O C X! •H 4-> -~ 3 3 e QrH (0 N 73 sue (0 CO ><; 0 CO H P iH (0 C P O CO O C_) -H +J CO CN M •H

PAGE 163

147 0 I o o w — r\ o \ / m \ \^ 0 s o CD K5 B] c o •H +J 10 H a) (0 fC ro .-I o X S-l II > as U X3 •H +J U CO > 0) +J X3 CP H CJ 6 S-l o i 0 x; g N +J (0 c 0) p X c O co o U r-l O u 0 4J c ID U Cn -H tn

PAGE 164

T3 — 6 a) p 0 o c x: •H -P • e -P G 1 C O cn H U -H 0 o ro CD P. 3 C 1 •H fa

PAGE 165

149 values total sand content. In general, there was a very good correspondence between the two maps. Because both soil properties were important components of the same PC, in addition the silt content was very low. Another aspect is that PCA not only allowed the selection of important soil properties for separating different soil series, but also the soil properties selected were spatially related. An important objective was to find a physical meaning of the contour maps and diagrams generated by the geostatistical analysis. For this reason contour maps and diagrams were compared with the map of physiographic regions of Florida (Figure 43). All contour maps have X and Y axes represented by the values of the geographic coordinates, which were very useful in locating the physiographic regions. Pedons studied were located in five physiographic regions: Southern Pine Hills, located between the 0 and 29 X coordinates and the 8 and 22 Y coordinates; Dougherty Karst, located between the 29 and 49 X coordinates and the 8 and 22 Y coordinates; Apalachicola Delta, located between the 24 and 58 X coordinates and the 0 and 12 Y coordinates; Tifton Uplands, located between the 48 and 58 X coordinates and the 11 and 15 Y coordinates; and Ocala Uplift, located between the 58 and 80 X coordinates and the 0 and 15 Y coordinates

PAGE 166

150 In general, maximum total sand content corresponded to minimum clay content (Figures 28 and 29). These soil properties were naturally related because of the presence of argillic horizons. Total sand content had a large depression between the 47 and 54 X coordinates, which corresponded to an area among the Apalachicola Delta, the Dougherty Karst, and the Tifton Uplands physiographic regions. The total sand content diagram indicated predominance of high values across the Panhandle, but there was a break in the continuity of the high values because of the presence of the Apalachicola Delta. Diagrams of total sand and clay contents (Figures 28 and 29) also indicated large number of small depressions between the 22 and 45 X coordinates and the 58 and 70 X coordinates, these areas corresponded to the Dougherty Karst and the Ocala Uplift, respectively. The presence of depressions seems to indicate a large variability in total sand and clay contents. The A horizon clay content (Figure 30) also had large variability in the Dougherty Karst physiographic region. The variability in soil properties in the Dougherty Karst is apparently related to the large variability in the karst topography. The variability in the Ocala Uplift can be the result of local differences in geology and topography including karst. Diagrams of clay content (Figures 29 and 30) indicated that clay content tend to decrease from the north to the south.

PAGE 167

151 Contour maps derived from direction-dependent semivariograms with maximum variation (Figures 40, 41, and 42) did not differ from those maps derived using directionindependent semi-variograms. This fact indicated that the geostatistical program was not capable of generating contour maps derived from direction-dependent semivariograms. A reason for this is that the kriging subroutine of the geostatistical program did not reguire the direction of the semi-variogram as input data. Therefore, improvement of the geostatistical program to take into account the direction-dependent semi-variograms is recommended. Despite the fact that the program was not capable of generating contour maps derived from directiondependent semi-variograms, one guestion that needs to be answered is: Are there significant differences among the direction-dependent variances? If there are no significant differences among direction-dependent variances, there should be no differences between contour maps derived from directionindependent and direction-dependent semivariograms for individual soil properties. An important advantage of kriging was that this interpolation technigue provided estimates of the estimation variance for each observation. These estimates can be displayed in the form of standard error (reliability) maps or diagrams. Reliability estimates indicate the precision of the kriged values and alternately

PAGE 168

152 could indicate where more samples would provide more information. Reliability diagrams (Figures 31, 32 and 33) were produced based on the standard errors of the kriged values The standard errors varied from 14.95 to 21.11 for total sand and from 13.10 to 16.33 for clay content weighted averages, and from 4.54 to 6.05 for A horizon clay content The standard error is a function of the nugget variance. The larger the nugget variance compared to the sill value, the larger the standard error compared to the kriged value Clay content weighted average had the largest standard error and A horizon clay content had the smallest. Reliability diagrams indicated areas of large and small standard error. Generally, areas with the smallest standard errors corresponded to areas with no sampled pedons (Figure 15). All three reliability diagrams coincided in indicating the eastern part of the Dougherty Karst physiographic region as the area with largest standard error. These reliability diagrams can be very useful in the design of new sampling strategies. Areas with large standard error require an increase in sampling intensity to increase the precision of the estimates. Geostatistical techniques were useful in evaluating the spatial variability of soils and to indicate zones where more intensive sampling is required. Geostatistical techniques require more investigations in order to better

PAGE 169

153 w •H TS C +J (U C 03 •H O rH a as a; o •H 4J > a> 4-i tn O (0 J4 ^ a; -p > 3 (0 e •h -a n a) x: Cr> X *H co ai 03 T3 V4 C 0 H -P (0 4-1 M o tn co tn ^ (0 o X u u u •H 4-1 > id -a c 03 4-1 03 £ 44 (0 o u Cn•H in a cm CD ^4 3 cn H

PAGE 170

154 n H -P G -P C 0) +J B o o o >i 5 0) •w en O (0 P 0) > JZ -P 3 e •H -O N 0) (0 -P x: tn x -H CO (1) H 3 C 0 •H 0) tn H M ,X -P (C ip p o 0) en w Cn p tc o X P. CD P. to -a u u 03 C r3 -P e


PAGE 171

155 a -H • O -P a, g o o 14-1 o >1 .G H +J c •H U C o N N as -h O X £ CO rH < CO T3 •H Q) G -H o u •H ^ -P to m n o cu Cn M id o X n Q) H (0 -a U H •H (0 4J T3 H G CD (0 > -P -— w g 41 OS O U tji— rC o| •H IT) Q
PAGE 172

156 define the assumption of stationarity and anisotropy. Anisotropy indicated that the variability of soil properties was direction-dependent. But an important question to be answered is: is there a significant difference among the direction-dependent variances? Fractals Soil variation can also be a function of the scale of observation. The components of the variance measure the amount of variance contributed by each scale, and by accumulating them, it may possible to show how variance increases with increasing distance. The ranges of the semi-variograms studied varied from "15 to 35 km. The random variation corresponded almost always to a large proportion of the total variance within these range distances (Tables 14 and 15). Therefore, the objective of this section was to evaluate quantitatively how semi-variograms can be used to indicate the scale dependence of soil variability in the study area. All semi-variograms had a nugget component which represented the random variability. Organic carbon content (weighted average and A horizon values) had a pure nugget effect indicating a short-range variation. Other soil properties studied had long-range variation. Soil properties with long-range variation had nugget variances that varied from approximately 6% to about 73% of the total variance (Tables 14 and 15).

PAGE 173

157 The Hausdorf f -Besicovitch dimension or fractal dimension (D) was calculated according to equation (43), to determine if the scale of study was appropriated to resolve the random component in the variability of the soil properties studied. Theoretically, the D value ranges from 1 to 2. The value of 1 indicates a systematic variation of soil properties and also indicates an appropriate scale of study. A value of 2 indicates a random variation and the need of increasing the scale of the study. If soil variability is scale-dependent, a decrease in the D value is expected when the distance between sampling points is decreased. For this reason, fractal dimensions were computed from the semi-variograms studied using a lag distances of 10 and 5 km (Table 18). Organic carbon content (weighted average and A horizon values) had a D value of 2 in all directions and for both lag distances. This result was supported by the semivariogram analysis which indicated a pure nugget effect. In addition, organic carbon content had also the largest C.V. (Table 2). These results indicated the short-range variation in organic carbon content. The variation in organic carbon content was random at the scale used, thus, an increase in the scale of study used is recommended to explain the variability of this soil property. Weighted average total sand content had D values that varied from 1.80 to 1.92 (Table 18) for the 10 km lag

PAGE 174

158 Table 18. Fractal dimension (D value) derived from selected soil property semi-variograms Soil Property Semi-variogram Lag distance (km) TS CL or D value A-PT A-OC Directionindependent 10 1 87 1.90 2.00 1.77 2.00 5 1 84 1.72 2.00 1.74 2.00 E-w 10 1 87 1 90 2.00 1.76 2.00 r 5 1.86 1 77 2.00 1.85 2.00 XTT? CT.T NE-SW 10 1 90 1 .93 2.00 1.86 2. 00 5 1 88 1 .73 2.00 1.80 2.00 WW— CTT i y z 1 Oft 2.00 1.63 2.00 5 1.92 1.80 2.00 1.66 2.00 N-S 10 1.80 1.82 2.00 1.84 2.00 5 1.66 1.51 2.00 1.77 2.00 See abbreviations, pp. xii-xiii

PAGE 175

159 distance. The high D values indicated that is necessary to increase the scale of the study to explain better the variability and to reduce the random component in the variability of the total sand content. The D values of total sand content calculated for a lag distance of 5 km were always smaller than or egual to the D values calculated for the 10 km lag distance. Therefore, it can be concluded that the variability in total sand content is scale-dependent. The decrease in the D value was a function of the proportion of the nugget variance (random variability) present. There was no decrease for the direction NW-SE which had the largest proportion of nugget variance (Table 14). This fact may indicate a complex variability in the total sand content in the NW-SE direction because of a complex variability in geology and topography in this directions. It is necessary to increase the scale of the study not only to reduce the random variability but also to find a physical meaning to the variability. Fractal analysis of weighted average clay content (Table 18) gave similar results as the analysis of weighted average total sand content. The D values calculated for the A horizon clay content were almost always smaller than the D values calculated for other soil properties studied. This can be related to the fact that the A horizon clay content had smaller

PAGE 176

160 proportions of nugget variance (Table 15) than those for other soil properties studied (Tables 14 and 15). The results of the fractal analysis were not in contrast with the results obtained with total sand and clay (weighted average and A horizon) content semi-variograms Semi-variograms indicated the presence of a systematic and a random variability within the range. The D values suggested that the random component of the variance was large because of the small scale of the study. The long distance between observations could influence the presence of a large random variation. When distance between pedons is long, local variations in parent materials and/or topography may increase the complexity in the variability of the soil properties studied. Therefore, an increase in scale and small distance between sampling locations (pedons) is necessary to reduce the random variability. Despite the fact that D values for 5 km lag distance were smaller than those for 10 km lag distance, the semivariograms for 5 km lag distance were only reliable for short distances (1/3 to 1/2 of the total length). Therefore, a smaller area with greater pedons density, shorter distance between pedons, and with a more uniform physiography was selected (Figure 34). The reduced area was located on the Ocala Uplift physiographic region. In general, the D values were reduced (Table 19) for weighted average total sand and clay contents if compared

PAGE 177

161 o CD 0) U Id >i -a U n o u T3 QJ H 14-1 O c 0 H -U (0 o 0 m U o tan •H fa

PAGE 178

162 Table 19. Fractal dimension (D value) derived from selected soil property semi-variograms for a reduced study area. Soil Property Semi-variogram Lag TS CL OC A-CL A-OC distance (km) D value Direction10 1.80 1.60 2.00 1.74 2.00 independent 5 1.76 1.68 1.97 1.72 2.00 E-W 10 1.87 1.86 2.00 1.75 2.00 5 1.95 1.66 2.00 1.72 1.96 NE-SW 10 1.66 1.70 1.90 1.86 2.00 5 1.22 1.33 1.90 1.86 2.00 NW-SE 10 1.96 1.91 2.00 1.62 2.00 5 2.00 1.98 2.00 1.78 2.00 N-S 10 1.76 1.85 1.59 1.74 2.00 5 1.68 1.79 1.59 1.40 2.00 See Abbreviations, pp. xii-xiii

PAGE 179

to the D values for the entire studied area (Table 18). Some of the D values for organic carbon content decreased. The D values for the A horizon clay content remained approximately the same or increased. These results indicate that when the scale of study is increased, soil properties with a large proportion of nugget variance (e.g., total sand) had a larger decrease in the random variation than soil properties with small proportion of nugget variance (e.g., the A horizon clay content). Some D values for weighted average total sand and clay contents and A horizon clay content increased for 5 km lag distances indicating a complex variation within this distance. Thus, if random variation is to be reduced, the distance between sampling locations has to be smaller than 5 km. The large D values ( larger than 1.5) have been found to be common in soils (Burrough, 1983b). For future planning of small scale studies in the area, it is necessary to consider that even 5 km distance between sampling locations give large D values which indicates a large proportion of random variability. The geostatistical analysis allowed the separation of the systematic and the random components of the variance. The fractal analysis indicated that it is necessary to increase the scale of study if the random component is to be reduced. Then, the variability of the soil properties

PAGE 180

164 studied could be explained better as a result of their specific location with respect to their parent material, topography, vegetation, and climate.

PAGE 181

SUMMARY AND CONCLUSIONS One hundred fifty one pedons were selected to determine the important soil properties affecting the spatial variability of soils in northwest Florida. Each pedon was located by a system of geographic coordinates (X and Y) Twenty soil properties (horizon thickness; very coarse, coarse, medium, fine, and very fine sand fractions; total sand, silt, and clay contents; pH-water; pH-KCl; organic carbon content; Ca, Mg, Na, and K contents extractable in NH 4 OAC; total bases; extractable acidity; cation exchange capacity; and base saturation) were initially selected for this study. Principal component analysis (PCA) and geostatistics were used in addition to other statistical analyses. All properties were non-normally distributed, based on the Kolmogorov test. This result could be influenced by systematic patterns of soil properties. Observations are not independent. For example, the clay content in the argillic horizon is not independent of the clay content in the eluvial horizon. Argillic horizons developed because clay is translocated from the upper horizons and is deposited in the lower horizons. Argillic horizons are 165

PAGE 182

166 developed under specific conditions. The process is not random. Values of soil properties were not independent but associated, and this fact could influence the probabilistic distribution. Individual soil properties have different degrees of importance in influencing the spatial variability of soils. In addition, geostatistical analysis is time consuming and complex. Therefore, PCA was used as an unbiased method to reduce the number of soil properties initially selected for study with geostatistics Two sets of data were used. One set was composed of weighted average of soil properties of individual pedons. Horizon thickness was used as the weighting criterion. Information is lost when averages are used; therefore, a second set of data composed of soil properties from the surface A horizon were used. All soil properties were standardized to mean zero and variance one. Each principal component (PC) selected explained at least 5% of the total variance of each set of data. Selection of soil properties was based on plots of soil properties in the plane of the first two PCs, orthogonal rotation of PC's axes, guantitative selection of large eigenvectors, analysis of collinearity and correlation coefficient between soil properties and PC.

PAGE 183

167 Weighted average total sand, clay, and organic carbon contents and A horizon clay and organic carbon contents were selected by the PCA for the geostatistical analysis. Results of the PCA were supported by a nested analysis of variance. Soil properties selected by the PCA were important in explaining the variability within and between soil series and between horizons as shown by the nested analysis of variance. PCA and the nested analysis of variance proved to be useful statistical technigues to select important soil properties to study soil variability. The nested analysis of variance not only validated the results of the PCA but also indicated that the selected soil properties were differentiating properties. Therefore, both analyses can be used together for a guantitative determination of differentiating soil properties. PCA also can be useful to determine the correct placement of pedons into the soil classification system. Selected soil properties were employed to study soil variability using geostatistical analysis. The geostatistical analysis had four parts: semi-variogram calculation, fitting of semi-variograms kriging, and use of fractals. Direction-independent and -dependent (E-W, NE-SW, NWSE, and N-S) semi-variograms were calculated for each selected soil property on a 380 x 100 km irregular grid.

PAGE 184

168 Within series variance was used as the criterion to assess stationarity of soil property values. The first calculated semi-variogram (direction-dependent and -independent) indicated presence of drift in the soil property values. Drift was reduced by using residuals, but was not completely removed. A reason for this may the presence of a short-range or a cyclic variation in soil properties. Weighted average total sand and clay contents and A horizon clay content were characterized by the presence of structure. A nugget variance was also present. The semivariogram range varied from 15 to 35 km. Variability of soil properties was directiondependent. Weighted average values had the largest variability in the N-S direction. The A horizon clay content had the largest variability in the NW-SE direction. Differences between direction-dependent semi-variograms could be the result of differences in geology and topography. Weighted average and A horizon organic carbon contents had pure nugget effects, indicating that organic carbon contents had a large point to point variation at short distances All observed semi-variograms had a wave pattern that indicated the presence of cyclic variations in the studied soil properties.

PAGE 185

169 Observed semi-variograms ( directionindependent and direction-dependent with largest sill) were fitted to theoretical models. Spherical, Dewijsian, Linear, and Root models were selected. The information derived from the fitted semi-variogram was used to produce contour maps and diagrams of kriged soil properties. Contour maps were generated using universal kriging because of the presence of drift in the data. Contour maps for the direction with largest variation did not differ from those derived from directionindependent semi-variograms. A reason for this is that the geostatistical program does not take into account the direction of semi-variograms. Contour maps of weighted average clay and sand contents were similar. This similarity was due to the low silt content and that these two variables were members of the same principal component. Therefore, the use of principal components as variables in the geostatistical analysis can generate individual contour maps that would represent all soil properties included in the principal component. More investigations in the use of principal components as individual variables for the geostatistical analysis are recommended. It is also recommended that such results should be compared with those obtained using cokriging. The use of principal components as individual

PAGE 186

170 variables may have the advantage that a single map can represent the variability of a group of soil properties which have two characteristics: first, they are important in explaining the total soil variability. Second, they are differentiating properties. Diagrams of kriged standard errors were also produced. They indicated that soil properties with large nugget variance had large standard errors. Likewise, kriged standard error diagrams can be very useful in the design of new sampling strategies. These diagrams identified areas that require an increase in sampling intensity to improve the precision of the estimates. The diagrams indicated an area located in the northeastern part of the Dougherty karst as the one with the largest standard errors. A possible reason for this may be the irregular topography of the limestone in the area. Kriged standard error diagram and the plot of pedon location can be very useful for planning future sampling strategies. Standard error diagrams indicate areas that require an increase in sampling. A plot of pedon location indicates specific places within the areas with large standard error where additional samples need to be taken. Results of the geostatistical analysis supported the fact that values of soil properties are not independent. Total sand and clay content semi-variograms had ranges that varied from 10 to 35 km. Values of studied soil properties

PAGE 187

171 were related (i.e., they were not independent) within the range distances. Organic carbon content semi-variograms did not have any range, but had a characteristic wave pattern that also indicated a degree of dependence among the values of organic carbon content. Finally, the fractal dimension was derived from the semi-variograms. In general, the fractal dimension was large ( larger than 1.5). These large values indicated that the scale of the study needs to be increased. A fractal dimension was also calculated for a reduced area. The reduced area had a larger pedon density, and therefore shorter distances between pedons than those for the area initially studied. The fractal dimension of soil properties was reduced, indicating the scale-dependent character of soil variability. Results of the fractal analysis were as expected because the studied area is a large area with a large variation in geology and topography. Ranges obtained from semi-variograms can be used to determine the grid size necessary to study the spatial variability of new areas in northwest Florida. The guantif ication of soil variability has two aspects. First, the conditions in which statistical analyses are used. Second, the statistical analyses that are employed.

PAGE 188

172 Soils in their natural environment do not follow randomized block or latin square patterns but soil scientists have used statistical experimental designs to study specific soil properties in laboratory or in greenhouse experiments. The greenhouse experiments have been performed to study how management (fertilizer, liming, or irrigation) influences the soil properties of interest. Results of greenhouse experiments have been validated by specific experimental design under field conditions. For soil scientists involved in the study of pedogenesis, soil variability, or soil geography, is very difficult to apply the same statistical experimental designs or to apply similar statistical analyses because of the difficulty of fulfilling the required assumptions. Soil properties that are non-normally distributed cannot be forced to normality. Dependent values of soil properties cannot be forced to be independent. A biased sampling of typical pedons cannot be forced to be a random sampling procedure. But controlled condition are required to guarantee some standard conditions for quantitative analysis and for further application in field conditions. The opinion of this author is that soils have two scales of variability. One, the large scale (vertical direction) limited by the root system of crops, grasses, trees, or specific engineering uses (e.g., septic tanks). The other, the small scale (horizontal direction) is

PAGE 189

173 limited by the extension of the study or the land use. Soil scientists have been relatively successful in studying the vertical variability because they have been capable of recognizing soil horizons. Soil properties are then related to specific horizons. But soil scientists have been less sucessful in studying the variability in the horizontal direction because of the lack of emphasis in the geographic aspect of soils. Polypedons have a geographic connotation. More emphasis has been placed on their morphologic description than on their geographic relations. Therefore, the use of polypedons as geographic entities is limited. External environmental features (e.g., landform, vegetation) which are consistently recognized and mappable have been helpful in delineating soils. Therefore, this author believes that landscape position can be used as a geographic entity to study soil variability. Landscape position takes into account geographic aspects. The close relationship between soil and landscape position has been discussed by many pedologists. Landscape position can represent the "greenhouse" to test quantitative methods to study variability of soil in natural conditions. The other important aspect of the quantitative analysis of soil variability is related to the statistical analyses themselves. Normality, independence, and

PAGE 190

174 homogeneous variance are basic assumptions for classical statistical analyses. This study showed that the set of data, selected for this study, sampled in northwest Florida since 1967, was non-normally distributed, and values of individual soil properties were not independent. In addition, sampling locations were not selected randomly. They were selected, as often is the case in a soil survey, to be modal pedons. This sampling procedure contradicts the criterion of randomness important in determining the degrees of freedom in classical statistics. Statistical analyses need to be divided into two groups: (i) those methods that can be used to analyze data in an artificial context (i.e, laboratory or greenhouse), and (ii) those techniques that can be used to analyze data derived from studies in the natural environment (i.e., data obtained from a soil survey) In natural conditions it may be difficult to satisfy the assumptions required for classical statistical analyses. Testing the assumptions is recommended; otherwise erroneous conclusions can be stated. When assumptions are not achieved it is necessary to study how this fact affects the results of the analysis, when assumptions are not achieved, it is necessary to employ alternative analyses (i.e, nonparametric analysis), or not to use inferential statistics.

PAGE 191

175 For example, in this study a classical statistical technique, nested analysis of variance, was employed to support results of the soil survey. But the hypothesis testing was not performed; thus, assumptions of this analysis were not required. The assumption of PCA is related with homogeneity of the variances. This assumption was fulfilled by standardizing the data. Results of the PCA were validated by the results of the nested analysis of variance on the raw data. The geostatistical analysis has the assumption of stationarity. This assumption was not fulfilled. Thus, universal kriging, which takes into account the presence of drift, was used as an interpolation method. Statistical analyses are needed to support and to improve the results of soil surveys. The nested analysis of variance is a technique that can be used very easily to determine the extent of the within-map unit variance. PCA is useful to select the important soil properties to study the soil variability. PCA reduced the number of variables selected for additional study. But a hiatus remains between the results of the PCA and the results of the soil survey. Soil scientists classified several pedons to a specific soil series. But the PCA indicated a large degree of dispersion within the soil series. This author believes that this result is mainly due to the lack of emphasis of

PAGE 192

176 soil and landscape relationships, and to the fact that morphological soil properties were not considered in the PCA. Therefore, more investigations are recommended in the use of morphological properties, not only in PCA but also in common statistical techniques used in quantitative analysis. The other statistical analysis used, geostatistics has two advantages over classical statistical techniques. First, geostatistics takes into account the location of the observations. Second, geostatistics not only considers the observation values but also their geometric support (i.e., soil as a volume). Results of the geostatistical analysis need to be validated by comparison of the isarithmic map with the soil survey map. The use of individual PCs to obtain the isarithmic map is one way to do so. More investigations are recommended.

PAGE 193

APPENDIX A CLASSIFICATION OF SERIES STUDIED

PAGE 194

Soil Series Taxonomic classification (Soil temperature is thermic)* Alaga Albany Angie Apalachee Ardilla Bethera Blanton Bonif ay Bonneau Cantey Chipley Chipola Compass Cowarts Coxville Dothan Duplin Esto Escambia Faceville Fuquay Garcon Greenville Goldsboro Hornsville Iuka Kenansville Kinston Lakeland Leef ield Lucy Lyerly Lynchburg Malbis Mulat Nankin Norfolk Ocilla Oktibbeha Coated Typic Quartz ipsamments Loamy, siliceous Grossarenic Paleudults. Clayey, mixed Aquic Paleudults. Very fine, montmorillonitic Fluvaquentic Dystrochrepts Fine-loamy, siliceous Fragiaquic Paleudults Clayey, mixed Typic Paleaguults. Loamy, siliceous Grossarenic Paleudults. Loamy, siliceous Grossarenic Plinthic Paleudults Loamy, siliceous Arenic Paleudults. Clayey, kaolinitic Typic Albaquults. Coated Typic Quartz ipsamments Loamy, siliceous Arenic Hapludults. Coarse-loamy, siliceous Plinthic Paleudults Fine-loamy, siliceous Typic Hapludults. Clayey, kaolinitic Typic Paleaquults. Fine-loamy, siliceous Plinthic Paleudults. Clayey, kaolinitic Aguic Paleudults. Clayey, kaolinitic Typic Paleudults. Coarse-loamy, siliceous Plinthaguic Paleudults. Clayey, kaolinitic Typic Paleudults. Loamy, siliceous Arenic Plinthic Paleudults. Loamy, siliceous Arenic Hapludults. Clayey, kaolinitic Rhodic Paleudults. Fine-loamy, siliceous Aquic Paleudults. Clayey, kaolinitic Typic Hapludults. Coarse-loamy, siliceous, acid Aguic Udif luvents Loamy, siliceous Arenic Hapludults. Fine-loamy, siliceous, acid Typic Fluvaguents Uncoated Typic Quartz ipsamments. Loamy, siliceous Arenic Plinthaguic Paleudults. Loamy, siliceous Arenic Paleudults. Very-fine, montmorillonitic Vertic Hapludalf s Fine-loamy, siliceous Aerie Paleaguults. Fine-loamy, siliceous Plinthic Paleudults. Coarse-loamy, siliceous Typic Ochraguults. Clayey, kaolinitic Typic Hapludults. Fine-loamy, siliceous Typic Paleudults. Loamy, siliceous Aguic Arenic Paleudults. Very-fine, montmorillonitic Vertic 178

PAGE 195

179 Orangeburg Pansey Pantego Pelham Plummer Rains Redbay Rutlege Sapelo Shubuta Stilson Surrency Tifton Troup Wagram Yemasse Yonges Hapludalf s. Fine-loamy, siliceous Typic Paleudults. Fine-loamy, siliceous Plinthic Paleaquult Fine-loamy siliceous Umbric Paleaquults. Loamy, siliceous Arenic Paleaquults. Loamy, siliceous Grossarenic Paleaquults. Fine-loamy, siliceous Typic Paleaquults. Fine-loamy, siliceous Rhodic Paleudults. Sandy, siliceous Typic humaquepts. Sandy, siliceous Ultic Haplaquods. Clayey, mixed Typic Paleudults. Loamy, siliceous Arenic Plinthic Paleudults. Loamy, siliceous Arenic Umbric Paleaquult Coarse-loamy, siliceous Plinthic Paleudults Loamy, siliceous Grossarenic Paleudults. Loamy, siliceous, Arenic Paleudults. Fineloamy, mixed Aerie Ochraquults. Fine-loamy, mixed Typic Ochraqualfs. Source: Calhoun et al., 1974; Carlisle et al., 1978, 1981, 1985; I.F.A.S. Soil Characterization Laboratory, unpublished data.

PAGE 196

APPENDIX B GEOGRAPHIC COORDINATES OF PEDONS STUDIED

PAGE 197

Pedon Laboratory number number X Coordinate (4.8 km/X) Y Coordinate (4.6 km/Y) 1 153-159 42.50 18.75 2 140-146 43.10 17.40 3 133-139 43.80 21.85 4 147-152 42.00 19.00 5 160-166 40.50 13.10 6 167-172 3.60 17.15 7 101-106 5.25 16.60 8 107-111 3.40 16.70 9 397-401 5.80 17.45 10 409-414 3.90 17.45 11 402-408 29.90 19.15 12 83-89 34.20 19.20 13 90-95 9.40 21.55 14 389-396 10.10 21.60 15 96-99 33.70 19.20 16 379-383 43.80 15.75 17 384-388 7.50 17.85 18 327-332 43.50 14.50 19 271-277 32.90 17.30 20 295-301 42.55 16.45 21 1072-1080 42.20 17.55 22 322-326 43.35 17.60 23 333-337 7.65 21.50 24 338-342 7.25 21.60 25 278-281 7.50 21.60 26 282-284 9.90 21.60 27 285-287 7.40 21.60 28 288-291 41.90 16.50 29 292-294 5.20 16.80 30 1065-1071 40.50 13.85 31 316-321 41.40 16.30 32 1056-1064 43.30 18.30 33 302-307 18.20 43.00 34 1895-1899 64.25 9.85 35 1475-1479 59.25 11.55 36 1427-1433 46.80 20.15 37 1446-1453 64.80 10.95 38 1454-1460 37.50 20.70 39 1469-1474 37.25 20.50 40 1887-1894 42.60 17.40 41 1880-1886 37.90 20.35 42 1872-1879 62.65 8.90 43 1434-1441 59.00 11.40 44 1900-1905 44.50 0.30 45 1461-1468 59.30 10.85 46 1480-1484 37.80 20.10 181

PAGE 198

182 47 2049-2055 48 1375-1379 49 1575-1581 50 628-633 51 2056-2064 52 2065-2071 53 1561-1567 54 1609-1614 55 1582-1588 56 1597-1602 57 634-637 58 1906-1910 59 1603-1608 60 2262-2268 61 2823-2827 62 2382-2387 63 2299-2306 64 2842-2847 65 3283-3288 66 2360-2364 67 2365-2369 68 2377-2381 69 2354-2359 70 3322-3328 71 2866-2874 72 2370-2376 73 2835-2841 74 3268-3273 75 2291-2298 76 2857-2865 77 2347-2353 78 2875-2880 79 2276-2283 80 2617-2623 81 2307-2312 82 2828-2834 83 2624-2630 84 2284-2290 85 2637-2644 86 2810-2817 87 2388-2392 88 2631-2636 89 2645-2652 90 2653-2658 91 3274-3282 92 3417-3425 93 4303-4309 94 4064-4069 95 4070-4077 96 4276-4283 97 4056-4063 58 .90 11. 50 65 00 9.30 21 15 21. 35 22.40 21.20 20 90 21. 25 23 10 20 00 23 95 18 40 22.00 21.10 23 .15 20.85 65. 10 10.15 64 90 9 55 68.60 10.95 68 85 12.70 66 90 10. 40 25 25 18 00 23 .55 20 70 66 50 12.80 65.20 8 .70 38 10 7.00 65.35 10.15 68 50 11 95 67.75 8.75 65 30 9.45 68.70 13 50 64 90 10.15 79 70 11.55 75 30 10.85 67 .30 13.35 67. 95 9.70 66 65 11.45 72 45 13 40 65.70 4 60 43 .95 21 .80 40 40 18.15 40.45 11.30 43.70 15.90 40. 10 11.25 6 40 15.00 40.05 19.70 40.80 15.60 43.95 15.85 9.35 17.45 3.00 17.20 5.80 14.85 10 95 18.10 11.95 14.25 3.10 16.40 3.40 14.00 5.55 16.00 5.45 16.80 34.60 15.50

PAGE 199

98 3395 -3401 99 4086 -4092 100 3388 -3394 101 4101 -4107 102 4270 -4275 103 3410 -3416 104 4284 -4291 105 4108 -4114 106 4078 -4085 107 4263 -4269 108 4298 -4302 109 4050 -4055 110 4619 -4623 111 4756 -4763 112 4847 -4853 113 4478 -4486 114 4788 -4795 115 4874 -4881 116 4510 -4516 117 5214 -5217 118 4796 -4803 119 4504 -4509 120 4991 -4997 121 4517 -4522 122 4750 -4755 123 4764 -4773 124 4499 -4503 125 5131 -5138 126 4493' -4498 127 5139' -5147 128 5148' -5157 129 4898' -4903 130 5027' -5032 131 4470' -4477 132 5062' -5066 133 4804' -4810 134 4836-4839 135 5021-5026 136 4632-4639 137 4998-5003 138 5207-5213 139 4488-4492 140 4892-4897 141 5726-5731 142 5732-5738 143 5714-5719 144 5511-5518 145 5720-5725 146 5493-5498 147 5499-5504 148 55055510 183 28.00 15.25 7.20 17.70 6.60 20.60 5.10 16.50 5.15 14.10 42.10 18.40 11.45 21.55 49.15 14.75 64.70 10.00 65.15 10.55 41.75 18.55 58.70 11.65 56.60 8.50 62.60 12.20 46.60 19.80 46.95 20.10 65.00 11.35 7.35 12.60 47.00 19.85 65.10 12.70 7.65 19.95 60.90 10.90 60.80 11.40 59.40 8.40 60.60 10.50 24.00 14.55 21.00 15.70 22.05 21.35 29.80 14.70 23.15 16.10 60.60 13.20 60.75 10.80 22.30 21.35 63.70 12.55 26.60 19.20 36.95 4.90 20.25 20.90 65.60 7.10 27.00 15.80 38.50 8.60 23.90 14.70 67.55 12.50 25.15 20.60 28.80 10.20 29.00 9.60 67.40 9.20 20.20 14.60 68.30 12.15 38.75 4.00 38.45 11.55 78.10 10.00

PAGE 200

184 149 5396-5402 77.80 10.70 150 5524-5528 27.65 14.45 151 5403-5410 26.55 14.00 Coordinates are relative to a point of origin 30s 00' 00'' N and 87s 24' 18'' W (X = 0 and Y = 0) chosen to ensure that all coordinates would be positive.

PAGE 201

APPENDIX C SEMI -VARIOGRAMS FOR DIRECTIONS WITH LARGEST VARIABILITY

PAGE 202

186

PAGE 203

187

PAGE 204

188 w i 0) -P u H 4-1 4J C 0) -p c o u c o A u u c id U o VWWV9 a; id g H (0 CD H > id o •H CD 03 •P > .C I •H £ CD CD m 0) P Di •H

PAGE 205

189 g res U o H U > I •H g 0) cn W CO I Z T3 0 +J +J •H m +j c OJ 4J c o o >1 ia H o c o N •H 2 CO n a) M CP VWWV9

PAGE 207

APPENDIX D CONTOUR MAPS FOR DIRECTIONS WITH LARGEST VARIABILITY

PAGE 208

192

PAGE 209

193 4 o o £ O 00 rO >i 0 CD Di rO H CU > •H M 01 O E 03 (tf •H U Cn O •H M id > i H e
PAGE 210

194

PAGE 211

APPENDIX E MAP OF PHYSIOGRAPHIC REGIONS IN NORTHWEST FLORIDA

PAGE 212

id •H u o 4-> K to I M § C H CO G O •H CP • CD M XI E rH O CO O •H CTi 00 JS h 10 ft id U co 0 o •H 0 CO M 05 >iCQ rH CU •• H -H 0) d) 4-1 B3 *U O Q CO CO O M U CD T3 4J C C a> o rH 4-1 -H 05 0 -H^H O >irH a •r) 4J C D s: u o u • O CD
PAGE 213

LITERATURE CITED Afifi, A. A., and V. Clark. 1984. Computer-aided multivariate analysis. Lifetime Learning Publications, Belmont, California. Beckett, P.H.T., and S.W. Bie. 1976. Reconaissance for soil survey. II. Pre-survey estimates of the intricacy of the soil pattern. J. Soil Sci. 27:101-110. Beckett, P.H.T., and R. Webster. 1971. Soil variability: A review. Soils and Fert. 34:1-15. Belobrov, V.P. 1976. Variation in some chemical and morphological properties of Sod-podzolic soils within the boundaries of elementary soil areals and taxonomic groups. In V.M. Fridland (ed.), Soil combinations and their genesis. Amerind Pub. Co. Pvt. Ltd. New Delhi. Bie, W.S. 1984. Soil data in digital space. In P. A. Burrough and S.W Bie (eds.), Soil information system technology. Proc. 6th meeting ISSS, Soil information system working group, Pudoc, Wageningen. Bos, J., M.E. Collins, G.J. Gensheimer, and R.B. Brown. 1984. Spatial variability for one type of phosphate mine land in central Florida. Soil Sci. Soc. Am. J. 48:1120-1125. Bradley, T.J. 1972. The climate of Florida. In U.S. Dept. of Commerce. Climates of the states. Oceanic and Atmospheric Administration. Water Information Center Inc., Syosset, New York. Brooks, H.K. 1981a. Guide to the physiographic divisions of Florida. IFAS. Univ. of Florida, Gainesville. Brooks, H.K. 1981b. Map of physiographic divisions of Florida. IFAS. Cooperative Extension Service, Univ. of Florida, Gainesville. Burgess, T.M. and R. Webster. 1980a. Optimal interpolation and isarithmic mapping of soil properties. I. The semi-variogram and punctual kriging. J. Soil Sci. 31:315-331. 197

PAGE 214

198 Burgess, T.M. and R. Webster. 1980b. Optimal interpolation and isarithmic mapping of soil properties. II. Block kriging. J. Soil Sci. 31:333-341. Burgess, T.M. R. Webster, and A.B. McBratney. 1981. Optimal interpolation and isarithmic mapping of soil properties. IV. Sampling strategy. J. Soil Sci. 32:643659. Burrough, P. A. 1981. Fractal dimensions of landscapes and another environmental data. Nature 294:240-242. Burrough, P. A. 1983a. Problems of superimposed effects in the statistical study of the spatial variation of soil. Agr. Water Manag. 6:123-143. Burrough, P. A. 1983b. Multiscale sources of spatial variation in soil. I. The application of fractal concepts to nested levels of soil variation. J. Soil Sci. 34:577-597. Burrough, P. A. 1983c. Multiscale sources of spatial variation in soil. II. A non-Brownian fractal model and its application in soil survey. J. Soil Sci. 34: 599-620. Burrough, P. A. and R. Webster. 1976. Improving a reconnaissance soil classification by multivariate methods. J. Soil Sci. 27:534-571. Byers, E. and D.B. Stephens. 1983. Statistical and stochastic analyses of hydraulic conductivity and particle size in a fluvial sand. Soil Sci. Soc. Am. J. 47:1072-1081. Caldwell, R.E. 1980. Major land resources areas in Florida. Soil Crop Sci. Soc. Fla. Proc. 39:38-40. Calhoun, F.G, V.W. Carlisle, R.E. Caldwell, L.W. Zelazny, L.C. Hammond, and H.L Breland. 1974. Characterization data for selected Florida soils. Soil Sci. Dept. Research Report No. 74-1. IFAS, Univ. of Florida, Gainesville Campbell, J.B. 1978. Spatial variation of sand content and pH within single contiguous delineations of two soil mapping units. Soil Sci. Soc. Am. J. 42:460-464. Carlisle, V.W. and R.B. Brown. 1982. Florida soil identification handbook. Soil Sci. Dept. IFAS. Univ. of Florida, Gainesville.

PAGE 215

199 Carlisle, V.W. R.E. Caldwell, F. Sodek III, L.C. Hammond, F.G. Calhoun, M.A. Granger, and H.L. Breland. 1978. Characterization data for selected Florida soils. Soil Sci. Dept. Research Report No. 78-1. IFAS. Univ. of Florida, Gainesville. Carlisle, V.W. M.E. Collins, F. Sodek III, and L.C Hammond. 1985. Characterization data for selected Florida soils. Soil Sci. Dept. Research Report No 85-1. IFAS. Univ. of Florida, Gainesville. Carlisle, V.W. C.T. Hallmark, F. Sodek III, R.E. Caldwell, L.C. Hammond, and V.E. Berkheiser. 1981. Characterization data for selected Florida soils. Soil Sci. Dept. Research Report No. 81-1. IFAS. Univ. of Florida, Gainesville. Chirlin, G.R. and G. Dagan. 1980. Theoretical head variograms for steady flow in statistical homogeneous aquifers. Water Res. Res. 16:1001-1015. Cuanalo, H.E., and R. Webster. 1970. A comparative study of numerical classification and ordination of soil profiles in a locality near Oxford. I. Analysis of 85 sites. J. Soil Sci. 23:62-75. DeGraffenreid, J. A. 1982. Timeand space-dependent data in the earth sciences. Kansas Geol. Survey, Series in Spatial Analysis No. 6. Univ. of Kansas, Lawrence. Duffee, E.M., W.J. Allen, and H.C. Ammons. 1979. Soil survey of Jackson County, Florida. U.S.D.A., U.S. Govt. Printing Office, Washington, D.C. Duffee, E.M., R.A. Baldwin, D.L. Lewis, and W.B. Warmack. 1984. Soil survey of Bay County, Florida. U.S.D.A., U.S. Govt. Printing Office, Washington, D.C. Edmonds, W. J. J.B. Campbell, and M. Lentner. 1985. Taxonomic variation within three mapping units in Virginia. Soil Sci. Soc. Am. J. 49:394-401. Enertronics. 1983. Energraphics Version 1.3. Enertronics Research Inc., St Louis, Missouri. Fernald, E.A. 1981. Atlas of Florida. Florida St. Univ., Tallahassee. Fernald, E.A. and D.J. Patton. 1984. Water resources, Atlas of Florida. Florida St. Univ., Tallahassee.

PAGE 216

200 Fridland, V.M. 1976. The soil-cover pattern: Problems and methods of investigation. In V.M. Fridland (ed.), Soil combinations and their genesis. Amerind Pub. Co. Pvt. Ltd. New Delhi Gambolati, G. and G. Volpi. 1979. Groundwater mapping in Venice by stochastic interpolators. I. Theory. Water Res. Res. 15:281-290. Green, W. 1985. Computer-aided data analysis: A practical guide. John Wiley & Sons, New York. Gower, J.C. 1966. Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika, 55:325-338. Gutjahr, A. 1984. Spatial variability: Geostatistical methods. In D.R. Nielsen and J. Bouma (eds.), Soil spatial variability. Proc. Workshop ISSS and SSSA, Pudoc, Wageningen. Hajrasuliha, S., N. Baniabbassi, J. Metthey, and D.R. Nielsen. 1980. Spatial variability of soil sampling for salinity studies in southwest Iran. Irrigation Sci. 1:197-208. Henley, S. 1981. Nonparametric geostatistics Applied Science Pub., London. Hole, F.D., and J.B. Campbell. 1985. Soil landscape analysis. Rowman & Allanheld Pub., New York. Huijbregts, Ch.J. 1975. Regionalized variables and guantitative analysis of spatial data. In J.C. Davis and M.J. McCullagh (eds.), Display and analysis of spatial data. NATO, Advanced Study Inst. John Wiley & Sons, London ITsoftware Inc. 1984. KeepIT version 2.4. Martin Marietta Data Systems, Princeton, New Jersey. Jim Yeh, T.C., L.W. Gelhar, and P.J. Wierenga. 1986. Observations of spatial variability of soil-water pressure in a field soil. Soil Sci. 142:7-12. Journel, G.A., and Ch.J. Huijbregts. 1978. Mining geostatistics. Academic Press, London. Kyuma, K. and K. Kawaguchi. 1973. A method of fertility evaluation for paddy soils. I. First approximation: Chemical potentiality grading. Soil Sci. Plant Nutr. 19:1-9.

PAGE 217

201 Lanyon, L.E., and G.F. Hall. 1980. Application of autocorrelation analysis to transect data from a drainage basin in eastern Ohio. Soil Sci. Soc. Am. J. 45:368-373. Luxmoore, S.R., B.P. Spalding, and I.M. Munro. 1981. Areal variation and chemical modification of weatheredshale infiltration characteristics. Soil Sci. Soc. Am. J. 45:687-691. Mandelbrot, B.B. 1977. Fractals, form, chance, and dimension. W.H. Freeman & Co., San Francisco. Matheron, G. 1963. Principles of geostatistics Economic Geology 58:1246-1266. Matheron, G. 1969. Cours de geostatistique. In R.A. Olea, 1975. Optimum mapping techniques using regionalized variable theory. Kansas Geol. Survey, Series in Spatial Analysis No. 2. Univ. of Kansas, Lawrence. Matheron, G. 1970. La the'orie des variables regionalisees et ses applications. In G. Gambolanti and G. Volpi, 1979. Groundwater mapping in Venice by stochastic interpolators. I. Theory. Water Res. Res. 15:281-290. Matheron, G. 1971. The theory of regionalized variables and its applications. In J. A. DeGraf f enreid (ed.), 1982. Timeand space-dependent data in the earth sciences. Kansas Geol. Survey, Series in Spatial Analysis No. 6. Univ. of Kansas, Lawrence. McBratney, A.B., and R. Webster. 1981. Spatial dependence and classification of the soil along a transect in northeast Scotland. Geoderma 26:63-82. McBratney, A.B. and R. Webster. 1983a. Optimal interpolation and isarithmic mapping of soil properties. V. Co-regionalization and multiple sampling strategy. J. Soil Sci. 34:137-162. McBratney, A.B. and R. Webster. 1983b. How many observations are needed for regional estimation of soil properties?. Soil Sci. 135:177-183. McBratney, A.B., R. Webster, R.G. McLaren, and R.B. Spiers. 1982. Regional variation of extractable copper and cobalt in the topsoil of south-east Scotland. Agronomie 10:969-982.

PAGE 218

202 McCormack, D.E., and L.P. Wilding. 1969. Variation in soil properties within mapping units of soils with contrasting substrata in northwestern Ohio. Soil Sci. Soc. Am. Proc. 33:587-593. Montgomery, D.C. 1976. Design and analysis of experiments. John Wiley & Sons, New York. Nash, M.H.H. 1985. Numerical classification, spatial dependence, and vertical kriging of soil sites in southern New Mexico. Master's thesis. New Mexico St. Univ., Las Cruces. National Cooperative Soil Survey. 1982. Official series description for Orangeburg series. Author, Washington, D.C. Norris, J.M. 1972. The application of multivariate analysis to soil studies. III. Soil variation. J. Soil Sci. 23:62-75. Olea, R.A. 1975. Optimum mapping technigues using regionalized variable theory. Kansas Geol. Survey, Series in Spatial Analysis No. 2. Univ. of Kansas, Lawrence. Olea, R.A. 1977. Measuring spatial dependence with semivariograms Kansas Geol. Survey, Series in Spatial Analysis No. 3. Univ. of Kansas, Lawrence. Olea, R.A. 1984. Systematic sampling of spatial functions. Kansas Geol. Survey, Series in Spatial Analysis No. 7. Univ. of Kansas, Lawrence. Phillips, J.D. 1986. Spatial analysis of shoreline erosion, Delaware Bay, New Jersey. Annals Assoc. Am. Geo. 76:50-62. Rao, P. V., P.S.C. Rao, J.M. Davidson, and L.C. Hammond. 1979. Use of goodness-of -f it test for characterizing the spatial variability of soil properties. Soil Sci. Soc. Am. J. 43:274-278. Richardson, J.L, and R.J. Bigler. 1984. Principal component analysis of prairie pothole soils in North Dakota. Soil Sci. Soc. Am. J. 48:1350-1355. Rogowski, A.S., R.M. Khanbilvardi and R.J. DeAngelis. 1985. Estimation erosion on plot, field, and watershed scales. In S.A. El-Swalfy, W.C Moldenhauer, and A. Lo (eds.), Soil erosion and conservation. Soil Cons. Soc. of Am. Ankeny, Iowa.

PAGE 219

203 Ruhe, R.V. 1969. Quaternary landscapes in Iowa. Iowa St. Univ. Press, Ames, Iowa. Russo, D. 1984. Design of an optimal sampling network for estimating the variogram. Soil Sci. Soc. Am. J. 48:708716. Saddig, M.H. P.J. Wierenga, J.M.H. Hendrickx, and M.Y. Hussain. 1985. Spatial variability of soil water tension in an irrigated soil. Soil Sci. 140:126-132. Sampson, R. 1978. Surface II graphics system. Kansas Geol. Survey, Series in Spatial Analysis No 1. Univ. of Kansas, Lawrence. Sanders, T.E. 1981. Soil survey of Leon County, Florida. U.S.D.A., U.S. Govt. Printing Office, Washington, D.C. SAS Institute Inc. 1982a. SAS User's guide: Basics. 1982 Edition. Cary, North Carolina. SAS Institute Inc. 1982b. SAS User's guide: Statistics. 1982 Edition. Cary, North Carolina. Shaw, G. and D. Wheeler. 1985. Statistical technigues in geographical analysis. John Wiley & Sons, New York. Silk, J. 1979. Statistical concepts in geography. George Allen & Unwin Ltd. London. Skrivan, J.A. and M.R. Karlinger. 1979. Semivariogram estimation and universal kriging program. U.S. Geol. Survey, Water Res. Div. Tacoma, Washington. Snedecor, G.W. and W.G. Cochran. 1980. Statistical methods. Iowa St. Univ. Press, Ames, Iowa. Soil Survey Staff. 1951. Soil Survey Manual. U.S.D.A., Handbook No. 18. U.S. Govt. Printing Office, Washington, D.C. Soil Survey Staff. 1975. Soil Taxonomy: A basic system of soil classification for making and interpreting soil surveys. U.S.D.A. Agric. Handbook No. 436. U.S. Govt. Printing Office, Washington, D.C. Soil Survey Staff. 1981. Soil Survey Manual. Chapter 4. U.S.D.A. Directive 430-V-SSM. U.S. Govt. Printing Office, Washington, D.C. Sokal, R.R., and F.J. Rohlf. 1981. Biometry. 2nd. Edition. W.H. Freeman & Co., New York.

PAGE 220

204 SSI Software. 1985. WordPerfect version 4.1. WordPerfect Corporation, Orem, Utah. Sullivan, J.L., H.H. Weeks, E.M. Duffee, B.P. Thomas, H.C. Ammonds, and M.L. Harrell. 1975. Soil survey of Holmes County, Florida. U.S.D.A., U.S. Govt. Printing Office, Washington, D.C. Tabor, J. A., A.W. Warrick, D.A. Pennington, and D.E. Myers. 1984. Spatial variability of nitrate in irrigated cotton. I. Petioles. Soil Sci. Soc. Am. J. 48:602-607. Ten Berge, H.F.M., L. Stroosni jder P. A. Burrough, A.K. Bregt, and M.J. de Heus. 1983. Spatial variability of physical soil properties influencing the temperature of the soil surface. Agr. Water Manag. 6:213-226. Tipper, J.C. 1979. Surface modelling techniques. Kansas Geol. Survey, Series in Spatial Analysis No. 4. Univ. of Kansas, Lawrence. Trangmar, B.B., R.S. Yost, and G. Uehara. 1985. Application of geostatistics to spatial studies of soil properties. Adv. in Agronomy, Vol. 38. Academic Press, New York. Van Kuilenburg, J., J.J. De Gruijter, B.A. Marsman, and J. Bouma. 1982. Accuracy of spatial interpolation between point data of soil moisture supply capacity, compared with estimates from mapping units. Geoderma 27:311-325. Vander Zaag, P., R.L. Fox, R.S. Yost, B.B. Trangmar, K. Hayashi, and G. Uehara. 1981. Spatial variability in selected properties of Rwanda soils. Proc. 4th. Int. Soil Class. Workshop, Kigali, Rwanda. Vauclin, M. S.R. Vieira, G. Vachaud, and D.R. Nielsen. 1983. The use of cokriging with limited field soil observations. Soil Sci. Soc. Am. J. 47:175-184. Vieira, S.R., J.L. Hatfield, D.R. Nielsen, and J.W. Biggar. 1983. Geostatistical theory and application of variability of some agronomical properties. Hilgardia 51:1-75. Vieira, S.R., D.R. Nielsen, and J.W. Biggar. 1981. Spatial variability of field-measured infiltration rate. Soil Sci. Soc. Am. J. 45:1040-1048. Walker, P.H. G.F. Hall, and R. Protz. 1968. Soil trends and variability across selected landscapes in Iowa. Soil Sci. Soc. Am. Proc. 32:97-101.

PAGE 221

205 Webster, R. 1977. Quantitative and numerical methods in soil classification and survey. Oxford Univ. Press, London Webster, R. 1985. Quantitative spatial analysis of soil in the field. Advance in soil science, Vol. 3. SpringerVerlag, New York. Webster, R. and T.M. Burgess. 1980. Optimal interpolation and isarithmic mapping of soil properties. III. Changing drift and universal kriging. J. Soil Sci. 31:505-524. Webster, R. and T.M. Burgess. 1984. Sampling and bulking strategies for estimating soil properties in small regions. J. Soil Sci. 35:127-140. Webster, R. and T.M. Burgess. 1983. Spatial variation in soil and the role of kriging. Agric. Water Manag. 6:111122. Webster, R. and P. A. Burrough. 1972. Computer-based soil mapping of small areas from sample data. I. Multivariate classification and ordination. J. Soil Sci. 23:210-221. Webster, R. and H.E. Cuanalo. 1975. Soil transect correlograms of north Oxfordshire and their interpretation. J. Soil Sci. 26:176-194. Webster, R. and S. Nortcliff. 1984. Improved estimation of micro nutrients in hectare plots of the Sonning series. J. Soil Sci. 35:667-672. Weeks, H.H., A.G. Hyde, A. Roberts, D. Lewis, and C. Peters. 1980. Soil survey of Santa Rosa County, Florida. U.S.D.A., U.S. Govt. Printing Office, Washington, D.C. Wilding, L.P., and L.R. Drees. 1978. Spatial variability: a pedologist's viewpoint. In Diversity of soils in the tropics. Soil Sci. Soc. Am. Spec. Publ. 34:1-12. Wilding. L.P., and L.R. Drees. 1983. Spatial variability and pedology. In L.P. Wilding, N.E. Smeck, and G.F. Hall (eds.), Pedogenesis and soil taxonomy. Vol. I. Concepts and interactions. Elsevier Sci. Pub., New York. Williams, C, and J.H. Rayner. 1977. Variability in three areas of the Denchworth soil map unit. III. Soil grouping based on chemical composition. J. Soil Sci. 28:180-195.

PAGE 222

206 Xu, J., and R. Webster. 1984. A geostatistical study of topsoil properties in Zhangwu County, China. Catena 11:13-26. Yost, R.S., G. Uehara, and R.F. Fox. 1982a. Geostatistical analysis of soil chemical properties of large land areas. I. Semivariograms Soil Sci. Soc. Am. J. 46:1028-1032. Yost, R.S., G. Uehara, and R.F. Fox. 1982b. Geostatistical analysis of soil chemical properties of large land areas. II. Kriging. Soil Sci. Soc. Am. J. 46:1033-1037.

PAGE 223

BIOGRAPHICAL SKETCH Francisco Antonio Ovalles Viani was born in Caracas, Venezuela, on August 1, 1950. He is son of the late Dr. Pedro Jose Ovalles and Mrs. Alba Viani de Ovalles. He received the degree of Ingeniero Agronomo in the Universidad Central de Venezuela in February, 1976. In March, 1976, he joined the Ministerio de Obras Publicas, Direccion General de Recursos Hidraulicos in the Division de Edafologia, Region Central. In April, the former official institution became Ministerio del Ambiente y de los Recursos Naturales Renovables. From January, 1978, to December, 1980, he was also a member of the Soil Science Department of the Facultad de Agronomia, Universidad Central de Venezuela. In August 1982, he received a scholarship from Consejo Nacional de Investigaciones Cientificas y Tecnologicas (CONICIT) to pursue graduate studies. He was accepted for a graduate program in the Soil Science Department, University of Florida, in August, 1982, under the guidance of Dr. Mary E. Collins. He received the degree of Master of Science from the University of Florida in December, 1984. 207

PAGE 224

208 He is member of Colegio de Ingenieros de Venezuela, Sociedad Venezolana de Ingenieros Agronomos, Sociedad Venezolana de la Ciencia del Suelo, American Society of Agronomy, Soil Science Society of America, International Soil Science Society, and honor societies Sigma Xi and Gamma Sigma Delta. He served as secretary of Sociedad Venezolana de la Ciencia del Suelo in 1982. He is married to Giordana de Ovalles; they have two children, Johanna Fernanda and Pedro Jose.

PAGE 225

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. M. E.^£ollins, Chairman Associate Professor of Soil Science I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Antonini Professor of Geography I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. R. W. Arnold Professor of Soil Science I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. if MB 3 R. B. Brown Associate Professor of Soil Science

PAGE 226

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. A~. S Fotherwigham Associate Professor of Geography This dissertation was submitted to the Graduate Faculty of the College of Agriculture and to the Graduate School and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. December, 1986 g^A of. Jl^f Dean, College of Agi#culture Dean, Graduate School


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EK269E4VU_JBVEXD INGEST_TIME 2015-01-21T18:55:03Z PACKAGE AA00026617_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES