<%BANNER%>

Using a Knowledge-Based System to Test the Transferability of a Soil-Landscape Model in Northeastern Vermont

Permanent Link: http://ufdc.ufl.edu/UFE0022712/00001

Material Information

Title: Using a Knowledge-Based System to Test the Transferability of a Soil-Landscape Model in Northeastern Vermont
Physical Description: 1 online resource (83 p.)
Language: english
Creator: Mckay, Jessica
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2008

Subjects

Subjects / Keywords: digital, fuzzy, inference, knowledge, mapping, soil
Soil and Water Science -- Dissertations, Academic -- UF
Genre: Soil and Water Science thesis, M.S.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Knowledge-based digital soil mapping has been used extensively to predict soil taxonomic and physico-chemical soil characteristics. Fuzzy logic knowledge-based models allow explicit integration of knowledge and expertise from soil mappers familiar with a region. Questions remain about the transferability of soil-landscape models developed in one region to other regions. Objectives of this study were to develop and evaluate a knowledge-based model to predict soil series and fuzzy drainage classes and assess its transferability potential between similar soil landscapes in Essex County, Vermont. Two study areas, study area (W1), 3.5 km2 in size and study area (W2), 1.9 km2 in size, were sampled at 128 and 42 sites, respectively. Both study areas are located in Essex County, Vermont. The bedrock in the area is phyllite and schist. Vegetation is spruce-fir and mixed northern-hardwood forests. The topography of the study areas is a series of hills and narrow valleys. Deep, loamy basal till covers the modeled area. Rule-based fuzzy inference was used based on fuzzy membership functions characterizing soil-environment relationships to create a model derived from expert knowledge (soil scientists) using 70 percent of sampled sites in W1. The model was implemented using the Soil Inference Engine (SIE), which provides tools and a user-friendly interface for soil scientists to prepare environmental data, define soil-environment models, run soil inference, and compile final map products. The soil prediction model was created and evaluated in W1 using 38 validation sites and transferred and validated in W2 using 42 validation sites. Defuzzified raster predictions were compared to field mapped soil series and fuzzy drainage class properties to assess their accuracy. The model was found to be highly transferable between the two areas. In W1 the model was 73.7 and 88.8 percent accurate in predicting soil series and fuzzy drainage classes using an independent validation set, respectively. In W2, similar results were achieved, with 71.4 and 89.9 percent accuracy in predicting soil series and drainage class. With more research into pre-processing tools to enhance the knowledge being fed into the inference engine, these accuracy numbers may be improved in the future. It was shown that the prediction model was transferable to a landscape with similar soil characteristics; however, it is critical to identify constraints and thresholds that limit transferability of prediction models to other soil-landscapes.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Jessica Mckay.
Thesis: Thesis (M.S.)--University of Florida, 2008.
Local: Adviser: Grunwald, Sabine.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2010-12-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2008
System ID: UFE0022712:00001

Permanent Link: http://ufdc.ufl.edu/UFE0022712/00001

Material Information

Title: Using a Knowledge-Based System to Test the Transferability of a Soil-Landscape Model in Northeastern Vermont
Physical Description: 1 online resource (83 p.)
Language: english
Creator: Mckay, Jessica
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2008

Subjects

Subjects / Keywords: digital, fuzzy, inference, knowledge, mapping, soil
Soil and Water Science -- Dissertations, Academic -- UF
Genre: Soil and Water Science thesis, M.S.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Knowledge-based digital soil mapping has been used extensively to predict soil taxonomic and physico-chemical soil characteristics. Fuzzy logic knowledge-based models allow explicit integration of knowledge and expertise from soil mappers familiar with a region. Questions remain about the transferability of soil-landscape models developed in one region to other regions. Objectives of this study were to develop and evaluate a knowledge-based model to predict soil series and fuzzy drainage classes and assess its transferability potential between similar soil landscapes in Essex County, Vermont. Two study areas, study area (W1), 3.5 km2 in size and study area (W2), 1.9 km2 in size, were sampled at 128 and 42 sites, respectively. Both study areas are located in Essex County, Vermont. The bedrock in the area is phyllite and schist. Vegetation is spruce-fir and mixed northern-hardwood forests. The topography of the study areas is a series of hills and narrow valleys. Deep, loamy basal till covers the modeled area. Rule-based fuzzy inference was used based on fuzzy membership functions characterizing soil-environment relationships to create a model derived from expert knowledge (soil scientists) using 70 percent of sampled sites in W1. The model was implemented using the Soil Inference Engine (SIE), which provides tools and a user-friendly interface for soil scientists to prepare environmental data, define soil-environment models, run soil inference, and compile final map products. The soil prediction model was created and evaluated in W1 using 38 validation sites and transferred and validated in W2 using 42 validation sites. Defuzzified raster predictions were compared to field mapped soil series and fuzzy drainage class properties to assess their accuracy. The model was found to be highly transferable between the two areas. In W1 the model was 73.7 and 88.8 percent accurate in predicting soil series and fuzzy drainage classes using an independent validation set, respectively. In W2, similar results were achieved, with 71.4 and 89.9 percent accuracy in predicting soil series and drainage class. With more research into pre-processing tools to enhance the knowledge being fed into the inference engine, these accuracy numbers may be improved in the future. It was shown that the prediction model was transferable to a landscape with similar soil characteristics; however, it is critical to identify constraints and thresholds that limit transferability of prediction models to other soil-landscapes.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Jessica Mckay.
Thesis: Thesis (M.S.)--University of Florida, 2008.
Local: Adviser: Grunwald, Sabine.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2010-12-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2008
System ID: UFE0022712:00001


This item has the following downloads:


Full Text

PAGE 1

USING A KNOWLEDGE-BASED SYSTEM TO TEST THE TRANSFERABILITY OF A SOIL-LANDSCAPE MODEL IN NORTHEASTERN VERMONT By JESSICA MCKAY A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2008 1

PAGE 2

2008 Jessica McKay 2

PAGE 3

To my parents, especially my dad; the only person other than my advisors who even tried to read this whole thesis. Also to my husband, because even though he has no idea what this is about, he did cook me dinner many nights while I was in between work and school. 3

PAGE 4

ACKNOWLEDGMENTS I thank my advisory committee: Dr. Sabine Grunwald and Dr. Willie Harris of the University of Florida, and Dr. Xun Shi of Dartmouth College, who all offered important insight into what needed to be in this document. I also thank Roger DeKett and Tom Burke, two members of our team at the NRCS who dug and described many of the holes for this study. Finally, I thank Robert Long, who I work next to every day. Not only did he help me dig holes and describe soils for this project, he has been a valuable source of knowledge and support since day one. 4

PAGE 5

TABLE OF CONTENTS page ACKNOWLEDGMENTS...............................................................................................................4 LIST OF TABLES...........................................................................................................................7 ABSTRACT...................................................................................................................................10 CHAPTER 1 INTRODUCTION..................................................................................................................12 Traditional Soil Mapping........................................................................................................12 Predictive Modeling................................................................................................................12 Digital Soil Mapping.......................................................................................................14 Fuzzy Logic.....................................................................................................................15 Digital Elevation Models.................................................................................................16 Digital Modeling Approaches and Methods....................................................................16 Knowledge-Based Models...............................................................................................18 Soil Inference Engine......................................................................................................19 Model Transferability.............................................................................................................19 2 OBJECTIVES AND HYPOTHESIS......................................................................................21 3 METHODOLOGY.................................................................................................................22 Study Area..............................................................................................................................22 Field Sampling........................................................................................................................25 Model Development...............................................................................................................30 Data Preparation.....................................................................................................................32 Rules.......................................................................................................................................39 Evaluation...............................................................................................................................41 4 RESULTS AND DISCUSSION.............................................................................................44 Final Predictions.....................................................................................................................44 Evaluation of Predicting Soil Series.......................................................................................53 Fuzzy Drainage Class.............................................................................................................55 Discussion...............................................................................................................................57 5 SUGGESTIONS FOR FURTHER RESEARCH...................................................................61 5

PAGE 6

APPENDIX A DOCUMENTATION EXAMPLES.......................................................................................63 B VEGETATIVE ARTIFACTS IN DIGITAL ELEVATION DATA......................................66 C FUZZY DRAINAGE CLASS DESIGNATIONS..................................................................67 D PREDICTION RESULTS FROM W1 (MULTIPLE SAMPLE CONFIGURATIONS).......74 LIST OF REFERENCES...............................................................................................................80 BIOGRAPHICAL SKETCH.........................................................................................................83 6

PAGE 7

LIST OF TABLES Table page 3-1 Study area comparison.......................................................................................................23 3-2 Soil series modeled in W1 and W2....................................................................................25 3-3 Rules for Cabot, Colonel, and Dixfield soils.....................................................................39 3-4 Evaluation criteria for fuzzy drainage class.......................................................................42 3-5 Matrix of fuzzy membership designations comparing SIE results and fuzzy drainage classes................................................................................................................................43 4-1 Confusion table that compares calibration prediction results based on SIE to observed soil series including most similar soil series using 90 model development sites in W1..........................................................................................................................53 4-2 Confusion table that compares validation prediction results based on SIE to observed soil series including most similar soil series using 38 independent evaluation sites in W1......................................................................................................................................53 4-3 Confusion table that compares validation prediction results based on SIE to observed soil series including most similar soil series using 42 validation independent evaluation sites in W2........................................................................................................54 4-4 Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 using 9 calibration runs.....................................................................................................................................54 4-5 Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 using 9 validation runs..............55 4-6 Confusion table that compares calibration prediction results based on SIE to observed drainage classes using 90 model development sites in W1................................55 4-7 Confusion table that compares validation prediction results based on SIE to observed drainage classes using 38 independent evaluation sites in W1..........................................56 4-8 Confusion table that compares validation prediction results based on SIE to observed drainage classes using 42 independent evaluation sites in W2..........................................56 4-9 Percent accuracy overall based on fuzzy drainage class membership (Validation)..........56 C-1 Study area W1 fuzzy drainage class designations (validation)..........................................67 C-2 Study area W2 fuzzy drainage class designations (validation)..........................................68 7

PAGE 8

C-3 Study Area W-1 Fuzzy drainage class designations (calibration).....................................70 D-1 Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 2)..............74 D-2 Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 2)........................74 D-3 Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 3)..............74 D-4 Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 3)........................75 D-5 Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 4)..............75 D-6 Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 4)........................75 D-7 Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 5)..............76 D-8 Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 5)........................76 D-9 Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 6)..............76 D-10 Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 6)........................77 D-11 Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 7)..............77 D-12 Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 7)........................77 D-13 Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 8)..............78 D-14 Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 8)........................78 D-15 Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 9)..............78 D-16 Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 9)........................79 8

PAGE 9

LIST OF FIGURES Figure page 3-1 Essex County, Vermont and study areas W1 and W2.......................................................24 3-2 Study area W1 sample points.............................................................................................27 3-3 Study area W2 sample points.............................................................................................29 3-4 Elevation for study area W1..............................................................................................33 3-5 Elevation for study area W2..............................................................................................34 3-6 Slope for study area W1.....................................................................................................35 3-7 Slope for study area W2.....................................................................................................36 3-8 Wetness index for study area W1......................................................................................37 3-9 Wetness index for study area W2......................................................................................38 3-10 Inference interface for Colonel (ArcSIE). A) bell-shaped curve for wetness index, B) Z-shaped curve for slope....................................................................................................40 4-1 Fuzzy prediction map of Cabot soil series for study area W1...........................................45 4-2 Fuzzy prediction map of Colonel soil series for study area W1........................................46 4-3 Fuzzy prediction map of Dixfield soil series for study area W1.......................................47 4-4 Fuzzy prediction map of Cabot soil series for study area W2...........................................48 4-5 Fuzzy prediction map of Colonel soil series for study area W2........................................49 4-6 Fuzzy prediction map of Dixfield soil series for study area W2.......................................50 4-7 Final prediction maps of soil series for study area W1......................................................51 4-8 Final prediction map of soil series for study area W2.......................................................52 A-1 Sample point 127 description.............................................................................................63 A-2 Sample point 127 profile photo..........................................................................................64 A-3 Sample point 127 landscape photo....................................................................................65 B-1 Vegetative artifacts in digital elevation data......................................................................66 9

PAGE 10

Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science USING A KNOWLEDGE-BASED SYSTEM TO TEST THE TRANSFERABILITY OF A SOIL-LANDSCAPE MODEL IN NORTHEASTERN VERMONT By Jessica McKay December 2008 Chair: Sabine Grunwald Major: Soil and Water Science Knowledge-based digital soil mapping has been used extensively to predict soil taxonomic and physico-chemical soil characteristics. Fuzzy logic knowledge-based models allow explicit integration of knowledge and expertise from soil mappers familiar with a region. Questions remain about the transferability of soil-landscape models developed in one region to other regions. Objectives of this study were to develop and evaluate a knowledge-based model to predict soil series and fuzzy drainage classes and assess its transferability potential between similar soil landscapes in Essex County, Vermont. Two study areas, study area (W1), 3.5 km2 in size and study area (W2), 1.9 km2 in size, were sampled at 128 and 42 sites, respectively. Both study areas are located in Essex County, Vermont. The bedrock in the area is phyllite and schist. Vegetation is spruce-fir and mixed northern-hardwood forests. The topography of the study areas is a series of hills and narrow valleys. Deep, loamy basal till covers the modeled area. Rule-based fuzzy inference was used based on fuzzy membership functions characterizing soil-environment relationships to create a model derived from expert knowledge (soil scientists) 10

PAGE 11

using 70 percent of sampled sites in W1. The model was implemented using the Soil Inference Engine (SIE), which provides tools and a user-friendly interface for soil scientists to prepare environmental data, define soil-environment models, run soil inference, and compile final map products. The soil prediction model was created and evaluated in W1 using 38 validation sites and transferred and validated in W2 using 42 validation sites. Defuzzified raster predictions were compared to field mapped soil series and fuzzy drainage class properties to assess their accuracy. The model was found to be highly transferable between the two areas. In W1 the model was 73.7 and 88.8 percent accurate in predicting soil series and fuzzy drainage classes using an independent validation set, respectively. In W2, similar results were achieved, with 71.4 and 89.9 percent accuracy in predicting soil series and drainage class. With more research into pre-processing tools to enhance the knowledge being fed into the inference engine, these accuracy numbers may be improved in the future. It was shown that the prediction model was transferable to a landscape with similar soil characteristics; however, it is critical to identify constraints and thresholds that limit transferability of prediction models to other soil-landscapes. 11

PAGE 12

CHAPTER 1 INTRODUCTION Traditional Soil Mapping Conventional soil survey methods are relatively expensive in terms of time and cost required to complete them. There are three main steps that make up soil survey according to Cook et al. (1996). The first step consists of observing ancillary data such as aerial photography, geology, and vegetation, along with soil profile characteristics. The second step requires these observations to be incorporated into an implicit conceptual model that is used to infer on the variation of soils. The third step is the practice of applying the conceptual model to the survey area in order to predict the soil variation and occurrence at unobserved sites. Commonly, soil scientists develop soil-landscape relationships using site-specific information that is translated to unsampled locations across a landscape. The survey process relies on tacit knowledge that is passed from surveyor to surveyor through training and experience and is never fully captured in documentation. Traditional soil mapping products utilize polygons, or crisp map units, which suggests abrupt changes from one map unit or soil type to another. This only allows each location on the landscape to fit into the constraints of one map unit, which does not accurately reflect the soil landscape. One way that scientists attempt to remedy this is to use a continuous field model, which uses pixels or voxels rather than polygons to reflect the gradual change of soil attributes across the landscape (Grunwald, 2006). Predictive Modeling For years, soil scientists have been working to build quantitative predictive models to a large extent based on the five factors of soil formation as described by Jenny (1941): 12

PAGE 13

S = f(Cl, O, R, P, T) (1-1) where S = soil Cl = climate O = organisms R = relief P = parent material T = time The most prominent soil-landscape model that underlies research studies was set forth by McBratney et al. (2003) and is known as the SCORPAN model. This model can be written as either: Sc = f(s,c,o,r,p,a,n) (1-2) or Sa = f(s,c,o,r,p,a,n) (1-3) where Sc is soil class and Sa is a soil attribute. The SCORPAN model is unique in that it includes s, soil, and n, spatial position, as factors. McBratney et al. (2003) pointed out that soil can be predicted from its properties, and that soil properties can be predicted from soil classes or from other soil properties. The reason s can be part of the model is the fact that soil properties and classes are correlated (linked) with each other. For example, drainage class is dependent on other soil properties such as soil texture, porosity, organic matter content, and others. Soil properties can be derived from remote or proximal sensing or from expert knowledge. Also implicit in the SCORPAN model are the spatial coordinates x,y and an approximate time coordinate ~ t (McBratney et al., 2003). Often, soil variability is primarily controlled by topography (Thompson et al., 2006), while in some landscapes other factors such as land use and land cover control soil variability. The predictive models are generally based on this concept, or, more specifically, the catena concept (Milne, 1935), which indicates that soil profiles that occur on topographically associated 13

PAGE 14

landscapes will be repeated on similar landscapes. A catena is a sequence of soils that are about the same age, which are derived from similar parent material and which occur under similar climatic conditions, but which have different characteristics due to variation in relief and drainage (Grunwald, 2006). A catena model developed on one hillslope has the potential to be transferred to adjacent hillslopes with similar landscape characteristics. Digital Soil Mapping Digital soil mapping techniques are rapidly being developed that take advantage of the vast quantity of information technologies available to the soils discipline. Digital soil mapping is defined by the International Working Group on Digital Soil Mapping as the creation and population of spatial soil information systems by the use of field and laboratory observational methods coupled with spatial and non-spatial soil inference systems (McBratney, 2006). The concept of soil inference systems was introduced by McBratney et al. (2002) as a way of using pedotransfer functions as knowledge rules for inference engines. Soil inference systems take information that is known with a given level of (un)certainty and use pedotransfer functions to infer data that is unknown. A pedotransfer function (PTF), according to Bouma (1989), is a process of translating data we have into what we need. There are two types of PTFs based on the amount of information that is available. Class PTFs predict soil properties based on the class to which the soil sample belongs (such as textural class, or any other class that the soil scientist defines). Continuous PTFs, on the other hand, predict certain soil properties as a continuous function of one or more measured variables (Wsten et al., 1995). Another classification of PTFs has been given by McBratney et al. (2002) as single point regressions, parametric and physico-empirical PTFs. Single point PTFs predict a single soil property, while parametric PTFs predict parameters of a model. This is similar to the idea of a soil inference engine that creates a soil-landscape model. 14

PAGE 15

These soil inference systems are tools used for environmental soil-landscape modeling, which Grunwald (2006) describes as a science devoted to understanding the spatial distribution of soils and coevolving landscapes as part of ecosystems that change dynamically through time. McSweeney et al. (2004) describe the methodology of soil-landscape modeling based on (i) characterization of the local physiographic domain through analysis of digital elevation model (DEM) data, (ii) collection of georeferenced soil samples and compiling desired soil property data, and (iii) development of explicit, quantitative, and usually simple empirical models. As Grunwald (2006) points out, soil-landscape modeling depends greatly on soil and ancillary variables. There are multiple factors which impact soil-landscape modeling, which include: attribute type (Boolean, categorical, ordinal, interval, or continuous); content of attributes (soil attributes, topographic attributes and classes, parent material, land cover and land use, or time); sample support; geographic extent of observations; total number of observations; density of observations; and sampling design. Fuzzy Logic Soil landscape models may include some form of fuzzy logic. Zadeh (1965) introduced the idea of fuzzy sets, which set out to quantify the imprecision and uncertainty that is an inherent part of soil mapping. While soils are traditionally mapped with crisp borders between map units and there are technically specific boundaries defined between soil series in terms of the attributes that make each specific series unique, it is common understanding between soil scientists that each soil type has a range of characteristics and soils vary constantly across the landscape. Fuzzy logic can be used to try to show the variation of soils as they actually occur while possibly moving away from the soil series concept employed in traditional soil survey. McBratney and Odeh (1997) point out that fuzzy set theory can be useful in dealing with uncertainty that arises due to imprecise boundaries between categories. Zhu (1999) explains that under fuzzy logic, 15

PAGE 16

each unit on a map (be it defined as a soil or simply as a pixel) can be assigned to more than one class with varying degrees of class assignment, or differing membership values. Digital Elevation Models Bishop and Minasny (2006) pointed out that historically, a major limitation to soil landscape modeling has been the quality of available elevation data. In recent years, much more detailed elevation data has become available and the use of geographic information systems (GIS) as a tool for modeling has risen dramatically. McBratney et al. (2003) found that a DEM was the most common source of secondary information in published soil mapping studies. They also found that a terrain attribute was used in 80% of the studies as part of the final prediction model. This, as Bishop and Minasny (2006) articulate, illustrates the importance of ensuring the accuracy of the DEM. If the DEM is inaccurate, it likely leads to uncertainty in the model output. The availability of high quality DEMs has vastly improved the outlook for soil-landscape modeling. For example, Thomson et al. (2006) used a high resolution DEM and resulting empirical quantitative models to predict patterns of soil properties. Digital Modeling Approaches and Methods Quite a bit of research has been done on the topic of predictive mapping of soil properties, while little has been done on the topic of mapping broad soil types or map units. Lagacherie and Voltz (2000) pointed out that mapping of soil properties in large areas is challenging to accomplish with acceptable precision and cost. Therefore, methods must be employed that utilize available information and minimize sampling. They also mention that predictions are often refined using secondary data, such as attributes derived from DEMs. In multiple case studies in Southern France, Lagacherie and Voltz (2000) and Voltz et al. (1997) used a method of first 16

PAGE 17

modeling the soil-landscape relationships in the area, and then using those to improve spatial predictions. To model the soil landscape relationships, a conditional probability approach was used as described in Lagacherie et al. (1995). This approach is used to represent the soil patterns and how they depend on landform features by computing the probability of a soil class occurring at a site given the soil classes, the geographical location, and the relative elevation of neighboring sites (Lagacherie and Voltz, 2000). Scull et al. (2003), McBratney et al. (2000) and Grunwald (2006) provide an overview of predictive soil mapping methods, including geostatistical methods, statistical methods (such as decision tree analysis), and knowledge-based models. Geostatistics has emerged as an especially popular approach to mapping soil properties because all soil and landscape properties show more or less spatial autocorrelation. Kriging is the geostatistical method of spatial interpolation (McBratney et al., 2000). According to McBratney et al. (2000), there are some major limitations to kriging, due to the assumptions of stationarity and spatial autocorrelation, which can be a problem in complex terrain such as in northeastern Vermont, because there are many areas where abrupt changes in soil-forming factors occur. Zhu (1999) also pointed out that these techniques require a large amount of field data in order to extract the relationships between soil properties and landscapes, which is a limiting factor when aiming to increase efficiency in a survey area. Not to mention that, as McBratney et al. (2002) pointed out, the most difficult and expensive step in environmental modeling is the collection of data. Statistical methods can also be used to describe the relationships between quantifiable landscape indices and soil properties and regression analysis has been successfully performed to account for variation in various soil characteristics using multiple predictor variables (Scull et 17

PAGE 18

al., 2003). There have been many advances in spatial statistics which provide multiple tools for pedologists to quantify and model the nature of soils in the landscape (Pennock and Veldkamp, 2006). The main drawback of any statistical method is that standard statistical procedures are not flexible enough to allow much integration with new data sources, such as expert knowledge (Scull et al., 2003). Decision tree analysis is new to the field of soil science, but essentially it uses soil landscape correlation in model development by designing a set of predictive rules developed from training data, which are then applied to a geographic database to predict the value of a response variable (Michaelsen et al., 1994, Scull et al., 2003). Three main goals of predictive soil mapping are defined by Scull et al. (2003): (1) to exploit the relationship between environmental variables and soil properties in order to more efficiently collect soil data; (2) produce and present models that better represent soil landscape continuity; and (3) explicitly incorporate expert knowledge in model design. Knowledge-based models have the potential to satisfy all three of these goals and, until recently, have been underrepresented in the research. Knowledge-Based Models Knowledge-based models are composed of three main elements: environmental data, a knowledge base, and an inference engine which combines the data and the knowledge base to infer logically valid conclusions about the soil (Skidmore et al., 1996). Davis (1993) reviewed knowledge-based models and their applications to environmental modeling research and found that while a possible absence of fundamental knowledge for rule generation would be a constraint on the application of the systems, they were becoming more widely accepted as a technique, even over a decade ago. Traditional soil survey has been the most popular form of soil mapping for many years incorporating knowledge of soil surveyors with extensive soil mapping 18

PAGE 19

experience. To incorporate soil mappers expertise into soil knowledge-based models has the potential to improve soil predictive models. Soil Inference Engine The Soil Inference Engine (SIE) is an expert knowledge-based inference engine designed for creating soil maps under fuzzy logic. There are two main types of knowledge that SIE uses: rules, which are defined in parametrical space, and cases, which are defined in geographical space. Both rule-based reasoning (RBR) and case-based reasoning (CBR) can be used to perform inference. Case-based reasoning aims to use the knowledge represented in specific cases to help solve a problem in a different area (Shi et al., 2004). The Soil Inference Engine also provides tools for result validation, terrain analysis, preand post-processing for raster data, and data format conversion (Shi, 2006). The Soil Inference Engine performs fuzzy soil mapping based on the concept of fuzzy soil classification, which assigns fuzzy membership values for different soil types to each location. Rule-based reasoning and CBR are used by SIE to calculate these fuzzy membership values. The values are meant to represent the similarities of a given soil to be predicted to those soil types defined within the inference engine (Shi, 2006). Model Transferability One major question that remains in the field of soil landscape modeling is that of model transferability, especially when it comes to modeling of soil types and not just one or two soil properties. It has been speculated by Lagacherie and Voltz (2000) that predictive capabilities are limited, especially over large areas, because the relationships between soil properties and landscapes are either nonlinear or unknown. Prediction becomes even more difficult when factors other than topography begin to play more of a role, such as different parent materials or 19

PAGE 20

changes in climate (Thompson et al., 2006). These other factors influence the soil environment as soils get further and further apart from each other spatially, especially in a varied landscape such as the glaciated region of northeastern Vermont. Pedotransfer functions that are developed in one geomorphic region and applied to another region may show larger uncertainties due to extrapolation (McBratney et al., 2002). This is likely true also for soil inference models. This study aims to take a soil prediction model developed for a relatively small study area in a complex landscape and test how well it transfers to another, similar study area a few kilometers away. 20

PAGE 21

CHAPTER 2 OBJECTIVES AND HYPOTHESIS This study had two main objectives, the first of which was to develop a model to predict soils occurring in dense till in a study area in Essex County, Vermont. The second objective was to test the transferability of that model to a second study area with similar landscape characteristics in the same county. Specific steps were: (1) To predict which soils (soil series; drainage classes) occur across the landscape in the study area (W1) using the SIE model. (2) To evaluate the completed soil model within W1 using an independent validation set. (3) To run the same model in study area W2. (4) To assess the transferability of the model by running transects in the W2 similar to a random catena sampling strategy and comparing the field results with the SIE results. The hypothesis was that the model will transfer well between similar landscapes to predict soil series and drainage classes. 21

PAGE 22

CHAPTER 3 METHODOLOGY Study Area The W1 study area is in Essex County, Vermont (Figure 3-1). It is about 3.5 km2 and the elevation ranges from 479 m at the outlet of the East Branch of the Nulhegan River to 853 m at the summit of Sable Mountain. The study area lies within the U.S. Geological Survey (USGS) Averill Lake topographic quadrangle. The bedrock is mainly phyllite and schist of the Gile Mountain formation, with some granite on the upper elevations of Sable Mountain. Vegetation is mainly spruce-fir forests on the mountain summit and poorly drained lower slopes and mixed northern-hardwood and spruce-fir forests on middle slopes. The general topography of the area is a series of hills and narrow valleys. Deep loamy basal till covers most of the middle and low elevations of the study area, while some very poorly drained organic materials occur on broad flats and in depressions. The W2 study area is also in Essex County, Vermont. It is 1.9 km2 surrounding an unnamed stream and the elevation ranges from 373 m to 619 m. The study area is completely within the USGS Bloomfield topographic quadrangle. The bedrock and vegetation are similar to that of the W1 study area, and the soil landscapes are also alike. The two study areas share a comparable climate, with a mean annual temperature of about 6 degrees Celsius and total annual precipitation equaling about 97 centimeters. The land use is also exactly the same, with both study areas (Table 3-1) being managed long-term by a large timber company. 22

PAGE 23

Table 3-1. Study area comparison Study Areas W1 W2 USGS Quad Averill Lake Bloomfield Size 3.5 km2 1.9 km2 Elevation (Meters) Min: 468 Max: 833 Mean: 664 Std. Dev.: 51.9 Min: 375 Max: 618 Mean: 475 Std. Dev.: 49.67 Geology phyllite and schist (Gile mountain formation) phyllite and schist (Gile mountain formation) Vegetation Mixed northern-hardwood and spruce-fir forests Mixed northern-hardwood and spruce-fir forests Topography hills and narrow valleys hills and narrow valleys Slope (Percent) Min: 0.02 Max: 86.08 Mean: 15.42 Std. Dev.: 12.02 Min: 0.10 Max: 54.82 Mean: 12.93 Std. Dev.: 7.38 Mean Annual Temperature 6 degrees Celsius 6 degrees Celsius Mean Annual Precipitation 97 cm 97 cm Land use Long term timber management Long term timber management Soils (general knowledge) Deep, loamy basal till; some very poorly drained organic materials in depressions Deep, loamy basal till; some very poorly drained organic materials in depressions 23

PAGE 24

Figure 3-1. Essex County, Vermont and study areas W1 and W2. 24

PAGE 25

Essex County is the last county in Vermont to be undergoing an initial soil survey, and therefore, there is no official soils data available for the county. However, Essex County is part of a larger region known as the Northeast Kingdom, which also includes Orleans and Caledonia counties. These areas have already been mapped and the data is available through the USDA-Natural Resources Conservation Service Web Soil Survey and Soil Data Mart. Given the experience in the rest of the Northeast Kingdom, it is reasonable to assume that in these mainly wooded areas, the basal till areas will be dominated by one catena of soils, and the model for this study reflects this assumption. The three soil series that dominate these and similar areas are known as Cabot, Colonel, and Dixfield (Table 3-2.). In general, Dixfield soils are found highest on the landscape and on the steepest and most convex slopes, and Cabot soils are found lowest on the landscape and on the flattest and most concave slopes. Colonel soils occur in between Cabot and Dixfield in terms of both hillslope position and slope shape. Other soils occur to a lesser extent on the landscapes evaluated in this study as well. These soils, for the purpose of validation, were designated based on which of the three dominant series they most closely resembled morphologically. Series Name Drainage Class Taxonomic Class Cabot Poorly Loamy, mixed, active, nonacid, frigid, shallow Typic Humaquepts Colonel Somewhat poorly Loamy, isotic, frigid, shallow Aquic Haplorthods Dixfield Moderately well Coarse-loamy, isotic, frigid Aquic Haplorthods Table 3-2. Soil series modeled in W1 and W2. Field Sampling The field sampling in W1 consisted of 157 soil pits dug as part of a separate (related) project. The 157 sites were laid out in a 150 m grid design throughout the entire study area. Detailed profile descriptions were written at each site (including documentation on soil series 25

PAGE 26

and drainage class properties), and landscape and profile photographs were also taken for later use. For use in this study, those 157 sample points were pared down in a few ways. First, areas of W1 that are known to be bedrock-controlled were masked out, because this particular model is not designed to map bedrock-controlled soils. This process left 128 sample points. Of those points, seventy percent (90 points) were used to aid model development and thirty percent (38 points) were used for model validation within W1. The seventy-thirty distribution within the W1 study area is random (figure 3-2). 26

PAGE 27

Figure 3-2. Study area W1 sample points 27

PAGE 28

In order to validate the model in W2, a sampling design similar to a random catena sampling strategy was used. Six sampling points along seven catenas were dug, for a total of 42 sampling points (Figure 3-3). Detailed profile descriptions were written at each site and photographs were taken of both the landscape and the soil profiles. 28

PAGE 29

Figure 3-3. Study area W2 sample points 29

PAGE 30

See Appendix A for examples of the documentation gathered during the course of this study. Model Development There are eight basic steps included in the RBR-CBR process used by SIE (Shi et al., 2007), and these are the steps that were followed to develop the model for the study area. They are as follows: (1) The soil scientist (myself, with guidance from two senior soil scientists) provided global knowledge. This includes the soils expected to be found in the area as well as the typical environmental conditions in which these soils occur. This global knowledge was supplemented in this study by the data obtained from the 90 sample sites in W1. There were also several hundred other known points that were not a part of the study but are in the same county and are the same soil types. These were not formally used in this study but are considered to be supplemental knowledge previously gained by the soil scientists. Environmental conditions that are defined by environmental values are formalized into rules, while those represented by geographical locations are formalized into cases. (2) The soil scientist prepared data layers such as slope and wetness index from the DEM to be used for characterizing the previously defined environmental conditions. (3) The Soil Inference Engine was used to perform RBR or global CBR, using both the global knowledge and the GIS layers. An output map was generated which shows the general pattern of soils on the landscape, based on the input information. (4) The soil scientist verified the initial round of output maps by comparing the results to knowledge of the area and any known points (in this case the 90 points), and adjusted them. This can be done by either adjusting the rules or global cases, or by fine-tuning the maps by using the following steps. For this study, the rules were adjusted multiple times 30

PAGE 31

in an attempt to gain inference results that showed high accuracy matches to the 90 points within the first study area. This turned out to be a challenge, but looking at the study area as a whole, it seemed the results were reasonable and therefore the process was moved forward to validation. (5) The soil scientist could have provided local knowledge, in the form of cases, to address local exceptions. These occur when the results make sense from the inputs, but for some reason it is known that a different soil may actually occur at a specific location. This knowledge can only be gained by either a.) field sampling or b.) extensive experience and knowledge of landforms. In this study, there were no local exceptions that were addressed. (6) The Soil Inference Engine would then be used to perform local CBR using the local knowledge and the GIS layers. (7) The soil scientist verifies the next round of output maps. The cases can be adjusted and the CBR can be run again. Running the inference is a very quick (a matter of seconds) process that can be repeated easily until the results are satisfactory. (8) The soil scientist used (9) the post-processing tools and other GIS tools (in this study, ArcGIS (Environmental Systems Research Institute, Redlands, CA) was used extensively, specifically spatial analyst) to integrate the results and generate final maps. Once the model was fully developed for W1, it was run on the W2 study area as well. The model developed by the soil scientist was then evaluated for the purpose of this study using an independent validation dataset consisting of 42 sample points from W2. 31

PAGE 32

Data Preparation In basal till soils, the two main factors that have proven to provide a good basis for rules are slope and compound topographic wetness index. Other layers, such as vegetation, landform, and relative position were investigated and ultimately not used in this study. Both of the layers used in the study are derived from a DEM (Figures 3-4 and 3-5), derived from Light Detection and Ranging (LiDAR) data. The LiDAR data was originally provided at 1 m resolution, which was too fine a resolution for this purpose due partly to vegetative artifacts (see appendix B) that affect inference results. The data was therefore filtered using a 9 x 9 rectangular neighborhood, then resampled to a 5 m pixel size using the resample tool in ArcToolbox. The software used for this process was ArcGIS. The DEM used for this study has this resulting 5 m pixel size as well as approximately 30 cm vertical accuracy. The terrain attributes (slope and wetness index) were derived using SIE. The tools for deriving both layers are found under the Terrain Attributes menu of SIE. The slope layer (figures 3-6 and 3-7) was created using the Evans-Young algorithm (Pennock et al., 1987), a neighborhood size of 30, and a square neighborhood shape. The wetness index (figures 3-8 and 3-9) is calculated as w = In(Flow Accumulation/Slope Gradient) (3-1) with the input being the DEM since this study used a multi-path wetness index algorithm (Shi, 2007), which is a function that represents water flowing into all neighboring pixels that are lower than the center pixel. The amount of flow to each pixel is proportional to the steepness in that direction. This is in contrast to a uni-path wetness index algorithm, which only allows flow in the steepest direction. 32

PAGE 33

Figure 3-4. Elevation for study area W1 33

PAGE 34

Figure 3-5. Elevation for study area W2 34

PAGE 35

Figure 3-6. Slope for study area W1 35

PAGE 36

Figure 3-7. Slope for study area W2 36

PAGE 37

Figure 3-8. Wetness index for study area W1 37

PAGE 38

Figure 3-9. Wetness index for study area W2 38

PAGE 39

Rules The rules developed for the three soil series in this study are relatively straightforward and represent the understanding of the soils as they occur on the landscape in relation to one another. The final rules are shown in Table 3-3, below. Figure 3-10 illustrates an example of the inference interface which shows the membership function. Table 3-3. Rules for Cabot, Colonel, and Dixfield soils. Full Membership at 0.5 Membership at Curve Shape P Function Series Slope % Wetness Index Slope % Wetness Index Slope Wetness Slope Wetness Cabot 8 6.3 20 4.8 Z-shaped S-shaped Limiting Factor Limiting Factor Colonel 15 3.9 35 2.4, 5.4 Z-shaped Bell-shaped Limiting Factor Limiting Factor Dixfield 15 3.4 8 4.9 S-shaped Z-shaped Limiting Factor Limiting Factor 39

PAGE 40

A B Figure 3-10. Inference interface for Colonel (ArcSIE). A) bell-shaped curve for wetness index, B) Z-shaped curve for slope. 40

PAGE 41

Evaluation The original output maps are fuzzy maps, with each pixel having an assigned fuzzy value for each soil series. In order to have a concrete way to validate results, a specific value must be assigned to each pixel, which is what a hardened (defuzzified) map accomplishes. Using the post-processing tools from SIE, hardened maps of the W1 and W2 study areas were created. The results were evaluated in two ways. First, a simple, one-to-one comparison of the hardened map and the soil series name at the validation points in each study area was done in the form of confusion matrices. To accommodate for bias in splitting the whole dataset into calibration and validation sets the procedure was repeated a total of 9 times to capture some of the uncertainty in predictions associated with selecting calibration/validation samples. Prediction performance on the multiple model runs are presented in form of confusion matrices. Second, a process was developed for evaluating the results based on fuzzy drainage class. One of the questions that came up during the course of this study was that of typical soils versus soils that remain in a series but that are not so typical of that series. This led to the development of fuzzy boundaries for soil series based on drainage class. For example, the Dixfield series falls into the moderately well drained drainage class, which has a range of characteristics defined that allows all soils that have redoximorphic features between 41 and 102 cm to be grouped in the same category. Some soils that are classified as Dixfield are more typical of Dixfield while some are still Dixfield but are on the dry fringe and others are on the wet fringe. A set of criteria (Table 3-4) was developed which allows the illustration of this differentiation between what is typical in a soil series (based on drainage class) and what is not. Since this model was developed for three soils in one catena, each belongs to a different drainage class, and the properties measured in the field were consistent with those that can be used to determine drainage class, this was deemed a reasonable evaluation characteristic. 41

PAGE 42

Table 3-4. Evaluation criteria for fuzzy drainage class. Drainage Class (Soil Series) Typical Characteristics Wetter Fringe Characteristics Drier Fringe Characteristics Poorly Drained (Cabot) O Horizon 0-15 cm, Chroma 2 in profile O horizon 15-20 cm Chroma 3 within 76 cm of top of mineral soil; must be chroma 2 somewhere Somewhat Poorly Drained (Colonel) Redox between 23 and 36 cm Redox between 0 and 23 cm Redox between 36 and 41 cm Moderately Well Drained (Dixfield) Redox between 56 and 86 cm Redox between 41 and 56 cm Redox between 86 and 102 cm It may be noted here that the wetter fringe characteristics of Colonel are outside the range in characteristics listed in the Official Series Description for the Colonel Series (N.C.S.S., 2008). This is because this study was developed to test the transferability of a simple model with only three soils, and once the study was underway, it was discovered that in places in both study areas there are soils occurring between Cabot and Colonel on the drainage class profile. These soils are Spodosols that are morphologically more similar to Colonel than to Cabot, so they were counted as Colonel (most like Colonel) for the purpose of this study. Also, drainage class evaluation criteria were defined such that they captured these intermediate soils as somewhat poorly drained. Specifically, a reduced matrix was made a requirement for poorly drained soils. Every validation point was then assigned a fuzzy value (Table 3-5) based on a comparison of the SIE results and the evaluation of whether the field results were typical for the series drainage class. For example, if SIE predicted Colonel, and field results yielded a wet-fringed Dixfield, a fuzzy membership value of 0.75 was assigned. A high fuzzy membership number means the field results more closely match the central concept of the drainage class associated with the predicted soil. 42

PAGE 43

Table 3-5. Matrix of fuzzy membership designations comparing SIE results and fuzzy drainage classes. Field Results SIE Output Cabot (Poorly Drained) Colonel (Somewhat Poorly Drained) Dixfield (Moderately Well Drained) SIE Output Wet fringe Typ-ical Dry fringe Wet Fringe Typical Dry Fringe Wet Fringe Typical Dry Fringe Cabot 1 1 1 .75 .5 .25 0 0 0 Colonel .25 .5 .75 1 1 1 .75 .5 .25 Dixfield 0 0 0 .25 .5 .75 1 1 1 Accuracy numbers were then determined based on these fuzzy membership designations by adding up all the fuzzy drainage class memberships in a given drainage class set and dividing by the number of sample sites in that set. 43

PAGE 44

CHAPTER 4 RESULTS AND DISCUSSION Final Predictions The initial output maps from SIE show the fuzzy results for each soil series (figures 4-1 through 4-6). On each of these maps, darker colors mean higher fuzzy memberships for that soil. The final prediction maps (Figures 4-7 and 4-8) for each study area are hardened maps of the SIE results, and also serve as a proxy for drainage class maps, because each soil type has a drainage class associated with it. The hardened maps are created by aggregating all three of the fuzzy membership maps for each study area using SIE to assign, at each pixel, the soil series with the highest fuzzy membership. 44

PAGE 45

Figure 4-1. Fuzzy prediction map of Cabot soil series for study area W1 45

PAGE 46

Figure 4-2. Fuzzy prediction map of Colonel soil series for study area W1 46

PAGE 47

Figure 4-3. Fuzzy prediction map of Dixfield soil series for study area W1 47

PAGE 48

Figure 4-4. Fuzzy prediction map of Cabot soil series for study area W2 48

PAGE 49

Figure 4-5. Fuzzy prediction map of Colonel soil series for study area W2 49

PAGE 50

Figure 4-6. Fuzzy prediction map of Dixfield soil series for study area W2 50

PAGE 51

Figure 4-7. Final prediction maps of soil series for study area W1. 51

PAGE 52

Figure 4-8. Final prediction map of soil series for study area W2. 52

PAGE 53

Evaluation of Predicting Soil Series The one-to-one comparison of the hardened map to the soil series as found in the field yielded low (42.6 percent) accuracy for the calibration sites in W1. The one-to-one comparison of the hardened map to the soil series as found in the field yielded 73.7 percent accuracy overall in W1 (validation sites) and 71.4 percent accuracy overall in W2. The confusion tables below show the breakdown of percent accuracy results by series name. Table 4-1. Confusion table that compares calibration prediction results based on SIE to observed soil series including most similar soil series using 90 model development sites in W1 Observations Calibration sites (n:90) Percent Cabot Colonel Dixfield Cabot 42 25 33 Colonel 21 47 33 Predictions Dixfield 9 52 39 Table 4-2. Confusion table that compares validation prediction results based on SIE to observed soil series including most similar soil series using 38 independent evaluation sites in W1 Observations Validation sites (n:38) Percent Cabot Colonel Dixfield Cabot 73 27 0 Colonel 15 77 8 Predictions Dixfield 0 30 70 53

PAGE 54

Table 4-3. Confusion table that compares validation prediction results based on SIE to observed soil series including most similar soil series using 42 validation independent evaluation sites in W2 Observations Validation sites (n:42) Percent Cabot Colonel Dixfield Cabot 69 31 0 Colonel 11 63 26 Predictions Dixfield 0 10 90 Since the accuracy for the calibration points was so low compared to the validation points in W1, multiple iterations of statistics were done using different arrangements of points as representing calibration versus validation points within W1 (Tables 4-4 and 4-5). A breakdown of these results can be seen in Appendix D. Table 4-4. Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 using 9 calibration runs Observations Calibration sites (n:90) Percent Cabot Colonel Dixfield Cabot 42 to 56 (mean: 51) 18 to 32 (mean: 26) 18 to 33 (mean: 23) Colonel 18 to 27 (mean: 22) 40 to 58 (mean: 50) 21 to 40 (mean: 28) Predictions Dixfield 0 to 10 (mean: 7) 36 to 55 (mean: 47) 35 to 55 (mean: 47) 54

PAGE 55

Table 4-5. Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 using 9 validation runs Observations Validation sites (n:38) Percent Cabot Colonel Dixfield Cabot 50 to 73 (mean: 60) 9 to 45 (mean: 24) 0 to 27 (mean: 16) Colonel 6 to 31 (mean: 19) 38 to 77 (mean: 52) 8 to 40 (mean: 29) Predictions Dixfield 0 to 18 (mean: 5) 30 to 64 (mean: 43) 36 to 70 (mean: 52) Fuzzy Drainage Class The fuzzy drainage class results show an overall average between classes of 88.8 percent accuracy in W1 and 89.9 percent accuracy in W2 (validation sets). The calibration points were 62.6 percent accurate overall when comparing fuzzy drainage class prediction results. While the calibration points still had lower accuracy numbers than the validation points, the drainage class results show higher accuracy (Tables 4-6, 4-7, and 4-8) than the one-to-one soil series comparison seen in the above confusion tables. Table 4-6. Confusion table that compares calibration prediction results based on SIE to observed drainage classes using 90 model development sites in W1 Observations Calibration sites (n:90) Percent Poorly Drained Somewhat Poorly Drained Moderately Well Drained Poorly Drained 68 32 0 Somewhat Poorly Drained 17 54 29 Predictions Moderately Well Drained 0 33 66 55

PAGE 56

Table 4-7. Confusion table that compares validation prediction results based on SIE to observed drainage classes using 38 independent evaluation sites in W1 Observations Validation sites (n:38) Percent Poorly Drained Somewhat Poorly Drained Moderately Well Drained Poorly Drained 78 21 0 Somewhat Poorly Drained 7 87 6 Predictions Moderately Well Drained 0 15 85 Table 4-8. Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 2) Observations Validation sites (n:42) Percent Poorly Drained Somewhat Poorly Drained Moderately Well Drained Poorly Drained 76 24 0 Somewhat Poorly Drained 6 73 21 Predictions Moderately Well Drained 0 5 95 The overall accuracy ratings for each study area, distributed by drainage class, are presented in table 4-9. Table 4-9. Percent accuracy overall based on fuzzy drainage class membership (Validation) W1 W2 Poorly Drained (Cabot) 93 90 Somewhat Poorly Drained (Colonel) 88 83 Moderately Well Drained (Dixfield) 87 95 Poorly drained and somewhat poorly drained soils had field results that more closely matched the central concept of the predicted drainage class in W1 than in W2. However, 56

PAGE 57

moderately well drained soils showed higher accuracy in W2 than in W1. Overall, assigning fuzzy memberships to the validation point field data brought accuracy ratings up from the raw comparison between the hardened SIE results and soil series. Discussion The results from both the direct comparison between the hardened map and field results and the fuzzy drainage class comparison show that the model is highly transferable between the two study areas, specifically looking at the validation points. The calibration points showed lower accuracy compared to the validation points, which could be a result of the fact that the calibration set is so much bigger than the validation set in W1 and thus captures more variability in the landscape. The results from different configurations of points show that the model is sensitive to the selection of sample and observation sites for calibration and validation. This is illustrated by the fact that the accuracy numbers change, at times dramatically, between soil series and point selections. It must be considered that the validation set is small relative to the calibration set and a random selection of points can skew the results one way or another. If it is considered that the even though the calibration points resulted in low accuracy numbers, the overall results for W1 looked reasonable according to expert soil scientists (myself included), and the study was pushed forward to the validation stage, it can be seen that the result for the validation sets showed high accuracy numbers and thus good transferability between similar areas. The model should therefore transfer well to other areas that are similar to these study areas. As one or more environmental factors change, the transferability of the model will go down. 57

PAGE 58

Assigning fuzzy drainage class memberships not only brings accuracy numbers up, but it points to the concept of a continuous field model, with soils changing gradually across the landscape rather than having discrete boundaries between one another. This model represented the basic soil landscape relationship of drier soils occurring higher on the landscape and wetter soils occurring lower on the landscape. The curves (rules) were designed with the two environmental variables (slope and wetness index) to reflect this relationship. As the slope increased and wetness decreased, drier soils took over. Wetness index served as a proxy for landscape position because it is a function of such. The resulting maps reflected the modeled soil landscape relationships in that the driest soil, Dixfield, generally occurred highest on the landscape and the wettest soil, Cabot, occurred lowest on the landscape, in the drainageways and flat areas. Colonel, which is between Cabot and Dixfield both in relative slope position and drainage class, occurred on middle slopes, generally in between Cabot and Dixfield soils. Knowledge-based prediction models have previously been compared to traditional soil mapping (Zhu et al., 2001). The model that was tested in that study was SoLIM, and SoLIM was found to be correct 81 percent of the time compared to the soil map being correct 61 percent of the time at one site, and the corresponding numbers at another site were 83.8 percent and 66.7 percent, respectively. The areas in this study have not been mapped traditionally, so such a comparison cannot be made; however, the SIE accuracy numbers were slightly lower, at 73.7 percent in W1 and 71.4 percent in W2. The fact that the county has never been mapped could be one reason for the lower accuracy numbers; it is reasonable to assume that a soil scientist creating a model for an area that has already been worked in extensively would create a more accurate model. 58

PAGE 59

There is discussion in the soil mapping community (undocumented meeting discussions) about raster versus vector mapping. The output from SIE is pixel, or raster, based, and this could have some benefits for users of the soil information. Traditionally, soil maps have been given out in polygon format, with each polygon representing a map unit labeled with one or two named soil types, and a customer would have to look at metadata to find out that there is actually the possibility of finding multiple other soils within that polygon. With raster data output, it is much easier to create a map that shows the continuous distribution of those so-called inclusions of soils within the map units. The accuracy of the raster soils data depends on the accuracy of the inputs, right down to the DEM. For this study, there was a very accurate one meter DEM available, which is not the case in most places. This raster resolution could affect the spatial resolution of the soil prediction maps. For this study, as for the rest of the county that is currently being mapped, the scale is 1:24,000. There are constraints to this model. Three soil series were modeled, with accuracies between 70 and 80 percent. That leaves 20 to 30 percent unexplained. Of the five CLORPT factors (climate, organisms, relief, parent material, or time), the one that most likely plays the biggest role in variability in this region is relief. This corresponds to the environmental factors that were used to create the model; slope and wetness, in that the catena concept shows that as topography varies, so does drainage and wetness. Variability in topography can lead to variability in drainage, though the catena concept would suggest that if the topography varies in the same manner, the drainage would change accordingly. Sudden changes in land surface occur indiscriminately across the landscape in both study areas. Tied in to this is the fact that evaluation of results relied partly on accuracy of GPS readings. Most of the study areas are 59

PAGE 60

forested and even with extra backpack antennas, there is the possibility that sample holes were dug outside of the correct pixel, on a slightly different landscape position. Many more than three soils will need to be modeled at a time in the future. This model was limited to three soils as a test of one catena. If a soil scientist can conceptualize a soil landscape model and has the available data layers to transfer that concept into a rule, SIE can be used to model that soil type or class. For this model, multiple other data layers were tested and ultimately not used because it was found that they did not add any benefit to the models outcome and only served to complicate things. This is not always the case, and as more, similar soils get added to the mix, it becomes necessary to add more environmental layers in order to differentiate between soil types. Soil properties are of interest to consultants, researchers, and agencies for multiple uses. A knowledge-based model such as SIE has the potential to predict continuous soil properties in the same manner as described above for soil classes. Zhu et al. (2001) modeled soil properties in two study areas using fuzzy logic knowledge-based modeling. If a soil property model can be conceptualized and environmental data layers are available that allow the transference of that knowledge to the model, inference should be able to be performed. However, SIE has not been tested as a tool for modeling soil properties, so more research and development would need to be invested in order to investigate the question of continuous soil property prediction. 60

PAGE 61

CHAPTER 5 SUGGESTIONS FOR FURTHER RESEARCH There are multiple related issues that directly impact soil scientists working with SIE. The first is that of data manipulation, and at what point has the DEM been manipulated a sufficient amount to accurately reflect what is on the ground and also allow for relatively flawless rule development? There are infinite possibilities for data manipulation built into not only the SIE software, but to the other GIS software packages that soil scientists use every day. One other issue is the method of evaluation of results. For this study, the results were evaluated on a pixel level, which is important on a very basic level, and must be done before a model can be considered useful. However, the soil scientists who use SIE for soil mapping are more concerned with an end product that fits the concept of map units. The new concept of map units could be raster data, though there is still currently a need for vectors due to SSURGO (Soil Survey Geographic) Database standards. It becomes important to know if the level of detail that SIE provides is not only accurate, but does it translate to map unit composition concepts? High resolution soil maps are easy to understand and likeable by soil scientists, but other users such as conservation planners and farmers find them daunting and wonder if the detail is really how soils occur across the landscape. Questions remain on the validity of creating vector maps for map units from the raster data, while preserving the raster data for later use. The results of this study can be built upon to move into a map unit discussion, where applicable. A third question is that of the limits of transferability. This study demonstrates that models are transferable between similar landscapes, but there is sure to come a point when they are not transferable. When is this point? Can it be defined within certain types of landscapes? Soil variability is linked to variability within CLORPT factors (climate, organisms, relief, parent 61

PAGE 62

material, time, as well as spatial position), and if the CLORPT factors in two soil regions differ, it is reasonable to believe that transferability will be limited. All these questions are important to the study of soils and soil landscape analysis, and can surely be investigated readily. 62

PAGE 63

APPENDIX A DOCUMENTATION EXAMPLES Figure A-1. Sample point 127 description 63

PAGE 64

Figure A-2. Sample point 127 profile photo 64

PAGE 65

Figure A-3. Sample point 127 landscape photo 65

PAGE 66

APPENDIX B VEGETATIVE ARTIFACTS IN DIGITAL ELEVATION DATA Figure B-1. Vegetative artifacts in digital elevation data 66

PAGE 67

APPENDIX C FUZZY DRAINAGE CLASS DESIGNATIONS Table C-1. Study area W1 fuzzy drainage class designations (validation) Point # Described as? SIE Result 1= Cabot 2=Colonel 3=Dixfield Drainage Class Features Typical of described drainage class? Fuzzy Membership 8 Colonel 2 Redox at 34 cm. Yes 1 9 Colonel 2 Redox at 31 cm. Yes 1 14 Colonel 1 redox at 6 cm fringe toward pd 0.75 18 Dixfield 2 redox at 54 cm fringe toward spd 0.75 21 Colonel 1 redox at 19 cm fringe toward pd 0.75 23 Colonel 1 redox at 0 cm fringe toward pd 0.75 29 Cabot 1 O horizon 12 cm., depleted matrix with redox Yes 1 30 Dixfield 3 redox at 62 cm. Yes 1 31 Dixfield 3 redox at 44 cm. fringe toward spd 1 32 Dixfield 3 redox at 46 cm. fringe toward spd 1 38 Cabot 1 O horizon 3 cm, depleted matrix with redox Yes 1 42 Colonel 2 Redox at 17 cm fringe toward pd 1 47 Cabot 1 O horizon 4cm, depleted matrix with redox Yes 1 48 Cabot 1 O horizon 8 cm, depleted matrix with redox Yes 1 49 Cabot 1 O horizon 4 cm, depleted matrix with redox Yes 1 51 Colonel 2 Redox at 37 cm. fringe toward mwd 1 54 Cabot 1 O horizon 16 cm, depleted matrix with redox fringe toward vpd 1 56 Cabot 1 O horizon 2cm, depleted matrix with redox Yes 1 58 Cabot 1 O horizon 3 cm, depleted matrix with redox Yes 1 59 Cabot 1 O horizon 10 cm, chroma 4 above 76 cm fringe toward spd 1 64 Colonel 2 redox at 25 cm. Yes 1 66 Colonel 2 redox at 15 cm. fringe toward pd 1 71 Colonel 2 redox at 3 cm. fringe toward pd 1 80 Colonel 2 redox at 9 cm. fringe toward pd 1 87 Dixfield 3 redox at 58 cm. Yes 1 67

PAGE 68

90 Colonel 2 redox at 20 cm. fringe toward pd 1 91 Cabot 1 O horizon 4 cm, chroma 3 within 76 cm fringe toward spd 1 98 Dixfield 3 redox at 45 cm fringe toward spd 1 121 Colonel 3 redox at 36 cm Yes 0.5 126 Cabot 2 O horizon 15 cm, depleted matrix with redox Yes 0.5 127 Colonel 2 Redox at 34 cm. Yes 1 128 Colonel 1 redox at 0 cm fringe toward pd 0.75 135 Cabot 2 O horizon 20 cm, depleted matrix with redox fringe toward vpd 0.25 137 Dixfield 3 redox at 68 cm Yes 1 141 Colonel 3 redox at 7 cm fringe toward pd 0.25 144 Dixfield 3 redox at 48 cm fringe toward spd 1 154 Cabot 1 O horizon 19 cm Yes 1 158 Colonel 3 redox at 27 cm Yes 0.5 Table C-2. Study area W2 fuzzy drainage class designations (validation) Point # Described as? SIE Result 1= Cabot 2=Colonel 3=Dixfield Drainage Class Features Typical of described drainage class? Fuzzy Membership 1 Colonel 1 redox at 24 cm. Yes 0.5 2 Dixfield 3 only faint redox at 70 cm. fringe toward wd 1 3 Dixfield 2 redox at 48 cm fringe toward spd 0.75 4 Colonel 1 redox at 16 cm fringe toward pd 0.75 5 Colonel 1 redox at 23 cm fringe toward pd 0.75 6 Colonel 2 redox at 12 cm fringe toward pd 1 7 Dixfield 3 redox at 30 cm but really Sunapee for model 1 8 Dixfield 3 redox at 63 cm Yes 1 9 Dixfield 3 redox at 82 cm Yes 1 10 Colonel 2 redox at 35 cm Yes 1 11 Dixfield 2 redox at 54 cm fringe toward spd 0.75 12 Dixfield 2 redox at 63 cm Yes 0.5 13 Dixfield 3 redox at 33 cm but really Monadnock for model 1 68

PAGE 69

14 Colonel 2 redox at 23 cm Yes 1 15 Cabot 1 O horizon 8 cm, depleted matrix with redox Yes 1 16 Cabot 1 O horizon 8 cm, depleted matrix with redox Yes 1 17 Dixfield 3 no redox fringe toward wd 1 18 Cabot 1 O horizon 9 cm, depleted matrix with redox Yes 1 19 Colonel 2 redox at 24 cm. Yes 1 20 Cabot 1 O horizon 12 cm, depleted matrix with redox Yes 1 21 Colonel 2 redox at 32 cm Yes 1 22 Colonel 2 redox at 14 cm fringe toward pd 1 23 Colonel 3 redox at 35 cm Yes 0.5 24 Dixfield 3 redox at 53 cm fringe toward spd 1 25 Cabot 1 O horizon 15 cm, depleted matrix with redox Yes 1 26 Colonel 2 redox at 0 cm fringe toward pd 1 27 Colonel 2 redox at 0 cm fringe toward pd 1 28 Cabot 1 O horizon 16 cm fringe toward vpd 1 29 Cabot 2 O horizon 4 cm, depleted matrix with redox Yes 0.5 30 Colonel 2 redox at 8 cm fringe toward pd 1 31 Dixfield 2 redox at 44 cm fringe toward spd 0.75 32 Colonel 1 redox at 8 cm fringe toward pd 0.75 33 Colonel 2 redox at 10 cm fringe toward pd 1 34 Dixfield 3 redox at 48 cm fringe toward spd 1 35 Dixfield 2 redox at 46 cm fringe toward spd 0.75 36 Dixfield 3 redox at 54 cm fringe toward spd 1 37 Cabot 1 O horizon 12 cm, chroma fringe toward 1 69

PAGE 70

3 above 76 cm spd 38 Colonel 2 redox at 20 cm fringe toward pd 1 39 Cabot 1 O horizon 13 cm, reduced matrix with redox Yes 1 40 Colonel 2 redox at 0 cm fringe toward pd 1 41 Cabot 2 O horizon 14 cm, reduced matrix with redox Yes 0.5 42 Cabot 1 O horizon 9 cm, reduced matrix with redox Yes 1 Table C-3. Study Area W-1 Fuzzy drainage class designations (calibration) Point # Described as? SIE Result 1= Cabot 2=Colonel 3=Dixfield Drainage Class Features Typical of described drainage class? Fuzzy Membership 1 Cabot 1 O horizon 20 cm., depleted matrix with redox fringe toward vpd 1 2 Cabot 1 O horizon 18 cm., depleted matrix with redox fringe toward vpd 1 4 Dixfield 2 redox at 59 cm. Yes 0.5 5 Colonel 3 redox at 7 cm. fringe toward pd 0.25 6 Cabot 2 O horizon 19 cm., depleted matrix with redox fringe toward vpd 0.25 10 Dixfield 1 redox at 12 cm. fringe toward pd 0 11 Cabot 1 O horizon 27 cm but really Peacham fringe toward vpd 1 12 Colonel 2 redox at 15 cm. fringe toward pd 1 13 Cabot 2 O horizon 17 cm, depleted matrix with redox fringe toward vpd 0.25 17 Colonel 3 redox at 14 cm. fringe toward pd 0.25 19 Cabot 1 O horizon 4 cm, depleted matrix with redox Yes 1 20 Colonel 2 redox at 39 cm fringe toward mwd 1 22 Colonel 2 redox at 28 cm. yes 1 24 Colonel 1 redox at 28 cm. Yes 0.5 70

PAGE 71

25 Colonel 1 redox at 7 cm. fringe toward pd 0.75 26 Colonel 2 redox at 0 cm. fringe toward pd 1 28 Cabot 2 O horizon 8 cm., depleted matrix with redox Yes 0.5 34 Cabot 1 O horizon 10 cm, depleted matrix with redox Yes 1 35 Cabot 1 O horizon 2 cm, depleted matrix with redox Yes 1 36 Colonel 1 redox at 12 cm. fringe toward pd 0.75 37 Colonel 2 redox at 15 cm. fringe toward pd 1 39 Dixfield 2 redox at 57 cm yes 0.5 41 Dixfield 2 redox at 0 cm but really Lyman for model 0.5 43 Colonel 3 redox at 37 cm fringe toward mwd 0.75 44 Colonel 3 redox at 9 cm fringe toward pd 0.25 45 Cabot 3 chroma 3 fringe toward swpd 0 46 Cabot 2 O horizon 38 cm, but really Peacham fringe toward vpd 0.25 50 Colonel 1 redox at 7 cm fringe toward pd 0.75 52 Dixfield 2 redox at 50 fringe toward spd 0.75 53 Dixfield 2 redox at 45 fringe toward spd 0.75 55 Colonel 2 redox at 25 cm yes 1 57 Colonel 2 redox at 14 cm. fringe toward pd 1 60 Dixfield 2 redox at 46 cm fringe toward spd 0.75 62 Dixfield 1 redox at 59 cm. yes 0 63 Dixfield 2 redox at 33 cm fringe toward spd 0.75 65 Colonel 3 redox at 25 cm yes 0.5 67 Colonel 2 redox at 7 cm fringe toward pd 1 68 Cabot 2 O horizon 5 cm, depleted matrix with redox yes 0.5 69 Colonel 1 redox at 5 cm fringe toward 0.75 71

PAGE 72

pd 70 Colonel 2 redox at 7 cm fringe toward pd 1 72 Dixfield 3 redox at 12 cm. but really Tunbridge for model 1 73 Dixfield 1 redox at 45 cm fringe toward spd 0 76 Colonel 2 redox at 40 cm fringe toward mwd 1 77 Cabot 2 chroma 3 fringe toward spd 0.75 78 Colonel 3 redox at 23 cm yes 0.5 79 Cabot 1 O horizon 1 cm, depleted matrix with redox yes 1 81 Dixfield 1 redox at 49 cm fringe toward spd 0 82 Cabot 2 O horizon 3 cm, depleted matrix with redox yes 0.5 86 Dixfield 2 redox at 23 cm, but really Tunbridge for model 0.5 88 Colonel 3 redox at 21 cm fringe toward pd 0.25 89 Cabot 2 O horizon 8 cm, depleted matrix with redox yes 0.5 92 Dixfield 2 No redox, but really Tunbridge for model 0.5 95 Dixfield 3 redox at 0 cm, but really Tunbridge for model 1 96 Cabot 2 O horizon 4 cm, depleted matrix with redox yes 0.5 97 Cabot 3 O horizon 9 cm, depleted matrix with redox yes 0 99 Colonel 1 redox at 28 cm yes 0.5 104 Dixfield 3 redox at 46 cm fringe toward spd 1 105 Colonel 2 redox at 11 cm fringe toward pd 1 106 Cabot 1 croma 3 fringe toward spd 1 107 Dixfield 3 no redox but really Abram for model 1 113 Cabot 2 chroma 3 fringe toward spd 0.75 114 Dixfield 3 no redox but really Tunbridge for model 1 120 Dixfield 2 redox at 49 cm fringe toward 0.75 72

PAGE 73

spd 122 Cabot 1 chroma 3 fringe toward spd 1 123 Dixfield 2 no redox but really Berkshire for model 0.5 128 Colonel 1 redox at 0 cm fringe toward pd 0.75 130 Dixfield 1 redox at 37 cm but really Sunapee fringe toward spd 0 131 Colonel 2 redox at 28 cm yes 1 132 Dixfield 3 redox at 74 cm yes 1 133 Colonel 3 redox at 26 cm yes 0.5 134 Colonel 3 redox at 36 cm yes 0.5 136 Colonel 3 redox at 15 cm. fringe toward pd 0.25 138 Dixfield 1 redox at 54 cm fringe toward spd 0 139 Dixfield 2 redox at 55 cm fringe toward spd 0.75 140 Colonel 2 redox at 14 cm. fringe toward pd 1 142 Dixfield 3 redox at 9 cm but really Tunbridge for model 1 143 Colonel 3 redox at 9 cm fringe toward pd 0.25 146 Colonel 2 redox at 16 cm fringe toward pd 1 147 Cabot 1 O horizon 5 cm, depleted matrix with redox Yes 1 148 Colonel 2 redox at 17 cm fringe toward pd 1 149 Colonel 3 redox at 16 cm fringe toward pd 0.25 150 Dixfield 3 redox at 37 cm but really Sunapee fringe toward spd 1 152 Dixfield 1 redox at 42 cm fringe toward spd 0 155 Dixfield 1 no redox but really Abram for model 0 156 Dixfield 3 redox at 33 cm but really Sunapee fringe toward spd 1 159 Dixfield 2 redox at 23 but really Sheepscot fringe toward spd 0.75 73

PAGE 74

APPENDIX D PREDICTION RESULTS FROM W1 (MULTIPLE SAMPLE CONFIGURATIONS) Table D-1. Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 2) Observations Calibration sites (n:90) Percent Cabot Colonel Dixfield Cabot 50 27 23 Colonel 18 53 30 Predictions Dixfield 10 55 35 Table D-2. Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 2) Observations Validation sites (n:38) Percent Cabot Colonel Dixfield Cabot 67 22 11 Colonel 31 38 31 Predictions Dixfield 0 31 69 Table D-3. Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 3) Observations Calibration sites (n:90) Percent Cabot Colonel Dixfield Cabot 52 28 20 Colonel 21 48 31 Predictions Dixfield 9 48 43 74

PAGE 75

Table D-4. Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 3) Observations Validation sites (n:38) Percent Cabot Colonel Dixfield Cabot 57 21 21 Colonel 21 50 29 Predictions Dixfield 0 40 60 Table D-5. Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 4) Observations Calibration sites (n:90) Percent Cabot Colonel Dixfield Cabot 50 32 18 Colonel 27 45 28 Predictions Dixfield 9 36 55 Table D-6. Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 4) Observations Validation sites (n:38) Percent Cabot Colonel Dixfield Cabot 64 9 27 Colonel 6 56 38 Predictions Dixfield 0 64 36 75

PAGE 76

Table D-7. Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 5) Observations Calibration sites (n:90) Percent Cabot Colonel Dixfield Cabot 54 18 29 Colonel 26 43 31 Predictions Dixfield 10 45 45 Table D-8. Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 5) Observations Validation sites (n:38) Percent Cabot Colonel Dixfield Cabot 55 45 0 Colonel 7 64 29 Predictions Dixfield 0 46 54 Table D-9. Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 6) Observations Calibration sites (n:90) Percent Cabot Colonel Dixfield Cabot 55 23 22 Colonel 22 57 21 Predictions Dixfield 0 45 55 76

PAGE 77

Table D-10. Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 6) Observations Validation sites (n:38) Percent Cabot Colonel Dixfield Cabot 53 29 18 Colonel 20 40 40 Predictions Dixfield 18 45 36 Table D-11. Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 7) Observations Calibration sites (n:90) Percent Cabot Colonel Dixfield Cabot 54 25 21 Colonel 21 55 24 Predictions Dixfield 5 45 50 Table D-12. Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 7) Observations Validation sites (n:38) Percent Cabot Colonel Dixfield Cabot 55 27 18 Colonel 21 50 29 Predictions Dixfield 8 46 46 77

PAGE 78

Table D-13. Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 8) Observations Calibration sites (n:90) Percent Cabot Colonel Dixfield Cabot 56 26 19 Colonel 18 58 24 Predictions Dixfield 8 48 44 Table D-14. Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 8) Observations Validation sites (n:38) Percent Cabot Colonel Dixfield Cabot 50 25 25 Colonel 28 44 28 Predictions Dixfield 0 37 63 Table D-15. Confusion table that compares calibration prediction results based on SIE to observed soil series using 90 model development sites in W1 (configuration 9) Observations Calibration sites (n:90) Percent Cabot Colonel Dixfield Cabot 50 32 18 Colonel 20 55 25 Predictions Dixfield 0 45 55 78

PAGE 79

Table D-16. Confusion table that compares validation prediction results based on SIE to observed soil series using 38 independent evaluation sites in W1 (configuration 9) Observations Validation sites (n:38) Percent Cabot Colonel Dixfield Cabot 64 9 27 Colonel 25 50 25 Predictions Dixfield 18 45 36 79

PAGE 80

LIST OF REFERENCES Bishop, T.F.A., and B. Minasny. 2006. Digital soil-terrain modeling: the predictive potential and uncertainty. p. 185-213. In Grunwald, S. (ed.) Environmental Soil Landscape Modeling, Geographic Information Technologies and Pedometrics. Taylor and Francis, New York. Bouma, J. 1989. Using soil survey data for quantitative land evaluation. Advances in Soil Science 9:177-213. Cook, S.E., R.J. Corner, G. Grealish, P. E. Gessler, and C. J. Chartress. 1996. A rule-based system to map soil properties. Soil Sci. Soc. Am. J. 60:1893-1900. Grunwald, S. 2006. What do we really know about the space-time continuum of soillandscapes? p. 3-36. In Grunwald, S. (ed.) Environmental Soil Landscape Modeling, Geographic Information Technologies and Pedometrics. Taylor and Francis, New York. Jenny, H. 1941. Factors of Soil Formation, A System of Quantitative Pedology. McGraw-Hill, New York. Lagacherie, P., J.P. Legros, and P.A. Burrough. 1995. A soil survey procedure using the knowledge on soil pattern of a previously mapped reference area. Geoderma 65:283301. Lagacherie, P. and M. Voltz. 2000. Predicting soil properties over a region using sample information from a mapped reference area and digital elevation data: a conditional probability approach. Geoderma 97:187-208. Lamsal S., S. Grunwald, G.L. Bruland, C.M. Bliss and N.B. Comerford. 2006. Regional hybrid geospatial modeling of soil nitrate-nitrogen in the Santa Fe River Watershed. Geoderma 135:233-247. McBratney, A.B. 2006. Background to digital soil mapping. International Working Group on Digital Soil Mapping. Retrieved April 23, 2007 from http://www.digitalsoilmapping.org/DSM_Background.html McBratney, A.B., and I.O.A. Odeh. 1997. Application of fuzzy sets in soil science: fuzzy logic, fuzzy measurements and fuzzy decisions. Geoderma 77:85-113. McBratney, A.B., I.O.A. Odeh, T.F.A. Bishop, M.S. Dunbar, and T.M. Shatar. 2000. An overview of pedometric techniques for use in soil survey. Geoderma 97:293-327. McBratney, A.B., B. Minasny, S.R. Cattle, and R. W. Vervoort. 2002. From pedotransfer functions to soil inference systems. Geoderma 109:41-73. McBratney, A.B., M.L. Mendonca Santos, and B. Minasny. 2003. On digital soil mapping. Geoderma 117:3-52. 80

PAGE 81

McSweeney, K., P.E. Gessler, B.K. Slater, R.D. Hammer, J.C. Bell, and G. W. Petersen. 1994. Towards a new framework for modeling the soil-landscape continuum. p. 127-145. In R.G. Amundson et al (ed.) Factors of soil formation: A fiftieth anniversary retrospective. SSSA Spec. Publ. no. 33. SSSA, Madison, WI. Michaelsen, J., D.S. Schimel, M.A. Friedl, F.W. Davis, and R.C. Dubayah. 1994. Regression tree analysis of satellite and terrain data to guide vegetation sampling and surveys. J. of Vegetation Science 5:673-686. Milne, G. 1935. Some suggested units of classification and mapping particularly for East African soils. Soil Res. 4:183 -198. National Cooperative Soil Survey. 2008. Retrieved July 24, 2008 from http://www2.ftw.nrcs.usda.gov/osd/dat/C/COLONEL.html Pennock, D.J., and A. Veldkamp. 2006. Advances in landscape-soil research. Geoderma 133:1-5. Pennock, D.J., B.J. Zebarth, and E. De Jong. 1987. Landform classification and soil distribution in Hummocky terrain, Saskatchewan, Canada. Geoderma 40:297-315. Scull, P., J. Franklin, O.A. Chadwick, and D. McArthur. 2003. Predictive soil mapping: a review. Progress in Physical Geography 27:171-197. Shi, X. 2006. Soil inference engine users guide. In progress, unpublished. Shi, X. 2007. ArcSIE Help Document. In progess, unpublished. Shi, X., R. Long, R. DeKett, and J. Philippe. 2008. Integrating different types of knowledge for digital soil mapping. Soil Sci. Soc. Am. J. Accepted. Shi, X., A. X. Zhu, J. E. Burt, F. Qi, and D. Simonson. 2004. A case-based reasoning approach to fuzzy soil mapping. Soil Sci. Soc. Am. J 68:885-894. Skidmore, A.K., F. Watford, P. Luckananurug, and P. J. Ryan. 1996. An operational GIS expert system for mapping forest soil. Photogrammetric Engineering and Remote Sensing 62:501-511. Thompson, J.A., E.M. Pena-Yewtukhiq, and J.H. Grove. 2006. Soil-landscape modeling across a physiographic region: topographic patterns and model transportability. Geoderma 133:57-70. Voltz, M., P. Lagacherie, and X. Louchart. 1997. Predicting soil properties over a region using sample information from a mapped reference area. Eur. J. Soil Sci. 48:19-30. Wsten, J.H.M., P.S. Finke, and M.J.W. Jansen. 1995. Comparison of class and continuous pedotransfer functions to generate soil hydraulic characteristics. Geoderma 66:227-237. 81

PAGE 82

Zadeh, L.A. 1965. Fuzzy sets. Information and Control 8:338-353. 82

PAGE 83

BIOGRAPHICAL SKETCH Jessica (McKay) Philippe received a Bachelor of Science degree in 2005 from the University of Vermont in natural resources planning with a minor in plant and soil science. She is employed as a soil scientist with the USDA-Natural Resources Conservation Service in Saint Johnsbury, Vermont. She lives with her husband and two cats in Newport, Vermont. 83