<%BANNER%>

Land Cover Mapping: A Comparison Between Manual Digitizing and Automated Classification of Black and White Historical Ae...


PAGE 1

LAND COVER MAPPING: A COMPARISON BETWEEN MANUAL DIGITIZING AND AUTOMATED CLASSIFICATION OF BLACK AND WHITE HISTORICAL AERIAL PHOTOGRAPHY By WALEED ABDULAZIZ AWWAD A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2003

PAGE 2

Copyright 2003 by Waleed Abdulaziz Awwad

PAGE 3

I dedicate this to my wife Nora and our children Khalid, Waad and Dania for all the sacrifices they made in order to have my goal accomplished, for all the times of joy, meals, and weekends I was not there with them, and for all their encouragement to complete this work.

PAGE 4

ACKNOWLEDGMENTS Praise and thanks go to Allah who blessed me with his mercy and guidance and by surounding me with people of love and support. Thanks go to my parents and family members for their continous prayers, concern and support. I extend my appreciation to individuals at the St. Johns River Water Management District, Environmental Science Division, John Stenberg and Walter Godwin, for their support in explaining the need for this study and providing the data. My appreciation goes to committee members Dr. Bon Dewitt and Dr. Ilir Bejleri for encouragement and guidance to complete this work. Special thanks go to my chairman, Dr. Scot E. Smith, for dedicating his time to review my work and for being there for me. He always was concerned about my research progress and academic development. iv

PAGE 5

TABLE OF CONTENTS page ACKNOWLEDGMENTS.................................................................................................iv LIST OF TABLES............................................................................................................vii LIST OF FIGURES.........................................................................................................viii ABSTRACT.......................................................................................................................ix CHAPTER 1 INTRODUCTION...........................................................................................................1 Purpose of Study.............................................................................................................2 Study Area......................................................................................................................2 2 BACKGROUND.............................................................................................................4 Need for Historical Aerial Photography.........................................................................4 Scanning and Rectification.............................................................................................4 Digitizing........................................................................................................................5 Automatic Image Classification......................................................................................6 Classification Systems....................................................................................................7 3 DATA PREPARATION.................................................................................................9 Texture..........................................................................................................................11 Principal Components Analysis....................................................................................17 4 ANALYSIS...................................................................................................................20 Manual Digitizing.........................................................................................................20 Automated Classification..............................................................................................21 Unsupervised Classification...................................................................................21 Supervised Classification.......................................................................................22 v

PAGE 6

5 DISCUSSION AND COMPARISON...........................................................................25 6 CONCLUSIONS...........................................................................................................29 LIST OF REFERENCES...................................................................................................31 BIOGRAPHICAL SKETCH.............................................................................................32 vi

PAGE 7

LIST OF TABLES Table page 3-1 Matrix of the correlation coefficient between bands in the image..............................17 3-2 Percentage of variance explained by eigenvalues of the components.........................18 5-1 Classification error matrix showing areas of pixels (hectares)...................................25 5-2 Classification error matrix showing amount of accuracy (%).....................................26 vii

PAGE 8

LIST OF FIGURES Figure page 1-1 Study area selected from Lake County mosaic.............................................................3 3-1 Procedural flowchart......................................................................................................9 3-2 Study site. a) 1941 initial aerial photograph, b) subset image.....................................10 3-3 Skewness textures of the image...................................................................................13 3-4 Variance textures of the image....................................................................................14 3-5 Mean Euclidean distance textures of the image..........................................................15 3-6 A 3-band combination of original and texture layers stacked.....................................16 3-7 A 3-band combination of the first five components....................................................18 4-1 Image classified using on-screen digitizing method....................................................21 4-2 Unsupervised classification results..............................................................................22 4-3 Supervised classification results..................................................................................24 5-1 Change detection assessment showing uncommon/common pixels...........................28 viii

PAGE 9

Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science LAND COVER MAPPING: A COMPARISON BETWEEN MANUAL DIGITIZING AND AUTOMATED CLASSIFICATION OF BLACK AND WHITE HISTORICAL AERIAL PHOTOGRAPHY By Waleed Abdulaziz Awwad May 2003 Chair: Dr. Scot E. Smith Major Department: Civil and Coastal Engineering This research compared the results of automated classification (supervised) and manual digitizing. The input data used were a single black and white aerial photograph taken in 1941. The image was scanned and georectified. Automated classification technique was performed on the image and compared to the manual digitizing method in terms of number of pixels recognized. A traditional on-screen digitizing was performed. Features extracted by this method were categorized into nine classes. They were water, aquatic vegetation, terrestrial vegetation, barren, citrus fields, cropland, rangeland, scrubland, urban. Texture analysis was applied to the aerial photograph in order to reconstruct the original image and make it a multi-band. Texture analysis was performed through three measures using five different size moving windows or kernels. The three texture measures included skewness, variance and mean Euclidean distance. Five different ix

PAGE 10

kernels were applied on each measure to create fifteen texture layers. A very detailed multi-band image was created. The study found that using different size moving windows or kernels for three texture measures was beneficial for this type of radiometric enhancement of historical photographs. Principal components analysis (PCA) was performed on that image to reduce redundancy in the data and remove noise. A new multi-band image was then created including the original image and the most useful spectral information to run image processing and automated classification. Automated classifications applied on the new data set included unsupervised and supervised. Each method defined the image into the same nine classes used for the manual method. The accuracy of the supervised classification was compared to the traditional on-screen digitizing. x

PAGE 11

CHAPTER 1 INTRODUCTION There has always been a need for efficient, cost-effective, accurate and automated map updating. Without a means to produce these maps, agencies that rely on maps are subject to project delays and other problems. Remote sensing imagery (aerial photographs, satellite imagery, etc) is an important source for updating maps and historical imagery is an important source of information for documenting land use/land change over time. This study examined three options for using aerial photography taken in 1941 for creation of land use/land cover maps in a rural part of central Florida. The objective of the study was to determine whether there were suitable alternatives to screen digitizing for land use/land cover classification. Creation of accurate land cover maps using black and white aerial photography taken in 60 years ago faces a number of potential problems including: (1) poor quality of the nitrate-based film, (2) relatively poor lens quality, (3) little or no ground-truth information and (4) limited geographic references for image geo-referencing. Limited ground-truthing information tends to result in errors in land cover classification and spatial positioning. Traditional map creation from aerial photography involves manual digitizing of recognizable spectral and spatial features followed by review and editing. More modern techniques involved automated feature recognition. This approach saves time and so, therefore, money, but are the results accurate enough for the purposed of a water management district and other users. 1

PAGE 12

2 This thesis will compare two methods of digitization in order to determine the feasibility of using automated signature classification for shortening time required to create the map, review and edit. Purpose of Study The objective of this research was to compare three approaches: (1) manual on-screen digitizing, (2) automated classification to land use and land cover mapping. A detailed comparison between the classes in each approach was done at the end of this study. The manual on-screen digitizing method was used as the reference data for this summary statistics. Study Area The study area was located in central Florida in portions of two of the seven lakes (Griffin, Harris/Little Harris, Eustis, Yale, Beauclair, Dora, and Weir) in the Upper Ocklawaha River Basin (Figure 1-1). This study focused on portions of lakes Yale and Eustis. The photograph selected for this study visually represented the best quality scanned photograph of all possible 11 1:20,000-scale photos are required to cover the lake. The original photograph was a 9 x 9 black and white vertical aerial contact print taken on February 16, 1941. The photograph was scanned on a scanner and geo-referenced. The area represented in the image was approximately 18 sq km. I attempted to locate the photo log for the aerial photographic mission in order to get more information about the camera, lens, film type, etc., but the company, that flew the flight lines could not be located and may no longer be in business.

PAGE 13

3 Figure 1-1. Study area selected from Lake County mosaic.

PAGE 14

CHAPTER 2 BACKGROUND Need for Historical Aerial Photography Historical aerial photography is a major source of information about the way in which land use and land cover has developed over time. The most obvious way of using this imagery is to view it, either on a computer screen or as a printed copy. This is performed by image software. Most imagery software permit processing of the image in a variety of ways including, scale change, filtering and color conversion. Further processing is requested to accomplish interpretation and quantifying information. Because most historic maps are inaccurate and difficult to work with, an initiative of the St. Johns River Water Management District (SJRWMD) resulted in a project where a series of black and white aerial photographs taken between 1937 and 1947 were scanned and orthorectified. The purpose was to give planners and researchers access to a digital database. Scanning and Rectification The scanning was done in 8-bit TIFF grayscale with a resolution of (500 dpi). The scanning equipment was an Anatech Eagle 4080 large format black and white scanner running I-Scan software on Unix computer. The geometric accuracy of the scanner was 10 thousands of an inch over the length of 40 inches. The scanned images had a ground cell size of one meter. Geometric rectification was done to place the images in a defined coordinate system. Then, it was possible to work with the images as digital maps; scale, overlay, etc. 4

PAGE 15

5 The company responsible for image rectification wrote a summary report to SJRWMD. The report describes the process they used to rectify the aerial photos using a created camera model, ground control points (GCPs) and digital orthophoto quarter quadrangles (DOQQs) taken in 1995. ERDAS Imagines OrthoBASE (version 8.4) was used to rectify a block of photos. An index of photos for each county was assembled and then rectified. Composition of the photographs introduced a large initial error because of the accumulation of misalignment. Corresponding GCPs on both of the DOQQs and the unrectified images were located first. A standard rectification procedure was used to find the most accurate points for use as GCPs such as road intersections and buildings. Change of features in the landscape between 1941 and 1995 presented a challenge in collection of GCPs. After collecting an even distribution of GCPs (about 20 GCPs throughout the photos), automatic tie point access followed for the entire block of photos. Also, there was some manual tying for the points. Finally, the projection parameters used are as follows: Projection: UTM, Zone 17 Spheroid: GRS 1980 Datum: NAD83 Digitizing Digitizing is the process of converting paper maps to a digital form. It has developed as geographic information systems (GISs) grown and, consequently, required new techniques for computer mapping. The digitizing tablet is the electronic form of the drafting tablet (Clarke, 1992). It is used to do the geocoding process to define a paper map in a real world coordinate system.

PAGE 16

6 Another digitizing technique is scanning. Scanners turn map sheets into digital images that display on computer screens. Different types of scanners are available according on the map size (small or large), the print type (paper or film), the output resolution (high or low), etc. Those map sheets made ready for geocoding. Today, digitizing is a widely used technique as screen digitizing. On-screen digitizing is an increasingly common way to get vector (point, line or polygon) information from existing scanned images. This technique is used to acquire new data from imagery and for comparing and detecting changes over time. By displaying images on the screen, features are drawn to represent those on the image. ArcMap (version 8.1) is GIS software from ESRI used in this paper to digitize historical aerial photographs. Automatic Image Classification There are two approaches employed in image processing for identifying features in an image. One is unsupervised and the other is supervised. Both are used to classify the image for a thematic interpretation. According to Lillesand and Keifer (2000), the objective of these operations is to replace visual analysis of the image data with quantitative techniques for automating identification of features in a scene. The idea is to categorize all pixels of same reflectance response into one class. In an unsupervised classification, the spectral response of pixels on the image is located automatically. In this process pixels will be assigned to clusters that correspond to a feature on the ground or class. Lillesand and Keifer (2000) suggest comparing the classified data with a referenced data for more properties of the spectral classes. The number of classes is to be determined by the analyst depending on visual and/or ground truthing. Clusters from the classified image consist of digital number (DN) values for the pixels.

PAGE 17

7 Supervised classification, on the other hand, is semi-automated since the analyst must collect numerical data from the image in the form of Training sets. Training sets represent areas of the LULC. The software thus is trained to classifying the image. Each pixel is categorized into a LULC class assigned by the analyst. All pixels in a category describe the spectral response of their category (Lillesand and Keifer, 2000). Classification Systems To do a classification on any map or a digital image, one must develop or apply a classification scheme. There are a large number of land use and land cover (LULC) classification schemes developed. The United States Geological Survey (USGS), for example, adopted a widely-used system developed by Anderson et al (1976) for remotely sensed data (Lillesand and Keifer, 2000). The Anderson (1976) is a classification system with a hierarchal framework for the levels it contains. The Florida Land Use, Land Cover Classification System (FLUCCS) is a hierarchically classification system based on the Anderson system. In this study, classes were defined to be as follows: 1. Urban or Built-Up Land Residential Commercial Services Industrial Transportation, Communication 2. Agricultural Land Cropland and Pasture Orchard, Groves, Vineyards, Nurseries 3. Rangeland Herbaceous Rangeland Shrub and Brush Rangeland Mixed Rangeland

PAGE 18

8 4. Forest Land Deciduous Forest Land Evergreen Forest Land Mixed Forest Land 5. Water Streams and Canals Lakes 6. Wetland Forested Wetlands Non-forested Wetlands 7. Barren Land

PAGE 19

CHAPTER 3 DATA PREPARATION A flow chart was created to follow illustrate the steps taken in this study (Figure 3-1). It consisted of four major steps: 1) pre-processing, 2) image construction techniques 3) image classification and 4) manual digitizing. This chapter will focus on pre-processing and image construction techniques. The other two steps are discussed in Chapter 4. Aerial Photograph (original image) Importin ERDAS Add as Layer In ArcMAP Subset ImageCreate LayerFor each ClassSelectEditor Tool Add to Original Layer (stack)StartDigitizing Define a ClassificationScheme 1 4 Classify Image(Unsupervised) Classify Image(Supervised) Texture Analysis Principle Components Analysis Image ConstructionTechniques 2 3 ManualAutomated Figure 3-1. Procedural flowchart. 9

PAGE 20

10 Most aerial photographs are indexed by a naming convention. For example, date they were taken on is on the upper left corner and a three-letter code of the county name and a photo index number is on the right. The TIF format aerial photo was imported in (ERDAS Imagine 8.5) in order to apply image-processing techniques (Figure 3-2 (a)). Before applying image processing on the photo it was trimmed by selecting an area of interest (AOI) (Figure 3-2 (b)). A linear contrast stretch was applied to the image using a histogram equalizer. a) b) Figure 3-2. Study site. a) 1941 initial aerial photograph, b) subset image.

PAGE 21

11 Older aerial photography was the first generation of remotely sensed data and, thus, was not a digital. The original films and paper prints must be scanned in order to produce computerized data. Much of the time, shadows and environmental factors affect the interpretation of single band images. A histogram of such an image provides 256 brightness values ranging from 0 to 255. Only personnel with familiarity with an area or with high interpretation skills can categorize this type of data. Those personnel would use nine image analysis elements to do the job: tone, texture, pattern, shape, size, height shadow, site and association (Lillesand, 2000). Fortunately, there are number of spatial filtering techniques to alleviate this problem. They include: convolution, Fourier transform, principal components analysis and texture analysis. In this study, texture was applied through three statistical measures performed on pixels in different size moving windows kernels or. The measures included; skewness, variance and mean Euclidean distance. Small and large kernels used included: 3x3, 5x5, 7x7, 13x13, and 25x25 surrounding each pixel. A total of fifteen texture layers (3 measures x 5 kernels) were produced and added to the original to create a multi-band image. Principal components analysis was then applied to reduce redundancy in the data, remove noise and use the most useful information. Texture Texture analysis was applied to improve the tonal information of the image. Texture is a major property of an image. It represents important tonal information about the structural arrangements of features in the image. Texture determines the overall visual smoothness or coarseness of the image features (Lillesand and Keifer, 2000). The role of texture is to create a brightness value for each pixel in the image. Texture is measured statistically using a moving window throughout the image. Windows of 3x3

PAGE 22

12 and 5x5 are examples of this type of neighboring pixel values. Results of texture algorithms developed show the benefits of texture in supervised and unsupervised classifications (Jensen, 1996). Kushwaha et al (1994) state that applications like terrain analysis, land cover and forest mapping, environmental monitoring, and ecological studies have used successfully texture analysis. Statistical operators including skewness, kurtosis, variation, standard deviation and maximum and mean Euclidean distance are used for the texture analysis in image processing to characterize the distribution of gray levels in an image. In this paper, three textural statistical approaches were applied as constructing techniques on the black and white historical aerial photograph. Skewness, variance and mean Euclidean distance computed with different size windows of pixels in the image. Skewness represents the distribution of intensity or the asymmetry in the image. Variance measures the distribution of variability in the image. Mean Euclidean distance measures mean distance of xy coordinates for every pair of pixel values. Hsu (1978) applied texture analysis on black and white aerial photography. The study found that texture measures were beneficial for the development of an automatic feature extractor system (Hsu, 1978). First, skewness was performed on the original image using the five different size windows. This filtering of the image generated five new images with enhanced edges. Figure 3-3 shows the skewness of pixels in a portion of the aerial photograph.

PAGE 23

13 a) Portion from original image b) 3x3 skewness c) 5x5 skewness d) 7x7 skewness e) 13x13 skewness f) 25x25 skewness Figure 3-3. Skewness textures of the image.

PAGE 24

14 Variance was then applied on the data. That also generated images with more enhanced edges. Figure 3-4 shows the variation in a portion of the aerial photograph. a) portion from original image b) 3x3 variance c) 5x5 variance d) 7x7 variance e) 13x13 variance f) 25x25 variance Figure 3-4. Variance textures of the image.

PAGE 25

15 The last measure applied was the mean Euclidean distance through the same kernels used in the measures before. Figure 3-5 shows the output from each kernel on a portion of the photograph. a) portion of the original image b) 3x3 mean Euclidean distance c) 5x5 mean Euclidean distance d) 7x7 mean Euclidean distance e) 13x13 mean Euclidean distance f) 25x25 mean Euclidean distance Figure 3-5. Mean Euclidean distance textures of the image.

PAGE 26

16 Texture measures generated a total of fifteen layers that were added to the original to produce a multi-band image. Figure 3-6 shows one band combination of three of the sixteen bands in the image to give a false RGB color composite. The constructed image is useful for edge detection and image enhancement. Figure 3-6. A 3-band combination of original and texture layers stacked. A correlation matrix was then generated to show the relationship among bands in the new multi-band image. A correlation coefficient between two bands ranges from +1 to A value of (+1) represents a perfect relationship between the bands and a value of (-1) represents an inverse relationship. So, highly correlated bands are very similar in their spectral information. The correlation between bands in the data set showed very low coefficients between the original layer (Band 1) and each one of the texture measures (Bands 2 through 16) (Table 3-1). If the between-band correlation is low as a result of using a particular band, that band is yielding distinct information alone (Jensen, 1996). Some bands have similar information and correlate high with each other. So, in order to work

PAGE 27

17 with non-correlated data set, principal components analysis was applied on the multi-band image. Table 3-1. Matrix of the correlation coefficient between bands in the image. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 1.00 -0.19 -0.27 -0.31 -0.37 -0.43 0.15 0.16 0.16 0.16 0.18 0.15 0.16 0.17 0.20 0.21 2 -0.19 1.00 0.69 0.51 0.27 0.17 -0.21 -0.17 -0.15 -0.10 -0.05 -0.21 -0.18 -0.17 -0.13 -0.09 3 -0.27 0.69 1.00 0.82 0.45 0.27 -0.23 -0.21 -0.19 -0.13 -0.08 -0.22 -0.22 -0.21 -0.17 -0.12 4 -0.31 0.51 0.82 1.00 0.64 0.36 -0.23 -0.22 -0.21 -0.16 -0.10 -0.22 -0.23 -0.22 -0.18 -0.14 5 -0.37 0.27 0.45 0.64 1.00 0.64 -0.19 -0.19 -0.19 -0.20 -0.18 -0.17 -0.19 -0.20 -0.23 -0.19 6 -0.43 0.17 0.27 0.36 0.64 1.00 -0.16 -0.16 -0.16 -0.17 -0.18 -0.15 -0.17 -0.18 -0.20 -0.21 7 0.15 -0.21 -0.23 -0.23 -0.19 -0.16 1.00 0.95 0.87 0.72 0.59 0.98 0.92 0.84 0.68 0.56 8 0.16 -0.17 -0.21 -0.22 -0.19 -0.16 0.95 1.00 0.97 0.85 0.70 0.92 0.96 0.92 0.77 0.63 9 0.16 -0.15 -0.19 -0.21 -0.19 -0.16 0.87 0.97 1.00 0.93 0.77 0.84 0.92 0.93 0.81 0.68 10 0.16 -0.10 -0.13 -0.16 -0.20 -0.17 0.72 0.85 0.93 1.00 0.91 0.69 0.80 0.87 0.89 0.78 11 0.18 -0.05 -0.08 -0.10 -0.18 -0.18 0.59 0.70 0.77 0.91 1.00 0.57 0.68 0.76 0.87 0.89 12 0.15 -0.21 -0.22 -0.22 -0.17 -0.15 0.98 0.92 0.84 0.69 0.57 1.00 0.93 0.86 0.69 0.57 13 0.16 -0.18 -0.22 -0.23 -0.19 -0.17 0.92 0.96 0.92 0.80 0.68 0.93 1.00 0.96 0.81 0.67 14 0.17 -0.17 -0.21 -0.22 -0.20 -0.18 0.84 0.92 0.93 0.87 0.76 0.86 0.96 1.00 0.90 0.75 15 0.20 -0.13 -0.17 -0.18 -0.23 -0.20 0.68 0.77 0.81 0.89 0.87 0.69 0.81 0.90 1.00 0.89 16 0.21 -0.09 -0.12 -0.14 -0.19 -0.21 0.56 0.63 0.68 0.78 0.89 0.57 0.67 0.75 0.89 1.00 Principal Components Analysis Principal components analysis (PCA) was performed for economical purposes. Principal components analysis is a useful process for producing an economical method of using the same image in fewer bands (Jensen, 1996). PCA runs only on multi-band images to reduce data redundancy in the image. Because of that, PCA was done on the 16-band image contained the original layer and the fifteen texture measures. In this process, the original data is transformed onto principal components; PC1, PC2, ,etc. The first component contains the maximum variation of the data set and the last one contains all the noise in the data. Eigenvalues, as coefficients result of this technique measure variance in the data. Table 3-2 shows how much variance of the data each of the principal components represents and the percentage it explains of the total variance.

PAGE 28

18 Components 1 through 5 account for a total of 98.10% of the variance in the entire 16-band image. This way it was easier and more productive to work with fewer bands. Table 3-2. Percentage of variance explained by eigenvalues of the components. Principal Component Eigenvalue Percentage PC1 2799.58 78.14 PC2 435.08 12.14 PC3 167.02 4.66 PC4 66.62 1.86 PC5 45.99 1.28 PC6 24.68 0.69 PC7 18.40 0.51 PC8 8.53 0.24 PC9 6.36 0.18 PC10 3.74 0.10 PC11 2.68 0.07 PC12 2.52 0.07 PC13 0.73 0.02 PC14 0.66 0.02 PC15 0.15 0.00 PC16 0.04 0.00 Total variance 3582.78 A new image was then generated from this analysis with the first five components or bands. Figure 3-7 shows one band combination of three of the five-band image. Figure 3-7. A 3-band combination of the first five components.

PAGE 29

19 As a result of image construction techniques applied in this chapter, it was decided to use the 5-band image produced from both the texture analysis and principal components analysis to run the automated image classification.

PAGE 30

CHAPTER 4 ANALYSIS This chapter will discuss how the two methods used for classifying the aerial photography were applied on the image. Traditionally the most common method for classification of historical photography is on-screen digitizing. The result was used as a post-processing base map for the computer-classified images. In all methods, a subclass in the classification scheme might be shown as a major class on the classified image depending on the land cover for the study area. For example, cropland is a subclass in the classification scheme. However, it has been extracted as a major class within the classification due to the relatively high amount of representation in the scene. Manual Digitizing In this approach of classifying images, features are manually traced on top of the scanned raster image. The process is usually called on-screen digitizing. The single-band original brightness layer was used in the process. ArcMap was used to apply this technique. First of all, the image was added as a new layer on the display. Blank Shapefiles were created using ArcCatalog to represent the classes within the image. Every shapefile was added as new data to the file in order to extract information. The process involved activating the Editor Tool utility and starting the digitizing process. Each feature was zoomed in on the screen because the most common problem in this technique was working too far away from the scanned image, which could lead to a poor product due to misclassifications of features. The accuracy of the extracted features during classification 20

PAGE 31

21 was based on what visually was believed to be since most of the area had changed. Each category in the display was given a different color to be distinct. Figure 4-1 shows a total of nine categories digitized from the image. They were: water, aquatic vegetation, barren, citrus, terrestrial vegetation, cropland, rangeland, scrubland, and urban. Water Aquatic Veg. Barren Citrus Terrestrial Veg. Cropland Rangeland Scrubland Urban Figure 4-1. Image classified using on-screen digitizing method. Automated Classification Both unsupervised and supervised classifications were applied on the five-band image created earlier (chapter 3). Unsupervised Classification An unsupervised classification was performed first in which signatures were located automatically. This technique involved a random sampling of initially unknown cover types. It was done with ISODATA using an arbitrary number of clusters equal to

PAGE 32

22 35 and a convergence threshold accuracy of 95%. The next step was to give the same colors to similar clusters, using the Raster Attribute Editor in ERDAS. Clusters were aggregated into more general categories (for example, corn and hay into terrestrial vegetation). Figure 4-2 presents a total of nine major classes assigned to the image. They were: water, aquatic vegetation, barren, citrus, terrestrial vegetation, cropland, rangeland, scrubland, and urban. Water Aquatic Veg. Barren Citrus Terrestrial Veg. Cropland Rangeland Scrubland Urban Figure 4-2. Unsupervised classification results. Supervised Classification The second type for automated classification in this study was the supervised one. A supervised classification was done to locate specific areas within the image that show homogenous categories of the land cover types. A signature file was created for the image using a combination of the AOI seed grow tool and manually creating polygon

PAGE 33

23 AOIs over identifiable cover types. The process involves using the seed properties utility in the software in order to set the neighborhood mode, the geographic constrains of the pixels and the spectral Euclidean distance. Multiple AOIs were drawn and grouped for a single category. Those areas represented the best match of pixels. The process was repeated for every known category in the image. The supervised classification used the maximum likelihood parametric classification rule. A total of nine signature classes were created for the signature file associated with the image. They were: water, aquatic vegetation, barren, citrus, terrestrial vegetation, cropland, rangeland, scrubland, and urban. Figure 4-2 shows the results of the supervised classification. The intent of this study was not to determine a multitude of specific land cover or vegetation classification, but to evaluate automated classification over the aerial photograph. Statistical analysis on the results was performed (see Chapter 5). The results were then compared to detect the overall change between the on-screen digitized image and the supervised classification image.

PAGE 34

24 Water Aquatic Veg. Barren Citrus Terrestrial Veg. Cropland Rangeland Scrubland Urban Figure 4-3. Supervised classification results.

PAGE 35

CHAPTER 5 DISCUSSION AND COMPARISON A visual evaluation of the classified images found differences and similarities in the classification. Also, a statistical comparison between the automated method and the manual method was done. An error matrix was created by pixel-to-pixel comparison of both the supervised classification and the on-screen digitized (reference data) methods. Table 5-1 shows a square array of the areas of pixels of the supervised classification recognized in the on-screen digitizing method. Water and citrus were recognized the highest from the classified data. Overall accuracy was calculated by the Table 5-1. Classification error matrix showing areas of pixels (hectares) Manual (Reference Data) 1 2 3 4 5 6 7 8 9 Row Total Automated 1 53 2 7 40 14 33 2 14 23 188 2 1 53 3 2 4 1 59 20 1 144 3 4 5 18 1 3 1 20 18 35 104 4 3 0 0 88 0 12 0 0 2 106 5 14 0 12 7 32 26 0 24 11 127 6 7 0 4 11 16 95 0 4 6 143 7 0 25 0 2 4 0 166 3 0 200 8 13 9 8 2 11 4 21 44 8 121 9 31 1 16 5 7 7 1 8 159 235 Column Total 126 95 69 157 92 179 268 135 247 1368 1.Urban, 2.Aquatic vegetation, 3.Terrestrial vegetation, 4.Barren, 5.Rangeland, 6.Cropland, 7.Water, 8.Scrubland, 9.Citrus. Overall accuracy = (53 + 53 + 18 + 88 + 32 + 95 + 166 + 44 + 159)/ 1368 = 52% 25

PAGE 36

26 sum of the numbers along the diagonal divided by the total number of pixel areas in the data. Also, from the error matrix the percentage of what each category occupies in the reference data was calculated. Table 5-2 shows the percentages of those areas recognized by the manual digitizing method. Table 5-2. Classification error matrix showing amount of accuracy (%) Manual (Reference Data) 1 2 3 4 5 6 7 8 9 Automated 1 42% 2% 10% 26% 15% 18% 1% 10% 9% 2 1% 56% 4% 1% 4% 0% 22% 15% 0% 3 3% 6% 25% 1% 3% 0% 7% 14% 14% 4 3% 0% 0% 56% 0% 7% 0% 0% 1% 5 11% 0% 17% 4% 35% 15% 0% 18% 5% 6 6% 0% 6% 7% 17% 53% 0% 3% 3% 7 0% 26% 0% 1% 5% 0% 62% 2% 0% 8 11% 9% 12% 1% 12% 2% 8% 33% 3% 9 24% 1% 24% 3% 8% 4% 0% 6% 65% 1.Urban, 2.Aquatic vegetation, 3.Terrestrial vegetation, 4.Barren, 5.Rangeland, 6.Cropland, 7.Water, 8.Scrubland, 9.Citrus. Another statistic measure from this error matrix is called KAPPA (Cohen, 1960). It is another method to measure accuracy in the classified data. When KAPPA is performed the result is KHAT statistic. It is calculated as riiiririiiiixxNxxxN1211**

PAGE 37

27 where N = the total number of observation in the matrix r = the number of classes in the data = the value from row i and column i (the major diagonal) iix ix = the value of column total ix = the value of row total riiix1 = 53 + 53 + 18 + 88 + 32 + 95 + 166 + 44 + 159 = 708 riiixx1* = (126*188) + (95*144) + (69*104) + (157*106) + (92*127) + (179*143) + (268*200) + (135*121) + (247*235) = 192793.1 1.19279313681.19279370813682 = 0.46 The value for K ranges from +1.0 to .0. It represents three definitions for the accuracy assessment of the data: strong agreement (1.0 to 0.80), moderate agreement (0.80 to 0.40) and poor agreement (0.40 to 0) (Landis and Koch, 1977). The value of KHAT in this study was found to be 46%. The difference between the KHAT value and the overall accuracy computed from the same error matrix (52%) is that the former uses the non-diagonal elements and the later uses the major diagonal elements and excludes the others. Finally, change detection assessment was performed to test the accuracy of supervised classification compared to on-screen manual digitizing. In this study, this test was simplified to show the uncommon/common error matrix for both methods. Figure 5-1 shows the results of this comparison of pixels by using two colors: black for uncommon

PAGE 38

28 and green for common recognition in the two images. Statistics of this comparison presented 58% of the supervised classification in agreement with the on-screen digitizing. Uncommon Common Figure 5-1. Change detection assessment showing uncommon/common pixels Analysts have to decide on the amount of error accepted if automated classification was to be used to interpret features from the black and white historical aerial photography. In practice, it is believed in this study that a combination of on-screen digitizing and automated classification of certain features is fair compromise of speed and accuracy of details. Dealing with historic maps makes it hard to verify the results of both. The geometry of the image might vary and the landscaped has changed. But, if we were able to achieve the adequate geometric correction, an automatic interpretation of the historical aerial photographs could be done as well in a timely manner.

PAGE 39

CHAPTER 6 CONCLUSIONS The main hypothesis of this study was to assess the difference in the result of a land use/land cover classification using black and white aerial photography between (a) a method using screen digitizing and (b) a method using automatic classification. Differing the size of pixel windows in texture proved to be successful for image construction techniques applied to the photograph of this study. It represented a spatial distribution of the tonal information of the pixels in the image. Many texture measures were available, but only three were tested in this study: skewness, variance and mean Euclidean distance using windows ranging from 3x3 to 25x25. All measures contributed very well to the original and had a low correlation associated with it. Thus, principal component analysis was performed on the multi-band image to reduce redundancy in the data and compress information into fewer bands. Statistics showed that the first five principal components represented more than 98% of the data. The new 5-band image was then classified automatically. In this study, statistics are presented and analysts have to decide on the amount of error accepted when using the automated classification approach to interpret the historical aerial photography. Results from comparing the automated classification method to the manual digitizing method showed how much of each category was correctly identified. Some were correctly identified more than others. Therefore, the automated classification could work better for interpreting certain features in the photography. The fact is that automated classification (supervised or unsupervised) is a rapid way to thematically 29

PAGE 40

30 interpret images. However, the manual digitizing approach will not become obsolete due to the fact that many applications require a level of accuracy not afforded by automatic classification techniques.

PAGE 41

LIST OF REFERENCES Clarke, Keith C. 1990. Analytical and Computer Cartography. Prentice Hall. Englewood Cliffs, New Jersey. Cohen, J. 1960. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement. Vol. 20, p.37-46 Hsu, Shin-Yi, 1978. Texture-Tone Analysis for Automated Land-Use Mapping. Photogrammetric Engineering and Remote Sensing. Vol.44, p.1393-1404. Jensen, J.R., 1996. Introductory Digital Image Processing, A Remote Sensing Perspective. Prentice Hall. Upper Saddle River, New Jersey. Kushwaha, S.P.S., Kuntz, S., and Oesten, G. 1994. Application of Image Texture in Forest Classification. International Journal of Remote Sensing. Vol.15, No.11, p.2273-2284. Landis, J., and G. Koch 1977. The Measurement of Observer Agreement for Categorical Data. Biometrics. Vol. 33, p.159-174. Lillesand, T., and Kiefer, R.W. 2000. Remote Sensing and Image Interpretation. 4 th Edition. John Wiley & Sons. New York. 31

PAGE 42

BIOGRAPHICAL SKETCH I am originally from Jeddah, Saudi Arabia. In 1995, I received a B.S. in cartography from Salem State College in Salem, Massachusetts, and joined the Exploration Organization in Saudi Aramco Oil Company. In 2001, I started a masters degree in civil engineering with a specialization in geomatics in the University of Florida. 32


Permanent Link: http://ufdc.ufl.edu/UFE0000634/00001

Material Information

Title: Land Cover Mapping: A Comparison Between Manual Digitizing and Automated Classification of Black and White Historical Aerial Photography
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0000634:00001

Permanent Link: http://ufdc.ufl.edu/UFE0000634/00001

Material Information

Title: Land Cover Mapping: A Comparison Between Manual Digitizing and Automated Classification of Black and White Historical Aerial Photography
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0000634:00001


This item has the following downloads:


Full Text











LAND COVER MAPPING:
A COMPARISON BETWEEN MANUAL DIGITIZING AND
AUTOMATED CLASSIFICATION OF BLACK AND WHITE
HISTORICAL AERIAL PHOTOGRAPHY















By

WALEED ABDULAZIZ AWWAD


A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE

UNIVERSITY OF FLORIDA


2003




























Copyright 2003

by

Waleed Abdulaziz Awwad




























I dedicate this to my wife Nora and our children Khalid, Waad and Dania for all the
sacrifices they made in order to have my goal accomplished, for all the times of joy,
meals, and weekends I was not there with them, and for all their encouragement to
complete this work.















ACKNOWLEDGMENTS

Praise and thanks go to Allah who blessed me with his mercy and guidance and

by surrounding me with people of love and support. Thanks go to my parents and family

members for their continuous prayers, concern and support.

I extend my appreciation to individuals at the St. Johns River Water Management

District, Environmental Science Division, John Stenberg and Walter Godwin, for their

support in explaining the need for this study and providing the data.

My appreciation goes to committee members Dr. Bon Dewitt and Dr. Ilir Bejleri

for encouragement and guidance to complete this work. Special thanks go to my

chairman, Dr. Scot E. Smith, for dedicating his time to review my work and for being

there for me. He always was concerned about my research progress and academic

development.
















TABLE OF CONTENTS
page

A C K N O W L E D G M E N T S ................................................................................................. iv

LIST O F TA BLE S ......... ..................... ........... .. ........... .............. .. vii

LIST OF FIGURES ............. ............. ........ ....... .......................... viii

ABSTRACT .............. ......................................... ix

CHAPTER

1 IN TRODU CTION ....................................... ...... .. .......... .............. .

Purpose of Study ......... ............. ............................................................................. 2
S tu dy A rea ........................................................................ .................................. . 2


2 BACKGROUND ........... ................................. ..........................

Need for Historical Aerial Photography .............................................................. 4
Scanning and R ectification ............................... .......................... ... .............. 4
D igitizing ..................... .... ....... ...... .......................... 5
Automatic Image Classification...................... ........ ............................ 6
Classification System s ........ ...... ........................ ......... ............ 7


3 D A TA PR EPA R A TIO N ................................................................. ....................... 9

T ex tu re .............. ... .................... ........................................................... 1 1
Principal C om ponents A nalysis.......................................................... ... ................. 17


4 A N A L Y S IS .............................................................................2 0

M annual D igitizing ............................................... ....................... .. 20
A utom ated C classification ............................................................ ...................... 2 1
U supervised C lassification......................................................... .............. 21
Supervised Classification ......................................................... .............. 22





v









5 DISCUSSION AND COMPARISON............................. ...............25

6 CONCLUSIONS .................... ........................... ......... 29

L IST O F R E F E R E N C E S ........................................................................... ........... ........ 1

B IO G R A PH IC A L SK E T C H ...................................................................... ..................32
















LIST OF TABLES

Table page

3-1 Matrix of the correlation coefficient between bands in the image. ..........................17

3-2 Percentage of variance explained by eigenvalues of the components........................ 18

5-1 Classification error matrix showing areas of pixels (hectares) ...............................25

5-2 Classification error matrix showing amount of accuracy (%) ................ .................26
















LIST OF FIGURES

Figure page

1-1 Study area selected from Lake County mosaic. .........................................................3

3-1 P rocedural flow chart.................................................................... ......................... 9

3-2 Study site. a) 1941 initial aerial photograph, b) subset image..................................10

3-3 Skew ness textures of the im age ................................................................................. 13

3-4 V ariance textures of the im age. ...................................................... ...................14

3-5 Mean Euclidean distance textures of the image. ........................... ................... 15

3-6 A 3-band combination of original and texture layers stacked.................. .................16

3-7 A 3-band combination of the first five components ............... ......... ............... 18

4-1 Image classified using on-screen digitizing method.................................................21

4-2 U supervised classification results........................................ .......................... 22

4-3 Supervised classification results ..................................................................... ... ..24

5-1 Change detection assessment showing uncommon/common pixels .........................28















Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science

LAND COVER MAPPING:
A COMPARISON BETWEEN MANUAL DIGITIZING AND
AUTOMATED CLASSIFICATION OF BLACK AND WHITE
HISTORICAL AERIAL PHOTOGRAPHY

By

Waleed Abdulaziz Awwad

May 2003

Chair: Dr. Scot E. Smith
Major Department: Civil and Coastal Engineering

This research compared the results of automated classification (supervised) and

manual digitizing. The input data used were a single black and white aerial photograph

taken in 1941. The image was scanned and georectified. Automated classification

technique was performed on the image and compared to the manual digitizing method in

terms of number of pixels recognized.

A traditional on-screen digitizing was performed. Features extracted by this

method were categorized into nine classes. They were water, aquatic vegetation,

terrestrial vegetation, barren, citrus fields, cropland, rangeland, scrubland, urban.

Texture analysis was applied to the aerial photograph in order to reconstruct the

original image and make it a multi-band. Texture analysis was performed through three

measures using five different size moving windows or kernels. The three texture

measures included skewness, variance and mean Euclidean distance. Five different









kernels were applied on each measure to create fifteen texture layers. A very detailed

multi-band image was created. The study found that using different size moving windows

or kernels for three texture measures was beneficial for this type of radiometric

enhancement of historical photographs. Principal components analysis (PCA) was

performed on that image to reduce redundancy in the data and remove noise. A new

multi-band image was then created including the original image and the most useful

spectral information to run image processing and automated classification.

Automated classifications applied on the new data set included unsupervised and

supervised. Each method defined the image into the same nine classes used for the

manual method. The accuracy of the supervised classification was compared to the

traditional on-screen digitizing.














CHAPTER 1
INTRODUCTION

There has always been a need for efficient, cost-effective, accurate and automated

map updating. Without a means to produce these maps, agencies that rely on maps are

subject to project delays and other problems. Remote sensing imagery (aerial

photographs, satellite imagery, etc) is an important source for updating maps and

historical imagery is an important source of information for documenting land use/land

change over time.

This study examined three options for using aerial photography taken in 1941 for

creation of land use/land cover maps in a rural part of central Florida. The objective of

the study was to determine whether there were suitable alternatives to screen digitizing

for land use/land cover classification. Creation of accurate land cover maps using black

and white aerial photography taken in 60 years ago faces a number of potential problems

including: (1) poor quality of the nitrate-based film, (2) relatively poor lens quality, (3)

little or no ground-truth information and (4) limited geographic references for image geo-

referencing.

Limited ground-truthing information tends to result in errors in land cover

classification and spatial positioning. Traditional map creation from aerial photography

involves manual digitizing of recognizable spectral and spatial features followed by

review and editing. More modern techniques involved automated feature recognition.

This approach saves time and so, therefore, money, but are the results accurate enough

for the purposed of a water management district and other users.









This thesis will compare two methods of digitization in order to determine the

feasibility of using automated signature classification for shortening time required to

create the map, review and edit.

Purpose of Study

The objective of this research was to compare three approaches: (1) manual on-

screen digitizing, (2) automated classification to land use and land cover mapping. A

detailed comparison between the classes in each approach was done at the end of this

study. The manual on-screen digitizing method was used as the reference data for this

summary statistics.

Study Area

The study area was located in central Florida in portions of two of the seven lakes

(Griffin, Harris/Little Harris, Eustis, Yale, Beauclair, Dora, and Weir) in the Upper

Ocklawaha River Basin (Figure 1-1). This study focused on portions of lakes Yale and

Eustis. The photograph selected for this study visually represented the best quality

scanned photograph of all possible 11 1:20,000-scale photos are required to cover the

lake. The original photograph was a 9" x 9" black and white vertical aerial contact print

taken on February 16, 1941. The photograph was scanned on a scanner and geo-

referenced. The area represented in the image was approximately 18 sq km. I attempted

to locate the photo log for the aerial photographic mission in order to get more

information about the camera, lens, film type, etc., but the company, that flew the flight

lines could not be located and may no longer be in business.











r lLT

&,


Figure 1-1. Study area selected from Lake County mosaic.


-IL
A.wa














CHAPTER 2
BACKGROUND

Need for Historical Aerial Photography

Historical aerial photography is a major source of information about the way in

which land use and land cover has developed over time. The most obvious way of using

this imagery is to view it, either on a computer screen or as a printed copy. This is

performed by image software. Most imagery software permit processing of the image in a

variety of ways including, scale change, filtering and color conversion. Further

processing is requested to accomplish interpretation and quantifying information.

Because most historic maps are inaccurate and difficult to work with, an initiative of the

St. Johns River Water Management District (SJRWMD) resulted in a project where a

series of black and white aerial photographs taken between 1937 and 1947 were scanned

and orthorectified. The purpose was to give planners and researchers access to a digital

database.

Scanning and Rectification

The scanning was done in 8-bit TIFF grayscale with a resolution of (500 dpi).

The scanning equipment was an Anatech Eagle 4080 large format black and white

scanner running I-Scan software on Unix computer. The geometric accuracy of the

scanner was 10 thousands of an inch over the length of 40 inches. The scanned images

had a ground cell size of one meter. Geometric rectification was done to place the

images in a defined coordinate system. Then, it was possible to work with the images as

digital maps; scale, overlay, etc.









The company responsible for image rectification wrote a summary report to

SJRWMD. The report describes the process they used to rectify the aerial photos using a

created camera model, ground control points (GCPs) and digital orthophoto quarter

quadrangles (DOQQs) taken in 1995. ERDAS Imagine's OrthoBASE (version 8.4) was

used to rectify a block of photos. An index of photos for each county was assembled and

then rectified.

Composition of the photographs introduced a large initial error because of the

accumulation of misalignment. Corresponding GCPs on both of the DOQQ's and the

unrectified images were located first. A standard rectification procedure was used to find

the most accurate points for use as GCPs such as road intersections and buildings.

Change of features in the landscape between 1941 and 1995 presented a challenge in

collection of GCPs. After collecting an even distribution of GCPs (about 20 GCPs

throughout the photos), automatic tie point access followed for the entire block of photos.

Also, there was some manual tying for the points. Finally, the projection parameters used

are as follows:

Projection: UTM, Zone 17

Spheroid: GRS 1980

Datum: NAD83

Digitizing

Digitizing is the process of converting paper maps to a digital form. It has

developed as geographic information systems (GISs) grown and, consequently, required

new techniques for computer mapping. The digitizing tablet is the electronic form of the

drafting tablet (Clarke, 1992). It is used to do the geocoding process to define a paper

map in a real world coordinate system.









Another digitizing technique is scanning. Scanners turn map sheets into digital

images that display on computer screens. Different types of scanners are available

according on the map size (small or large), the print type (paper or film), the output

resolution (high or low), etc. Those map sheets made ready for geocoding. Today,

digitizing is a widely used technique as screen digitizing. On-screen digitizing is an

increasingly common way to get vector (point, line or polygon) information from existing

scanned images. This technique is used to acquire new data from imagery and for

comparing and detecting changes over time. By displaying images on the screen,

features are drawn to represent those on the image. ArcMap (version 8.1) is GIS

software from ESRI used in this paper to digitize historical aerial photographs.

Automatic Image Classification

There are two approaches employed in image processing for identifying features

in an image. One is "unsupervised" and the other is "supervised". Both are used to

classify the image for a thematic interpretation. According to Lillesand and Keifer

(2000), the objective of these operations is to replace visual analysis of the image data

with quantitative techniques for automating identification of features in a scene. The idea

is to categorize all pixels of same reflectance response into one class.

In an unsupervised classification, the spectral response of pixels on the image is

located automatically. In this process pixels will be assigned to clusters that correspond

to a feature on the ground or class. Lillesand and Keifer (2000) suggest comparing the

classified data with a referenced data for more properties of the spectral classes. The

number of classes is to be determined by the analyst depending on visual and/or ground

truthing. Clusters from the classified image consist of digital number (DN) values for the

pixels.









Supervised classification, on the other hand, is semi-automated since the analyst

must collect numerical data from the image in the form of Training sets. Training sets

represent areas of the LULC. The software thus is "trained" to classifying the image.

Each pixel is categorized into a LULC class assigned by the analyst. All pixels in a

category describe the spectral response of their category (Lillesand and Keifer, 2000).

Classification Systems

To do a classification on any map or a digital image, one must develop or apply a

classification scheme. There are a large number of land use and land cover (LULC)

classification schemes developed. The United States Geological Survey (USGS), for

example, adopted a widely-used system developed by Anderson et al (1976) for remotely

sensed data (Lillesand and Keifer, 2000). The Anderson (1976) is a classification system

with a hierarchal framework for the levels it contains. The Florida Land Use, Land

Cover Classification System (FLUCCS) is a hierarchically classification system based on

the Anderson system. In this study, classes were defined to be as follows:

1. Urban or Built-Up Land

Residential
Commercial Services
Industrial
Transportation, Communication

2. Agricultural Land

Cropland and Pasture
Orchard, Groves, Vineyards, Nurseries

3. Rangeland

Herbaceous Rangeland
Shrub and Brush Rangeland
Mixed Rangeland









4. Forest Land

Deciduous Forest Land
Evergreen Forest Land
Mixed Forest Land

5. Water

Streams and Canals
Lakes

6. Wetland

Forested Wetlands
Non-forested Wetlands


7. Barren Land














CHAPTER 3
DATA PREPARATION

A flow chart was created to follow illustrate the steps taken in this study (Figure

3-1). It consisted of four major steps: 1) pre-processing, 2) image construction techniques

3) image classification and 4) manual digitizing. This chapter will focus on pre-

processing and image construction techniques. The other two steps are discussed in

Chapter 4.








Most aerial photographs are indexed by a naming convention. For example, date

they were taken on is on the upper left corer and a three-letter code of the county name

and a photo index number is on the right. The TIF format aerial photo was imported in

(ERDAS Imagine 8.5) in order to apply image-processing techniques (Figure 3-2 (a)).

Before applying image processing on the photo it was trimmed by selecting an "area of

interest" (AOI) (Figure 3-2 (b)). A linear contrast stretch was applied to the image using a

histogram equalizer.

m

',t ,- m:






iI


4-il


b) Stud i Ima
Figure 3-2. Study site. a) 1941 initial aerial photograph, b) subset image.









Older aerial photography was the first generation of remotely sensed data and,

thus, was not a digital. The original films and paper prints must be scanned in order to

produce computerized data. Much of the time, shadows and environmental factors affect

the interpretation of single band images. A histogram of such an image provides 256

brightness values ranging from 0 to 255. Only personnel with familiarity with an area or

with high interpretation skills can categorize this type of data. Those personnel would

use nine image analysis elements to do the job: tone, texture, pattern, shape, size, height

shadow, site and association (Lillesand, 2000). Fortunately, there are number of spatial

filtering techniques to alleviate this problem. They include: convolution, Fourier

transform, principal components analysis and texture analysis. In this study, texture was

applied through three statistical measures performed on pixels in different size moving

windows kernels or. The measures included; skewness, variance and mean Euclidean

distance. Small and large kernels used included: 3x3, 5x5, 7x7, 13x13, and 25x25

surrounding each pixel. A total of fifteen texture layers (3 measures x 5 kernels) were

produced and added to the original to create a multi-band image. Principal components

analysis was then applied to reduce redundancy in the data, remove noise and use the

most useful information.

Texture

Texture analysis was applied to improve the tonal information of the image.

Texture is a major property of an image. It represents important tonal information about

the structural arrangements of features in the image. Texture determines the overall

visual smoothness or coarseness of the image features (Lillesand and Keifer, 2000). The

role of texture is to create a brightness value for each pixel in the image. Texture is

measured statistically using a moving window throughout the image. Windows of 3x3









and 5x5 are examples of this type of neighboring pixel values. Results of texture

algorithms developed show the benefits of texture in supervised and unsupervised

classifications (Jensen, 1996). Kushwaha et al (1994) state that applications like terrain

analysis, land cover and forest mapping, environmental monitoring, and ecological

studies have used successfully texture analysis. Statistical operators including skewness,

kurtosis, variation, standard deviation and maximum and mean Euclidean distance are

used for the texture analysis in image processing to characterize the distribution of gray

levels in an image. In this paper, three textural statistical approaches were applied as

constructing techniques on the black and white historical aerial photograph. Skewness,

variance and mean Euclidean distance computed with different size windows of pixels in

the image. Skewness represents the distribution of intensity or the asymmetry in the

image. Variance measures the distribution of variability in the image. Mean Euclidean

distance measures mean distance of xy coordinates for every pair of pixel values. Hsu

(1978) applied texture analysis on black and white aerial photography. The study found

that texture measures were beneficial for the development of an automatic feature

extractor system (Hsu, 1978).

First, skewness was performed on the original image using the five different size

windows. This filtering of the image generated five new images with enhanced edges.

Figure 3-3 shows the skewness of pixels in a portion of the aerial photograph.
























a) Portion from original image


c) -5x skewness


d) /x/ skewness


e) 13x13 skewness
Figure 3-3. Skewness textures of the image.


f) 25x25 skewness


b) 3x3 skewness









Variance was then applied on the data. That also generated images with more

enhanced edges. Figure 3-4 shows the variation in a portion of the aerial photograph.


-. IP7i 'f. -


a) portion from original image


c) 3x3 variance


e) 13x13 variance
Figure 3-4. Variance textures of the image.


b) 3x3 variance


a) /x/ variance

^-^"
-~ -WIP


t) 23x23 vanance









The last measure applied was the mean Euclidean distance through the same

kernels used in the measures before. Figure 3-5 shows the output from each kernel on a

portion of the photograph.


a) portion of the original image


c) 5x5 mean Euclidean distance


b) 3x3 mean Euclidean distance


d) 7x7 mean Euclidean distance


e) 13x13 mean Euclidean distance f) 25x25 mean Euclidean distance
Figure 3-5. Mean Euclidean distance textures of the image.









Texture measures generated a total of fifteen layers that were added to the original

to produce a multi-band image. Figure 3-6 shows one band combination of three of the

sixteen bands in the image to give a false RGB color composite. The constructed image

is useful for edge detection and image enhancement.

















Figure 3-6. A 3-band combination of original and texture layers stacked.

A correlation matrix was then generated to show the relationship among bands in

the new multi-band image. A correlation coefficient between two bands ranges from +1

to -1. A value of (+1) represents a perfect relationship between the bands and a value of

(-1) represents an inverse relationship. So, highly correlated bands are very similar in

their spectral information.

The correlation between bands in the data set showed very low coefficients

between the original layer (Band 1) and each one of the texture measures (Bands 2

through 16) (Table 3-1). If the "between-band correlation" is low as a result of using a

particular band, that band is yielding distinct information alone (Jensen, 1996). Some

bands have similar information and correlate high with each other. So, in order to work










with non-correlated data set, principal components analysis was applied on the multi-

band image.

Table 3-1. Matrix of the correlation coefficient between bands in the image.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 1.00 -0.19-0.27-0.31 -0.37 -0.43 0.15 0.16 0.16 0.16 0.18 0.15 0.16 0.17 0.20 0.21
2 -0.19 1.00 0.69 0.51 0.27 0.17 -0.21 -0.17 -0.15 -0.10 -0.05 -0.21 -0.18 -0.17 -0.13 -0.09
3 -0.27 0.69 1.00 0.82 0.45 0.27 -0.23 -0.21 -0.19 -0.13 -0.08-0-0.22-0.22 -0.21 -0.17 -0.12
4 -0.31 0.51 0.82 1.00 0.64 0.36 -0.23 -0.22 -0.21 -0.16 -0.10 -0.22 -0.23 -0.22 -0.18 -0.14
5 -0.37 0.27 0.45 0.64 1.00 0.64 -0.19 -0.19 -0.19 -0.20 -0.18 -0.17 -0.19 -0.20 -0.23 -0.19
6 -0.43 0.17 0.27 0.36 0.64 1.00 -0.16 -0.16-0.16 -0.17 -0.18-0.15 -0.17 -0.18 -0.20 -0.21
7 0.15 -0.21-0-0. -0.23 -0.19 -0.16 1.00 0.95 0.87 0.72 0.59 0.98 0.92 0.84 0.68 0.56
8 0.16 -0.17 -0.21 -0.22 -0.19 -0.16 0.95 1.00 0.97 0.85 0.70 0.92 0.96 0.92 0.77 0.63
9 0.16 -0.15 -0.19 -0.21 -0.19 -0.16 0.87 0.97 1.00 0.93 0.77 0.84 0.92 0.93 0.81 0.68
10 0.16 -0.10 -0.13 -0.16 -0.20 -0.17 0.72 0.85 0.93 1.00 0.91 0.69 0.80 0.87 0.89 0.78
11 0.18 -0.05 -0.08 -0.10 -0.18 -0.18 0.59 0.70 0.77 0.91 1.00 0.57 0.68 0.76 0.87 0.89
12 0.15 -0.21-0-0. -0.22 -0.17 -0.15 0.98 0.92 0.84 0.69 0.57 1.00 0.93 0.86 0.69 0.57
13 0.16 -0.18-0.22-0.23 -0.19 -0.17 0.92 0.96 0.92 0.80 0.68 0.93 1.00 0.96 0.81 0.67
14 0.17 -0.17 -0.21 -0.22 -0.20 -0.18 0.84 0.92 0.93 0.87 0.76 0.86 0.96 1.00 0.90 0.75
15 0.20 -0.13-0.17-0.18-0.23 -0.20 0.68 0.77 0.81 0.89 0.87 0.69 0.81 0.90 1.00 0.89
16 0.21 -0.09-0.12-0.14-0.19 -0.21 0.56 0.63 0.68 0.78 0.89 0.57 0.67 0.75 0.89 1.00

Principal Components Analysis

Principal components analysis (PCA) was performed for economical purposes.

Principal components analysis is a useful process for producing an economical method of

using the same image in fewer bands (Jensen, 1996). PCA runs only on multi-band

images to reduce data redundancy in the image. Because of that, PCA was done on the

16-band image contained the original layer and the fifteen texture measures. In this

process, the original data is transformed onto principal components; PC1, PC2, ...,etc.

The first component contains the maximum variation of the data set and the last one

contains all the noise in the data. Eigenvalues, as coefficients result of this technique

measure variance in the data. Table 3-2 shows how much variance of the data each of the

principal components represents and the percentage it explains of the total variance.










Components 1 through 5 account for a total of 98.10% of the variance in the entire 16-

band image. This way it was easier and more productive to work with fewer bands.

Table 3-2. Percentage of variance explained by eigenvalues of the components.
Principal Component Eigenvalue Percentage
PC1 2799.58 78.14
PC2 435.08 12.14
PC3 167.02 4.66
PC4 66.62 1.86
PC5 45.99 1.28
PC6 24.68 0.69
PC7 18.40 0.51
PC8 8.53 0.24
PC9 6.36 0.18
PC10 3.74 0.10
PC11 2.68 0.07
PC12 2.52 0.07
PC13 0.73 0.02
PC14 0.66 0.02
PC15 0.15 0.00
PC16 0.04 0.00
Total variance 3582.78

A new image was then generated from this analysis with the first five components

or bands. Figure 3-7 shows one band combination of three of the five-band image.

,, _B3f
: *BB


Figure 3-7. A 3-band combination of the first five components.






19


As a result of image construction techniques applied in this chapter, it was

decided to use the 5-band image produced from both the texture analysis and principal

components analysis to run the automated image classification.














CHAPTER 4
ANALYSIS

This chapter will discuss how the two methods used for classifying the aerial

photography were applied on the image. Traditionally the most common method for

classification of historical photography is on-screen digitizing. The result was used as a

post-processing base map for the computer-classified images. In all methods, a subclass

in the classification scheme might be shown as a major class on the classified image

depending on the land cover for the study area. For example, cropland is a subclass in the

classification scheme. However, it has been extracted as a major class within the

classification due to the relatively high amount of representation in the scene.

Manual Digitizing

In this approach of classifying images, features are manually traced on top of the

scanned raster image. The process is usually called "on-screen digitizing". The single-

band original brightness layer was used in the process. ArcMap was used to apply this

technique.

First of all, the image was added as a new layer on the display. Blank Shapefiles

were created using ArcCatalog to represent the classes within the image. Every shapefile

was added as new data to the file in order to extract information. The process involved

activating the Editor Tool utility and starting the digitizing process. Each feature was

zoomed in on the screen because the most common problem in this technique was

working too far away from the scanned image, which could lead to a poor product due to

misclassifications of features. The accuracy of the extracted features during classification









was based on what visually was believed to be since most of the area had changed. Each

category in the display was given a different color to be distinct. Figure 4-1 shows a total

of nine categories digitized from the image. They were: water, aquatic vegetation,

barren, citrus, terrestrial vegetation, cropland, rangeland, scrubland, and urban.
























S Water -I Aquatic Veg. II Barren M Citrus M Terrestrial V(
I Cropland M Rangeland Scrubland M Urban

Figure 4-1. Image classified using on-screen digitizing method.

Automated Classification

Both unsupervised and supervised classifications were applied on the five-band

image created earlier (chapter 3).

Unsupervised Classification

An "unsupervised" classification was performed first in which signatures were

located automatically. This technique involved a random sampling of initially unknown

cover types. It was done with ISODATA using an arbitrary number of clusters equal to









35 and a convergence threshold accuracy of 95%. The next step was to give the same

colors to similar clusters, using the Raster Attribute Editor in ERDAS. Clusters were

aggregated into more general categories (for example, corn and hay into terrestrial

vegetation). Figure 4-2 presents a total of nine major classes assigned to the image.

They were: water, aquatic vegetation, barren, citrus, terrestrial vegetation, cropland,

rangeland, scrubland, and urban.




.... .. .. .I



















S Water I. Aquatic Veg. Barren Citrus .Terrestrial V
= Cropland M Rangeland Scrubland M Urban
Figure 4-2. Unsupervised classification results.

Supervised Classification

The second type for automated classification in this study was the supervised one.

A supervised classification was done to locate specific areas within the image that show

homogenous categories of the land cover types. A signature file was created for the

image using a combination of the AOI seed grow tool and manually creating polygon









AOI's over identifiable cover types. The process involves using the "seed properties"

utility in the software in order to set the neighborhood mode, the geographic constrains of

the pixels and the spectral Euclidean distance. Multiple AOIs were drawn and grouped

for a single category. Those areas represented the best match of pixels. The process was

repeated for every known category in the image. The supervised classification used the

maximum likelihood parametric classification rule. A total of nine signature classes were

created for the signature file associated with the image. They were: water, aquatic

vegetation, barren, citrus, terrestrial vegetation, cropland, rangeland, scrubland, and

urban. Figure 4-2 shows the results of the supervised classification.

The intent of this study was not to determine a multitude of specific land cover or

vegetation classification, but to evaluate automated classification over the aerial

photograph. Statistical analysis on the results was performed (see Chapter 5). The

results were then compared to detect the overall change between the on-screen digitized

image and the supervised classification image.












In


m l *
..


is... ...,..
'.. ,


c '.


U "


M Water Aquatic Veg. I I Barren M Citrus
S Cropland M Rangeland m Scrubland m Urban


Terrestrial Veg.


Figure 4-3. Supervised classification results.















CHAPTER 5
DISCUSSION AND COMPARISON

A visual evaluation of the classified images found differences and similarities in

the classification. Also, a statistical comparison between the automated method and the

manual method was done. An error matrix was created by pixel-to-pixel comparison of

both the supervised classification and the on-screen digitized (reference data) methods.

Table 5-1 shows a square array of the areas of pixels of the supervised classification

recognized in the on-screen digitizing method. Water and citrus were recognized the

highest from the classified data. Overall accuracy was calculated by the

Table 5-1. Classification error matrix showing areas of pixels (hectares)
Manual (Reference Data)
1 2 3 4 5 6 7 8 9 Row
Total
Automated
1 53 2 7 40 14 33 2 14 23 188

2 1 53 3 2 4 1 59 20 1 144

3 4 5 18 1 3 1 20 18 35 104

4 3 0 0 88 0 12 0 0 2 106

5 14 0 12 7 32 26 0 24 11 127
6 7 0 4 11 16 95 0 4 6 143

7 0 25 0 2 4 0 166 3 0 200
8 13 9 8 2 11 4 21 44 8 121

9 31 1 16 5 7 7 1 8 159 235
Column
126 95 69 157 92 179 268 135 247 1368
Total
1.Urban, 2.Aquatic vegetation, 3.Terrestrial vegetation, 4.Barren, 5.Rangeland,
6.Cropland, 7.Water, 8.Scrubland, 9.Citrus.
Overall accuracy = (53 + 53 + 18 + 88 + 32 + 95 + 166 + 44 + 159)/ 1368 = 52%









sum of the numbers along the diagonal divided by the total number of pixel areas in the

data. Also, from the error matrix the percentage of what each category occupies in the

reference data was calculated. Table 5-2 shows the percentages of those areas recognized

by the manual digitizing method.

Table 5-2. Classification error matrix showing amount of accuracy (%)

Manual (Reference Data)
1 2 3 4 5 6 7 8 9

Automated
1 42% 2% 10% 26% 15% 18% 1 10% 9%

2 1% 56% 4% 1% 4% 0% 22% 15% 0%

3 3% 6% 25% 1% 3% 0% 7% 14% 14%

4 3% 0% 0% 56% 0% 7% 0% 0% 1%

5 11% 0% 17% 4% 35% 15% 0% 18% 5%
6 6% 0% 6% 7% 17% 53% 0% 3% 3%

7 0% 26% 0% 1% 5% 0% 62% 2% 0%
8 11% 9% 12% 1% 12% 2% 8% 33% 3%

9 24% 1% 24% 3% 8% 4% 0% 6% 65%

1.Urban, 2.Aquatic vegetation, 3.Terrestrial vegetation, 4.Barren, 5.Rangeland,
6.Cropland, 7.Water, 8.Scrubland, 9.Citrus.

Another statistic measure from this error matrix is called KAPPA (Cohen, 1960).

It is another method to measure accuracy in the classified data. When KAPPA is

performed the result is KHAT statistic. It is calculated as


r r


N2 Y(x *x)









where

N = the total number of observation in the matrix

r = the number of classes in the data

x,, = the value from row i and column i (the major diagonal)

x, = the value of column total

x, = the value of row total


-x,,= 53 + 53 + 18 + 88 + 32 + 95+ 166 + 44+ 159= 708
i=1


(x, x, )= (126*188)+ (95*144)+ (69*104)+ (157*106)+ (92*127)+ (179*143)
i=1

+ (268*200) + (135*121) + (247*235) = 192793.1

= 1368(708)-192793.1 0.46
(1368)2 -192793.1

The value for K ranges from +1.0 to -1.0. It represents three definitions for the accuracy

assessment of the data: strong agreement (1.0 to 0.80), moderate agreement (0.80 to 0.40)

and poor agreement (0.40 to 0) (Landis and Koch, 1977). The value of KHAT in this

study was found to be 46%. The difference between the KHAT value and the overall

accuracy computed from the same error matrix (52%) is that the former uses the non-

diagonal elements and the later uses the major diagonal elements and excludes the others.

Finally, change detection assessment was performed to test the accuracy of

supervised classification compared to on-screen manual digitizing. In this study, this test

was simplified to show the uncommon/common error matrix for both methods. Figure 5-

1 shows the results of this comparison of pixels by using two colors: black for uncommon









and green for common recognition in the two images. Statistics of this comparison

presented 58% of the supervised classification in agreement with the on-screen digitizing.










L 'j













M Uncommon W-1 Common

Figure 5-1. Change detection assessment showing uncommon/common pixels

Analysts have to decide on the amount of error accepted if automated

classification was to be used to interpret features from the black and white historical

aerial photography. In practice, it is believed in this study that a combination of on-screen

digitizing and automated classification of certain features is fair compromise of speed and

accuracy of details. Dealing with historic maps makes it hard to verify the results of both.

The geometry of the image might vary and the landscaped has changed. But, if we were

able to achieve the adequate geometric correction, an automatic interpretation of the

historical aerial photographs could be done as well in a timely manner.














CHAPTER 6
CONCLUSIONS

The main hypothesis of this study was to assess the difference in the result of a

land use/land cover classification using black and white aerial photography between (a) a

method using screen digitizing and (b) a method using automatic classification.

Differing the size of pixel windows in texture proved to be successful for image

construction techniques applied to the photograph of this study. It represented a spatial

distribution of the tonal information of the pixels in the image. Many texture measures

were available, but only three were tested in this study: skewness, variance and mean

Euclidean distance using windows ranging from 3x3 to 25x25. All measures contributed

very well to the original and had a low correlation associated with it. Thus, principal

component analysis was performed on the multi-band image to reduce redundancy in the

data and compress information into fewer bands. Statistics showed that the first five

principal components represented more than 98% of the data. The new 5-band image was

then classified automatically.

In this study, statistics are presented and analysts have to decide on the amount of

error accepted when using the automated classification approach to interpret the historical

aerial photography. Results from comparing the automated classification method to the

manual digitizing method showed how much of each category was correctly identified.

Some were correctly identified more than others. Therefore, the automated classification

could work better for interpreting certain features in the photography. The fact is that

automated classification (supervised or unsupervised) is a rapid way to thematically






30


interpret images. However, the manual digitizing approach will not become obsolete due

to the fact that many applications require a level of accuracy not afforded by automatic

classification techniques.
















LIST OF REFERENCES

Clarke, Keith C. 1990. Analytical and Computer Cartography. Prentice Hall.
Englewood Cliffs, New Jersey.

Cohen, J. 1960. "A Coefficient of Agreement for Nominal Scales." Educational and
Psychological Measurement. Vol. 20, p.37-46

Hsu, Shin-Yi, 1978. "Texture-Tone Analysis for Automated Land-Use Mapping."
Photogrammetric Engineering and Remote Sensing. Vol.44, p.1393-1404.

Jensen, J.R., 1996. Introductory Digital Image Processing, A Remote Sensing
Perspective. Prentice Hall. Upper Saddle River, New Jersey.

Kushwaha, S.P.S., Kuntz, S., and Oesten, G. 1994. "Application of Image Texture in
Forest Classification." International Journal of Remote Sensing. Vol.15, No. 11,
p.2273-2284.

Landis, J., and G. Koch 1977. "The Measurement of Observer Agreement for
Categorical Data." Biometrics. Vol. 33, p.159-174.

Lillesand, T., and Kiefer, R.W. 2000. Remote Sensing andImage Interpretation. 4th
Edition. John Wiley & Sons. New York.















BIOGRAPHICAL SKETCH

I am originally from Jeddah, Saudi Arabia. In 1995, I received a B.S. in

cartography from Salem State College in Salem, Massachusetts, and joined the

Exploration Organization in Saudi Aramco Oil Company. In 2001, I started a master's

degree in civil engineering with a specialization in geomatics in the University of Florida.