Tuesday, April 19, 2016

Advanced Classifiers 3: Expert System and Advanced Neural Network

Introduction:

The advanced classifiers used in the previous labs were each robust, however, they aren't the only methods to achieve high classification accuracy. Artificial Neural Network (ANN) is a classification method that simulates the thinking process of the human brain to implement machine learning, allowing the classifier to perform parameter adjustments independent of a remote sensing expert. Expert System (ES) uses a decision tree to refine an existing classified image using ancillary data. When used correctly, ES can yield incredibly accurate results (93% Overall Accuracy reported in one study).

Methods:

The first method used in this exercise was ES. The ES was used to refine the classification of an image, using several ancillary images containing boolean values for specific classes. The decision tree was designed to refine the classification of several classes (urban/built-up, agriculture, and green vegetation), including dividing urban/built-up into residential and non-residential 'urban' areas. The non-residential urban areas were identified in one of the ancillary boolean rasters, and a rule was created so the output image only contained the intersecting areas between the urban/built-up class of the input raster and the non-residential areas ancillary raster. The residential areas were identified by selecting only the areas classified as urban/built-up on the input raster that were not overlapped by values of the non-residential areas raster. The ancillary rasters for green vegetation and agriculture were used to refine their respective classes in a similar manner to the one used to refine the urban/built-up classes.

ANN was implemented on Quickbird imagery of Northern Iowa University. Once samples for the desired classes are collected, they are fed into the classifier. The classifier was set to use 1000 training iterations when classifying the image.

Results/Conclusions:

The Expert System classification greatly increased the accuracy of the previous image(Figure 1). It allowed for more differentiation of classes, and fixed several problem areas (Figure 2). The ANN classification was extremely easy to implement, but didn't explain any of its parameters, so it isn't possible to know exactly what parameters it used to perform its final classification.

Figure 1: The Expert System Classified image.
Figure 2: The commercial area in the center/right was reclassified to 'Other Urban',
while the surrounding areas were reclassified to 'Residential'



Thursday, April 14, 2016

Object-Based Classification

Introduction:

Object-Based classification methods are the most powerful methods currently in general use by the remote sensing community world-wide. They are substantially more robust than other methods, because classification isn't only based on pixel values, but also on: texture, compactness, and smoothness. In this lab, I explored two different object-based classification methods: 'Random Forest' and 'Support Vector Machines'(SVM). Random forest creates a specified number (I created 300) of classifiers which each create random subsets of training data and classifies them. The overall pixel classification is based on the majority of the values chosen by the classifiers. Support Vector Machines determines decision boundaries to produce optimal separation between classes, via multiple iterations.

Methods:

Before either classifier could be used, segments were created from a Landsat 7 image. The objects were created with a shape/color percent of 30%/70%, and a compactness/smoothness percentage of 50%/50%. Once the segments were created, training samples were selected from the segments. After enough training samples were collected, the images were classified.

Before classification could be performed, I created a training function to identify the classification parameters. The Random Forest classifier was created and trained, with a maximum number of 300 trees (individual classifiers).  The classified image was created, using the parameters identified by the training function.

Before SVM classification could be performed, I created a training function to identify the classification parameters. The Support Vector Machines classifier was created and trained to use a linear kernel. The classified image was created, using the parameters identified by the training function.

Results:

The Random forest classifier created a substantially more accurate output than the Support vector machines classifier. The disparity between the two is the most clearly visible in the extremely built up parts of Eau Claire, which SVM classified as ‘bare earth’ (Figure 1). As SVM only uses one classifier, it was possible for the introduction of more error than random forest’s 300 classifiers. 

Figure 1: Random Forest vs. Support Vector Machines
Random forest performed much better in urban areas than SVM

Sunday, April 10, 2016

Advanced Classifiers 1

Introduction:

Unsupervised and supervised classification methods both sort pixels by the values they contain, but they only do this analysis on a pixel value. But not all features are larger than a pixel! Advanced classifiers sort pixels into different classes, while considering values at a sub-pixel level. The advanced classifiers used in this lab are 'Spectral Linear Unmixing' and 'Fuzzy Classifier'.

Methods:

In order to use Spectral Linear Unmixing (SLM), the spectral reflectance of the pure pixel values (end members) must be recorded before classification can be performed. First, the principal components of each band were calculated. Next, the principal components of bands one and two were plotted on a scatterplot graph. By circling the corners of the plotted pixels, it was possible to identify agricultural, bare earth, and water points. By plotting bands three and four, it was possible to identify the urban/built-up end member. Once all of the end members were collected, all of them were selected, and fractional images were created for each of the LULC classes.

Fuzzy classification sorts mixed value pixels into different classes by assessing how close the values are to one class or another. The first step of fuzzy classification was to collect 48 spectral signatures of both homogeneous and mixed land cover. Next, the spectral classes were used to classify each pixel into 5 different classes (or the 5 best potential classes). After the pixels were classified, a fuzzy convolution was performed to constrain the classifier to only the 5 specified classes.

Conclusion:

The results of the fuzzy classification were worse than both the Unsupervised and supervised classification schemes (Figure 1).

Figure 1: (From top left - clockwise) Unsupervised, Supervised,
Fuzzy classification with 48 signatures, and Fuzzy classification with 24 signatures.

It is likely that error in the fuzzy classifier was introduced during the collection of urban points, as it was difficult to find areas with homogeneous pixel values.

Digital Change Detection

Introduction:

Digital change detection is the process of identifying areas that have experienced changes in land use/land cover (LULC) between two time periods. There are two main methods of performing change detection, Write Function Memory Insertion and Post-Classification Change detection.

Methods:

The Write Function Memory Insertion method is an extremely simple, yet effective process. By stacking the NIR bands of two images, it is possible to identify LULC changes that occurred as they appear as a different color than those that remain constant.

Figure 1: The output from Write Function Memory Insertion.
Areas that changed between 1991 and 2011 appear in red.
Post-classification comparison change detection is a slightly more complicated process than Write Function Memory Insertion, as it first requires both images to be classified. Post-classification comparison change detection is done by comparing the number of pixels in a LULC class in each image, and calculating the percentage of change between the two images (Figure 2).
Figure 2: The percent change between 2001 and 2011 in the Milwaukee MSA.


It is also possible to map the changed pixels, so it shows not only which pixels changed, but which classes they changed to and from (Figure 3). By combining this method with the comparison graphing, it would be possible to identify how much of each LULC classes' change was from each other class.
Figure 3: Milwaukee MSA LULC change.

Conclusions:

LULC change is one of the biggest areas of study within remote sensing, and this lab was extremely helpful for understanding the primary methods of analysis within this sub-field of the discipline. By performing post-classification change detection, specifically, the remote sensing expert can add extremely valuable information to satellite imagery. This information is extremely helpful for planning and environmental modeling.

Saturday, April 9, 2016

Supervised Classification

Introduction:

Land use/land cover (LULC) information can be extracted using numerous methods, one of which is "Supervised Classification". Supervised classification requires the remote sensing expert to collect training samples for the classes they wish to identify. These training samples must be evaluated, to ensure they reflect the spectral variety contained within each LULC class. Once the training samples have been vetted, the remote sensing expert uses them to classify the image.

Methods:

The first step of supervised classification is the collection of training samples. I collected a total of 50 training samples, by using a combination of my Landsat 7 image and Google Earth Imagery. Once all of the samples were collected, I analyzed each class's samples to ensure their spectral signatures resembled the spectral signatures of their respective land cover types (Figures 1&2).

Figure 1: One of these water signatures doesn't match the others, so it will need to be eliminated.

Figure 2: All of these forest signatures resemble one another
Once all of the classes' signatures were assessed, it was necessary to evaluate the spectral separability of the classes. This was done to ensure the classifier will be able to discern one class from one another (Figure 3).

Figure 3: The spectral separability value of 1964
 for bands 1,3,4, and 5 is very good
Once the spectral separability was assessed, the image was classified using the Maximum Likelihood method.

Results:

The result of the Maximum Likelihood classification appears to be slightly less accurate than the image generated from the unsupervised ISODATA algorithm. This inaccuracy was likely introduced as a result of poor quality training samples.

Figure 4: The result of the Maximum Likelihood classification

Conclusion:

Supervised classification's strengths are that it allows the analyst substantially more control over what classes the data are sorted into, and doesn't require manual sorting of clusters. The weaknesses of supervised classification are that it requires training samples to be collected, and that the quality of the output classification depends on the quality of the training samples. The collection of good training samples is fairly dependent on the level of background knowledge the analyst has of the study area.

Unsupervised Classification

Introduction:

Classification of Land Use / Land Cover (LULC) information is one of the primary applications of remote sensing. There are several different methods for performing LULC classification, and in this lab I will be exploring "Unsupervised Classification" using the ISODATA method.

Methods:

ISODATA (Iterative self-organizing data analysis algorithm) identifies common spectral values, and organizes them into clusters. It doesn't require any in-situ data, or any knowledge of the study area before it organizes the pixels. Parameters that I changed are: the maximum number of model iterations, the convergence threshold between spectral values, and the minimum and maximum number of classes into which the data will be organized.

After the pixels were organized into clusters, I recoded them into 5 land use classes: water, forest, agriculture, urban/built-up, and bare soil. I ran the ISODATA algorithm twice, so it sorted the information into 10 classes the first time, and 20 classes the second time.

Results:

The resulting classified images suited the area fairly well (Figure1). The 20-class iteration produced a better result, as the additional classes allowed it to capture a greater variety extent of spectral variance within each LULC class.

Figure 1: The results of the 20-class ISODATA algorithm.


Conclusion:

The ISODATA algorithm is fairly simple, but manages to produce good results in spite of its simplicity. The major benefit of ISODATA is how it doesn't require in-depth knowledge of the study area before implementation, unlike other methods. If better results were desired, I could increase the number of iterations and increase the number of spectral classes.