Tuesday, April 19, 2016

Advanced Classifiers 3: Expert System and Advanced Neural Network

Introduction:

The advanced classifiers used in the previous labs were each robust, however, they aren't the only methods to achieve high classification accuracy. Artificial Neural Network (ANN) is a classification method that simulates the thinking process of the human brain to implement machine learning, allowing the classifier to perform parameter adjustments independent of a remote sensing expert. Expert System (ES) uses a decision tree to refine an existing classified image using ancillary data. When used correctly, ES can yield incredibly accurate results (93% Overall Accuracy reported in one study).

Methods:

The first method used in this exercise was ES. The ES was used to refine the classification of an image, using several ancillary images containing boolean values for specific classes. The decision tree was designed to refine the classification of several classes (urban/built-up, agriculture, and green vegetation), including dividing urban/built-up into residential and non-residential 'urban' areas. The non-residential urban areas were identified in one of the ancillary boolean rasters, and a rule was created so the output image only contained the intersecting areas between the urban/built-up class of the input raster and the non-residential areas ancillary raster. The residential areas were identified by selecting only the areas classified as urban/built-up on the input raster that were not overlapped by values of the non-residential areas raster. The ancillary rasters for green vegetation and agriculture were used to refine their respective classes in a similar manner to the one used to refine the urban/built-up classes.

ANN was implemented on Quickbird imagery of Northern Iowa University. Once samples for the desired classes are collected, they are fed into the classifier. The classifier was set to use 1000 training iterations when classifying the image.

Results/Conclusions:

The Expert System classification greatly increased the accuracy of the previous image(Figure 1). It allowed for more differentiation of classes, and fixed several problem areas (Figure 2). The ANN classification was extremely easy to implement, but didn't explain any of its parameters, so it isn't possible to know exactly what parameters it used to perform its final classification.

Figure 1: The Expert System Classified image.
Figure 2: The commercial area in the center/right was reclassified to 'Other Urban',
while the surrounding areas were reclassified to 'Residential'



Thursday, April 14, 2016

Object-Based Classification

Introduction:

Object-Based classification methods are the most powerful methods currently in general use by the remote sensing community world-wide. They are substantially more robust than other methods, because classification isn't only based on pixel values, but also on: texture, compactness, and smoothness. In this lab, I explored two different object-based classification methods: 'Random Forest' and 'Support Vector Machines'(SVM). Random forest creates a specified number (I created 300) of classifiers which each create random subsets of training data and classifies them. The overall pixel classification is based on the majority of the values chosen by the classifiers. Support Vector Machines determines decision boundaries to produce optimal separation between classes, via multiple iterations.

Methods:

Before either classifier could be used, segments were created from a Landsat 7 image. The objects were created with a shape/color percent of 30%/70%, and a compactness/smoothness percentage of 50%/50%. Once the segments were created, training samples were selected from the segments. After enough training samples were collected, the images were classified.

Before classification could be performed, I created a training function to identify the classification parameters. The Random Forest classifier was created and trained, with a maximum number of 300 trees (individual classifiers).  The classified image was created, using the parameters identified by the training function.

Before SVM classification could be performed, I created a training function to identify the classification parameters. The Support Vector Machines classifier was created and trained to use a linear kernel. The classified image was created, using the parameters identified by the training function.

Results:

The Random forest classifier created a substantially more accurate output than the Support vector machines classifier. The disparity between the two is the most clearly visible in the extremely built up parts of Eau Claire, which SVM classified as ‘bare earth’ (Figure 1). As SVM only uses one classifier, it was possible for the introduction of more error than random forest’s 300 classifiers. 

Figure 1: Random Forest vs. Support Vector Machines
Random forest performed much better in urban areas than SVM

Sunday, April 10, 2016

Advanced Classifiers 1

Introduction:

Unsupervised and supervised classification methods both sort pixels by the values they contain, but they only do this analysis on a pixel value. But not all features are larger than a pixel! Advanced classifiers sort pixels into different classes, while considering values at a sub-pixel level. The advanced classifiers used in this lab are 'Spectral Linear Unmixing' and 'Fuzzy Classifier'.

Methods:

In order to use Spectral Linear Unmixing (SLM), the spectral reflectance of the pure pixel values (end members) must be recorded before classification can be performed. First, the principal components of each band were calculated. Next, the principal components of bands one and two were plotted on a scatterplot graph. By circling the corners of the plotted pixels, it was possible to identify agricultural, bare earth, and water points. By plotting bands three and four, it was possible to identify the urban/built-up end member. Once all of the end members were collected, all of them were selected, and fractional images were created for each of the LULC classes.

Fuzzy classification sorts mixed value pixels into different classes by assessing how close the values are to one class or another. The first step of fuzzy classification was to collect 48 spectral signatures of both homogeneous and mixed land cover. Next, the spectral classes were used to classify each pixel into 5 different classes (or the 5 best potential classes). After the pixels were classified, a fuzzy convolution was performed to constrain the classifier to only the 5 specified classes.

Conclusion:

The results of the fuzzy classification were worse than both the Unsupervised and supervised classification schemes (Figure 1).

Figure 1: (From top left - clockwise) Unsupervised, Supervised,
Fuzzy classification with 48 signatures, and Fuzzy classification with 24 signatures.

It is likely that error in the fuzzy classifier was introduced during the collection of urban points, as it was difficult to find areas with homogeneous pixel values.

Digital Change Detection

Introduction:

Digital change detection is the process of identifying areas that have experienced changes in land use/land cover (LULC) between two time periods. There are two main methods of performing change detection, Write Function Memory Insertion and Post-Classification Change detection.

Methods:

The Write Function Memory Insertion method is an extremely simple, yet effective process. By stacking the NIR bands of two images, it is possible to identify LULC changes that occurred as they appear as a different color than those that remain constant.

Figure 1: The output from Write Function Memory Insertion.
Areas that changed between 1991 and 2011 appear in red.
Post-classification comparison change detection is a slightly more complicated process than Write Function Memory Insertion, as it first requires both images to be classified. Post-classification comparison change detection is done by comparing the number of pixels in a LULC class in each image, and calculating the percentage of change between the two images (Figure 2).
Figure 2: The percent change between 2001 and 2011 in the Milwaukee MSA.


It is also possible to map the changed pixels, so it shows not only which pixels changed, but which classes they changed to and from (Figure 3). By combining this method with the comparison graphing, it would be possible to identify how much of each LULC classes' change was from each other class.
Figure 3: Milwaukee MSA LULC change.

Conclusions:

LULC change is one of the biggest areas of study within remote sensing, and this lab was extremely helpful for understanding the primary methods of analysis within this sub-field of the discipline. By performing post-classification change detection, specifically, the remote sensing expert can add extremely valuable information to satellite imagery. This information is extremely helpful for planning and environmental modeling.

Saturday, April 9, 2016

Supervised Classification

Introduction:

Land use/land cover (LULC) information can be extracted using numerous methods, one of which is "Supervised Classification". Supervised classification requires the remote sensing expert to collect training samples for the classes they wish to identify. These training samples must be evaluated, to ensure they reflect the spectral variety contained within each LULC class. Once the training samples have been vetted, the remote sensing expert uses them to classify the image.

Methods:

The first step of supervised classification is the collection of training samples. I collected a total of 50 training samples, by using a combination of my Landsat 7 image and Google Earth Imagery. Once all of the samples were collected, I analyzed each class's samples to ensure their spectral signatures resembled the spectral signatures of their respective land cover types (Figures 1&2).

Figure 1: One of these water signatures doesn't match the others, so it will need to be eliminated.

Figure 2: All of these forest signatures resemble one another
Once all of the classes' signatures were assessed, it was necessary to evaluate the spectral separability of the classes. This was done to ensure the classifier will be able to discern one class from one another (Figure 3).

Figure 3: The spectral separability value of 1964
 for bands 1,3,4, and 5 is very good
Once the spectral separability was assessed, the image was classified using the Maximum Likelihood method.

Results:

The result of the Maximum Likelihood classification appears to be slightly less accurate than the image generated from the unsupervised ISODATA algorithm. This inaccuracy was likely introduced as a result of poor quality training samples.

Figure 4: The result of the Maximum Likelihood classification

Conclusion:

Supervised classification's strengths are that it allows the analyst substantially more control over what classes the data are sorted into, and doesn't require manual sorting of clusters. The weaknesses of supervised classification are that it requires training samples to be collected, and that the quality of the output classification depends on the quality of the training samples. The collection of good training samples is fairly dependent on the level of background knowledge the analyst has of the study area.

Unsupervised Classification

Introduction:

Classification of Land Use / Land Cover (LULC) information is one of the primary applications of remote sensing. There are several different methods for performing LULC classification, and in this lab I will be exploring "Unsupervised Classification" using the ISODATA method.

Methods:

ISODATA (Iterative self-organizing data analysis algorithm) identifies common spectral values, and organizes them into clusters. It doesn't require any in-situ data, or any knowledge of the study area before it organizes the pixels. Parameters that I changed are: the maximum number of model iterations, the convergence threshold between spectral values, and the minimum and maximum number of classes into which the data will be organized.

After the pixels were organized into clusters, I recoded them into 5 land use classes: water, forest, agriculture, urban/built-up, and bare soil. I ran the ISODATA algorithm twice, so it sorted the information into 10 classes the first time, and 20 classes the second time.

Results:

The resulting classified images suited the area fairly well (Figure1). The 20-class iteration produced a better result, as the additional classes allowed it to capture a greater variety extent of spectral variance within each LULC class.

Figure 1: The results of the 20-class ISODATA algorithm.


Conclusion:

The ISODATA algorithm is fairly simple, but manages to produce good results in spite of its simplicity. The major benefit of ISODATA is how it doesn't require in-depth knowledge of the study area before implementation, unlike other methods. If better results were desired, I could increase the number of iterations and increase the number of spectral classes.

Thursday, February 25, 2016

Radiometric and Atmospheric Correction

Introduction:

Before analysis can be performed on satellite imagery, it needs to be atmospherically corrected. In this lab we will be using three different methods: Empirical Line Calibration, Dark Object Subtraction, and Multidate Image Normalization.

Methods:

Empirical Line Calibration (ELC) was performed by developing regression equations between in situ reflectance measurements and those recorded by the sensor for the same target. In order to perform ELC, I selected 5 areas within the image, then found in situ reflectance information from spectral libraries within Erdas Imagine.



The next method used was Dark Object Subtraction (DOS). The satellite image is converted to at-satellite spectral radiance, before it is converted to true surface reflectance. The radiance conversion is performed by analyzing the image metadata in order to determine the original and re-scaled pixel values. The conversion to reflectance is done by calibrating the radiance image based on: the distance between the earth and the sun, atmospheric transmittance between the ground and sensor, the sun zenith angle, atmospheric transmittance from the sun to ground, and the mean atmospheric spectral irradiance.
Image 1: Comparing the ground control points.

The final correction method used was Multidate Image Normalization (MIN). The first step was to  collect radiometric ground control points from the base image and subsequent image (Image 1). The points were used to build regression equations in Excel. The sample points were collected from static, non-vegetated areas throughout the image scene.

Results:


ELC didn’t really seem to change any pixel values by major amounts (Image 2).

Image 2: The original image (left), and ELC corrected image (right).
DOS increased the visible contrast between features, and eliminated atmospheric haze (Image 3).

Image 3: The original image (left) and DOS corrected image (right) viewed using a 7,5,3 band combination.
The MIN method greatly reduced the visible haze on the imagery, and produced more vivid values overall.
Image 4: The Chicago 2000 (left) and Chicago 2009 MIN (right) viewed using a 7,5,3 band combination.
In the future, I would be more likely use Dark Object Subtraction than the other methods, as it produced a more accurate result. However, Multidate Image Normalization may also be used if I am performing analysis over multiple time periods.

Sunday, February 14, 2016

Lab 2: Surface Temperature Extraction from Thermal Remote Sensing Data

Introduction:

Thermal imagery holds incredibly useful information, but requires a very different workflow than reflective imagery. In this lab, I first performed visual image interpretation on thermal imagery, then progressed into constructing models in order to quantitatively estimate land surface temperature.

Methods:

In order to become comfortable with thermal imagery, I first examined thermal imagery captured over the Eau Claire, Wisconsin. I used my knowledge of existing features to interpret the images' brightness values, in order to further develop my understanding of the properties of thermal images and the features they capture.

After gaining some understanding of the imagery, I created a model to convert ETM+ imagery's values from digital numbers to the satellite's original radiance values. The model used the following equation, "spectral radiance = Grescale * digital number value + Brescale". In relation to the slope equation (y=mx+b), Grescale is the value the radiance was divided by in order to create the digital number. Brescale is the lowest radiance value the satellite recorded. In order to calculate the Grescale, I used the following equation, "Grescale = (LMAX - LMIN)/(QCALMAX - QCALMIN)". The LMAX and LMIN values are the highest and lowest radiance values originally recorded by the satellite, respectively. QCALMAX and QCALMIN are the highest and lowest calibrated pixel values, respectively. As the thermal imagery is recorded in 8-bits, QCALMAX is 255 and QCALMIN is 1. After calculating the Grescale, I was able to use the Grescale and Brescale to calculate the at-satellite radiance values (Figure 1).

Figure 1: The model for calculating at-satellite radiance.


Unfortunately, radiance is only the satellite's recorded value, and not the actual surface temperature. In order to determine the actual surface temperature, I needed to implement an additional equation. The equation to convert radiance to surface temperature is, "Temperature = k2 / ln((k1/radiance)+1)". The values k1 and k2 are calibration constants for the satellite that were recorded before it was launched into orbit, so implementing the equation only required me to identify the values for k1 and k2. I created a new model, with the output radiance raster as the input value for the new equation. Running the new model generated a temperature surface raster, showing temperature in degrees Kelvin.

Next, I performed the same series of equations to calculate radiance and surface temperature, only using Landsat TM imagery instead of ETM+ imagery. Calculating surface temperature required different values for k1 and k2, as Landsat TM was calibrated slightly differently than Landsat ETM+

The final step of this lab was to calculate land surface temperature for Chippewa and Eau Claire counties from a Landsat 8 thermal image. First, I performed an image subset using an area of interest file of the counties' boundaries. Next, I followed similar procedures as the previous two temperature calculations, only using different k1 and k2 values. After generating a surface temperature raster with degrees recorded in Kelvin, I added an additional operation to convert the temperature into degrees Fahrenheit.

Results:

Each of the calculations produced a useful output, but by converting the temperature from Kelvin into Fahrenheit, the usability of the Landsat 8 output image was greatly increased (Figure 2).

Figure 2: Fahrenheit surface temperature extracted from Landsat 8's thermal band.
Sources:

Landsat satellite image is from Earth Resources Observation and Science Center, United States Geological Survey. Area of interest (AOI) file is derived from ESRI counties vector features

Thursday, February 4, 2016

Lab 1: Image quality assessment and statistical analysis

Introduction:

In order to properly use multispectral data for land use/land cover classification the data must first be preprocessed to identify and remove sources of redundancy. I used two different methods to identify problematic band combinations, feature space imaging and correlation analysis.

Methods:

Feature space imaging illustrates the pixel values of two images by plotting them on a scatterplot graph. Band combinations with high association will appear as a cohesive line across the graph (Figure 1). Band combinations with low association will display points widely spread across the surface of the scatterplot graph (Figure 2). I used feature space imaging to assess the covariance values of Landsat imagery of Eau Claire, WI.

Figure 1: High covariance between bands 2 and 3
Figure 2: Low covariance between bands 4 and 5


Correlation analysis compares bands with one another and assesses the extent of association between each band combination. The output of correlation analysis is typically displayed as a correlation matrix. The cell values indicate the extent of interrelationship, with higher values (-1 and +1) indicating extremely high association, and lower values (~0) indicating low association. If two bands are highly correlated, one of them should be removed, to preserve the integrity of the analysis. I used correlation analysis to assess covariance of bands in Landsat imagery for Eau Claire, WI, and Quickbird imagery for the Florida Keys and Sundarbans, Bangladesh.

Results:

After creating the feature space plots, I deemed the removal of bands 2 and 7 necessary in order for proper analysis to be performed. Band 2 had high covariance with both band 1 and band 3 (Figures 1,3). Band 7 had high covariance with band 5, indicating one of them needed to be removed. Band 5 and band 4 had the lowest covariance of any band combination, which convinced me to eliminate band 7 rather than band 5.
Figure 3: Bands 1 and 2 have high covariance



The Eau Claire correlation matrix verified my band removal assessment from the feature space plots, as it indicated bands 2 and 7 have the highest correlation values (Table 1).   The correlation matrix for the Florida Keys revealed high correlations between the bands 1&2 … suggesting the removal of band 2 (Table 2). The correlation matrix for Sundarbans, Bangladesh revealed a similar pattern, with band combinations 1&2 having high correlations (Table 3).

Table 1: Eau Claire Correlation Matrix
Table 2: Florida Keys Correlation Matrix

Table 3: Sundarbans Correlation Matrix


Sources
Landsat satellite image is from Earth Resources Observation and Science Center, United States Geological Survey. Quickbird high resolution images are from Global Land Cover Facility at www.landcover.org