Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
SUPER-RESOLUTION OF HYPERSPECTRAL
SATELLITE IMAGES WITH THE HELP OF
HOPFIELD NEURAL NETWORK
Project Report
Submitted as part of the
course project requirement of
ECE/CS/ME 539
Introduction to Artificial Neural Network and Fuzzy Systems
Submitted by:
Shikha Gaur
(Campus ID No. 9076475756)
DEPARTMENT OF COMPUTER SCIENCES
UNIVERSITY OF WISCONSIN - MADISON
December 2017
Contents
Introduction …………………………………………………………………………………...3
Problem Formulation ………………………………………………………………………..4-5
Data preparation …………………………………………………………………………….5-6
Algorithm …………………………………………………………………………………...7-8
Results ………………………………………………………………………………………8-9
Discussion ……………………………………………………………………………………10
Conclusion …………………………………………………………………………………...11
Appendix …………………………………………………………………………………….11
References …………………………………………………………………………………...12
Introduction
Remote sensing has facilitated monitoring of a large area, and with the help of satellites it can
provide a repetitive global coverage. Hyperspectral remote sensing involves imaging a surface
through a sensor having a large number of narrow and contiguous bands. The implication of
narrow and contiguous bands is that we can obtain a continuous spectral response of the target
material being imaged. Thus, there are numerous applications of hyperspectral remote sensing
like resource management, urban monitoring, environmental monitoring, mineral exploration,
agriculture planning, military and defence. These applications include some form of matching
the spectral similarity between the observed signature and the known standard signatures of
the various targets. However, for such an analysis to be effective and reliable, it is important
that the signatures obtained through hyperspectral remote sensing are not contaminated, which
is hardly the case. The cost of increased spectral resolution is the reduced spatial resolution for
a hyperspectral image. It varies from a few to tens of metres. The main reason behind the need
to have a good spatial resolution is that there may be more than one type of material present in
the instantaneous field of view (IFOV) of the sensor. Thus, the spectral response that we get at
a pixel can be a mixed response of the various materials present in the sensors IFOV. This leads
to the problem of mixed pixels, i.e. the pixels occupied by two or more pure elements. In order
to benefit from the material characterization property of hyperspectral imagery, some solution
to the mixed pixel problem is needed, which has been attempted in a variety of ways.
The improvement in spatial resolution through some improvements in sensor design is an
expensive and challenging task. Some post-processing technique needs to be applied to
improve the spatial resolution of a hyperspectral image. Super-resolution techniques are those
algorithmic techniques that give improved resolution image from a low-resolution image.
Approaches to achieve super-resolution can be broadly classified into two categories: super-
resolution reconstruction and super-resolution mapping. In super-resolution reconstruction, we
use “multiple” low-resolution images that have sub-pixel shifts. These images can be used for
construction of a higher-resolution image. Super-resolution mapping refers to generation of
land-cover maps at a resolution finer than the given hyper-spectral image, by post processing
of a “single” soft classified image (Ling et al. 2010). Super-resolution not only provides cost-
effective solution to increasing resolution of pictures taken with camera, but it also allows
synthetic zooming of region of interest, which has got applications like surveillance, forensic,
scientific, medical, satellite imaging etc. (Park et al. 2003)
Tatem et al. (2001) proposed an approach using Hopfield neural network (HNN), in which the
output of fuzzy classifier was used to obtain prior information of pixel composition, which was
then used to constrain the HNN. An energy function was coded into HNN and the network was
used for solving the problem through minimization of this energy function. In this case the
energy minimum represented the spatial arrangement of class components within a pixel, which
gave the super-resolved class map of the input image.
Arora and Tiwari (2013) proposed an inverse Euclidean based super-resolution mapping
method. They used the concept of spatial contiguity of all earth surface features.
In my current work, I’ve tried to apply the approaches of Arora and Tiwari (2013) along with
Tatem et al. (2001) to achieve super-resolution for hyperspectral images in presence of 4
classes.
Problem Formulation
Hopfield Neural Networks proposed by John Hopfield in 1982, are fully connected recurrent
networks. They have feedback links. It works as content addressable memory. It is also used
in pattern recognition and solving optimization problems.
The neurons of the network are connected to all other neurons except itself. The condition of
no self-connection and symmetric weights guarantees convergence to a stable state. However,
the final state of the network depends upon the input applied or the initial state of the network.
Thus, the state reached finally is not unique. The state of the network is called energy of the
network. When the neurons update their states and finally network converges, an energy
minima is obtained. Thus, the network seeks lower energy while changing states.
Two energy functions were described by Hopfield for continuous and discrete network
dynamics. The expression of Energy E for continuous case is given by (Classification methods
for remotely sensed data: Brandt Tso, Paul M. Mather)
iv
iiij
j i
iijiij dvvfvIvvwE0
1 )(1
2
1
For high values of λ, the last term in the equation becomes negligible.
Whereas the expression for energy for discrete case is given by
i
ijj i
iijiij vIvvwE2
1
Here the input to ith neuron is described by ui, external input by Ii and the output is described
by vi. τ is a constant for neuron. vi is a function of nonlinear activation function f(ui). λi is a
parameter called gain which scales the input to the function f(u).
i
j
jij
i Ivwdt
du
The energy function formulated to describe the problem is given by:
i j
ijijij PkGkGkE )21( 321
Here k1, k2 and k3 are weights and are constant;
The updation of network neurons is done as per the following equation:
dtdt
duudttu
ij
ijij )(
i
i
v
E
dt
du
ij
ij
ij
ij
ij
ij
ij
ij
dv
dPk
dv
dGk
dv
dGk
dv
dE321
21
Here i and j refer to the location of the neuron. As the location is important for the present case,
the neurons considered to be arranged as a two-dimensional matrix, are given index (i, j).
Two goal functions and one constraint function is defined to obtain the desired super-
resolution. The goal functions are given below:
)1(5.08
1tanh1
2
11 1
1
1
1
ij
i
ikik
j
jljl
kl
ij
ijvv
dv
dG
ij
i
ikik
j
jljl
kl
ij
ijvv
dv
dG
1
1
1
1
5.08
1tanh1
2
12
Both the goal functions support the underlying idea used in the method, i.e. the idea of spatial
contiguity. Thus, they tend to modify the value of a neuron depending upon the values of its
surrounding neurons. Here the neighbourhood considered is a 3 x 3 neighbourhood.
The value of the parameter λ controls the steepness of the tanh function. In the project work I
have chosen the value of λ equal to 100.
The constraint used in this project is the proportion constraint obtained from soft classification
of the low-resolution image.
zxz
xzk
zyz
yzl
xykl
ij
ijav
zdv
dP))55.0tanh(1(
2
12
Here z denotes the zoom factor and axy denotes the proportion of the class for the low-resolution
pixel.
Data preparation
The study area selected is near Ghatol town in the Banswara district of Rajasthan, India. The
source of hyperspectral image is Hyperion sensor onboard NASAs Earth Observing -1 Mission
(EO-1) satellite. Hyperion has got 220 spectral bands (from 0.5 to 2.5 m) and 30 m spatial
resolution. The Hyperion sensor operates in wavelength range from 0.4 to 2.5 µm. The images
are freely available in the form of various data products. Level 1 product which is
radiometrically corrected was downloaded from USGSs EarthExplorer. The level 1 product
had the dimensions 256 x 3413 x 242. Though there is a total of 242 bands in the dataset, only
196 are calibrated unique bands. The image was acquired on March 8, 2013.
(a) (b)
Figure 1(a): The study area, (b): Hyperion image of 1 band of the study area (adapted from
http://earthexplorer.usgs.gov/)
Figure 2: Reference Land use/land cover image obtained by applying FCM to Hyperion image
(30 m resolution)
Single band gray scale image of one of the Hyperion band of image acquired over the study
area is shown in Figure 1(b). Some bands were found noisy and a useful spectral subset
containing 120 samples, 180 lines and 159 bands was created. The spatial subset was created
from the image to reduce the computational time. Some pre-processing in the form of
atmospheric correction was also done on the data.
Figure 2 shows a classification map obtained by applying fuzzy C-means clustering on the
image. The landuse/landcover classes present in the scene are water, vegetation, barren soil
and rocks.
Algorithm
Figure 3: Flow chart of the proposed algorithm
Initialization of Hopfield neural network:
The final state of a Hopfield network depends on initial condition applied. Instead of going for
a random initial allotment maintaining the fuzzy output, I have tried to take neighbourhood into
account.
The network is initialized with the values 0.55 and 0.45. 0.55 represents a sub-pixel with value
1 and 0.45 denotes a sub-pixel with value 0. These values make the change of neuron states to
be faster than the situation when a neuron/sub-pixel was allotted a value of 1 and it had to
change its state to a value of 0.
The initialization starts from various subpixels belonging to a pixel. First of all, the abundance
fraction/class proportion in the particular pixel as well as the zoom factor decides as for how
many subpixels will belong to that class. If for example, the class proportion at a particular
pixel is 20% and the zoom factor is 5, then out of 25 subpixels generated at a pixel 20%, i.e. 5
pixels will belong to that class. Now which 5 pixels out of these 25 subpixels to be allotted the
value of 0.55(corresponding to our 1) is determined by the abundance fraction/class proportion
of the neighbouring pixel. The neighbour having maximum abundance attracts most of the sub-
pixels towards it; that means most of the sub-pixels tend to be close to the pixel having largest
abundance. Thus, I initialized the 5 sub-pixels (in the present example of 20% abundance of
central pixel) closest to such a pixel with the value of 0.55. Further refinement to such an initial
allotment is done by my super-resolution algorithm.
Binary classification is done per class, i.e. an image is divided in multiple planes corresponding
to the number of classes. Super-resolution technique is separately applied to these planes giving
rise to binary classified image planes. Later these planes are merged to give a single hard
classified image with desired resolution. During the merger if there is a clash between multiple
classes, then the class having maximum abundance in that pixel is given priority.
I have implemented the network with the help of a digital computer (intel i7 processor, 8 GB
RAM). The programming was done using MATLAB 2016b.
Results
1. Zoom factor = 3
(a) (b)
Figure 4: (a) Low-resolution input image, (b) Super-resolved output image
Classification results and overall execution time has been summarized in table 1.
Table 1: Confusion matrix for zoom factor = 3
class 1 class 2 class 3 class 4
class 1 661 4 7 142
class 2 19 4113 204 499
class 3 24 217 8202 651
class 4 187 813 1059 4798
actu
al c
lass
Predicted class
Computation time = 19.973205 sec
Overall accuracy = 82.287%
2. Zoom factor = 5
(a) (b)
Figure 5: (a) Low-resolution input image, (b) Super-resolved output image
Classification results and overall execution time has been summarized in table 2.
Table 2: Confusion matrix for zoom factor = 3
class 1 class 2 class 3 class 4
class 1 7840 890 30 334
class 2 1486 4057 209 1105
class 3 20 186 594 14
class 4 306 650 46 3833
Predicted class
actu
al c
lass
Overall accuracy = 75.574%
Computation time = 33.044258 sec
Discussion
• Zoom factor of 10 was also tested and resulted in 66.7% accuracy and the program execution
time was 86 seconds.
• The classification accuracy and execution time turned out to be better than expected.
• The accuracy can be improved by improving the convergence criteria.
• For final classification results, in case of conflict between multiple classes, I chose the class
with highest membership value. This can be improvised to where we also take the current output
of HNN into account (using a variant of SoftMax layer). In the current setup SoftMax led to
image noise and less classification accuracy.
I have completed my Master’s thesis on a similar problem. I’ve implemented many changes for the
purpose of the course project which led to overall improvement in system performance and accuracy.
The summary of changes made specifically in the course project with regards to my Master’s thesis
work are given below:
• Synchronous weight update is being done in the current implementation instead of
asynchronous update of neural network weights.
• Convergence criteria has been added where if the overall energy of the system does not vary
beyond a certain limit, convergence is assumed to have been reached. Because of robust
initialization incorporated, the convergence to local minima is not a big concern in the
implementation.
• When producing final result of classification, I've combined the binary classifications of 4
classes. Previously Hopfield network output was directly used without any activation layer to
be able to find the maximum. This has been improved by using a step function for activation
layer and using the result of Hopfield neural network when only one class had a value of 1. In
case of a conflict or if none of the classes have a value of 1, the value of initial fuzzy
classification output is used to select a class (the class having largest contribution in a pixel is
selected). This led to reduced salt and pepper noise in the image.
The overall result of these improvement was an improvement in classification accuracy by
approximately 4% for a zoom factor of 3 and the execution time was much less (19 seconds using a PC
with i7 processor, windows 10 operating system as compared to 1 hour 45 minutes using a PC with i5
processor, windows 8 operating system).
Conclusion
For the course project I selected the method of Hopfield Neural Network because for a single image
super-resolution case it is one of the most accurate and popular method. Additionally it doesn’t require
training data set. Also one can add multiple constraints to the energy function and that can be modified
to fit different requirements and images. One of the main disadvantages of the technique is its time
consumption. This makes it difficult to use in real-time application specially with higher zoom factors.
The work of Tatem et al. 2001 has been carried forward in the present work. Further improvement can
be done in the technique through mineral mapping at a finer scale. Additional datasets can be included
to add more constraints and give an improvised super-resolution mapping.
The accuracies obtained for zoom factor of 3, 5 and 10 are higher than the targeted accuracy (78% was
expected for zoom factor of 3, whereas accuracy higher than 82% was obtained). Out of several possible
reasons for classification errors, one important cause can be the omission of some of the bands from the
dataset. Due to a preponderance of noise the first 20 bands were excluded from the hyperspectral data
used in the present study.
The execution time was within desirable limits. The algorithm produces very reproducible and
consistent results because of robust initialization algorithm used.
Appendix The source code of project implementation has been included in a separate zip file.
References
ARORA, M.K., AND TIWARI, K.C., 2013, Subpixel Target Enhancement in Hyperspectral Images,
Defence Science Journal,63:1, pp. 63-68.
LING, F., DU, Y., XIAO, F., XUE, H. and WU, S., 2010, Super-resolution land-cover mapping using
multiple sub-pixel shifted remotely sensed images, International Journal of Remote Sensing, 31: 19, pp.
5023-5040.
SHIKHA GAUR, 2014, Master’s thesis work on Sub-pixel mapping with hyperspectral images using
Super-resolution, Indian Institute of Technology Bombay
PARK, S.C., PARK, M.K., and KANG, M.G., 2003, Super-resolution image reconstruction: a technical
overview, Signal Processing Magazine, IEEE, 20:3, pp. 21-36.
TATEM, A.J., LEWIS, H.G., ATKINSON, P.M. and NIXON, M.S., 2001, Superresolution target
identification from remotely sensed images using a Hopfield neural network, IEEE Transactions on
Geoscience and Remote Sensing, 39, pp. 781796.
TSO, B. and MATHER, P.M., 2009, Classification Methods for Remotely Sensed Data (CRC Press).