SUPER-RESOLUTION OF HYPERSPECTRAL SATELLITE IMAGES WITH THE HELP OF HOPFIELD NEURAL ...homepages.cae.wisc.edu/~ece539/project/f17/Gaur_rpt.pdf · 2017. 12. 24. · SATELLITE IMAGES

SUPER-RESOLUTION OF HYPERSPECTRAL

SATELLITE IMAGES WITH THE HELP OF

HOPFIELD NEURAL NETWORK

Project Report

Submitted as part of the

course project requirement of

ECE/CS/ME 539

Introduction to Artificial Neural Network and Fuzzy Systems

Submitted by:

Shikha Gaur

(Campus ID No. 9076475756)

DEPARTMENT OF COMPUTER SCIENCES

UNIVERSITY OF WISCONSIN - MADISON

December 2017

Contents

Introduction …………………………………………………………………………………...3

Problem Formulation ………………………………………………………………………..4-5

Data preparation …………………………………………………………………………….5-6

Algorithm …………………………………………………………………………………...7-8

Results ………………………………………………………………………………………8-9

Discussion ……………………………………………………………………………………10

Conclusion …………………………………………………………………………………...11

Appendix …………………………………………………………………………………….11

References …………………………………………………………………………………...12

Introduction

Remote sensing has facilitated monitoring of a large area, and with the help of satellites it can

provide a repetitive global coverage. Hyperspectral remote sensing involves imaging a surface

through a sensor having a large number of narrow and contiguous bands. The implication of

narrow and contiguous bands is that we can obtain a continuous spectral response of the target

material being imaged. Thus, there are numerous applications of hyperspectral remote sensing

like resource management, urban monitoring, environmental monitoring, mineral exploration,

agriculture planning, military and defence. These applications include some form of matching

the spectral similarity between the observed signature and the known standard signatures of

the various targets. However, for such an analysis to be effective and reliable, it is important

that the signatures obtained through hyperspectral remote sensing are not contaminated, which

is hardly the case. The cost of increased spectral resolution is the reduced spatial resolution for

a hyperspectral image. It varies from a few to tens of metres. The main reason behind the need

to have a good spatial resolution is that there may be more than one type of material present in

the instantaneous field of view (IFOV) of the sensor. Thus, the spectral response that we get at

a pixel can be a mixed response of the various materials present in the sensors IFOV. This leads

to the problem of mixed pixels, i.e. the pixels occupied by two or more pure elements. In order

to benefit from the material characterization property of hyperspectral imagery, some solution

to the mixed pixel problem is needed, which has been attempted in a variety of ways.

The improvement in spatial resolution through some improvements in sensor design is an

expensive and challenging task. Some post-processing technique needs to be applied to

improve the spatial resolution of a hyperspectral image. Super-resolution techniques are those

algorithmic techniques that give improved resolution image from a low-resolution image.

Approaches to achieve super-resolution can be broadly classified into two categories: super-

resolution reconstruction and super-resolution mapping. In super-resolution reconstruction, we

use “multiple” low-resolution images that have sub-pixel shifts. These images can be used for

construction of a higher-resolution image. Super-resolution mapping refers to generation of

land-cover maps at a resolution finer than the given hyper-spectral image, by post processing

of a “single” soft classified image (Ling et al. 2010). Super-resolution not only provides cost-

effective solution to increasing resolution of pictures taken with camera, but it also allows

synthetic zooming of region of interest, which has got applications like surveillance, forensic,

scientific, medical, satellite imaging etc. (Park et al. 2003)

Tatem et al. (2001) proposed an approach using Hopfield neural network (HNN), in which the

output of fuzzy classifier was used to obtain prior information of pixel composition, which was

then used to constrain the HNN. An energy function was coded into HNN and the network was

used for solving the problem through minimization of this energy function. In this case the

energy minimum represented the spatial arrangement of class components within a pixel, which

gave the super-resolved class map of the input image.

Arora and Tiwari (2013) proposed an inverse Euclidean based super-resolution mapping

method. They used the concept of spatial contiguity of all earth surface features.

In my current work, I’ve tried to apply the approaches of Arora and Tiwari (2013) along with

Tatem et al. (2001) to achieve super-resolution for hyperspectral images in presence of 4

classes.

Problem Formulation

Hopfield Neural Networks proposed by John Hopfield in 1982, are fully connected recurrent

networks. They have feedback links. It works as content addressable memory. It is also used

in pattern recognition and solving optimization problems.

The neurons of the network are connected to all other neurons except itself. The condition of

no self-connection and symmetric weights guarantees convergence to a stable state. However,

the final state of the network depends upon the input applied or the initial state of the network.

Thus, the state reached finally is not unique. The state of the network is called energy of the

network. When the neurons update their states and finally network converges, an energy

minima is obtained. Thus, the network seeks lower energy while changing states.

Two energy functions were described by Hopfield for continuous and discrete network

dynamics. The expression of Energy E for continuous case is given by (Classification methods

for remotely sensed data: Brandt Tso, Paul M. Mather)

iv

iiij

j i

iijiij dvvfvIvvwE0

1 )(1

2

1

For high values of λ, the last term in the equation becomes negligible.

Whereas the expression for energy for discrete case is given by

i

ijj i

iijiij vIvvwE2

1

Here the input to ith neuron is described by ui, external input by Ii and the output is described

by vi. τ is a constant for neuron. vi is a function of nonlinear activation function f(ui). λi is a

parameter called gain which scales the input to the function f(u).

i

j

jij

i Ivwdt

du

The energy function formulated to describe the problem is given by:

i j

ijijij PkGkGkE )21( 321

Here k1, k2 and k3 are weights and are constant;

The updation of network neurons is done as per the following equation:

dtdt

duudttu

ij

ijij )(

i

i

v

E

dt

du

ij

ij

ij

ij

ij

ij

ij

ij

dv

dPk

dv

dGk

dv

dGk

dv

dE321

21

Here i and j refer to the location of the neuron. As the location is important for the present case,

the neurons considered to be arranged as a two-dimensional matrix, are given index (i, j).

Two goal functions and one constraint function is defined to obtain the desired super-

resolution. The goal functions are given below:

)1(5.08

1tanh1

2

11 1

1

1

1

ij

i

ikik

j

jljl

kl

ij

ijvv

dv

dG

ij

i

ikik

j

jljl

kl

ij

ijvv

dv

dG

1

1

1

1

5.08

1tanh1

2

12

Both the goal functions support the underlying idea used in the method, i.e. the idea of spatial

contiguity. Thus, they tend to modify the value of a neuron depending upon the values of its

surrounding neurons. Here the neighbourhood considered is a 3 x 3 neighbourhood.

The value of the parameter λ controls the steepness of the tanh function. In the project work I

have chosen the value of λ equal to 100.

The constraint used in this project is the proportion constraint obtained from soft classification

of the low-resolution image.

zxz

xzk

zyz

yzl

xykl

ij

ijav

zdv

dP))55.0tanh(1(

2

12

Here z denotes the zoom factor and axy denotes the proportion of the class for the low-resolution

pixel.

Data preparation

The study area selected is near Ghatol town in the Banswara district of Rajasthan, India. The

source of hyperspectral image is Hyperion sensor onboard NASAs Earth Observing -1 Mission

(EO-1) satellite. Hyperion has got 220 spectral bands (from 0.5 to 2.5 m) and 30 m spatial

resolution. The Hyperion sensor operates in wavelength range from 0.4 to 2.5 µm. The images

are freely available in the form of various data products. Level 1 product which is

radiometrically corrected was downloaded from USGSs EarthExplorer. The level 1 product

had the dimensions 256 x 3413 x 242. Though there is a total of 242 bands in the dataset, only

196 are calibrated unique bands. The image was acquired on March 8, 2013.

(a) (b)

Figure 1(a): The study area, (b): Hyperion image of 1 band of the study area (adapted from

http://earthexplorer.usgs.gov/)

Figure 2: Reference Land use/land cover image obtained by applying FCM to Hyperion image

(30 m resolution)

Single band gray scale image of one of the Hyperion band of image acquired over the study

area is shown in Figure 1(b). Some bands were found noisy and a useful spectral subset

containing 120 samples, 180 lines and 159 bands was created. The spatial subset was created

from the image to reduce the computational time. Some pre-processing in the form of

atmospheric correction was also done on the data.

Figure 2 shows a classification map obtained by applying fuzzy C-means clustering on the

image. The landuse/landcover classes present in the scene are water, vegetation, barren soil

and rocks.

Algorithm

Figure 3: Flow chart of the proposed algorithm

Initialization of Hopfield neural network:

The final state of a Hopfield network depends on initial condition applied. Instead of going for

a random initial allotment maintaining the fuzzy output, I have tried to take neighbourhood into

account.

The network is initialized with the values 0.55 and 0.45. 0.55 represents a sub-pixel with value

1 and 0.45 denotes a sub-pixel with value 0. These values make the change of neuron states to

be faster than the situation when a neuron/sub-pixel was allotted a value of 1 and it had to

change its state to a value of 0.

The initialization starts from various subpixels belonging to a pixel. First of all, the abundance

fraction/class proportion in the particular pixel as well as the zoom factor decides as for how

many subpixels will belong to that class. If for example, the class proportion at a particular

pixel is 20% and the zoom factor is 5, then out of 25 subpixels generated at a pixel 20%, i.e. 5

pixels will belong to that class. Now which 5 pixels out of these 25 subpixels to be allotted the

value of 0.55(corresponding to our 1) is determined by the abundance fraction/class proportion

of the neighbouring pixel. The neighbour having maximum abundance attracts most of the sub-

pixels towards it; that means most of the sub-pixels tend to be close to the pixel having largest

abundance. Thus, I initialized the 5 sub-pixels (in the present example of 20% abundance of

central pixel) closest to such a pixel with the value of 0.55. Further refinement to such an initial

allotment is done by my super-resolution algorithm.

Binary classification is done per class, i.e. an image is divided in multiple planes corresponding

to the number of classes. Super-resolution technique is separately applied to these planes giving

rise to binary classified image planes. Later these planes are merged to give a single hard

classified image with desired resolution. During the merger if there is a clash between multiple

classes, then the class having maximum abundance in that pixel is given priority.

I have implemented the network with the help of a digital computer (intel i7 processor, 8 GB

RAM). The programming was done using MATLAB 2016b.

Results

1. Zoom factor = 3

(a) (b)

Figure 4: (a) Low-resolution input image, (b) Super-resolved output image

Classification results and overall execution time has been summarized in table 1.

Table 1: Confusion matrix for zoom factor = 3

class 1 class 2 class 3 class 4

class 1 661 4 7 142

class 2 19 4113 204 499

class 3 24 217 8202 651

class 4 187 813 1059 4798

actu

al c

lass

Predicted class

Computation time = 19.973205 sec

Overall accuracy = 82.287%

2. Zoom factor = 5

(a) (b)

Figure 5: (a) Low-resolution input image, (b) Super-resolved output image

Classification results and overall execution time has been summarized in table 2.

Table 2: Confusion matrix for zoom factor = 3

class 1 class 2 class 3 class 4

class 1 7840 890 30 334

class 2 1486 4057 209 1105

class 3 20 186 594 14

class 4 306 650 46 3833

Predicted class

actu

al c

lass

Overall accuracy = 75.574%

Computation time = 33.044258 sec

Discussion

• Zoom factor of 10 was also tested and resulted in 66.7% accuracy and the program execution

time was 86 seconds.

• The classification accuracy and execution time turned out to be better than expected.

• The accuracy can be improved by improving the convergence criteria.

• For final classification results, in case of conflict between multiple classes, I chose the class

with highest membership value. This can be improvised to where we also take the current output

of HNN into account (using a variant of SoftMax layer). In the current setup SoftMax led to

image noise and less classification accuracy.

I have completed my Master’s thesis on a similar problem. I’ve implemented many changes for the

purpose of the course project which led to overall improvement in system performance and accuracy.

The summary of changes made specifically in the course project with regards to my Master’s thesis

work are given below:

• Synchronous weight update is being done in the current implementation instead of

asynchronous update of neural network weights.

• Convergence criteria has been added where if the overall energy of the system does not vary

beyond a certain limit, convergence is assumed to have been reached. Because of robust

initialization incorporated, the convergence to local minima is not a big concern in the

implementation.

• When producing final result of classification, I've combined the binary classifications of 4

classes. Previously Hopfield network output was directly used without any activation layer to

be able to find the maximum. This has been improved by using a step function for activation

layer and using the result of Hopfield neural network when only one class had a value of 1. In

case of a conflict or if none of the classes have a value of 1, the value of initial fuzzy

classification output is used to select a class (the class having largest contribution in a pixel is

selected). This led to reduced salt and pepper noise in the image.

The overall result of these improvement was an improvement in classification accuracy by

approximately 4% for a zoom factor of 3 and the execution time was much less (19 seconds using a PC

with i7 processor, windows 10 operating system as compared to 1 hour 45 minutes using a PC with i5

processor, windows 8 operating system).

Conclusion

For the course project I selected the method of Hopfield Neural Network because for a single image

super-resolution case it is one of the most accurate and popular method. Additionally it doesn’t require

training data set. Also one can add multiple constraints to the energy function and that can be modified

to fit different requirements and images. One of the main disadvantages of the technique is its time

consumption. This makes it difficult to use in real-time application specially with higher zoom factors.

The work of Tatem et al. 2001 has been carried forward in the present work. Further improvement can

be done in the technique through mineral mapping at a finer scale. Additional datasets can be included

to add more constraints and give an improvised super-resolution mapping.

The accuracies obtained for zoom factor of 3, 5 and 10 are higher than the targeted accuracy (78% was

expected for zoom factor of 3, whereas accuracy higher than 82% was obtained). Out of several possible

reasons for classification errors, one important cause can be the omission of some of the bands from the

dataset. Due to a preponderance of noise the first 20 bands were excluded from the hyperspectral data

used in the present study.

The execution time was within desirable limits. The algorithm produces very reproducible and

consistent results because of robust initialization algorithm used.

Appendix The source code of project implementation has been included in a separate zip file.

References

ARORA, M.K., AND TIWARI, K.C., 2013, Subpixel Target Enhancement in Hyperspectral Images,

Defence Science Journal,63:1, pp. 63-68.

LING, F., DU, Y., XIAO, F., XUE, H. and WU, S., 2010, Super-resolution land-cover mapping using

multiple sub-pixel shifted remotely sensed images, International Journal of Remote Sensing, 31: 19, pp.

5023-5040.

SHIKHA GAUR, 2014, Master’s thesis work on Sub-pixel mapping with hyperspectral images using

Super-resolution, Indian Institute of Technology Bombay

PARK, S.C., PARK, M.K., and KANG, M.G., 2003, Super-resolution image reconstruction: a technical

overview, Signal Processing Magazine, IEEE, 20:3, pp. 21-36.

TATEM, A.J., LEWIS, H.G., ATKINSON, P.M. and NIXON, M.S., 2001, Superresolution target

identification from remotely sensed images using a Hopfield neural network, IEEE Transactions on

Geoscience and Remote Sensing, 39, pp. 781796.

TSO, B. and MATHER, P.M., 2009, Classification Methods for Remotely Sensed Data (CRC Press).

Documents

SUPER-RESOLUTION OF HYPERSPECTRAL SATELLITE IMAGES WITH THE HELP OF HOPFIELD NEURAL ...homepages.cae.wisc.edu/~ece539/project/f17/Gaur_rpt.pdf · 2017. 12. 24. · SATELLITE IMAGES