Upload
grssieee
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
Embedded Feature Selection
of Hyperspectral Bands with
Boosted Decision Trees
Sildomar Monteiro and Richard Murphy
The University of Sydney
Rio Tinto Centre for Mine Automation
• Totally Autonomous Mine in 10 years:– Brings together all elements of systems, perception,
machine learning, data fusion and more
– A grand challenge for Field Robotics
• Driven by safety, predictability and efficiency
Dr Sildomar Monteiro 22
IGARSS 2011
Goal: Mine Picture Compilation
• Provide a complete and accurate model of the mine
– Mine planning and better prediction outcomes
• Maintain and update a multi-scale probabilistic
representation
– Geology
– Geometry
– Equipment
– And other properties of interest for the mining process
Dr Sildomar Monteiro 3IGARSS 2011
Today
Floor mapping using ripped trench sections
Geology Feedback to Batch
Cone logging
4
Today
Dr Sildomar Monteiro 4IGARSS 2011
Geology (ground-truth)
Dr Sildomar Monteiro 5IGARSS 2011
Mine Face Scanning
IGARSS 2011 6
[Nieto, Viejo and Monteiro, 2010]
Hyperspectral sensing for mining
• Geology classification (material identification) still has
many challenges
• Environmental conditions
– Illumination, temperature, dust
• Timely data acquisition and processing is needed
– Algorithms and calibration
• High spectral similarity between (ore-bearing) rock
types
– Few, if any, distinctive spectral features
Dr Sildomar Monteiro IGARSS 2011 7
Outline
• Hyperspectral classification using Boosting
• Embedded band selection
• Experiments using iron ore data
Dr Sildomar Monteiro 8IGARSS 2011
Hyper-spectral Sensors
VisNIR400-970 nm
SWIR 970-2500 nm
Multispectral
Hyper-spectral
Dr Sildomar Monteiro 9IGARSS 2011
Band 4Band 5
Band 6
Band 2Band 3
Band n
Band 1
Example of Classification and Spectra
Dr Sildomar Monteiro IGARSS 2011 10
500 750 1000 1250 1500 1750 2000 2250
Refle
cta
nce
0.0
0.1
0.2
0.3
0.4
0.5
0.6
500 750 1000 1250 1500 1750 2000 2250
0.00
0.05
0.10
0.15
0.20
Wavelength (nm)
500 750 1000 1250 1500 1750 2000 2250
Refle
cta
nce
0.0
0.2
0.4
0.6
0.8
1.0
Wavelength (nm)
500 750 1000 1250 1500 1750 2000 2250
0.0
0.1
0.2
0.3
0.4
0.5
a b
c d
500 750 1000 1250 1500 1750 2000 2250
Reflecta
nce
0.0
0.1
0.2
0.3
0.4
0.5
0.6
500 750 1000 1250 1500 1750 2000 2250
0.00
0.05
0.10
0.15
0.20
Wavelength (nm)
500 750 1000 1250 1500 1750 2000 2250
Reflecta
nce
0.0
0.2
0.4
0.6
0.8
1.0
Wavelength (nm)
500 750 1000 1250 1500 1750 2000 2250
0.0
0.1
0.2
0.3
0.4
0.5
a b
c d
500 750 1000 1250 1500 1750 2000 2250
Refle
cta
nce
0.0
0.1
0.2
0.3
0.4
0.5
0.6
500 750 1000 1250 1500 1750 2000 2250
0.00
0.05
0.10
0.15
0.20
Wavelength (nm)
500 750 1000 1250 1500 1750 2000 2250
Refle
cta
nce
0.0
0.2
0.4
0.6
0.8
1.0
Wavelength (nm)
500 750 1000 1250 1500 1750 2000 2250
0.0
0.1
0.2
0.3
0.4
0.5
a b
c d
Hyperspectral Band Selection
• Feature Selection (vs Dimensionality Reduction)
– Remove correlated inputs
– Physical interpretation (band wavelengths)
• Faster data processing
• Possible faster data acquisition
• Can be tailored to application
• Indicate multispectral bands
Dr Sildomar Monteiro IGARSS 2011 11
Boosting
• Sound theoretical foundation
– Additive Logistic Regression [Friedman, 2000]
• Empirical studies show that boosting
– Yields small classification error rates
– Is very resilient to overfitting
• State-of-the-art results in many applications, e.g. face
recognition in computer vision
• The idea of Boosting is to train many “weak” learners
on various distributions (or set of weights) of the input
data and then combine the resulting classifiers into a
single “committee”
Dr Sildomar Monteiro 12IGARSS 2011
Decision Trees
• Advantages:
– Robustness and interpretability
• Disadvantages
– Low accuracy and high variance
• Binary decision trees
• Boosted trees
– Accurate, robust and interpretable
Dr Sildomar Monteiro 13
( , , , , ) ( )f x a b a x ba bb
( )x
1
( ) ( )M
m m
m
G x sign f x
m
32
1IGARSS 2011
Embedded Feature Selection
• Relative Importance of input variables
• Approximation for decision trees (heuristic)
[Friedman, 1999]
• Least-squares improvement criterion
Dr Sildomar Monteiro 14
12ˆ( )
.varj x x j
j
F xI E x
x
12 2
1
ˆ ˆ( ) ( ( ) )J
j tt
I T i t j
22 , l rl r l r
l r
wwi R R y y
w w
IGARSS 2011
Embedded Feature Selection (cont.)
• Boosted Decision Trees
• The Multi-class case
Dr Sildomar Monteiro 15
2 2
1
1ˆ ˆj j
M
mmMTI I
1
1ˆ ˆK
j jkk
I IK
IGARSS 2011
Experiments
• Hyperspectral data acquired using a field
spectrometer (ASD)
– 429 bands (same as hyperspectral camera)
– Wavelengths from 350 nm to 2500 nm
• Samples of ore-bearing rocks
– Martite, goethite, kaolinite, etc (total 9 classes)
– Different illumination and physical conditions (direct sunlight,
shadow and viewing angles)
• Methodology of experiments
– Metrics: accuracy, precision, recall, F, Kappa, AUC
– 4-fold cross-validation
Dr Sildomar Monteiro 16IGARSS 2011
Dr Sildomar Monteiro IGARSS 2011 17
Hyperspectral data set
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
500 750 1000 1250 1500 1750 2000 2250 2500
Re
flectan
ce
Wavelength
samples_644-17_1_00000.asd.ref samples_644-17_1_00035.asd.ref
VisNIR SWIR
Dr Sildomar Monteiro IGARSS 2011
Information in spectra
18
Experimental Results: 9 rock types
• Relative importance of features
• Normalized count of features
Dr Sildomar Monteiro 19
400 600 800 1000 1200 1400 1600 1800 2000 2200 24000
20
40
60
80
100
Wavelength (nm)
Rela
tive im
port
ance (
%)
400 600 800 1000 1200 1400 1600 1800 2000 2200 24000
20
40
60
80
100
Wavelength (nm)
Norm
aliz
ed f
eatu
re c
ount
(%)
IGARSS 2011
Experimental Results: 9 rock types
• Classification performance of selected features
Dr Sildomar Monteiro IGARSS 2011 20
0.1000
0.2000
0.3000
0.4000
0.5000
0.6000
0.7000
0.8000
0.9000
Accuracy F-score Kappa AUC
Relative Importance
Normalized Count
Experimental Results
• All 9 classes
• Martite
Dr Sildomar Monteiro IGARSS 2011 21
Summary
• Boosting increases the performance of decision trees
while keeping model interpretability
• We presented two approaches to perform feature
selection using boosted decision trees
• Calculating the relative importance of features was
more efficient than the counting of features
• The reduced set is able to predict the classes
accurately, and more efficiently than using all features
Dr Sildomar Monteiro 22IGARSS 2011
Conclusions
• The standard learning procedure of boosted decision
trees can perform feature selection automatically
• The feature selection is embedded in the internal
structure of the model, no need for extra parameters
or separate selection algorithms
• Instability of the models can be an issue
• Future work: how to determine the optimal number of
features (using statistical tests)
Dr Sildomar Monteiro 23IGARSS 2011
When Things Don’t Work...
Dr Sildomar Monteiro IGARSS 2011 24