24
Embedded Feature Selection of Hyperspectral Bands with Boosted Decision Trees Sildomar Monteiro and Richard Murphy The University of Sydney

ST.Monteiro-EmbeddedFeatureSelection.pdf

Embed Size (px)

Citation preview

Page 1: ST.Monteiro-EmbeddedFeatureSelection.pdf

Embedded Feature Selection

of Hyperspectral Bands with

Boosted Decision Trees

Sildomar Monteiro and Richard Murphy

The University of Sydney

Page 2: ST.Monteiro-EmbeddedFeatureSelection.pdf

Rio Tinto Centre for Mine Automation

• Totally Autonomous Mine in 10 years:– Brings together all elements of systems, perception,

machine learning, data fusion and more

– A grand challenge for Field Robotics

• Driven by safety, predictability and efficiency

Dr Sildomar Monteiro 22

IGARSS 2011

Page 3: ST.Monteiro-EmbeddedFeatureSelection.pdf

Goal: Mine Picture Compilation

• Provide a complete and accurate model of the mine

– Mine planning and better prediction outcomes

• Maintain and update a multi-scale probabilistic

representation

– Geology

– Geometry

– Equipment

– And other properties of interest for the mining process

Dr Sildomar Monteiro 3IGARSS 2011

Page 4: ST.Monteiro-EmbeddedFeatureSelection.pdf

Today

Floor mapping using ripped trench sections

Geology Feedback to Batch

Cone logging

4

Today

Dr Sildomar Monteiro 4IGARSS 2011

Page 5: ST.Monteiro-EmbeddedFeatureSelection.pdf

Geology (ground-truth)

Dr Sildomar Monteiro 5IGARSS 2011

Page 6: ST.Monteiro-EmbeddedFeatureSelection.pdf

Mine Face Scanning

IGARSS 2011 6

[Nieto, Viejo and Monteiro, 2010]

Page 7: ST.Monteiro-EmbeddedFeatureSelection.pdf

Hyperspectral sensing for mining

• Geology classification (material identification) still has

many challenges

• Environmental conditions

– Illumination, temperature, dust

• Timely data acquisition and processing is needed

– Algorithms and calibration

• High spectral similarity between (ore-bearing) rock

types

– Few, if any, distinctive spectral features

Dr Sildomar Monteiro IGARSS 2011 7

Page 8: ST.Monteiro-EmbeddedFeatureSelection.pdf

Outline

• Hyperspectral classification using Boosting

• Embedded band selection

• Experiments using iron ore data

Dr Sildomar Monteiro 8IGARSS 2011

Page 9: ST.Monteiro-EmbeddedFeatureSelection.pdf

Hyper-spectral Sensors

VisNIR400-970 nm

SWIR 970-2500 nm

Multispectral

Hyper-spectral

Dr Sildomar Monteiro 9IGARSS 2011

Band 4Band 5

Band 6

Band 2Band 3

Band n

Band 1

Page 10: ST.Monteiro-EmbeddedFeatureSelection.pdf

Example of Classification and Spectra

Dr Sildomar Monteiro IGARSS 2011 10

500 750 1000 1250 1500 1750 2000 2250

Refle

cta

nce

0.0

0.1

0.2

0.3

0.4

0.5

0.6

500 750 1000 1250 1500 1750 2000 2250

0.00

0.05

0.10

0.15

0.20

Wavelength (nm)

500 750 1000 1250 1500 1750 2000 2250

Refle

cta

nce

0.0

0.2

0.4

0.6

0.8

1.0

Wavelength (nm)

500 750 1000 1250 1500 1750 2000 2250

0.0

0.1

0.2

0.3

0.4

0.5

a b

c d

500 750 1000 1250 1500 1750 2000 2250

Reflecta

nce

0.0

0.1

0.2

0.3

0.4

0.5

0.6

500 750 1000 1250 1500 1750 2000 2250

0.00

0.05

0.10

0.15

0.20

Wavelength (nm)

500 750 1000 1250 1500 1750 2000 2250

Reflecta

nce

0.0

0.2

0.4

0.6

0.8

1.0

Wavelength (nm)

500 750 1000 1250 1500 1750 2000 2250

0.0

0.1

0.2

0.3

0.4

0.5

a b

c d

500 750 1000 1250 1500 1750 2000 2250

Refle

cta

nce

0.0

0.1

0.2

0.3

0.4

0.5

0.6

500 750 1000 1250 1500 1750 2000 2250

0.00

0.05

0.10

0.15

0.20

Wavelength (nm)

500 750 1000 1250 1500 1750 2000 2250

Refle

cta

nce

0.0

0.2

0.4

0.6

0.8

1.0

Wavelength (nm)

500 750 1000 1250 1500 1750 2000 2250

0.0

0.1

0.2

0.3

0.4

0.5

a b

c d

Page 11: ST.Monteiro-EmbeddedFeatureSelection.pdf

Hyperspectral Band Selection

• Feature Selection (vs Dimensionality Reduction)

– Remove correlated inputs

– Physical interpretation (band wavelengths)

• Faster data processing

• Possible faster data acquisition

• Can be tailored to application

• Indicate multispectral bands

Dr Sildomar Monteiro IGARSS 2011 11

Page 12: ST.Monteiro-EmbeddedFeatureSelection.pdf

Boosting

• Sound theoretical foundation

– Additive Logistic Regression [Friedman, 2000]

• Empirical studies show that boosting

– Yields small classification error rates

– Is very resilient to overfitting

• State-of-the-art results in many applications, e.g. face

recognition in computer vision

• The idea of Boosting is to train many “weak” learners

on various distributions (or set of weights) of the input

data and then combine the resulting classifiers into a

single “committee”

Dr Sildomar Monteiro 12IGARSS 2011

Page 13: ST.Monteiro-EmbeddedFeatureSelection.pdf

Decision Trees

• Advantages:

– Robustness and interpretability

• Disadvantages

– Low accuracy and high variance

• Binary decision trees

• Boosted trees

– Accurate, robust and interpretable

Dr Sildomar Monteiro 13

( , , , , ) ( )f x a b a x ba bb

( )x

1

( ) ( )M

m m

m

G x sign f x

m

32

1IGARSS 2011

Page 14: ST.Monteiro-EmbeddedFeatureSelection.pdf

Embedded Feature Selection

• Relative Importance of input variables

• Approximation for decision trees (heuristic)

[Friedman, 1999]

• Least-squares improvement criterion

Dr Sildomar Monteiro 14

12ˆ( )

.varj x x j

j

F xI E x

x

12 2

1

ˆ ˆ( ) ( ( ) )J

j tt

I T i t j

22 , l rl r l r

l r

wwi R R y y

w w

IGARSS 2011

Page 15: ST.Monteiro-EmbeddedFeatureSelection.pdf

Embedded Feature Selection (cont.)

• Boosted Decision Trees

• The Multi-class case

Dr Sildomar Monteiro 15

2 2

1

1ˆ ˆj j

M

mmMTI I

1

1ˆ ˆK

j jkk

I IK

IGARSS 2011

Page 16: ST.Monteiro-EmbeddedFeatureSelection.pdf

Experiments

• Hyperspectral data acquired using a field

spectrometer (ASD)

– 429 bands (same as hyperspectral camera)

– Wavelengths from 350 nm to 2500 nm

• Samples of ore-bearing rocks

– Martite, goethite, kaolinite, etc (total 9 classes)

– Different illumination and physical conditions (direct sunlight,

shadow and viewing angles)

• Methodology of experiments

– Metrics: accuracy, precision, recall, F, Kappa, AUC

– 4-fold cross-validation

Dr Sildomar Monteiro 16IGARSS 2011

Page 17: ST.Monteiro-EmbeddedFeatureSelection.pdf

Dr Sildomar Monteiro IGARSS 2011 17

Hyperspectral data set

Page 18: ST.Monteiro-EmbeddedFeatureSelection.pdf

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

500 750 1000 1250 1500 1750 2000 2250 2500

Re

flectan

ce

Wavelength

samples_644-17_1_00000.asd.ref samples_644-17_1_00035.asd.ref

VisNIR SWIR

Dr Sildomar Monteiro IGARSS 2011

Information in spectra

18

Page 19: ST.Monteiro-EmbeddedFeatureSelection.pdf

Experimental Results: 9 rock types

• Relative importance of features

• Normalized count of features

Dr Sildomar Monteiro 19

400 600 800 1000 1200 1400 1600 1800 2000 2200 24000

20

40

60

80

100

Wavelength (nm)

Rela

tive im

port

ance (

%)

400 600 800 1000 1200 1400 1600 1800 2000 2200 24000

20

40

60

80

100

Wavelength (nm)

Norm

aliz

ed f

eatu

re c

ount

(%)

IGARSS 2011

Page 20: ST.Monteiro-EmbeddedFeatureSelection.pdf

Experimental Results: 9 rock types

• Classification performance of selected features

Dr Sildomar Monteiro IGARSS 2011 20

0.1000

0.2000

0.3000

0.4000

0.5000

0.6000

0.7000

0.8000

0.9000

Accuracy F-score Kappa AUC

Relative Importance

Normalized Count

Page 21: ST.Monteiro-EmbeddedFeatureSelection.pdf

Experimental Results

• All 9 classes

• Martite

Dr Sildomar Monteiro IGARSS 2011 21

Page 22: ST.Monteiro-EmbeddedFeatureSelection.pdf

Summary

• Boosting increases the performance of decision trees

while keeping model interpretability

• We presented two approaches to perform feature

selection using boosted decision trees

• Calculating the relative importance of features was

more efficient than the counting of features

• The reduced set is able to predict the classes

accurately, and more efficiently than using all features

Dr Sildomar Monteiro 22IGARSS 2011

Page 23: ST.Monteiro-EmbeddedFeatureSelection.pdf

Conclusions

• The standard learning procedure of boosted decision

trees can perform feature selection automatically

• The feature selection is embedded in the internal

structure of the model, no need for extra parameters

or separate selection algorithms

• Instability of the models can be an issue

• Future work: how to determine the optimal number of

features (using statistical tests)

Dr Sildomar Monteiro 23IGARSS 2011

Page 24: ST.Monteiro-EmbeddedFeatureSelection.pdf

When Things Don’t Work...

Dr Sildomar Monteiro IGARSS 2011 24