Upload
prince-eley
View
221
Download
1
Embed Size (px)
Citation preview
1Feature Selection
Feature Selection for Image Retrieval
By Karina Zapién Arreola
January 21th, 2005
2Feature Selection
Introduction
Variable and feature selection have become the focus of much research in areas of applications for datasets with many variables are available
Text processing
Gene expression
Combinatorial chemistry
3Feature Selection
Motivation
The objective of feature selection is three-fold:
Improving the prediction performance of the predictors
Providing a faster and more cost-effective predictors
Providing a better understanding of the underlying process that generated the data
4Feature Selection
Why use feature selection in CBIR
Different users may need different features for image retrieval
From each selected sample, a specific feature set can be chosen
5Feature Selection
Boosting
Method for improving the accuracy of any learning algorithm
Use of “weak algorithms” for single rules
Weighting of the weak algorithms
Combination of weak rules into a strong learning algorithm
6Feature Selection
Adaboost Algorithm
Is a iterative boosting algorithmNotation
Samples (x1,y1),…,(xm,ym), where, yi= -1,1There are m positive samples, and l negative samples
Weak classifiers hi
For iteration t, the error is defined as:
εt = min (½)Σi ωi |hi(xi) – yi|
where ωi is a weight for xi.
7Feature Selection
Adaboost Algorithm
Given samples (x1,y1),…,(xm,ym), where yi = -1,1
Initialize ω1,i=1/(2m), 1/(2l), for yi = 1,-1
For t=1,…,TNormalize ωt,i = ωt,i /(Σj ωt,j)
Train base learner ht,i using distribution ωi,j
Choose ht that minimize εt with error ei
Update ωt+1,i = ωt,i βt1-ei
Set βt = (εt)/(1- εt) and αt = log(1/ βt)
Output the final classifier H(x) = sign( Σt αt ht(x) )
8Feature Selection
Adaboost Application
Searching similar groupsA particular image class is chosen
A positive sample of this group is given randomly
A negative sample of the rest of the images is given randomly
9Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparisonStable solution
10Feature Selection
Domain knowledgeFeatures used
colordb_sumRGB_entropy_d1col_gpd_hsvcol_gpd_labcol_gpd_rgbcol_hu_hsv2col_hu_lab2col_hu_labcol_hu_rgb2col_hu_rgbcol_hu_seg2_hsvcol_hu_seg2_labcol_hu_seg2_rgb
Features usedcol_hu_seg_hsv
col_hu_seg_lab
col_hu_seg_rgb
col_hu_yiq
col_ngcm_rgb
col_sm_hsv
col_sm_lab
col_sm_rgb
col_sm_yiq
text_gabor
text_tamura
edgeDB
waveletDB
Features usedhist_phc_hsv
hist_phc_rgb
Hist_Grad_RGB
haar_RGB
haar_HSV
haar_rgb
haar_hmmd
11Feature Selection
Check list Feature Selection
Domain knowledge
Commensurate features
Normalize features between an appropriated range
Adaboost takes each feature independent so it is not necessary to normalize them
12Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparisonStable solution
13Feature Selection
Feature construction and space dimensionality reduction
Clustering
Correlation coefficient
Supervised feature selection
Filters
14Feature Selection
Check list Feature Selection
Domain knowledge
Commensurate features
Interdependence of features
Prune of input variables
Features with the same value for all samples (variance=0) were eliminated
From4912 Linear Features3583 were selected
15Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individually
When there is no asses method, use Variable Ranking method. In Adaboost this is not necessary
16Feature Selection
Variable Ranking
Preprocessing step
Independent of the choice of the predictor
Correlation criteriaIt can only detect linear dependencies
Single variable classifiers
17Feature Selection
Variable Ranking
Noise reduction and better classification may be obtained by adding variables that are presumable redundant
Perfectly correlated variables are truly redundant in the sense that no additional information is gained by adding them. It doesn’t mean absence of variable complementarily
Two variables that are useless by themselves can be useful together
18Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparisonStable solution
19Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparisonStable solution
20Feature Selection
Adaboost Algorithm
Given samples (x1,y1),…,(xm,ym), where xi, yi -1,1
Initialize ω1,i=1/(2m), 1/(2l), for yi = -1,1
For t=1,…,TNormalize ωt,i = ωt,i /(Σj ωt,j)
Train base learner ht,i using distribution ωi,j
Choose ht that minimize εt with error ei
Update ωt+1,i = ωt,i βt1-ei
Set βt = (εt)/(1- εt) and αt = log(1/ βt)
Output the final classifier H(x) = sign( Σt αt ht(x) )
21Feature Selection
Weak classifier
Each weak classifier hi is defined as follows:
hi.pos_mean – mean value for positive samples
hi.neg_mean – mean value for negative sample
A sample is classified as:1 if it is closer to hi.pos_mean
-1 if it is closer to hi.neg_mean
22Feature Selection
Weak classifier
hi.pos_mean – mean value for positive samples
hi.neg_mean – mean value for negative sample
A Linear Classifier was used
hi.neg_mean
hi.pos_mean
23Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparisonStable solution
24Feature Selection
Adaboost experiments and results
4 positives
10 positives
25Feature Selection
Few positive samplesUse of 4
positive samples
26Feature Selection
More positive samples
False Positive
Use of 10 positive samples
27Feature Selection
Training data
Training data
Test data
False negative
Use of 10 positive samples
28Feature Selection
Changing number of Training Iterations
The number of iterations
Used was from 5 to 50
Iterations = 30was set
29Feature Selection
Changing Sample Size
5 pos10 pos
15 pos20 pos
25 pos
30 pos35 pos
30Feature Selection
Few negative samplesUse of 15
negative samples
31Feature Selection
More negative samplesUse of 75
negative samples
32Feature Selection
Check list Feature Selection
Domain knowledgeCommensurate featuresInterdependence of featuresPrune of input variablesAsses features individuallyDirty dataPredictor – linear predictorComparison (ideas, time, comp. resources, examples)Stable solution
33Feature Selection
Stable solution
For Adaboost is important to have a representative sample
Chosen parameters:Positives samples: 15
Negative samples: 100
Iteration number: 30
34Feature Selection
Stable solution with more samples and iterations
Beaches
Dinosaurs
Mountains
ElephantsBuildings
Humans
RosesBusesHorses
Food
35Feature Selection
Stable solution for DinosaursUse of:• 15 Positive samples• 100 Negative samples• 30 Iterations
36Feature Selection
Stable solution for RosesUse of:• 15 Positive samples• 100 Negative samples• 30 Iterations
37Feature Selection
Stable solution for BusesUse of:• 15 Positive samples• 100 Negative samples• 30 Iterations
38Feature Selection
Stable solution for BeachesUse of:• 15 Positive samples• 100 Negative samples• 30 Iterations
39Feature Selection
Stable solution for FoodUse of:• 15 Positive samples• 100 Negative samples• 30 Iterations
40Feature Selection
Unstable Solution
41Feature Selection
Unstable solution for Roses
Use of:• 5 Positive samples• 10 Negative samples• 30 Iterations
42Feature Selection
Best features for classification
HumansBeachesBuildingsBusesDinosaursElephantsRosesHorsesMountainsFood
43Feature Selection
And the winner is…
44Feature Selection
Feature frequency
Feature's Frequency
00.020.040.060.08
0.10.120.140.160.18
0.2
Haar_
rgb_
norm
haar
_hmm
d
hist_
Grad_
RGB
haar
_RGB
haar
_HSV
hist_
phc_
hsv
hist_
phc_
rgb
03-co
l_gpd
_hsv
.mat
03-co
l_sm
_yiq.
mat
03-co
l_hu_
yiq.m
at
03-co
l_hu_
seg_
hsv.m
at
03-co
l_sm
_lab.m
at
04-te
xt_ta
mur
a.mat
05-e
dgeD
B
03-co
l_sm
_rgb
.mat
03-co
l_gpd
_rgb
.mat
03-co
l_gpd
_lab.
mat
05-w
avel
etDB
03-co
l_hu_
lab.m
at
03-co
l_hu_
seg_
rgb.
mat
03-co
l_ngc
m_rgb
.mat
03-co
l_hu_
seg_
lab.
mat
04-te
xt_ga
bor.m
at
Feature
Appe
aren
ce t
imes
45Feature Selection
Extensions
Searching similar imagesPairs of images are built
The difference for each feature is calculated
Each difference is classified as: 1 if both images belong to the same class
-1 if both images belong to different classes
Multiclass adaboost
46Feature Selection
Extensions
Use of another weak classifierDesign weak classifier using multiple features
→ classifier fusion
Use different weak classifier such as SVM, NN, threshold function, etc.
Different feature selection method: SVM
47Feature Selection
Discussion
Is important to add feature Selection for Image retrieval
A good methodology for selecting features should be used
Adaboost is a learning algorithm
→ data dependent
It is important to have representative samples
Adaboost can help to improve the classification potential of simple algorithms
48Feature Selection
Thank you !