Upload
edmund-holt
View
215
Download
2
Embed Size (px)
Citation preview
Recognition of Hand Written English Characters
& Numerals
CS771Machine Learning : Tools, Techniques &
Application
Gaurav Krishna Y9227224Harshit Maheshwari 10290Pulkit Jain 10543Sayantan Marik 13111057
Preprocessing Feature Extraction Classification Techniques Brief Discussions on Results
◦ Parameter Selection◦ Comparative Results for Different Techniques
Final Results
Outline
We have 55 examples for each of the 62 classes
Problems Handled and Preprocessing Steps◦ Varied Size of characters Resized the characters in images of size 32X32◦ ILL centered
Centered the Images◦ Varied thickness of strokes
Thinning ( Done only for 13 Features) These steps were done in MATLAB
Preprocessing
Following Features are Considered SET 1*
◦ Haralick Texture Features◦ Zoning Feature◦ Eccentricity◦ Raw Moment ◦ Covariance
SET 2◦ Contour Feature◦ Histogram Feature◦ 13 Point Feature◦ Holes Feature
We used Java( jFeatureLib Library) to extract these features
* In graph plotting, when we say all feature we mean set 1
Feature Extraction
Random Forest Classifier Neural Network
◦ Single Hidden Layer◦ Double Hidden Layer
SVM Classifier ( Using SMO Algorithm) K-Nearest Neighbour
Classification Techniques Used
BRIEF DISCUSSION OF RESULTS
Effect of Scaling the Images
Used features is image pixel values
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Series1Series2
Determining Zoning Parameter
Zoning, Haralick Features and Eccentricity without Thinning
Deciding SVM Complexity Parameter
2 3 4 5 6 777
77.2
77.4
77.6
77.8
78
78.2
78.4
78.6
78.8
79
78.2991
78.8856 78.8856
78.2991
77.7126 77.7126
Effect of SVM Complexity Paramter
SVM Classification on Individual Features
Haralick Histogram Raw moments + Covariance
Eccentricity Holes0
2
4
6
8
10
12
Series1
Haralick 9.6774
Histogram 1.7595
Raw moments + Covariance 7.0381
Eccentricity 5.8651
Holes 2.346
Using Different Feature Sets on SVM
Holes + Zoning
Holes+ Zoning + Haralick
Contour Contour + Haral-
ick
Contour + Zoning
His-togram
His-togram + Zoning + Thinning
13 Point Feature + Thinning
Zoning Haralick + Zoning
Haralick + Zoning
+ Raw moment
+ Co-variance
Accu-racy
76.8328 78.0059 72.7273 74.1935 78.0059 34.8974 69.2082 60.4106 75.5681 78.0059 77.1261
51525354555657585
Accuracy
Using all the extracted features with SVM
Pixel values+SVM+
without thinning
Using 64 Zones 64 zones + Im-age in 32x32
pixels
SVM with 64 zones
Neural Network with 64 zones
Accuracy 74.4868 76.8328 62.7566 72.7273 61.8768
5
15
25
35
45
55
65
75
85
Accuracy
Accura
cy
Features: Haralick, Eccentricity, Zoning
Nearest Neighbors in K-NN
Single Hidden Layer Doble Hidden Layer + Feature Set 1
Contour+Zoning +Single Hidden
Accuracy 71.85 57.478 70.3812
5
15
25
35
45
55
65
75
Accuracy
Using Neural NetworkFeature Set 1 {Zoning, Haralick, Eccentricity, Raw Moments}
Using Random Forest Classifier
Zoning Contour feature + Zoning62
64
66
68
70
72
74
76
78
80
68.0352
77.7126
Random Forest Classifiers
We divided the data into 90% Training Set and 10% Test Set.
We got the accuracy of 78.8% (Using SVM on Haralick, Zoning (8X8), Eccentricity)
Ten fold cross validation gives the accuracy of 77.39%
Training error of 5% obtained.◦ Training and Testing on whole dataset gave 95%
accuracy.
Final Results
THE MNIST DATABASE of handwritten digitshttp://yann.lecun.com/exdb/mnist/
The Chars74K dataset Character Recognition in Natural Imageshttp://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
Handwritten Character Recognition using Neural Networkshttp://home.iitk.ac.in/ sunithb/NN.pdf
References
Thanks
END