Upload
vivien-baker
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
25th June 2002 IEMCT CDAC Pune 1
Non-linear Normalization to Improve Telugu OCR
Atul Negi, Chakravarthy Bhagvati, V.V. Suresh Kumar
Department of Computer and Information Sciences,
University of Hyderabad
25th June 2002 IEMCT CDAC Pune 2
Acknowledgements
Ministry of Information Technology, New Delhi Under the Project
Resource Center for Indian Language Technology Solutions (Telugu)
25th June 2002 IEMCT CDAC Pune 3
Organization of Presentation• Introduction• Telugu Script• Classification By Template Matching• Complete OCR Algorithm• Nonlinear Normalization• Results• Concluding Remarks• Bibliography• Contact Information
25th June 2002 IEMCT CDAC Pune 4
Introduction• OCR Research Indian Scripts
– Initial era Pioneers: RMK Sinha, Deekshitalu, ISI Kolkata
– Maturity: Mid Nineties Complete Systems • Bangla
• Devanagari
• Recent Status of OCR in Indian Scripts– ICDAR 1999, Bangalore
– ILOCR Workshop, 2002 UoH
– Sadhana, Indian Acad. Sci. Feb `02, Special Issue
25th June 2002 IEMCT CDAC Pune 5
Introduction: Progress of Telugu OCR
• Structural approach (ref. 4), – moments and size of the character used
• Neural networks (ref. 1), – Connected Components, training and
recognition
• Template Matching (ref. 5),– Connected Components, Templates and linear
size normalization• Wavelet multi-resolution analysis (ref. 6)
25th June 2002 IEMCT CDAC Pune 6
Telugu Script• Features of Telugu
– Basic vowel sounds (Acchulu) 16 symbols– Simple consonants (Hallulu) 36 symbols– Vowel Sounds (Matraas) 16 symbols – Half Consonants (Voththus) 30 symbols
• Complexity of Character Recognition– Composition of Characters and Syllables from
above symbols: 5000 or so in common use.
• Reducing Complexity– Identification of glyphs used in composition :
about 400
25th June 2002 IEMCT CDAC Pune 8
Classification By Template Matching
• Why Template Matching?– Feature Extraction
Effectiveness– Dimensionality (Size 32x32)
• Fringe Distances (ref. 10)– No need for blurring– Distances Pre-computed and
Stored– Ease of matching
25th June 2002 IEMCT CDAC Pune 9
The Complete OCR algorithm• Read an input binary image • Segment the image into words • Extract the connected components from
each word • For each component
– (a) Normalize size to match stored templates – (b) Compute fringe distance map – (c) Compute fringe distance from all templates – (d) Output template with smallest fringe
distance – (e) Convert template code to ISCII
• Store ISCII output in a file
25th June 2002 IEMCT CDAC Pune 10
Nonlinear Normalization• Need for Normalization
– Preprocessing step to equalize size, position, inclination etc. to ease recognition
– Necessary when recognition is by template matching
• Non-Linear Normalization– All parts of the character image not treated
equally– Hypothesis: Differences between characters
will be increased, therefore improved discrimination
25th June 2002 IEMCT CDAC Pune 11
Nonlinear Normalization Technique• Line density equalization-analogous to
histogram density equalization (ref. 13)• Generalization: Feature Density Equalization
(ref. 14)– Projection of feature density onto horizontal,
vertical axes– Feature projection functions H(i) and V(j) – input, i=1,…I and j=1,…J.
– new position (m, n) output computed in normalized image of size (M,N) for point (i, j) in input image of size (I,J).
25th June 2002 IEMCT CDAC Pune 12
Nonlinear Normalization Technique• Feature Density Equalization
– Feature projection functions H(k) and V(l), input, i=1,…I and j=1,…J.
– New position (m, n) output size (M,N), for each point (i, j) in input image of size (I,J).
– m= k=1 to i H(k) M / [k=1 to i H(k)]
– n= l=1 to j V(l) N / [l=1 to j V(l)]
– H(i)= (j=1 to J) f(i, j) + H
– V(j)= (i=1 to I) f(i, j) + V{NSN by dot density
25th June 2002 IEMCT CDAC Pune 15
Results
0
100
200
300
400
500
Image 1 Image 2 Image 3 Image 4 Image 5
Fig. 5. Graphical representation of comparision of Linear and Non-linear Normalization
Number Glyphs Linear Normalization Non-Linear Normalization
25th June 2002 IEMCT CDAC Pune 16
Image 1Misclassifications: 1 (NSN) , 7 (L)
Total Glyphs: 145 ( 99%, 95.2% )
25th June 2002 IEMCT CDAC Pune 17
Image 5Misclassifications:
• 105 (NSN)
• 136 (linear Normalization)
Total Glyphs: 354 (70.3%, 61.6%)
25th June 2002 IEMCT CDAC Pune 18
Discussion
•Why Should Nonlinear Normalization succeed despite shape distortions?
•Is the best that we can do?
•Why not use this always?
25th June 2002 IEMCT CDAC Pune 19
Concluding Remarks
• Non-linear normalization appears to improve OCR accuracy (based on 1300 glyphs examined)
• More experimentation with the features is required to overcome problems like gaps
• Further testing on variety of fonts and sizes is required to conclude recognition improvement with more confidence
25th June 2002 IEMCT CDAC Pune 20
Bibliography• M.B. Sukhswami, P. Seetharamulu , and Arun K. Pujari, “Recognition of Telugu characters using Neural networks,” Int. J. of
Neural Systems, 6(3):317 (1995).• R. Kasturi and S. N. Srihari (Eds.). Proc. Fifth International Conf. Document Anaalysis and Recognition. Bangalore, India,
IEEE Computer Society Press, Los Alamitos, CA, (1999).• B.B. Chaudhuri and U. Garain, and M. Mitra, “On OCR of the most popular two Indian language scripts: Devanagari and
Bangla”, in Visual Text Recogntion and Document Processing, Ed. N. Murshed, World Scientific Press (2000).• SNS Rajasekharan and B.L. Deekshatulu, “Generation and Recognition of Printed Telugu characters”, Computer Graphics
and Image Processing, 6:335-360, (1977).• Atul Negi, Chakravarthy Bhagvati, and B. Krishna, “An OCR system for Telugu”, Proc. . Sixth International Conf. Document
Analysis and Recognition. Seattle, USA, IEEE Computer Society Press, Los Alamitos, CA, (2001).• A.K. Pujari, C.D. Naidu, and B.C.Jinaga, “An addaptive and intelligent character recognizer for Telugu scripts using
multiresolution analysis and associative memory”, Proc. Canadian Conf. On AI, Calagary, Canada, May 2002, LNCS, Springer Verlag (2002).
• B. Krishna, “Design and implementation of a Telugu script recognition system” Technical report, Dept. of Computer and Information Sciences, University of Hyderabad, Hyderabad, India, (2000).
• R.C. Gonzalez and R.E. Woods. Digital Image Processing. Addison-Wesley, 1993• O.D. Trier, A.K. Jain, and R.Taxt. “Feature extraction methods for character recognition-a survey”, Pattern Recognition,
29(4):641-662, (1996).• R.L. Brown. “The fringe distance measure: an easily calculated image distance measure with recognition results comparable
to Gaussian blurring”, IEEE Trans. System Man and Cybernetics, 24(1):111-116, (1994).• K. Wong, R. Casey, and F. Wahl. “Document analysis system”. IBM J. Research and Development, 26(6), (1982).• G. Nagy, S. Seth, and M. Vishwanathan, “A prototype document image analysis system for technical journals” Computer,
25(7), (1992).• H. Yamada, K. Yamamoto and T. Saito, “A nonlinear normalization method for handprinted Kanji character recognition-line
density equalization”, Pattern Recognition, 23(9):1023-1029, (1990).• S-W. Lee and J-S. Park, “Nonlinear shape normalization methods for the recognition of large set handwritten characters”,
Pattern Recognition, 27(7):895-902, (1994).• V.V. Suresh Kumar, “Non-linear Normalization Techniques to Improve OCR”, Technical report, Dept. of Computer and
Information Sciences, University of Hyderabad, Hyderabad, India,(2002).
25th June 2002 IEMCT CDAC Pune 21
Contact Information
Atul Negi, Chakravarthy BhagvatiDepartment of Computer and Information Sciences,
University of Hyderabad
Hyderabad 500 046, AP INDIA
Email: [email protected]
Visit http://www.uohyd.ernet.in
and http://www.Languagetechnologies.ac.in