Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Recognition of Cursive Roman Handwriting– Past, Present and Future
H. Bunke
Department of Computer Science, University of Bern
Neubruckstrasse 10, CH-3012 Bern, Switzerland
Acknowledgments:
- S. Gunter, T. Varga, M. Zimmermann
- Swiss National Science Foundation (20-5287.97 and IM2)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.1/61
Introduction
optical character recognition (OCR)
��
��
��
HH
HH
HH
Oriental Script Roman Script
��
���
HH
HHH
machine printed text handwritten text
��
�
HH
H
on-line off-line
��
HH
isolated cursive
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.2/61
Introduction
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.3/61
Introduction
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.4/61
Introduction
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.5/61
Introduction
(why) is it difficult?
• large variation in personal handwriting style
• different writing instruments
• segmentation problem
• large vocabulary (possibly open)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.6/61
Introduction
hundert
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.7/61
Introduction
is there any future need for automatic handwriting recognition?
• applications with commercial potential: address, form and check reading
• digital libraries, transcription of historical archives
• "non-death" of paper and new devices for handwriting acquisition
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.8/61
Introduction
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.9/61
Contents
1. Introduction
2. State of the Art
3. Current Developments
4. Future Trends
5. Conclusion
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.10/61
Document Image Preprocessing
standard operations include
• noise filtering
• binarization
• thinning
• skew correction
• slant correction
• estimation of baseline and main writing zones
• horizontal and vertical scaling
• additional problem dependent methods to separate handwriting frombackground
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.11/61
Document Image Preprocessing
original image final result
binarized image deslanted image
thinned image estimation of writing zones
estimation of slant deslanted and deskewed image
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.12/61
Isolated Character Recognition
• usually cast as a classification problem
• consists of preprocessing, feature extraction, and classification
features for isolated character recognition:
• raw pixels
• derived from series expansion, moments, etc.
• projection based features, contour based features
• structural features: end points, forks, junctions, etc.
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.13/61
Isolated Character Recognition
classifiers for isolated character recognition:
• nearest-neighbor
• Bayes classifier
• neural nets
• SVM, etc.
which classifier is best?
• depends on many factors, for example, available training set, number offree parameters, time & memory constraints, etc.
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.14/61
Cursive Word Recognition
• major problem: segmentation
• Sayre’s paradox
• three approaches
− holistic− segmentation-based (oversegment and merge)− segmentation-free (Hidden Markov Models, HMM)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.15/61
Hidden Markov Models (HMMs)
slidingwindow
featurevector
↓0
B
B
@
x01
...x0n
1
C
C
A
HMM S1
P11
S2P12
P(X)
P22
S3P23
P(X)
P33
...
P(X)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.16/61
Hidden Markov Models (HMMs)
slidingwindow
featurevector
↓0
B
B
@
x11
...x1n
1
C
C
A
HMM S1
P11
S2P12
P(X)
P22
S3P23
P(X)
P33
...
P(X)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.17/61
Hidden Markov Models (HMMs)
slidingwindow
featurevector
↓0
B
B
@
x21
...x2n
1
C
C
A
HMM S1
P11
S2P12
P(X)
P22
S3P23
P(X)
P33
...
P(X)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.18/61
Hidden Markov Models (HMMs)
slidingwindow
featurevector
↓0
B
B
@
x31
...x3n
1
C
C
A
HMM S1
P11
S2P12
P(X)
P22
S3P23
P(X)
P33
...
P(X)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.19/61
Hidden Markov Models (HMMs)
slidingwindow
featurevector
↓0
B
B
@
x41
...x4n
1
C
C
A
HMM S1
P11
S2P12
P(X)
P22
S3P23
P(X)
P33
...
P(X)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.20/61
Hidden Markov Models (HMMs)
slidingwindow
featurevector
↓0
B
B
@
x51
...x5n
1
C
C
A
HMM S1
P11
S2P12
P(X)
P22
S3P23
P(X)
P33
...
P(X)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.21/61
Hidden Markov Models (HMMs)
slidingwindow
featurevector
↓0
B
B
@
x61
...x6n
1
C
C
A
HMM S1
P11
S2P12
P(X)
P22
S3P23
P(X)
P33
...
P(X)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.22/61
General Text Recognition
• segmentation-based: segment line of text into individual words, then usecursive word recognizer
• segmentation-free: segmentation and recognition are integrated
− concatenate HMM word to word sequence (or sentence) models− use constraints to narrow down the search-space, for example,
soft-constraints derived from n-gram language models
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.23/61
Segmentation-free Word Sequence Recognition
• concatenation of HMM
w1
w2
wn
w1
w2
wn
w1
w2
wn
...
...
...
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.24/61
Segmentation-free Word Sequence Recognition
• concatenation of HMM
w1
w2
wn
w1
w2
wn
w1
w2
wn
...
...
...
p(w1
i)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.25/61
Segmentation-free Word Sequence Recognition
• concatenation of HMM
w1
w2
wn
w1
w2
wn
w1
w2
wn
...
...
...
p(w1
i) p(w2
i|w1
j)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.26/61
Segmentation-free Word Sequence Recognition
• concatenation of HMM
w1
w2
wn
w1
w2
wn
w1
w2
wn
...
...
...
p(w1
i) p(w2
i|w1
j) p(w3
i|w2
j)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.27/61
Segmentation-free Word Sequence Recognition
• concatenation of HMM
w1
w2
wn
w1
w2
wn
w1
w2
wn
...
...
...
p(w1
i) p(w2
i|w1
j) p(w3
i|w2
j)
• bi-gram language model
word next word probability
to the 0.009333
to be 0.002239
to a 0.000138
to have 0.000105
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.28/61
Recognition Experiment
40
50
60
70
80
0 1000 2000 3000 4000 5000 6000 7000 8000
Wor
d R
ecog
nitio
n R
ate
[%]
Vocabulary Size [n]
Simple Sentence ModelUnigram ModelBigram Model
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.29/61
Some Recent Trends
• databases for development and performance evaluation
• multiple classifier systems
• synthetic training data
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.30/61
Databases
• isolated characters and words:− CEDAR− NIST− CENPARMI− ELT9− IRESTE− ...
• cursively handwritten text− Senior/Robinson, PAMI 1998− Elliman/Sherkat, ICDAR 2001− IAM, collection in progress (since about 1997)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.31/61
Some Details of the IAM Database
• more than 1,500 scanned pages of handwritten text
• material from over 600 individual writers− 95,000 correctly segmented words− over 13,000 lines of text− over 5,000 complete sentences
• covering a vocabulary of over 12,000 words
• ground truth and lexical tags available (LOB corpus)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.32/61
Some Details of the IAM Database (2)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.33/61
Some Details of the IAM Database (3)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.34/61
Multiple Classifier Systems
• motivation: use a group of experts rather than a single expert
• many approaches to handwriting recognition have been proposed usingmcs’s
• often the basic classifiers are constructed ’by hand’
• recently so-called ensemble methods have been proposed:− they require only a single classifier to be constructed by hand− the classifier ensemble is generated automatically
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.35/61
Multiple Classifier Systems (2)
"classical" approach
input resultcombiner
nc
1c
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.36/61
Multiple Classifier Systems (3)
c1
cn
combiner resultinput
ensemble method
generateautomatically
base classifier
c
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.37/61
Issues in MCS’s
• ensemble generation− bagging− feature subspace− boosting− others
• combination− voting− rank sum− weighted voting− trainable classifier
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.38/61
Some Results
recognition rates achieved by various ensemble generation methods
algorithm recognition rate
Bagging 68.11%
AdaBoost 68.67%
random subspace 67.35%
feature selection 71.58%
original classifier 66.23%
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.39/61
Synthetic Generation of Training Data
• all recognizers need to be trained
• the larger the training set, the better the performance("you never have enough training data")
• but collection of training data is expensive
• previous work on generation of synthetic training data:− machine printed OCR [Baird et al.]− Arabic and Chinese OCR− isolated characters− (synthetic handwriting for other purposes [Guyon, Plamondon])
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.40/61
Synthetic Generation of Training Data
• no work on synthetic training data generation for cursive Romanhandwriting recognition
• two approaches:− using templates− applying geometric distortions to existing handwritten text
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.41/61
Synthetic Handwriting from Templates
• templates extracted from forms
• templates extracted from running text, using HMM in forced alignmentmode
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.42/61
Synthetic Handwriting from Templates (2)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.43/61
Synthetic Handwriting from Templates (3)
• disadvantages:− all instances of a character are identical− no ligatures
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.44/61
Synthetic Handwriting from N-Grams
• compile a list of frequent 3- and 2-tuples from an electronic corpus
• extract templates of these tuples from a handwritten text, using forcedalignment
• split the given text into available tuples and generate the synthetichandwriting
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.45/61
Synthetic Handwriting from N-Grams (2)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.46/61
Some Results
0 1 2 3 4 560
62
64
66
68
70
72
74
training set
reco
gniti
on r
ate
[%]
• 1193 word instances; 16 writers; 357 word vocabulary
• 80% training; 20% testing; 5-fold cross validation
• 1 = natural training data2 = synthetic training data3 = synthetic training data4 = synthetic training data
• test data: always natural
• except for the training data (natural/synthetic) identical conditions for allexperiments (same training/test words; same size of training/test set etc.)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.47/61
Future Perspectives
• some random comments:
− MCS’s− synthetic training data− enhanced HMMs (for example, 2D)− enhanced language models− etc.
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.48/61
Future Perspectives
• to reach a new quality of recognition we need to go from text transcriptionto text understanding:
− include syntactic and semantic text analysis− include task specific knowledge (in addition to statistical parameter
estimation)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.49/61
Who can read this?
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.50/61
Who can read this?
When I was in high school, my physics teacher - whose namewas Mr. Bader - called me down one day after physics classand said, "You look bored; I want to tell you something inte-resting." Then he told me something which I found fascina-ting, and have, since then, always found fascinating....The subject # is this - the principle of least action.Richard P. Feynman: The Feynman Lectures, Volume II.
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.51/61
Who can read this?
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.52/61
Who can read this?
Középiskolás koromban, egy nap a fizikatanárom - Bader úrnakívták - magához hívott fizikaóra után és azt mondta: "Unott-nak látszol; szeretnék mondani neked valami érdekeset." Majdelmondott valamit, amit elbûvölõnek találtam, és az-óta is mindig elbûvölõnek találom ... A legkisebb hatáselvérõl van szó.Richard P. Feynman: The Feynman Lectures, Volume II.
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.53/61
Integration of Grammatical Knowledge
• prerequisites:
− a word sequence recognizer that produces an n-best list (see before)− a stochastic context free grammar− a parser to compute the probability of a sentence or the most
probable parse tree
• procedure:
− reorder the n-best list from the recognizer taking parse probabilitiesinto account
final score = recognition score + γ f(parse probability)
where γ is a normalization factor and f(.) is a normalization function
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.54/61
Example of Grammatical Knowledge Integration
Rank Recognition Score Candidate Sentence
1 23923.6 She has put up the value other money .
2 23921.8 She has put up the value of her money .
3 23890.3 She had put up the value other money .
4 23888.4 She had put up the value of her money .
5 23854.3 She has put up the value at her money .
Rank Parse Prob. Candidate Sentence
1 1.58352e-19 She had put up the value of her money .
2 4.62861e-20 She has put up the value of her money .
3 1.12458e-21 She has put up the value at her money .
4 2.63105e-22 She had put up the value other money .
5 7.69052e-23 She has put up the value other money .
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.55/61
Example of Grammatical Knowledge Integration
Rank Recognition Score Candidate Sentence
1 23923.6 She has put up the value other money .
2 23921.8 She has put up the value of her money .
3 23890.3 She had put up the value other money .
4 23888.4 She had put up the value of her money .
5 23854.3 She has put up the value at her money .
Rank Parse Prob. Candidate Sentence
1 1.58352e-19 She had put up the value of her money .
2 4.62861e-20 She has put up the value of her money .
3 1.12458e-21 She has put up the value at her money .
4 2.63105e-22 She had put up the value other money .
5 7.69052e-23 She has put up the value other money .
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.56/61
Some Experimental Results
6
8
10
12
14
16
18
20
22
24
26
28
30
32
34
0 10 20 30 40 50 60 70 80 90 100
Sen
tenc
e R
ecog
nitio
n R
ate
[%]
Rank [n]
Reordered 100-Best ListBaseline System
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.57/61
Future Challenge
• to deal with human factors (i.e. errors and abnormalities introduced byhumans)
− statistical modeling has proven very useful− however we also need to incorporate task specific knowledge
provided by human experts
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.58/61
Sample Check Images
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.59/61
Sample Check Images (2)
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.60/61
Conclusions
• the recognition of cursive Roman handwriting has been a subject ofresearch for several decades
• for specific tasks some level of maturity has been reached andcommercial systems have become available
• some other tasks, particularly the recognition of unconstrained generaltext, need much more research
• these tasks are interesting for practical applications
• there do exist promising directions to further develop the field
Recognition of Cursive Roman Handwriting – Past, Present and Future – p.61/61