Gradient-Based Learning Applied to Document Recognition

Douglas HohenseeCOS 598b

Pattern Recognition, in General

Feature extraction•

Represent input with low-dimensional vectors

Tends to be hand-crafted

Success of algorithm depends on the chosen class of features

Gradient-Based Learning•

Classifiers calculate some function

= F(Zp, W)

= the p-th

input patternW = adjustable model parametersYp

= class label•

Loss Function:

= D(Dp, F(Zp, W))

Quantify discrepancy between Dp

and Yp

Theoretical performance limits ([3],[4],[5])]

As # training examples increases,

P = # of training samplesh = “effective capacity”

([6],[7])0.5 <= α

<= 1.0k = constant

Ex: Structural Risk Minimization: find min of )(WkHEtrain +

Gradient-based minimization procedure

In practice, local minima seem not to be a problem•

“somewhat of a theoretical mystery”

Handwriting Recognition•

Problem 1: recognize individual characters

Problem 2: separate out characters from neighbors

Heuristic Over-Segmentation–

Generate a lot of potential cuts, and select the best combination of cuts based on scores for each candidate character

But: Is half of ‘4’

a ‘1’? Is half of ‘8’

a ‘3’?

Handwriting Recognition•

Again, most systems = multiple modules–

Field locator –

Field segmenter–

Recognizer–

Contextual post-processor

Usually, each module trained separately!

Globally Trainable System

Start with a set of modules:Xn

= output of n-th

moduleWn

= parameters of n-th

module

Globally Trainable System

if is known, then

Back-Propagation

Calculate error in each output neuron. •

For each neuron, calculate local error.

Adjust weights of each neuron to lessen local error. •

Assign "blame" for the local error to neurons at the previous level, giving greater responsibility to neurons connected by stronger weights.

Repeat the steps above on the neurons at the previous level, using each one's "blame" as its error.

(wikipedia)

LeNet-5

1. Local receptive fields2. Shared weights

f(a) = A tanh(Sa)

3. Sub-sampling

LeNet-5

LeNet-5•

RBF’s

can adapt•

But, E(W) has trivial

solution, with all RBF

identical and F6 const.

LeNet-5

Limitations of convolutional networks

State information passed between modules is fixed-size

Can’t deal with variable-sized input, e.g.–

Continuous speech recognition

Handwritten word recognition

Multi-module Systems

Graph Transformer Networks (GTN)

Example:

Acoustic vectors-> phonemes lattice-> words lattice-> single sequence

of words (result)

GTN: Word Segmentation

Candidate cuts at–

Minima in vertical projection profile

Minima of the distance between upper & lower contours

GTN: Recognition Transformer

GTN: Viterbi

algorithm

: viterbi

penaltyvstart

= penalty association w/arc i

: {set of arcs w/destination n}

GTN: Viterbi

training

Minimize:–

Penalty of the path of the correct sequence

Space Displacement Neural Network (SDNN)

Alternative to Heuristic Over-Segmentation

AMAP•

On-line handwriting recognition

“annotated image”

Each pixel is a 5-element feature vector–

4 features associated with 4 orientations of the pen trajectory–

1 feature associated with local curvature in area near pixel

GTN with Bank Checks•

Multi-module system

“50% correct / 49% reject /

1% error”

criterion•

Reject low-confidence

outputs

Results: 82/17/1

Conclusion: global training = good

Segmentation and recognition modules shouldn’t learn independently

It’s easier to make datasets for globally trained systems

Globally trained systems perform better

Gradient-Based Learning Applied to Document Recognition · 2008-04-01 · Gradient-Based Learning...

Documents

Map Work: Relief, Gradient & Cross Section · 2018-09-09 · Gradient Example Gradient of slope A-B 1. Gradient = VERTICAL DIFFERENCE HORIZONTAL DISTANCE 2. Gradient = 900m-400m 2km

Students' Mathematical Noticingjwilson.coe.uga.edu/EMAT7050/articles/LobatoEtAl.pdf · Students’ Mathematical Noticing Joanne Lobato San Diego State University Charles Hohensee

Recognition of Printed Devnagari Characters With Regular …mirlabs.org/nagpur/ppt12.pdf · 2009. 1. 29. · 1. S.Arora,L.Malik,” Classification Of Gradient Change Features Using

The Blue Team: Corey Campbell, Eric Diep, Seth Hohensee Matthew Letchworth, Denis Mileyko, Zachary Regelski Mentor: Marissa Jimenez October 22nd, 2013

Pattern Recognition: Introduzionebias.csr.unibo.it/maltoni/ml/DispensePDF/8_ML_RetiNeurali.pdf · On-line Backpropagation Stochastic Gradient Descent (SGD) Approfondimenti Softmax

TECHNICAL NOTE: Testing for Black Mura Gradient …...• Maximum gradient relative to white • Maximum gradient relative to black Testing for Black Mura Gradient per German Automotive

Gradient-Based Learning Applied to Document Recognition

Ranking Policy Gradient - arXiv · Ranking Policy Gradient Ranking Policy Gradient Kaixiang Lin linkaixi@msu.edu DepartmentofComputerScienceandEngineering MichiganStateUniversity

Exponentiated Gradient versus Gradient Descent for Linear Predictors

1 rnnnrr11 - COnnecting REpositories · 2018. 7. 8. · Face Recognition Vendor Test 2002 Face Recognition Vendor Test 2006 Giga Byte Gradient Oriented Pyramid Gabor Wavelet Transform

Henning Lorch Page 1 Vector Gradient Intersection Transform (VGIT) Pattern Recognition Circle Detection Henning Lorch 2007, hlorch@eso.org ?

gradient 1 gradient 2 gradient 3 gradient 4 ECDIS buyers · PDF fileECDIS buyers guide gradient 1 gradient 2 gradient 3 gradient 4 gradient 1 gradient 2 gradient 3 gradient 4 10436

Robust Ear Recognition using Gradient Ordinal Relationship ... · Robust Ear Recognition using Gradient Ordinal Relationship Pattern? Aditya Nigam1 and Phalguni Gupta2 1 School of

Gradient-Orientation-Based PCA Subspace for Novel Face Recognition · INDEX TERMS Face recognition, object recognition, pattern recognition. I. INTRODUCTION Face recognition, which

Gradient-Based Learning Applied to Document …axon.cs.byu.edu/~martinez/classes/678/Papers/Convolution...Gradient-Based Learning Applied to Document Recognition YANN LECUN, MEMBER,

Gradient echo pulse sequences Conventional gradient echo Steady state Coherent gradient echo Steady state Incoherent gradient echo Steady state free precession

PROTEIN FOLD RECOGNITION USING THE GRADIENT BOOST … · 2009. 10. 25. · May 22, 2006 16:14 WSPC/Trim Size: 11in x 8.5in for Proceedings csb2006 1 PROTEIN FOLD RECOGNITION USING

Robust face recognition with structural binary gradient patterns · bust face recognition from videos. A more comprehensive study on illumination-invariant representations for face

Milli-Q Gradient and Milli-Q Gradient A10 User Manual · Manufacturing Site ... Water produced by the Milli-Q Gradient or Gradient A10 ... 47 Viewing Resistivity

Intro Logistic+Regression Gradient+Descent+++SGD...9 SGD:+Stochastic+Gradient+Ascent+(or+Descent) • “True”gradient: • Samplebasedapproximation: • Whatifweestimategradientwithjustonesample???