20
Real-time Computer Real-time Computer Vision Vision with with Scanning N-Tuple Scanning N-Tuple Grids Grids Simon Lucas Simon Lucas Computer Science Dept Computer Science Dept

Real-time Computer Vision with Scanning N-Tuple Grids Simon Lucas Computer Science Dept

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Real-time Computer VisionReal-time Computer Visionwith with

Scanning N-Tuple GridsScanning N-Tuple Grids

Simon LucasSimon Lucas

Computer Science DeptComputer Science Dept

OutlineOutline

Background: N-Tuple ClassifiersBackground: N-Tuple Classifiers

The scanning n-tuple gridThe scanning n-tuple grid

Isolated Character RecognitionIsolated Character Recognition

Isolated Face RecognitionIsolated Face Recognition

Convolutional Mode OCRConvolutional Mode OCR

Real time vision demoReal time vision demo

ConclusionsConclusions

N-Tuple ClassifiersN-Tuple Classifiers

Work by randomly sampling input spaceWork by randomly sampling input spaceFirst applied to binary imagesFirst applied to binary imagesVery fast; reasonable accuracyVery fast; reasonable accuracyScanning N-Tuple classifier (Lucas, 1995)Scanning N-Tuple classifier (Lucas, 1995) Applied to sequence recognitionApplied to sequence recognition Fast and accurateFast and accurate

Current workCurrent work SNT GridSNT Grid Specially developed for convolutional (sliding window) Specially developed for convolutional (sliding window)

applicationsapplications Recognise patterns independent of locationRecognise patterns independent of location

LikelihoodImage

SNT-Grid System ArchitectureSNT-Grid System Architecture

Binarise(e.g. Niblack)

Scanning Index

(SNT-Grid)

IntegratedLikelihoods

LikelihoodImage

FurtherProcessing

(e.g. Dictionary orLanguage Model)

OriginalOriginal

BinarisedBinarised

SNT IndexedSNT Indexed

Simple OperationSimple Operation

Slide grid over imageSlide grid over image Interpret each position Interpret each position

as binary numberas binary number

Efficient ImplementationEfficient Implementation

Very simple ideaVery simple idea

Decompose one 2-d scanDecompose one 2-d scan

Into two 1-d scans!Into two 1-d scans!

Reduces time complexityReduces time complexity Suppose image is n x nSuppose image is n x n Window is m x mWindow is m x m Reduce from O(nReduce from O(n22mm22)) To O(nTo O(n22))

Well worth the effort!Well worth the effort!

Worked ExampleWorked Example

SNT Indexing: Java CodeSNT Indexing: Java Code

OCR Results: MNist DigitsOCR Results: MNist Digits

SNTGrid Speed on MNistSNTGrid Speed on MNist

Java ImplementationJava Implementation

Chars are 28 x 28 grey level imagesChars are 28 x 28 grey level images

Training (60,000 chars)Training (60,000 chars) 8s (> 7,000 cps)8s (> 7,000 cps)

Testing (10,000 chars)Testing (10,000 chars) 3.8s (> 2,600 cps)3.8s (> 2,600 cps)

ORL Face DataORL Face Data

40 subjects40 subjects

10 images from each10 images from each

Using 5 for training, 5 for testingUsing 5 for training, 5 for testing

Average around 97.5% accuracyAverage around 97.5% accuracy

Competitive with other methodsCompetitive with other methods

Much faster!Much faster!

Museum Archive CardsMuseum Archive Cards

Hard to read with conventional OCRHard to read with conventional OCR

2 Detector : Raw outputs2 Detector : Raw outputs

‘‘2’ Detector – Integrated2’ Detector – Integrated OP OP(Uses Integral Array of Viola + Jones)(Uses Integral Array of Viola + Jones)

Real-time DemoReal-time Demo

Very efficientVery efficient

Can use it for real-time expression Can use it for real-time expression recognitionrecognition

Or a ‘video’ joystick!Or a ‘video’ joystick!

Bit like EyeToy – but potentially more Bit like EyeToy – but potentially more sophisticatedsophisticated

Sample testsSample testsReal-time DemoReal-time Demo

ConclusionsConclusions

Basis of simple and efficient computer visionBasis of simple and efficient computer vision

Trick is the scan decompositionTrick is the scan decomposition

Also use of integral image to accumulate Also use of integral image to accumulate likelihoodslikelihoods

Currently being applied to reading text in natural Currently being applied to reading text in natural scenesscenes

Many other applications alsoMany other applications also

Further reading: ICDAR 2005 Paper (on my web Further reading: ICDAR 2005 Paper (on my web page)page)