Presentation on OCR of noisy images using MATLAB

Embed Size (px)

Citation preview

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    1/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    2/23

    Objective

    Develop a prototype of OCR system Application of Template Matching for Recognition

    Present scope Machine reading characters Source of characters:

    Typewritten Handwritten/ Photographs

    Efficiently reads : upper/ lower case alphabets numerals 0-9 multiple lines and

    noisy images as well

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    3/23

    What is OCR ?

    OCR : Optical Character Recognition Digitizing printed texts

    Purpose ?

    Support full editing & searching Compact storage capability

    Editing enabled E-storage saves paper; environment friendly Enables online display of information Speed up all related process. and many more

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    4/23

    Ways to implement OCR: Transformation and series expansion Template Matching Structural Analysis Artificial Neural Network

    etc

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    5/23

    What are its applications ?

    Mass surveillance methods Face/ Iris etc. recognition systems Banking: process cheque without human intervention At offices: to process paper work Reading machine for the blind etc.

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    6/23

    Input image from camera/scanner/snapshots

    Convert image into binary

    Remove Noise

    Segmentation

    Character Identification

    Save to file in a text format

    General Algorithm

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    7/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    8/23

    Flow of Control for character recognition

    Input image :typewritten/photograph

    Extract 1 line at a time

    From the detected string select thecharacter image

    Rescale the image to the size of thetemplate

    Match: Image & Template

    Store the highest match found; in

    case of no match repeat theprevious step

    The best match is stored as therecognized character

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    9/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    10/23

    Database

    Store the characters corresponding tofigures which are considered ideal. Times New Roman/ Calibri Font Size: 42*24

    62 elements, in totality

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    11/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    12/23

    Testing the system

    Input images from the real world were made as input to our system andsatisfactory results were obtained .

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    13/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    14/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    15/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    16/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    17/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    18/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    19/23

    Conclusion drawn from results:

    The ambiguous characters, e.g. B, 8 etc have lower recognition

    rate.

    Possible reasons:

    a) Characters similarity with other characters (e.g. B and 8, Sand 5, 1 and l) etc.

    b) Image quality, font of the characters

    c) Techniques limitation

    d) Accuracy rate of 77.78% for noisy images, calculated on the

    basis of test results.(18-4)/18*100 = 77.78 %

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    20/23

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    21/23

    Text with a font size < 14 will results in more errors.

    Most documents formatting are lost during text scanning, so their

    recognition depends on how well the document is scanned.

    The output will always require spellchecking and proofreading aswell as reformatting to get the desired final layout.

    These models are although performing good and are widely appliedbut they are no where near to the performance of the human brain!

    Rather they can never be there!

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    22/23

    Future Scope

    Use more adaptive technique (ANN) to implement OCR.

    Also we would like to improve on the results weve obtained for

    noisy images.

    Our ultimate goal will be to accomplish 99% accuracy!

  • 7/28/2019 Presentation on OCR of noisy images using MATLAB

    23/23

    Thank You

    *****