Hand Written Digit Recognition using Convolutional …academicscience.co.in/admin/resources/project/paper/f...Signature Verification, Postal-Address Interpretation, Bank-Check Processing,

260 Nimisha Jain, Kumar Rahul, Ipshita Khamaru. Anish Kumar Jha, Anupam Ghosh

International Journal of Innovations & Advancement in Computer Science

IJIACS

ISSN 2347 – 8616

Volume 6, Issue 5

May 2017

Hand Written Digit Recognition using Convolutional Neural

Network (CNN)

Nimisha Jain1 , Kumar Rahul

1 , Ipshita Khamaru

1 . Anish Kumar Jha

1 , Anupam Ghosh

1*

1 Department of Computer Sc. & Engineering, Netaji Subhash Engineering College, Kolkata

A B S T R A C T

The field of machine learning is a rapidly developing

one. Recent developments in the field of image

processing and machine learning has led to efficient

extraction of features from images of peoples’ faces.

Recognizing handwritten digits from images isn’t easy. It

involves the difficulty of visual pattern recognition which

becomes very apparent when an attempt is made to write

a computer program to recognize digits. The goal of our

work will be to create a model that will be able to

recognize and determine the handwritten digits from its

image. We aim to complete this by using the concepts of

Convolution Neural Network. The aim of our study is to

open the way to digitalization. Though the goal is to just

to create a model which can recognize the digits but it

can be extended to letters and then a person’s

handwriting. Through this work, we aim to learn and

practically apply the concepts of Machine Learning and

Neural Networks. Moreover, digit recognition is an

excellent prototype problem for learning about neural

networks and it gives a great way to develop more

advanced techniques like deep learning.

1. Introduction

Feature Recognition extracts features and their

parameters from solid models.This is a pioneering,

state-of-the-art technology from Geometric with

more than 50 man-years of research and

development. Feature Recognition extracts features

and their parameters from solid models. It finds

applications in modeling, design, finite element

analysis, machining, process planning and cost

estimation.Feature recognition is the ability to

automatically or interactively identify and group

topological entities, such as faces in a boundary

representation (B-rep) solid model, into functionally

significant features such as holes, slots, pockets,

fillets, ribs, etc. The features that are not recognized

automatically can be recognized using the

interactive feature recognition facilities [1]. In short,

Feature Recognition makes smart solids out of

dumb solids! This information is then used in

downstream applications.

Handwriting Recognition has an active community

of academics studying it. The biggest conferences

for handwriting recognition are the International

Conference on Frontiers in Handwriting

Recognition (ICFHR), held in even-numbered

years, and the International Conference on

Document Analysis and Recognition (ICDAR), held

in odd-numbered years. Both of these conferences

are endorsed by the IEEE. Active areas of research

include: Online Recognition, Offline Recognition,

Signature Verification, Postal-Address

Interpretation, Bank-Check Processing, Writer

Recognition [2,3].

A lot of the important work on neural networks

happened in the 80's and in the 90's but back then

computers were slow and datasets very tiny. The

research didn't really find many applications in the

real world. As a result, in the first decade of the 21st

century neural networks have completely

disappeared from the world of machine learning [4].

Working on neural networks was definitely fringe.

It's only in the last few years, first seeing speech

recognition around 2009, and then in computer

vision around 2012, that neural networks made a

big comeback. What changed? Lots of data and

cheap and fast DBU's [5,6]. Today, neural networks

are everywhere. So if anyone is doing anything with

data analytics or prediction, they're definitely

something that we want to get familiar with.

The objective of the work is to develop a model that

will classify a picture of a handwritten digit

according to pre-trained labels, for this we found

relevant image database for training the given

model Engineer features that will be used for

learning lay down the feature extraction

mechanism Choose several supervised classification



IJIACS

ISSN 2347 – 8616

Volume 6, Issue 5

May 2017

algorithm such as KNN classifier, MLP classifier,

CNN classifier. This also involves fine tuning hyper

parameters.

There is a lot happening at the moment in terms of

A.I. development and machine learning innovation.

For instance major auto mobile manufacturing

companies are investing in technology such as self-

driving cars and intelligent parking systems. Tesla,

Google, Ford to name a few. At the same time

Google deep mind ended up developing a general

purpose A.I [7,8]. that learned by itself to play one

of the hardest playing board games Go. Go in

particular is more challenging for a computer to

learn as it requires extreme level of pattern

recognition then logical decision making as the

game can assume more states then there are in the

perceivable universe [9] The AlphaGo project not

only mastered the game but also beat the world

champion 4-1 in a tournament. This feat piqued our

interest in the field of artificial intelligence in

general. With availability of learning resources such

as online learning websites like edx, coursera and

abundance of books written in this domain gave us

a better insight to understanding this field of science

[10]. Also the application of it in the field of image

recognition helped us narrow our topic to something

related to computer vision.

Classification of images and patterns has been one

of the major implementation of Machine Learning

and Artificial Intelligence. People are continuously

trying to make computers intelligent so that they

can do almost all the work done by humans [11].

This continuous effort of making computer

intelligent is to reduce the workload on humans.

Nowadays, machines are able to understand the

conversations, handwriting or any sort of actions by

humans.

The motivation behind this study is to take a step

towards the technology. Due to the increasing

demands of intelligent agents acting like humans,

the understanding of Machine Learning has become

very important in the field of Computer Science.

The work has been taken by keeping this in mind.

Handwriting recognition system is the most basic

and an important step towards this huge and

interesting area of Computer Vision.

2. Methodology

Deep Learning has emerged as a central tool for

self-perception problems like understanding images,

voice from humans, robots exploring the world. The

project aims to implement the concept of

Convolution Neural Network which is one of the

important architecture of deep learning.

Understanding CNN and applying it to the

handwritten recognition system, is the major target

of the proposed system [12-15].

There is a reason behind using CNN for handwritten

digit recognition. Let us consider a multi-layer

feedforward neural network to be applied on

MNIST dataset which contains images of size

28×28 pixels (roughly 784 pixels). So if a hidden

layer has about 100 units, then the first layer

weights comes up to about 78k parameters, which is

large but manageable. However, in the natural

world the size of the image is much larger [16-18].

If we consider the size of the typical image which is

around 256×256 pixels (roughly about 56,000

pixels), then the first layer weights will have about

560k parameters! So that becomes too many

parameters and hence make it unscalable for real

images. Hence, it will be so large that it will

become very difficult to generalize the new data fed

into the network.

Convolution Neural Network extracts the feature

maps from the 2D images by applying filters and

hence making the task of feature extraction from the

images easier [19-22]. Basically, convolution neural

network considers the mapping of image pixels with

the neighborhood space rather than having a fully

connected layer of neurons [23-26].

Convolution Neural Networks has been proved to

be a very important and powerful tool in signal and

image processing. Even in the fields of computer

vision such has handwriting recognition, natural

object classification and segmentation, CNN has

been a much better tool compared to all other

previously implemented tools [27-35].

The broader aim in mind was to develop a M.L.

model that could recognize people's handwriting.

However, as we began developing the model we

realized that the topic in hand was too tough and

would require tremendous data to learn. Example to

accurately classify a cursive handwriting will be



IJIACS

ISSN 2347 – 8616

Volume 6, Issue 5

May 2017

very tough. Thus we settled on classifying a given

handwritten digit image as the required digit using

three different algorithms and consequently testing

its accuracy.

Architecture of the Proposed System

Figure. 1–Handwritten Recognition system.

3. Explanation of the architecture:

• The first layer of the architecture is the User layer.

User layer will comprise of the people who interacts

with the app and for the required results.

• The next three layers is the frontend architecture

of the application. The application will be

developed using Bootstrap which is the open source

platform for HTML, CSS and JavaScript. The

application is deployed in the localhost which is

shown on the browser. Through the app, the user

will be able to upload pictures of the handwritten

digits and convert it into the digitalized form.

• The one in between the database and view layer is

the business layer which is the logical calculations

on the basis of the request from the client side. It

also has the service interface.

• The backend layer consists of two datasets:

Training Data and Test Data. The MNIST database

has been used for that which is already divided into

training set of 60,000 examples and test of 10,000

examples.

• The training algorithm used is Convolution Neural

Network. This will prepare the trained model which

will be used to classify the digits present in the test

data. Thus, we can classify the digits present in the

images as: Class 0,1,2,3,4,5,6,7,8,9.

Algorithm:

Forward Propagation:

Convolutional layer:

Suppose that we have some N×NN×N square

neuron layer which is followed by our

convolutional layer. If we use an m×m filter ω, our

convolutional layer output will be of

size (N−m+1)×(N−m+1)(N−m+1)×(N−m+1). In

order to compute the pre-nonlinearity input to some

unit xij𝑙 in our layer, we need to sum up the

contributions (weighted by the filter components)

from the previous layer cells:

xij𝑙= 𝛚𝐚𝐛y(i+a)(j+b)𝑙−1𝑚−1

𝑏=0𝑚−1𝑎=0

Then, the convolutional layer applies its

nonlinearity:

yij𝑙=σ (xij𝑙).

Architecture:

Below shown is a small workflow of how CNN

module will extract the features and classify the

image based on it. The architecture shows the input

layer, hidden layers and output layer of the network.

There are many layers involved in the feature

extraction phase of the network which involves

convolution and subsampling .

Figure 2: Architecture of CNN.



IJIACS

ISSN 2347 – 8616

Volume 6, Issue 5

May 2017

How it works:

Neural Networks receive an input, and

transform it through a series of hidden layers.

Each hidden layer is made up of a set of

neurons, where each neuron is fully connected

to all neurons in the previous layer.

Neurons in a single layer function completely

independently.

The last fully-connected layer is called the

"output layer“.

Convolution Layer:

The Convolutional layer is the core building block

of a CNN. The layer's parameters consist of a set of

learnable filters (or kernels), which have a small

receptive field, but extend through the full depth of

the input volume. During the forward pass, each

filter is convolved across the width and height of

the input volume, computing the dot product

between the entries of the filter and the input and

producing a 2-dimensional activation map of that

filter. As a result, the network learns filters that

activate when they see some specific type of feature

at some spatial position in the input..

Feature Extraction:All neurons in a feature share the

same weights .In this way all neurons detect the

same feature at different positions in the input

image. Reduce the number of free parameters.

Subsampling Layer: Subsampling, or down-

sampling, refers to reducing the overall size of a

signal .The subsampling layers reduce the spatial

resolution of each feature map. Reduce the effect of

noises and shift or distortion invariance is achieved.

Max Pooling: Only the locations on the image that

showed the strongest correlation to each feature (the

maximum value) are preserved, and those

maximum values combine to form a lower-

dimensional output. It provides a form Of

translation invariance. Imagine cascading a max-

pooling layer with a convolutional layer.

Figure 3: Max Pooling

4. Results:

As with any work or project taken up in the field of

machine learning and image processing we are not

considering our results to be perfect. Machine

learning is a constantly evolving field and there is

always room for improvement in your

methodology; there is always going to be another

new approach that gives better results for the same

problem.

The application has been tested using three models:

Multi-Layer Perceptron (MLP), Convolution Neural

Network (CNN). With each model we get a

different accuracy of the classifier which shows

which one is better. The results of training the

network is stored in .npz format so that whenever a

user tries to recognize the digit, the application does

not go into the training loop again. For

classification, we have used logistic classifier,

softmax function, one hot encoding, cross entropy

and loss minimization using mini batch gradient

descent. These are some of the basics of Neural

Network which are required to process the output

from the network and display in the form the user

can understand.

4.1.Dataset used:

The dataset used is the MNIST database of

handwritten digits.It consists of a training set of

60,000 examples, and a test set of 10,000

examples.The digits have been size-normalized and

centered in a fixed-size image.The images are of

size 28*28 pixels.It is a good database for people

who want to try learning techniques and pattern

recognition methods on real-world data while

spending minimal efforts on preprocessing and

formatting.



IJIACS

ISSN 2347 – 8616

Volume 6, Issue 5

May 2017

The dataset is taken from-

http://yann.lecun.com/exdb/mnist/

4.2 Analysis of the results:

In business, System Analysis and Design refers to

the process of examining a business situation with

the intent of improving it through better procedures

and methods. System analysis and design relates to

shaping organizations, improving performance and

achieving objectives for profitability and growth.

The emphasis is on systems in action, the

relationships among subsystems and their

contribution to meeting a common goal.

Looking at a system and determining how

adequately it functions, the changes to be made and

the quality of the output are parts of system

analysis. Organizations are complex systems that

consist of interrelated and interlocking subsystems.

Changes in one part of the system have both

anticipated and unanticipated consequences in other

parts of the system. The systems approval is a way

of thinking about the analysis and design of

computer based applications. It provides a

framework for visualizing the organizational and

environmental factors that operate on a system.

Proposed Application Module:

The proposed application has been implemented

using Python on Django Framework. The user is

given two options in the home image: Simple

Upload, Model Form Upload. Simple Upload will

allow the user to upload the image and predict it

then and there. After navigating away from that

page, the link to the uploaded image is lost. The

Model Form Upload will allow the user to upload

the image with description. With this link, the user

will be able to store the image and see its link on the

home page itself. By clicking on the link, the user

will be able to get the result from the CNN

classifier.

4.3 Validation of the Results

The application shows the results of the processing

of an image, and feeding it to the network. The

screenshots for the application showing results are

given in the next section. There has been hardly any

cases where the CNN model gives wrong prediction

for a particular image.

5. Conclusion

Handwritten digit recognition is the first step to the

vast field of Artificial Intelligence and Computer

Vision. With the advance of technology, every day

there are new algorithms coming up which can

make computers do wonders. Machines can

recognize images and understand handwriting of

different people which humans themselves have

failed to understand or comprehend from the image.

As seen from the results of the experiment, CNN

proves to be far better than other classifiers. The

results can be made more accurate with more

convolution layers and more number of hidden

neurons.

The target of the proposed work was to open the

way to digitalization. Though the goal is to just

95.50%

96.00%

96.50%

97.00%

97.50%

98.00%

98.50%

99.00%

99.50%

K-NN MLP CNN

Accuracy

Accuracy

Models Accuracy

K-Nearest Neighbour(K-NN) 96.8%

Multi-Layer Perceptron (MLP) 97.3%

Convolution Neural Network

(CNN)

99.1%

http://yann.lecun.com/exdb/mnist/



IJIACS

ISSN 2347 – 8616

Volume 6, Issue 5

May 2017

recognize the digits but it can be extended to letters

and then a person’s handwriting. This can

completely abolish the need of typing. People can

straightaway write down in their own handwriting

and then convert it in digital form by just clicking

pictures of it. Digit recognition is an excellent

prototype problem for learning about neural

networks and it gives a great way to develop more

advanced techniques of deep learning.

The broader aim in mind was to develop a M.L.

model that could recognize people's handwriting.

However, as we began developing the model we

realized that the topic in hand was too tough and

would require tremendous data to learn. Example to

accurately classify a cursive handwriting will be

very tough. Thus we settled on classifying a given

handwritten digit image as the required digit using

three different algorithms and consequently testing

its accuracy. In future we are planning to further

explore the topic to recognize people’s handwriting.

REFERENCES [1]DR.KUSUMGUPTA2 ,"A COMPREHENSIVE REVIEW ON

HANDWRITTEN DIGIT RECOGNITION USING VARIOUS

NEURAL NETWORK APPROACHES", INTERNATIONAL

JOURNAL OF ENHANCED RESEARCH IN

MANAGEMENT & COMPUTER APPLICATIONS, VOL. 5,

NO. 5, PP. 22-25, 2016.

[2] Ishani Patel, ViragJagtap and Ompriya Kale."A

Survey on Feature Extraction Methods for

Handwritten Digits Recognition", International

Journal of Computer Applications, vol. 107, no. 12,

pp. 11-17, 2014.

[3] Y LeCun,"COMPARISON OF LEARNING

ALGORITHMS FOR HANDWRITTEN DIGIT

RECOGNISATION",”. In: International conference

on Artificial Neural networks, France, pp. 53–60.

1995.

[4]Faisal Tehseen Shah, Kamran Yousaf,"Handwritten

Digit Recognition Using Image Processing and

Neural Networks", Proceedings of the World

Congress on Engineering, vol., 2007.

[5]Viragkumar N. Jagtap , Shailendra K. Mishra,"Fast

Efficient Artificial Neural Network for Handwritten

Digit Recognition", International Journal of

Computer Science and Information Technologies,

vol. 52, no. 0975-9646, pp. 2302-2306, 2014.

[6]Saeed AL-Mansoori,"Intelligent Handwritten Digit

Recognition using Artificial Neural Network", Int.

Journal of Engineering Research and Applications,

vol. 5, no. 5, pp. 46-51, 2015.

[7]Gil Levi and Tal Hassner,"OFFLINE

HANDWRITTEN DIGIT RECOGNITION USING

NEURAL NETWORK", International Journal of

Advanced Research in Electrical, Electronics and

Instrumentation Engineering, vol. 2, no. 9, pp. 4373-

4377, 2013.

[8]Haider A. Alwzwazy1 , Hayder M. Albehadili2 ,

Younes S. Alwan3 , Naz E. Islam4 ,"Handwritten

Digit Recognition Using Convolutional Neural

Networks", International Journal of Innovative

Research in Computer and Communication

Engineering, vol. 4, no. 2, pp. 1101-1106, 2016.

[9]XuanYang , Jing Pu ,"MDig: Multi-digit Recognition

using Convolutional Neural Network on Mobile",

Stanford University.

[10]Yuhao Zhang,"Deep Convolutional Network for

Handwritten Chinese Character Recognition",


[11]Jung Kim,Xiaohui Xie,"Handwritten Hangul

recognition using deep convolutional neural

networks", University of California.

[12]Nada Basit , Yutong Zhang , Hao Wu , Haoran

Liu , Jieming Bin , Yijun He , Abdeltawab M.

Hendawi,"MapReduce-based Deep Learning with

Handwritten Digit Recognition Case Study",

University of Virginia, VA, USA.

[13]"Handwritten Digit Recognition using various Neural

Network Approaches", International Journal of

Advanced Research in Computer and

Communication Engineering, vol. 4, no. 2, pp. 78-

80, 2015.

[14]"Handwritten Digit Recognition Using Image

Processing and Neural Networks", Proceedings of

the World Congress on Engineering, vol. 1, 2007.

[15]"Recognizing Handwritten Digits and Characters",


[16]”Hand Written Digit Recognition using Kalman

Filter”, International Journal of Electronics and

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=%22Authors%22:.QT.Nada%20Basit.QT.&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=%22Authors%22:.QT.Yutong%20Zhang.QT.&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=%22Authors%22:.QT.Hao%20Wu.QT.&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=%22Authors%22:.QT.Haoran%20Liu.QT.&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=%22Authors%22:.QT.Haoran%20Liu.QT.&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=%22Authors%22:.QT.Jieming%20Bin.QT.&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=%22Authors%22:.QT.Yijun%20He.QT.&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=%22Authors%22:.QT.Abdeltawab%20M.%20Hendawi.QT.&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=%22Authors%22:.QT.Abdeltawab%20M.%20Hendawi.QT.&newsearch=true



IJIACS

ISSN 2347 – 8616

Volume 6, Issue 5

May 2017

Communication Engineering, Volume 5, Number 4,

pp. 425-434, 2012.

[17]”Detection and Classification of Brain Tumor Using

SVM and K-NN Based Clustering”, SSRG

International Journal of Electronics and

Communication Engineering, March 2017.\

[18]Hayder M. Albeahdili, Haider A. Alwzwazy, Naz E.

Islam.” Robust Convolutional Neural Networks for

Image Recognition”. (IJACSA) International Journal

of Advanced Computer Science and Applications,

Vol. 6, No. 11, 2015.

[19] Fabien Lauer, Ching Y. Suen, and G´erard Bloch “A

trainable feature extractor for handwritten digit

recognition‖”, Journal Pattern Recognition,Elsevier,

40 (6), pp.1816-1824, 2007.

[20] Chen-Yu Lee, SainingXie, Patrick Gallagher,

Zhengyou Zhang, ZhuowenTu, “ Deeply-Supervised

Nets “ NIPS 2014.

[21] M. Fischler and R. Elschlager, “The representation

and matching of pictorial structures”, IEEE

Transactions on Computer, vol. 22, no. 1, 1973.

[22] Kevin Jarrett, KorayKavukcuoglu,

Marc’AurelioRanzato and YannLeCun “What is the

Best Multi-Stage Architecture for Object

Recognition” CCV’09, IEEE, 2009.

[23] Kaiming, He and Xiangyu, Zhang and Shaoqing,

Ren and Jian Sun “ Spatial pyramid pooling in deep

convolutional networks for visual recognition‖

European”, Conference on Computer Vision,

arXiv:1406.4729v4 [cs.CV] 23 Apr 2015.

[24] X. Wang, M. Yang, S. Zhu, and Y. Lin. “Regionlets

for generic object detection”. In ICCV, 2013.

[25] Krizhevsky, Alex, Sutskever, Ilya, and Hinton,

Geoffrey. “ImageNet classification with deep

convolutional neural networks”.In Advances in

Neural Information Processing Systems 25

(NIPS’2012). 2012.

[26] C. Couprie, C. Farabet, L. Najman, and Y. LeCun.“

Indoor semantic segmentation using depth

information”. International Conference on Learning

Representation, 2013.

[27] R. Girshick, J. Donahue, T. Darrell, and J. Malik.

“Rich feature hierarchies for accurate object

detection and semantic segmentation”. CoRR,

abs/1311.2524, 2013.

[28] O. Russakovsky, J. Deng, H. Su, J. Krause, S.

Satheesh, S. Ma, S. Huang, A. Karpathy, A.

Khosla,M. Bernstein, A.C. Berg, and F.F. Li. “

Imagenet large scale visual recognition challenge”.

International Journal of Computer Vision”

December 2015, Volume 115, Issue 3, pp 211- 252.

[29] RaiaHadsell, Sumit Chopra, YannLeCun,

“Dimensionality Reduction by Learning an Invariant

Mapping” CVPR , vol. 2, pp. 1735- 1742, 2006. 13.

Omkar M. Parkhi, Andrea Vedaldi,and Andrew

Zisserman “Deep Face Recognition” British

Machine Vision Conference, 2015

[30] Dan ClaudiuCiresan, Ueli Meier, Luca Maria

Gambardella, JuergenSchmidhuber, “Deep Big

Simple Neural Nets Excel on Handwritten Digit

Recognition” Neural Computation, Volume 22,

Number 12, December 2010

[31] Sherif Abdel Azeem, Maha El Meseery,

HanyAhmed ,”Online Arabic Handwritten Digits

Recognition “,Frontiers in Handwriting Recognition

(ICFHR), 2012

[32] Liu, C.L. et al.,.” Handwritten digit recognition:

Benchmarking of state-of-the-art techniques”.

Pattern Recognition 36, 2271–2285. 2003

[33] Xiao-Xiao Niun ,Ching Y. Suen “A novel hybrid

CNN–SVM classifier for recognizing handwritten

digits” Pattern Recognition 45 (2012) 1318–1325,

2011.

[34] M. D. Zeiler and R. Fergus, “Visualizing and

understanding convolutional neural networks,”

arXiv:1311.2901, 2013.

[35] Gil Levi and Tal Hassner,” Age and Gender

Classification using Convolutional Neural

Networks”, Computer Vision and Pattern

Recognition Workshops (CVPRW) IEEE, 2015

Documents

Hand Written Digit Recognition using Convolutional …academicscience.co.in/admin/resources/project/paper/f...Signature Verification, Postal-Address Interpretation, Bank-Check Processing,