Khmer Character Recognition using Artificial Neural Network · Khmer Character Recognition using Artificial Neural Network Hann Meng* and Daniel Morariu† *Faculty of Engineering,

Khmer Character Recognition using Artificial Neural Network

Hann Meng* and Daniel Morariu† *Faculty of Engineering, Lucian Blaga University of Sibiu, Sibiu, Romania

E-mail: [email protected] Tel: +855 92 86 †Advisor, Department of Computer Science and Electrical Engineering, Lucian Blaga University of Sibiu, Sibiu, Romania

E-mail: [email protected]

Abstract— Character Recognition has become an interesting and a challenge topic research in the field of pattern recognition in recent decade. It has numerous applications including bank cheques, address sorting and conversion of handwritten or printed character into machine-readable form. Artificial neural network including self-organization map and multilayer perceptron network with the learning ability could offer the solution to character recognition problem. In this paper presents Khmer Character Recognition (KCR) system implemented in Matlab environment using artificial neural networks. The KCR system described the utilization of integrated self-organization map (SOM) network and multilayer perceptron (MLP) network with backpropagation learning algorithm for Khmer character recognition problem.

I. INTRODUCTION

Character recognition or optical character recognition (OCR) system is often used in machine translation, text-to-speech and text mining. It is been used to bring books, magazines, and printed documents to digital library by digitalizing them. The OCR application is the best example of pattern recognition. In order to be able to recognize Khmer scan text documents or handwritten documents, first important step is to be able to recognize Khmer characters. In this paper, I developed a Khmer character recognition system using artificial neural network. The system is implemented with self-organization map (SOM) network and multilayer feed-forward neural network using backpropagation-learning algorithm based on the scheme presented in Fig. 6.

II. KHMER LANGUAGE CHARACTERISTICS

Prepare y The Khmer script (Âksâr Khmêr) is used to write Khmer language (Cambodia official language). Cambodia officially known as the Kingdom of Cambodia that located in Southeast Asia. Ref. [12], Khmer differs from neighboring languages such as Thai, Lao, Burmese and Vietnamese. Khmer language has influenced by Sanskrit and Pali.

A. Consonants Fig. 1 is a list of Khmer 33 consonants is shown below.

Some characters are quite similar with little differences.

Fig. 1 list of Khmer consonants

Fig. 2. Numberical

B. Numberical Khmer numerical are often used in literature more than

handwritten which shown in Fig 2. In handwritten or every day used, people written as Arabic numerical which are widely known.

III. BACKGROUND TECHNOLOGY

In this section described the concept of artificial neural network (ANN). Artificial neural network has been use in many fields of study such as image processing, document clustering, segmentation, biology system, geography, chemistry, decision-making, data mining, optical character recognition (OCR), pattern recognition and many other interesting fields of research. Researchers have developed many artificial learning algorithms, which could enable the machine to have ability to perform and learn like human brain. ANN is trained and learned by example like human being and process information’s like human brain.

Ref. [4], The architecture of HNN is base on preclassification by using SOM (Self-Organization Map). Each preclassification is followed by the recognition modules, which PCA (Principle Component Analysis) cascaded with an MLP (Multilayer perceptron). A set of winner from SOM network is triggered the recognition modules to classify the input data. Voting mechanism is used to select winner when

978-616-361-823-8 © 2014 APSIPA APSIPA 2014

multiple winner occurred. By seeing the great potential of hierarchical neural network which integrate multiple artificial neural networks into character recognition system, self-organization map and multilayer perceptron neural network is chosen to implement our simple system. This section is organized as follows. Self-organizing map (SOM) is the clustering algorithm that maps input data to form grouping. Neural network is used as character classifier network and brief introduction of Matlab tool for artificial neural network.

A. Self-Organization Map (SOM) Ref. [3, 9], the self-organizing map (SOM) network

developed by Kohonen (1995), also called Kohonen neural network. Ref. [6], SOM is an effective network for analyzing of multidimensional data. SOM is one of Artificial Neural Network (ANN), which is suitable for data clustering. The SOM network consist of two layers of neuron: an input layer, where the inputs to the ANN are applied, and an output layer of neurons called competitive layer, where the grouping the inputs are formed. SOM is an unsupervised learning algorithm, which there is no desired output (no teacher), the network have to discovery itself the feature, pattern, clusters, regulation from the input data automatically.

The idea this algorithm is to seek the “winner” unit from each input vector, and the modification for the synaptic coefficients is done for the winner unit and also for all units from the neighborhood of the winner unit. This network is like a map of an input dataset which shown in Fig 3.

Algorithms 1). Initialize all weights with small random values (or initialize the weight using input data) 2). Set the initial learning rate and topological of neighborhood 3). Take the input sample (vector) and present to the network then calculate the distance between itself to all neurons in the network Calculate the distance of Best Matching Unit (BMU): 𝑥 − 𝑤!,! = 𝑚𝑖𝑛!,! 𝑥 − 𝑤!" (1) 4). The neuron that has the closest distance to the current sample is considered to be the “winner” neuron and update its weight. Update the weight:

w! t + 1 = w! t + α t x − w! t for i ∈ neigborhoodw! otherwise

(2)

5). For all neuron found in the neighborhood of the winner neuron also update all their weights according to the Kohonen equation (2) 6). Repeat step 3-5 for all input vectors from the data set 7). Decrease the value of learning rate (α) and shrink the neighborhood (σ)

Fig. 3. The basic structure of self-organization map, ref. [6] 8). Repeat step 3 through 7 until the learning rate is less than a pre-specified threshold or a maximum number of iterations are reached. The proposed formula for computing of the Neighborhood:

σ t = σ! ∗ e! !!

And the Learning Rate: α t = α! ∗ e! !! Where t is the

current age (stage), N is total number of iterations Note: The coefficient of neighborhood (σ!) should be large

enough because we want to start with a large number of neighborhoods at the beginning (recommended neighborhood be about 60-70% from the all neurons). The parameter of the learning rate (α!) was chosen small value of 0.1 because we want that learning rate be small in order to descent quickly to zero.

B. Neural Network Ref. [2], Warren McCulloch and Walter Pitts introduced a

binary threshold as a computational model for neural network in 1943. Paul Werbos developed a learning algorithm called backpropagation of error in 1974. It is now used in many areas of studies and researches. Ref. [5], neural network is the most popular choice for developing character recognition system, which this network could learn well and provide high accuracy and speed for character identification. Neural network has been use to solve many character recognition problem; such as Chinese character recognition.

A neural network contains a number of nodes (units or neurons) connected by edges. Each link has a weight associated with it. The weights can be a memory of the network and the learning process of the networks is to compute these weights so that the network presents best results suited to train data. Neural network is adaptive because it can adapt to the change of data and learn the characteristic of the input signal.

The system is illustrated the noticeable and better result of using feed-forward network using back propagation algorithm with momentum and adaptive learning algorithm that provided better network classifier and higher accuracy in recognizing character.

Fig. 4. Common model of artificial neural network

Fig. 5. Multilayer Perceptrons, ref. [1]

Therefore, feed-forward neural network using backpropagation with adaptive learning is chosen to implement for Khmer character recognition system and used as character classification network.

Computational Neural model The Fig. 4 show the common model of artificial neural

network.

Multilayer Perceptron Network and Backpropagation Multilayer perceptron is a feed-forward neural network. It

consists of input layer, hidden layers and output layer. The common multilayer network has single hidden layer called two layer network in which count only number of hidden layers and output layer and input layer is not include. Multilayer perceptron network is the network that has more than one layer of perceptron with an activation function in each layer, which can learn any boolean function, Fig. 5.

Compute the output in hidden and output layer in the following: Hidden layer h! = f(H!,where H! = w!"x!! + θ! (3) Output layer o! = f(O!), where O! = w!"h!! + θ! (4)

Fig. 6. The structure of KCR System

• 𝑥 represent the inputs signal

• 𝑤 represent the weights

• 𝜃 represent the bias value

• 𝑓 is an activation function or cost function of the

network that is known as sigmoid transfer function.

The error output function is computation between

actual output from the network and desired output.

Error function is known as network’s energy

function.

𝐸 = !!

(𝑑! − 𝑜!)!!! (5)

where 𝑑! is the desired output for neuron ith

IV. SYSTEM DESIGN

A. Application Flow Control

The Fig. 6 shown the overall architecture of implemented Khmer character recognition system.

The proposed Khmer character recognition (KCR) system scheme is adopted from ref. [4] that is depicted in figure above. The KCR system consists of two well-known artificial neural network methods that are integrated to solve the Khmer character recognition problem. Self-organization map is used for input data grouping and multilayer neural network is used for network classifier. The network is trained with set of character in 20 by 20 dimensional in black and white pixels. The input data images represent as 20 by 20 matrix. First is to transform input binary character image matrix to a vector with scalar values of 400 elements and then feed to the network. Then the SOM network has 400 neurons at the input layer, which is each neuron for each pixel. B. Data Representation

Input Representation

The input samples to the artificial neural network are mostly in image format. The input image can be from several sources such as scan text image, camera capture, handwritten or handprint document. As text image usually contains lot of

Fig. 7. Listed of 5 different fonts on a single character

characters inside. Ref. [8, 10], to get the isolated character image from the text image, many methods are presented in articles and papers that are needed to do image preprocessing and segmentation.

In our work, we do not adapt to segmentation and feature extraction methods above. The input image is isolated character image, which is predefined. Each input character image is an image with JPEG format, which is only need to convert to binary character image. Each character image is represented in 20x20 pixels that are 400 pixels. This is the feature selection for the network. The system is trained with 5 sets of different fonts which each font has 33 characters and 10 numerals. In total, there are 43 x 5 = 135 input samples.

The Fig. 7 is shown 5 types of difference fonts on a single character in which each font has different styles, sizes and shapes that are used in different purposes. Some font provided a very thin of character or fat or bold and some are like handwriting.

In the Fig. 8 is described the full process of input data transformation from character image to final data presentation for the networks. First, input character image is converted to binary matrix form in 20x20 dimensions, and then binary matrix is transformed to vector scalar values with 400 elements and put to dataset to represent all character matrixes. These processes is repeated until finish all the input character datasets. The transformation from character image to matrix is described in preprocessing section. Target Vector Encoding

Multilayer Neural network is a supervised learning algorithm; in which in the training process is needed to know the desired output is needed. Supervise learning is called also learning with a teacher. When the actual output from the network is closer to the desired output, the small error is obtained. As the nature of neural network the output neurons are usually 0 or 1. There are several ways where target vector can represent in neural networks.

In the case of Khmer language, there are 33 consonants, 25 dependent vowels, 14 independent vowels and 10 numerical. The total is 33 + 25 + 14 + 10 = 82. So define the target vector with 82 dimensional is not a good choice. As a result, the method of defining the target vector above is hard to work on it and it is not a good solution to the problem. Therefore, we proposed target vector encoding technique to deal with this difficulty, see Fig. 9. Target vector encoding:

2! = 𝑛𝑢𝑚𝑏𝑒𝑟𝑠 𝑜𝑓 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑡𝑎𝑟𝑔𝑒𝑡 where n is number of digits.

Fig. 8. Input data transformation

Fig. 9. Target Vector Encoding Sample

The idea behind the target vector encoding technique is to convert decimal index that represent of that character into binary form with specific number of digit. The n number of digit is requiring because it could show how many number of characters would be able to represent. For example, using only 5 digits, would have 2! = 2! = 32 possible targets to represent all 26 English alphabets. In Khmer language with 81 inputs by using 6 digits then the maximum number of target vector is 2! = 64. Using only 6 digits because in KCR system use only 43 (alphabets and numerals) inputs are cluster into 9 group base on it similarity by SOM network. So the using only 6 digits it could represent all input characters in each groups. More precisely see in the Table 3, which is shown the first 5 characters with 6 digits target vector encoding. This technique is flexible and scalable with less complexity for the neural network. C. Image Preprocessing

Preprocessing is the stage to prepare data representation for putting to the network. In this section will describe about image to gray scale transformation and convert input character image to binary character matrix.

Fig. 10. Bitmap Image, a) isolate character, b) binary character image, c) character image matrix

Gray Scale Transformation

Gray scale transformation is a part of image processing. It converts color image into gray scale form. Ref. [11], usually images from the camera or scan documents are in the form of true color images, while most of the image processing techniques are worked with 256 gray scale image. Gray scale image also known as black and white and the value of color varying from weak (black) to strong intensity (white).

Image Digitalizing

Usually the documents are in the form printed document, handwritten character, typewriter and images in which human is able to visualize. In contrast to human, computer is worked on the machine encoding behind the image with a specific format. In artificial neural network, the input characters mostly are images, which are needed to convert into the form of binary character matrix, that is, the network able to learn. Image digitalizing is a process to transform images in the form of gray scale, color into binary form.

In the Fig 10, the alphabet has been digitalizing into 20x20 = 400 digital cells, each cell have only a single color, either black or white. Each black pixel is assigned value to 0 and each white pixel is assigned value to 1, which is shown in Fig 10 (c). It is necessary to understand the information from the image to the matrix in a form meaningful to a machine encode information for computer.

D. Additional System Features The additional features are use to enhance the system

performance as well as the better recognition accuracy and speed.

Data Normalization

Normalization is the process to transform the input data for the network in the way that could highly system performance. Usually in artificial neural network the input data is normalized in the range of interval [0 1] or [-1 1]. Normalization can be made when create a new network by using minmax or decimal scaling function. For SOM network, it is better to transform the input value from 0 to -1 and 1 to 1. Since the normalization of input value would make the training faster and better data groups. For MLP neural network, the transformation input value from 0 to 0.1 and 1 to 0.9 for fast convergence and better precision since the sigmoid function is used in networks, which provide great achievement for network result. These transformations of the input data are based on experiences, since there is no specific

Fig. 11. a) The original binary image, b) matrix represent character, and c) image with noise

rule or algorithms to transform input data for a specific network.

Adding Noise to the Data

In reality, noise is happened unpredictable like in scan documents or handwritten documents. Noise is not a good thing, because it could limit the accuracy of network system. Our network is trained with the ideal data binary character images. But for testing, it is better to add some noises to the binary character images in order to know whether the network has ability to learn or not. Adding noise allows the system to explore and learn to approximate the best result. Thus adding noises to the binary character image is required for testing the network. Adding the noises randomly to the matrix I with the noise

rate as following steps: 1) Calculate number of random points base on specific noise rate 2) Take random row index i, which i in range [1 20] 3) Take random column index j, which j in range [1 20] 4) Apply to the matrix I (i, j) with the random value in which value can be 0 or 1 5) Repeat step 2 to 4 until reach the number of random points The sample result by adding noises to original character image matrix is shown in Fig. 11 (c). E. Network Design

The KCR system begins with the training of self-organization map network, which after training the data are clustered. Each cluster is trained with MLP neural network. Each cluster is trained corresponding to MLP neural network modules. The SOM network is trained using adaptive learning algorithm that is fast for data clustering and MLP network is trained using back propagation learning algorithm that is good for distinguishing character identity.

Self-Organization Map Configuration

SOM network consists of input layer and competitive layers and no bias. Ref. [7], authors used different size of SOM network (30x30 and 50x50) to test and find out the better recognition accuracy. Since in our system the dataset to train is very small, so SOM network size is used only 3x3, which input dataset group into 9 classes. Then each class is putted to each MLP network. It is used as preclassify the network. The network training parameters as following:

• Input neurons: 400 (one sample)

• Output neurons: 9 clusters ([3 3])

• Training algorithm: Random Order Weight/Bias

Learning Rules

• Training epochs: 2000

Classifier of Neural Network

Neural network is a powerful method for data modeling that could capture and represent the complex relationship between input and output. Neural network system architecture is resembled to perform task like human brain that is network acquired knowledge from learning (training) and stored it in the network’s interconnected neurons, which is known as connection weights. Neural network is fast and realizable method for data classification toward achieving high recognition rate. Neural network with backpropagation algorithm consists of

two hidden layers, input layer and output layer. The out come from each layer is activated with sigmoid transfer function. Output layer is the competitive layer, where the output character is identified. The initial weights, bias and momentum are initialized by the neural network system. The network training parameters as following:

• Input neurons: 400 (one sample)

• Hidden layer neuron: 100 for hidden layer 1, 50 for

hidden layer 2

• Output neurons: 6 (binary encoding)

• Training algorithm: gradient descent

backpropagation with momentum and adaptive

learning rate

• Performance function: Sum Squared Error

• Activation function: sigmoid transfer function

• Training goal achieved: 10e-29

• Training epochs: 5000

V. APPLICATION

A. Application Interface The application interface of Khmer character recognition

(KCR) system is implemented using Matlab software and is depicted in the Fig 12.

The GUI (graphics user interface) has shown the four process from the input character to the predicte output character including several buttons where user can interact to test the system and configure to testing sitting. User be able to change the noise rate and number of character to be random for recognizing with the display text box for the recognition report

Fig. 12. KCR Application Interface

Training Interface

The Fig 13 below is shown the training interface of Khmer character recognition system.

The system is trained with predefine dataset in which consist of 5 set of fonts and each font contains 33 characters and 10 numerals. The system has two training phase: first training with SOM network and second training with MLP network. With this training interface user has possibility to set the training parameters of SOM in which consist of learning rate and training iterations and set the training parameters of MLP network in which consist of training goal, learning rate, training epochs, and number of hidden layer in both layer 1 and 2.

The input dataset is first trained with self-organization map (SOM) in which data is group into subset. Each subset is fed to each corresponding multilayer perceptron and the training SOM network is saved after training is finished. The result of each grouping dataset from SOM is fed to the corresponding to train with MLP module and each trained MLP module is saved as recognition module.

Fig. 13. Training Interface

Fig. 14. Testing Application

Grouping dataset into subset is really important which can lead speed up training with MLP and better classification and high recognition accuracy. This is not only speed up the training but also speed up the testing process and provided high accuracy. Training with multilayer perceptron network (MLP) is the second phase and also train for network classifier. After each training module is finished, MLP network module is saved for network classifier that is used in the testing phase.

B. Using Application

The Khmer Character Recognition System (KCR) is implemented in Matlab, which is shown in the Fig. 14.

By clinking on recognition button the character classification process to find the best match the input character corresponding to the noise rate is displayed on the predicted panel. The recognition module uses the number of testing randomly on an input character with the same noise rate. The system report text display shown the predicted character index and the number of frequency predicted character index found according to the number of random testing. The display predicted character is selected based on number of frequency index occurred. Each character has different forms, depending on the font it belongs. The results are display with only one standard font. The predicted result is displayed with the corresponding testing outcome index that match to the standard font index. The simulation with differences training goal 10e-20 and 10e-29 provided the same recognition rate when there is no noise added and when the noise added training goal with value 10e-29 provided higher recognition rate with all the different fonts in test set. Therefore, the training goal of neural network is used 10e-29 to train our system since the average recognition rate increase up to 94.13% with the same noise rate 10%. The summary of simulation approximate result of overall system that is tested with trained and untrained dataset from different fonts in which the network is trained using gradient

descent backpropagation with momentum and adaptive learning rate with the training goal 10e-29.

VI. CONCLUSIONS AND FUTURE WORK

The Khmer Character Recognition (KCR) system is integrated two artificial neural network techniques to work together namely self-organization map (SOM) and multilayer perceptron with backpropagation algorithm have been described in this paper. The preprocessing and preclassification (SOM) working together with character classifier modules could enhance the system performance, speed and recognition accuracy. As a result, using preclassification could reduce the complexity and allows faster recognizing the character since the result of preclassification is triggered a specific module of multilayer perceptron classifiers. That is the important feature in the system. The improvement of the system’s performance by using backpropagation with momentum gradient descent learning term could retains quickly compare to the standard backpropagation in which using gradient descent learning term by experimented. The training goal is set to the maximum value (10e-29) since the average recognition rate increase from 72% (training goal 10e-10) to 94% recognition rate with the same noise rate. The average recognition of trained dataset resulted 65% of correct predictions and untrained dataset is approximately only 30% correct prediction with noise rate.

The detail discussion of this paper could provide the beneficial conceptual of self-organization map and neural network as well as understanding the utilization and the implementation of the integrated of these two learning algorithms. This material could serve as a guide for readers working in the character recognition area.

A lot of research is still needed to improve the current KCR system’s performance with new feature. The KCR system is expected to be able to work with offline handwritten Khmer character, which the dataset is collected from difference writers. The scan images dataset would need to do image preprocessing (smoothing, noise remove, binarization), feature extraction and segmentation method to provide better recognition network system. By apply principle component analysis (PCA) that is a statistical method to projection the dataset from higher dimensions to lower dimension, which could enhance the system performance much faster. Therefore, the combine these methods with the existing integrated of self-organization map network and multilayer perceptron neural network may provide better solution to offline Khmer character recognition system.

REFERENCES

[1] Brian D. Ripley, N. L. Hjort, “Pattern Recognition and Neural Networks (1st ed.),” Cambridge University Press, New York, NY, USA, 1995

[2] Gheorghita S., Munteanu, R., Enache M., Study of neural networks to improve performance for character recognition, AQTR 2012 - IEEE International Conference on Automation Quality and Testing Robotics, pp.323, 326, 24-27 May 2012

[3] Kohonen T., The self-organizing map, Proceedings of the IEEE, Vol.78, No.9, pp.1464, 1480, Sep 1990

[4] U. Halici, A. Erol, G. Ongun, Industrial Applications of Hierarchical Neural Networks: Character Recognition and Fingerprint Classification, CRC Press, Inc., Boca Raton, FL, USA, 1999

[5] Liangbin Zheng, Ruqi Chen, Xiaojin Cheng, Research on Offline Handwritten Chinese Character Recognition Based on BP Neural Networks, IPCSIT Vol. 51, IACSIT Press, Singapore, 2012

[6] M.H. Ghaseminezhad, A. Karami, A novel self-organizing map (SOM) neural network for discrete groups of data clustering, Applied Soft Computing, Vol 11, Issue 4, June 2011, Pages 3771-3778, ISSN 1568-4946, 10.1016/j.asoc.2011.02.009

[7] Marinai S., Miotti B., Soda G., Bag of Characters and SOM Clustering for Script Recognition and Writer Identification, ICPR 2010 - 20th International Conference on Pattern Recognition, pp.2182, 2185, 23-26 Aug. 2010

[8] Ranpreet Kaur, Baljit Singh, A Hybrid Neural Approach For Character Recognition System, International Journal of Computer Science and Information Technologies (IJCSIT), Vol. 2, 721-726, 2011

[9] Simon Haykin, Neural Networks: A Comprehensive Foundation (2nd ed.), Prentice Hall PTR, Upper Saddle River, NJ, USA, 1998

[10] Vijay Laxmi Sahu, Babita Kubde, Offline Handwritten Character Recognition Techniques using Neural Network, International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064

[11] Zhigang Zhang, Cong Wang, The Research of Vehicle Plate Recognition Technical Based on BP Neural Network, AASRI Procedia, Vol. 1, 2012, Pages 74-81, ISSN 2212-6716, 10.1016/j.aasri.2012.06.013

[12] Khmer alphabet, Available from: http://en.wikipedia.org/wiki/Khmer_alphabet, Accessed: 1 May 2013

Documents

Khmer Character Recognition using Artificial Neural Network · Khmer Character Recognition using Artificial Neural Network Hann Meng* and Daniel Morariu† *Faculty of Engineering,