Deep Learning Based Real-time Object Recognition …onlinepresent.org/proceedings/vol142_2016/19.pdf · Deep Learning Based Real-time Object Recognition System with Image Web Crawler

Deep Learning Based Real-time Object Recognition

System with Image Web Crawler

Myung-jae Lee1, Hyeok-june Jeong1, Young-guk Ha2

2 Corresponding author

1 Department of Computer Science & Engineerinig

Konkuk University

Neungdong-ro, Gwangjin-gu, Seoul 143-701, Korea

{dualespresso,amitajung}@naver.com 2 Department of Computer Science & Engineerinig

Konkuk University

Neungdong-ro, Gwangjin-gu, Seoul 143-701, Korea

[email protected]

Abstract. Recently, deep learning algorithm becomes a great solution for

various field. Convolutional Neural Network (CNN), a kind of neural network,

is known as suitable method for image processing. This paper proposes a real-

time object recognition system with CNN. Since many images are needed for

deep learning, this system contains image web crawler that collects images

automatically. This paper will show high accuracy in object recognition.

Keywords: deep learning, object recognition, CNN, crawler

1 Introduction

There are a lot of data in the Internet. These data can be used with suitable process

such as big data analysis which is the process of collecting, organizing and analyzing

a lot of data to discover data patterns and useful information. There are many types of

data, and one of them is image. Images have a lot of information and is used in

various system such as speed camera system on the road, license plate recognition

system and Google image searching system.

Object recognition in image is one of the interested study because object

recognition means that the system can understand like human think. In other words,

object recognition is related with field of Artificial Intelligence. Furthermore, growth

of deep learning algorithm accelerates object recognition system. Deep learning

which is part of a machine learning is used in the many research and industry to help

solve big data problems. It has various architectures such as Convolutional Deep

Neural Networks (CNN). CNN, which is inspired by the organization of the visual

cortex of animal, is the feed forward networks between its neurons. It can be used

vision computing system, such as object recognition system. Deep learning algorithm

with CNN can help analyze image data. It trains a lot of categorized images and helps

Advanced Science and Technology Letters Vol.142 (SIT 2016), pp.103-110

http://dx.doi.org/10.14257/astl.2016.142.19

ISSN: 2287-1233 ASTL Copyright © 2016 SERSC

recognition. In other words, deep learning needs a lot of images. Theas images can be

collected in the Internet.

However, categorizing data is the most important work before using data because

usable and unusable data for purpose are mixed in the Internet. For this reason, there

are increase of researches which is related with collecting data. Web crawling is one

of the collecting method. Web crawler collects data with established category and

helps manage data.

This paper proposes a real-time object recognition system with web crawler. This

system collects images automatically and trains collected images. With trained model,

the system recognize objects in real-time with camera. This system focuses on the

object which appear in the road such as car, traffic sign and police.

2 Related Work

There have been significant researches that have tired for archiving object recognition

in images. Several papers have proposed way of using Scale Invariant Feature

Transform (SIFT) algorithm[1,2]. Paper [1] suggested image recognition system with

pyramidal descriptor adapted SIFT algorithm and paper [2] proposed image

recognition system for colorectal polyp histology with SIFT. Both are well designed

system; however these systems are simply processed, so that its result can be

incorrect. Research [3] suggested persimmon growing monitoring system with

analyzing image and paper [4] proposed image recognition system with three

dimensional Speed Up Robust Feature (SURF) algorithm. However both are have a

possibility of erroneous result.

For this reason, there have been many studies that have aimed to archive object

recognition with deep learning. Paper [5] proposed deep learning based visual

tracking system and research [6] suggested multiple instance analysis system with

deep learning. Both focused on the use of deep learning in object recognition, not

performances.

To improve these performances, various designs are suggested. Related work [7]

suggested very deep CNN for recognition large-scale images. Paper [8] showed

hierarchical feature extraction to improve image recognition performances. Research

[9] proposed simple network of learning for fast image recognition. These studies

showed great performance but they have not managing system for images which are

trained and have not real-time system.

This paper suggests real-time image recognition system with image crawler. This

system tries for recognizing objects which are detected on the road such as Car,

Ambulance and Pedestrians. And this paper proposes a way to manage and collect

images and to recognize object in real-time.

Advanced Science and Technology Letters Vol.142 (SIT 2016)

104 Copyright © 2016 SERSC

3 System Design

Fig. 1. Overall architecture of this system.

Figure 1 shows overall architecture of this system. Web Image Crawler designs

ontology and collects images with designed ontology. This system crawls images

automatically and saves images in Hadoop Distributed File System (HDFS). Image

Trainer brings images from a HDFS and learns these images. Image Trainer makes

DNN model profile that fundamental source of Image Recognizer after learning.

Image Recognizer makes DNN model from downloaded DNN model profile and

detects objects from images which is captured from the camera in real time.

3.1 Image Web Crawler

Fig. 2. System Flow of Image Web Crawler

Figure 2 shows automatic image web crawling system. Ontology Manager generate

ontology and instance file. Ontology changes experiences in the real world into

modeled concept for computer. Ontology Manager consist of various objects on the

road. Web Page Searcher searches keyword with instances of ontology and takes web

source. This study used Selenium Google Chrome Driver for page searching. Image

Crawler crawls URLs of images from parsed web source. File Handler saves images

to HDFS. Before saving, it checks duplication of URL and changes URLs to images.

This study constructed HDFS on cluster server with 60 virtual nodes. HDFS is

suitable system to save big data.


Copyright © 2016 SERSC 105

3.2 Image Trainer

Fig. 3. System Flow of Image Trainer

As shown Figure 3 above, Image Trainer has three layered process. Image

Downloader brings big data images from a HDFS. Images which saved in HDFS are

classified with their category. Image Downloader brings these images as it is. Image

Learner trains images with deep learning. For deep learning network, this study used

Convolutional Neural Network (CNN) to recognize object from image.

Fig. 4. Graph and Example of Overfitting

CNN uses multiple filter to focus on a small area and get one number. By focusing

on a small area repeatedly, feature of image is found. However, this useful network

was not used until a few years ago. Since it focuses on small area repeatedly, its result

becomes detailed. As a result, trained system recognize only trained images but not



testing images which is in same category but not used in train. It is named Overfitting.

For example, the system trained Police Car but cannot recognize not trained Police

Car as shown Figure 4 above. However, as the dropout concept was proposed,

Overfitting problem was solved. Dropout eliminates overfitting and increase its

accuracy.

Deep learning with CNN can be implemented with various libraries. This paper

implemented with Caffe framework which is considered to be rapid and is

modularized with C++, python and Matlab. DNN Model Profile Manager makes

DNN model profile from the result of image training and sends DNN model profile to

image recognizer.

3.3 Image Recognizer

Fig. 5. System Flow of Image Recognizer

As shown Figure 5 above, Image Recognizer has two inputs, captured image and

DNN Model Profile. DNN Model Regenerator generates DNN model with received

DNN Model Profile. As Image Recognizer regenerates DNN Model, multiple Image

Recognizer can be used in this system. Image Receiver captures images from the

camera. Object Recognizer detects object from a received images with DNN model

which is regenerated on DNN Model Regenerator. Recognition Result Logger saves

result of object recognition. This log can be used in feedback of this system.

4 Implementation

The proposed Image Trainer is implemented on Ubuntu 14.04. To increase learning

performance, we used four GPGPUs and high-performance CPU. Caffe library was

used for deep learning and CUDA was used for using GPGPU. Image Recognizer is

implemented in Ubuntu 14.04 and used GPGPU for image recognition.

Table 1. Implementation Environment

Image Trainer Image Recognizer

CPU Intel Xeon E5 2.40GHz Intel i7 3.60 GHz

RAM 128GB 16GB



HDD 1TB SSD 256GB SSD

GPGPU Geforce GTX 1080 * 4 Geforce GTX 1080

OS Ubuntu 14.04 LTS Ubuntu 14.04 LTS

Libraries CUDA, OpenCV, Caffe CUDA, OpenCV, Caffe

This implementation trained 65,000 images which is collected in Image Web

Crawler and has general resolution; general resolution is in range from 640x480 to

1920x1080. To increase accuracy, it was learned with 100,000 iteration and 25

network layers. The recognizer experiment uses captured image from the camera in

real-time.

Fig. 6. A Part of the Designed Ontology

Ontology was designed as shown Figure 6. It is comprised of various objects which

is detected on the road. There are various instances at the bottom of the ontology tree

and sub instances at the child of instance. For example, two sub instances, Kia K5 and

Hyundai Sonata, are located for the child of Mid-size Car. These sub instances are

used for keyword to search images.

Fig. 7. Designed Convolutional Neural Network

As shown Figure 7 above, convolutional neural network is designed with 24 layers.

All of focused small images are use this network. To solve Gradient Vanishing

problem, ReLU layer is used for activate function in every Convolutional Layer. If the



system uses sigmoid function as a activate function, a gradient becomes zero value.

Using ReLU function can solve this problem with low calculating time. However, this

function makes input size too big and it can be a critical problem in learning

algorithm with increment of calculating time and lack of memory space. Pooling layer

is solution of this problem. With pooling layer, input size can be reduced. Local

Response Normalization (LRN) and Dropout layer prevent overfitting. Fully

Connected (FC) layer, which is implemented after Conv layer, classified images. In

Output layer, Softmax layer transforms result value to possibility.

Fig. 8. Result of implementation

The experiment trained with 65,000 images and 100,000 iteration. Calculating time

on training was 48 hours. Recognition system classified with 17 classes. An accuracy

of object recognition resulted 99% and calculating time was below 50ms in average.

5 Conclusion

This paper proposed a real-time object recognition system with image web crawler.

The proposed crawling system was designed for flexibility that can be modified easily

with ontology. This paper designed CNN to achieve high performance. The deep

learning system performed great accuracy. The recognizer was designed to implement

in itself, if DNN model profile is provided. In other words, this recognizer need not

exchange image or recognition data with deep learning system, but only need

download DNN model profile one time.

The proposed system will be modified in the near future. Recognition system will

apply to crawling process. Recognizer which is implemented in crawling process

checks images whether it is proper or not. This system will make accuracy of images

for training higher.



Acknowledgments. This work was supported by Institute for Information &

communications Technology Promotion(IITP) grant funded by the Korea

government(MSIP) (R7118-16-1002, Development of Driving Computing System

Supporting Real-time Sensor Fusion Processing for Self-Driving Car)

References

1. Seidenari, L.: Local pyramidal descriptors for image recognition. IEEE transactions on

pattern analysis and machine intelligence 36.5 (2014): 1033-1040.

2. Kominami, Y.: Computer-aided diagnosis of colorectal polyp histology by using a real-

time image recognition system and narrow-band imaging magnifying

colonoscopy. Gastrointestinal endoscopy 83.3 (2016): 643-649.

3. Chang, K.-C.: Design of persimmon growing stage monitoring system using image

recognition technique. Consumer Electronics-Taiwan (ICCE-TW), 2016 IEEE

International Conference on. IEEE, 2016.

4. Redondo-Cabrera, C.: Surfing the point clouds: Selective 3d spatial pyramids for category-

level object recognition. Computer Vision and Pattern Recognition (CVPR), 2012 IEEE

Conference on. IEEE, 2012.

5. Wang, N., Yeung, D.-Y.: Learning a deep compact image representation for visual

tracking. Advances in neural information processing systems. 2013.

6. Xu, Y.: Deep learning of feature representation with multiple instance learning for medical

image analysis. 2014 IEEE International Conference on Acoustics, Speech and Signal

Processing (ICASSP). IEEE, 2014.

7. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image

recognition. arXiv preprint arXiv:1409.1556 (2014).

8. Li, H.: Hierarchical feature extraction with local neural response for image

recognition. IEEE transactions on cybernetics 43.2 (2013): 412-424.

9. Chan, T.-H.: PCANet: A simple deep learning baseline for image classification? IEEE

Transactions on Image Processing 24.12 (2015): 5017-5032.

10. Srivastava, N.: Dropout: a simple way to prevent neural networks from overfitting. Journal

of Machine Learning Research 15.1 (2014): 1929-1958.



Documents

Deep Learning Based Real-time Object Recognition …onlinepresent.org/proceedings/vol142_2016/19.pdf · Deep Learning Based Real-time Object Recognition System with Image Web Crawler