Mining Best Utility Pattern from RFID Data

8/12/2019 Mining Best Utility Pattern from RFID Data

1/12


2/12

including marketing [10], manufacturing, process

control, and fraud detection [9], bioinformatics,

information retrieval, adaptive hypermedia,

electronic commerce and network management [4].

Descriptive mining and Predictive mining are the

two types of data mining tasks [5]. The fundamental

characteristics or common properties of the data in

the database are portrayed by a technique denoted bydescriptive mining. The technique of predictive

mining figures out patterns from the data, this

enables predictions to be made. Tasks like

Classification, Regression and Deviation detection

are included in predictive mining methods.

Many latest and emerging applications are found by

mining information from a huge database. One of

the fields that incorporate the sequential pattern

mining in RFID database is the Radio FrequencyI d en t if ic a ti o n ( R FI D ). R a di o F r eq u en c y

Identification (RFID) is a high-speed, real-time,

precise information gathering and processing

technology, which by employing radio-frequency

signal identifies the objects distinctively [6]. An

extensive variety of organizations and individuals

are being helped by RFID technology, for instance,

hospitals and patients, retailers and customers, and

manufacturers and distributors all through the

supply chain to achieve substantial productivitygains and efficiencies [11]. Motivated by long

sequences in text data, biological data, software

engineering, and sensor networks, mining repetitive

gapped subsequences was studied to capture the

occurrences of sequential patterns repeating within

each sequence of a large database and then use them

as features for classification or prediction. The tags

are very diverse from printed barcodes in their

ability to hold data, at which range the tags can be

read, and the absence of line-of-sight constraints

[12].

Finding all frequent sequential patterns with a user-

specified least support is the goal of sequential

pattern mining. Usually, the sequential pattern

mining approaches are either generate-and-test (also

known as Apriori) or pattern growth (also known as

divide-and-conquer) or vertical format method

approach [13]. Of the many approaches [15] that

have been proposed in sequential pattern mining

most of them are focused on the following two

issues: (1) enhancing the competency of the mining

process and (2) widening the mining of sequential

patterns to other types of time related patterns [16].

The retailing industry problems motivated the issue

of sequential patterns discovery. However, theresults are applicable to numerous scientific and

business domains, like stocks and markets basket

analysis, natural disasters (e.g. earthquakes), DNA

sequence analyses, gene structure analyses, web log

click stream analyses, and so on [18]. Time is the

most important feature for this task, mainly when the

results are necessary in a limited period of time [17].

In many cases, sequential pattern mining still faces

hard challenges in both efficacy and competence,nevertheless efficiency of mining the whole set of

sequential patterns has been enhanced considerably.

On the one hand, in a large database there could be a

huge quantity of sequential patterns. Only a small

subset of such patterns often interests a user. By

presenting the complete set of sequential patterns

the mining result would be tough to understand and

hard to employ [22].To optimize the cost of the

interesting sequential patterns Genetic Algorithm

(GA) is employed. GA optimizers are vigorous andthey function well with discontinuous and non

differentiable functions where the customary local

optimizers fail. Processes such as genetic

combination, mutation, and natural selection in a

design based on the concepts of evolution are used

by the optimization techniques.

Even efficient algorithms that have been proposed

for mining, it can be found that mining large amount

of sequential patterns from huge databases is acomputationally expensive task. In this work, an

effective data mining system that generates the

optimum sequential pattern is proposed. The main

aim of the exploration is to develop a utility

considered RFID data mining technique. It is

intended to discover an optimum sequential pattern

based on their utility. The rest of the paper is

organized as follows: section 2 describes some of

VIPS VIVEKANANDA JOURNAL OF RESEARCH(2)


3/12

the recent related works. Section 3 briefs about GA

and section 4 details about the proposed method,

optimization of sequential patterns using GA.

Experimental results and analysis of the proposed

methodology are discussed in Section 5. Finally,

concluding remarks are provided in Section 6.

Numerous researches have been proposed by

researchers for an effective data mining process. In

this section, a brief review of some important

contributions from the existing literature is

presented.

For frequent item set mining that identifies high-

utility item combinations an algorithm was

presented by J. Hu and A. Mojsilovic [18]. In

difference to the customary association rule and

frequent item mining methods, the objective of thealgorithm was to locate segments of data, defined

through combinations of some items (rules), which

gratify certain conditions as a group and maximize a

predefined objective function. They devise the task

as an optimization problem, present a competent

estimation to resolve it by specialized partition trees,

called High-Yield Partition Trees, and examine the

functioning of diverse splitting strategies. The

algorithm was tested on real-world datasets, and it

accomplishedvery good results.

F o r n u me ro u s s e qu e nt ia l p a tt e rn m i ni ng

applications, Jian Pei et al [19] proposed that the

Constraints were vital. Nevertheless, no systematic

study was available on constraint-based sequential

pattern mining. In their paper, that issue was

investigated and it was pointed out that the

framework which was developed for constrained

frequent-pattern mining did not fit our mission well.

On the basis of a sequential pattern growthmethodology an extended framework was

developed. Their study illustrates that under this

new framework the constraints can be effectively

and efficiently pushed deep into the sequential

pattern mining. Furthermore, their framework can

be extended to constraint-based structured pattern

mining as well.

A methodology with two processes for sequence

classification that utilizes sequential pattern mining

and optimization was presented by Themis P.

Exarchos et al. [21]. In the first stage, a series

classification model, which was found on a set of

sequential patterns, was defined and two sets of

weights one for the patterns and the other for classes

were set up. In the second stage, by employing anoptimization technique the weight values were

assessed to achieve best classification precision. By

altering the number of sequences, the number of

patterns and the number of classes, extensive

appraisal was done on the methodology, and it has

compared with similar sequence classification

approaches.

Data mining is a well accepted verity that the

process of data mining produces numerous patterns

from the given data and it was proposed by

S.Shankar et al. [22]. The procedure of discovering

frequent item sets and association rules were the

most important tasks in data mining. For mining

frequent item sets and association rules several

competent algorithms were attainable in the

literature. In recent years incorporating utility

considerations in data mining tasks was gaining

fame. The business value has been improved by

certain association rules and these rules of interestwere accredited by the data mining community over

a long time. The discovery of frequent item sets and

association rules from transaction databases

benefits numerou s business applicati ons. A

complete survey and study of a variety of techniques

in existence for frequent item set mining,

association rule mining with utility considerations

have beenproposed in their paper.

Mining Sequential Patterns in large databases has

become a vital data mining task with broad

applications and this was described by Mourad

Ykhlef and Hebah ElGibreen [23]. In the field of

data mining it was an important task, which

describes potential sequenced relationships among

items in a database. Numerous diverse algorithms

were introduced for their task. The precise optimal

Sequential Pattern rule were found by the

RELATED WORKS



4/12

conventional algorithms but particularly when they

were applied on large databases it takes a long time.

Currently, some evolutionary algorithms, namely

Particle Swarm Optimization and Genetic

Algorithm, were proposed and have been applied to

solve their problem. A new variety of hybrid

evolutionary algorithm that combines Genetic

Algorithm (GA) with Particle Swarm Optimization(PSO) to mine Sequential Pattern was introduced in

their paper, so as to enhance the pace of evolutionary

algorithms convergence. Their algorithm was

referred to as SP-GAPSO.

A search and optimization technique which is

inspired by nature's evolutionary processes is

genetic algorithm (GA). A population of candidates

iterates through multiple generations of selection,crossover, and mutation until an optimized solution

survives, much in the manner of survival of the

fittest. GAs are computer based optimization

techniques that employs the Darwinian evolution of

nature as a model [24]. The work of Holland (1975)

obtained a huge popularity for them. Usually, they

are employed for problems, which have an immense

and complex search space with an increased number

of local optimums [27]. The strength behind GAs is

the fact that the search space is traversed in parallel

by arbitrarily generating solutions and those

solutions are endlessly evaluated with a fitness

function [25]. Generally, three different search

phases are there in GA: (1) creating an initial

population; (2) Evaluating the population by a

fitness function; (3) producing a new population

[21]. In GA, the solutions are termed as individuals

or chromosomes [27]. The genetic search starts with

an arbitrarily generated population inside which, afitness function evaluates every individual.

The individuals of existing and following

generations are duplicated or eliminated on the basis

of the fitness values. By applying GA operators

further generations are produced [21] i .e.

reproduction, crossover and mutation which are

sequentially applied to each individual with certain

probabilities [23], [22]. The first operator which is

the production operator (elitism) produces one or

more copies of any individual that posses a high

fitness value; or else, the individual is detached from

the solution pool [29]. Two randomly chosen parent

individuals are taken by the crossover operator as

input, and then they are combined and they generate

two children. This process of combining takes placeby choosing two crossover points in the strings of

the parents and then exchanging the genes between

these two points [26]. The mutation of individuals

through the alteration of parts of their genes is the

next step in each generation [30]. Mutation brings

inconsistency into the population of the succeeding

generation by altering a gene of a chromosome.

Making sure that the search algorithm is not bound

on a local optimum is its main goal [22]. It is used to

make sure that all likely alleles can go into the

population and hence preserve the population

diversity [21]. It is a very important component of

GAs and to produce diversity for GAs it is a

variation operator [28].

By means of a novel data cleaning, transformation

and loading technique the RFID data has been

effectively warehoused, which was dedicatedly

proposed for RFID data. The previous works

illustrated that the required knowledge from the

warehoused RFID data was efficiently mined by the

proposed novel RFID data mining system. The

present work is intended to discover an optimum

sequential pattern on their cost, termed as utility

assigned. To identify the optimal sequential pattern

the GA-based technique is employed. After the

fuzzy rules are created from the sequential patterns,the optimal sequential patterns are recognized by the

GA based method as per their utility assigned. The

sequential pattern with maximum profit is

discovered by the fitness function of the GA. For

easy understanding of the proposed mining system

the optimal sequential pattern of RFID data is

briefed in the following sub-section, prior to detail

the proposed mining system.

GENETICALGORITHM(GA)

AN EFFICIENT DATA MINING SYSTEM

BASED ON GA



5/12VIPS VIVEKANANDA JOURNAL OF RESEARCH(5)


10/12

CONCLUSION

REFERENCES

In this paper, we have presented a data mining

system for mining the information that are

applicable to the type of movement of the tags,

which are attached to the warehouse goods. The

proposed mining system mined knowledge from the

warehoused data by generating I-dataset, miningsequential patterns and then by generating fuzzy

rules from the sequential patterns. After that, on the

basis of their assigned utility, the sequential patterns

are optimized by using GA. The outcome of the

system, optimized fuzzy rules with corresponding

profit, has detailed the type of the tag movement

with a fuzzy score. Given a part of the tag (indirectly

it refers to a product) movement, the fuzzy rules

clasp the persisting path of the tag (product). In this

manner, diverse length combinations of the tags

have been taken into consideration and theirmovement has been understood. The movements

are considered only for some important tags and

combinations and not for all tags and their

combinations. From the implementation results and

comparative analysis, we observed that our

proposed system will efficiently identify the

optimum sequential pattern. So, with the help of the

presented optimized data mining system, tracking

of goods in large warehouses can be executed

efficiently. As we only concentrated on the

optimized sequential patterns the cost of mining thesequential patterns is minimized. The extracted

information would be helpful for warehouse

management.

1. Bin Li and Dennis Shasha,

ACM SIGMOD Record, Vol.27, No.2,

pp.541-543, June 1998.

2. Anand, Bell and Hughes,

Data and Knowledge Engineering,

Vol.18,No.3, pp.189-223, 1996.

3. Agrawal, Imielinsk and Swami,

IEEE

Transaction Knowledge and Data Engineering,

vol. 5, no. 6, pp.914-925, 1993.

4. Chen and Liu,

International

Journal of Business Intelligence and Data

Mining, Vol.1, No.1, pp.4-11, 2005.

5. Yashpal Singh and Alok Singh Chauhan,

Journal of

T h e or e ti c a l a n d A p p li e d I n f or m a ti o nTechnology, Vol.5, No.6, pp.36-42,2009.

6. C.M. Roberts,

Computers & Security, Vol.25,pp. 18

26, 2006.

7. Hatim A. Aboalsamh,

WSEAS Transactions on

Computers, Vol.7, No.8, pp.1352-1361, August

2008.

8. Sathiyamoorthi and Murali Bhaskaran,

International Journal of

Recent Trends in Engineering, Vol. 2, No.

3,pp.1-5, November 2009

9. Jayanthi Ranjan and Vishal Bhatnagar,

J ournal of

Knowledge Management Practice, Vol. 9, No.1, March 2008.

10. Michael J. Shaw, Chandrasekar Subramaniam,

Gek Woo Tan and Michael E. Welge,

Decision support systems, Vol.31,

No.1, pp.127-137, May 2001

11. Asghar Sabbaghi and Ganesh Vaidyanathan,

Journal of

Theoretical and Applied Electronic Commerce

Research, Vol. 3, No. 2, p.p. 71-81, 2008, ISSN

07181876.

12. Asif, Z., Mandviwalla, M.,

Communications of the

Association for Information Systems, Vol. 15,

"Free Parallel Data

Mining",

"EDM: A generalframework for data mining based on evidence

theory",

Database

Mining: A Performance Perspective,

"Data mining from 1994 to 2004:

an application-oriented review",

"Neural Networks In Data Mining",

"Radio frequency identification

(RFID)",

"A novel Boolean

algebraic framework for association and

pattern mining",

"Data

Mining for Intelligent Enterprise Resource

Planning System",

"A

Review of Data Mining Tools In Customer

Re la ti on sh ip Ma nag em en t" ,

"Knowledge management and data mining for

marketing",

Effect iven ess and Effici ency of RFID

technology in Supply Chain Management:Strategic values and Challenges,

"Integrating the

supply chain with RFID: a technical and

business analysis",



11/12

No. 24, pp.393-427, 2005.

13. Jian Pei,Jiawei Han, Behzad Mortazavi-Asl,

Jianyong Wang, Helen Pinto, Qiming Chen,

Umeshwar Dayal and Mei-Chun Hsum,

IEEE

Tr ans act ions on Knowledge and DataEngineering, Vol. 16, No. 10, pp.1-17, October

2004.

14. M.S. Chen, J. Han, P.S. Yu,

IEEE

Tr ans act ions on Knowledge and Data

Engineering,Vol.8, No.6,pp.866 883, 1996.

15. Yen-Liang Chen and Ya-Han Hu,

Decision Support Systems, Vol. 42, pp. 1203-

1215, 2006.

16. Kuen-Fang Jea, Ke-Chung Lin and I-En Liao,

International

Journal of Innovative Computing, Information

and Control, Vol.5, No.8,August 2009.

17. Dhany Saputra, Dayang R.A.Rambli and Oi

Mean Foong,International Journal of

Computer Science and Engineering, Vol. 2,

No.2, pp.49-554, 2008.

18. J. Hu and A. Mojsilovic,

Pattern Recognition, Vol. 40, pp.

3317 3324,2007.

19. J i a n P e i, J ia we i H an a nd We iWa n g,

Journal of

Intelligent Information Systems,Vol.28

,No.2,pp.133 -160,April 2007.

20. Shigeaki Sakurai, Youichi Kitahara and Ryohei

Orihara,

International Journal of Computational

Intelligence, Vol. 4, No.4, pp.252-260, 2008.

21. Themis P. Exarchos, Markos G. Tsipouras,

Costas Papaloukas and Dimitrios I. Fotiadis, "A

t wo - st a ge m e th o do lo g y f or s e qu e nc e

classification based on sequential pattern

mining and optimization", Data & KnowledgeEngineering,Vol.66, pp.467487,2008.

22. Shankar and Purusothaman, "

International Journal of Soft Computing

Applications, Vol.10, No.4, pp.81-95, 2009.

23. Mourad Ykhlef and Hebah ElGibreen,

World Academy of Science,Engineering and Technology,Vol.60,pp.863-

870,2009.

24. Jyothi Pillai and O.P.Vyas,

International Journal of Computer Applications

(0975 8887), Vol. 5, No.11, pp.9-13,August

2010.

25. M. Sedighizadeh and A. Rezazadeh,

World Academy of Science,

Engineering and Technology, Vol. 37, 2008.

26. P. Radhakrishnan, V.M. Prasad and M.R.

Gopalan,

Journal of Computer Science,

Vol. 5, No. 3, pp. 233-241, 2009.

27. Basheer M. Al-Maqaleh and Kamal K.Bharadwaj,

World Academy of Science, Engineering and

Technology, vol. 11, pp. 43-46, 2005.

28. Timo Mantere,

"Mining Sequential Patterns by Pattern-

Growth: The PrefixSpan Approach",

Data mining: an

overview from a database perspective,

"Constraint-

based s equenti al patt er n mining: The

consideration of recency and compactness",

"Mining hybrid sequential patterns by

hierarchical mining technique",

"Mining Sequential PatternsUsing I-PrefixSpan",

High-utility pattern

mining: A method for discovery of high-utility

item sets,

"Constraint-based sequential pattern mining:The pattern-growth methods",

"A Sequential Pattern Mining Method

based on Sequenti al I nteres ti ngness ",

Utility Sentient

Frequent Itemset Mining and Association Rule

Mining: A Literature Survey and Comparative

Study",

"Mining

Sequential Patterns Using Hybrid Evolutionary

Algorithm",

"Overview of Itemset

U ti li ty M in in g a n d i ts A pp li ca ti on s" ,

"Using

Genetic Algorithm for Distributed GenerationAllocation to Reduce Losses and Improve

Voltage Profile,

"Optimizing Inventory Using Genetic

Al gor it hm fo r Eff icie nt Supply Chain

Management,"

"Genetic Programming Approach

to Hierarchical Production Rule Discovery,"

A Min-Max Genetic Algorithm

with Alternating Multiple Sorting for Solving



12/12

Constrained Problems,

Improved Off-Line Intrusion Detection Using

A Genetic Algorithm,

"Selection of RTOS for

an Efficient Design of Embedded Systems,"

Combining

Genetic Algorithms With Imperfect AndSubdivided Features For The Automatic

Registration Of Point Clouds (GAREG-ISF),

"A Comparative Study of Adaptive

Mutation Operators for Genetic Algorithms,"

"The Rank-

scaled Mutation Rate for Genetic Algorithms,

"A Genetic Algorithm-based Solution for

Intrusion Detection,"

in Proceedings of the

Ninth Scandinavian Conference on Artificial

Intelligence, 2006.

29. Pedro A. Diaz-Gomez and Dean F. Hougen,

Proceedings of the

Seventh International Conference on EnterpriseInformation Systems, 25-28, 2005, pp. 66-73,

May 25-28, Miami,USA, 2005.

30. S. Ramanarayana Reddy,

International Journal of Computer Science and

Network Security, Vol.6 No.6, pp. 29-37, June

2006

31. Stefan Schenk and Klaus Hanke,

Proceedings of the 3rd ISPRS International

Workshop, Vol. 38,

32. I m t ia z K o re j o, S h en g xi an g Ya n g a n d

ChangheLi,

in

pro ceedin gs of the 8th Me ta heuri st ic

International Conference, July 1316, 2009.

33. Mike Sewell, Jagath Samarabandu, Ranga

Rodrigo, and Kenneth McIsaac,

I n te r na t io n a l J o ur n al o f I n fo r ma t io n

Technology, Vol. 3, No. 1, 2006.

34. Zorana Bankovic, Jos M. Moya,AlfaroAraujo,

Slobodan Bojanic and Octavio Nieto-Taladriz,

Journal of InformationAssurance and Security, Vol. 4, pp. 192-199,

2009.


Documents

Mining Best Utility Pattern from RFID Data