[IEEE 2014 Recent Advances in Engineering and Computational Sciences (RAECS) - Chandigarh, India (2014.03.6-2014.03.8)] 2014 Recent Advances in Engineering and Computational Sciences

Proceedings of 2014 RAECS UIET Panjab University Chandigarh, 06 - 08 March, 2014

978-1-4799-2291-8/14/$31.00 ©2014 IEEE

Genetic Binary Decision Tree based Packet Handling Schema for WBAN System

Madhumita Kathuria Department of Computer Engineering

YMCA University of Science & Technology Faridabad, Haryana, India

[email protected]

Sapna Gambhir Department of Computer Engineering

YMCA University of Science & Technology Faridabad, Haryana, India [email protected]

Abstract: New generation applications related to Wireless Body Area Network (WBAN) are responsible for gathering and managing heterogeneous data for both real time and non-real time traffic. The proposed algorithm can handle heterogeneous packets by considering characteristics in terms of required bandwidth, required buffer, transmission delay, packet loss, and reliability. The main purpose of the proposed schema is to provide Genetic Binary Decision Tree based Packet Handling (GBDTPH) protocol, which classifies the heterogeneous traffic flow according to rule sets. Decision tree based data classification and prioritization helps in improving the quality of service required for different kinds of flows. The newly designed Prioritized Earliest Deadline Scheduling algorithm provides fairness to low priority packets and helps in overcoming starvation problem. Proposed packet drop module smartly drops deadline exceeded and least frequent used low priority packets. Keywords: WBAN; Genetic Algorithm; Binary Decision Tree; Packet Handling; Classification; Scheduling.

I. INTRODUCTION WBAN is a human centric network working on the sensed data monitored from sensors attached to the human body. It serves various applications with advance technology. WBAN system shown in Fig. 1 is a consistent system that assures QoS in terms of reliable, managed and congestion free transmission of heterogeneous packets over Internet. Guaranteed data delivery with short delay is a challenging issue for these networks as they deals with real time and non-real time heterogeneous traffic flow with variable rate, delay and loss tolerances. Different sensed data have different importance and this raises different requirements in the service quality. Sensors like ECG may generate very time critical data packets, which must be delivered with guaranteed reliability, while streaming of ECG signals may allow a certain percentage of packet losses but negligible delay. Hence quality of service must be given

Fig. 1. A typical WBAN System.

the foremost importance to traffic control mechanisms that seeking to handle packets properly. In this paper we propose an enhanced data classification, queuing and a prioritized scheduling method to ensure good QoS. Our classification module uses a hybrid genetic binary decision tree algorithm and forward packets based on the content they carry in a header and according to the available bandwidth. This paper has been classified as follows: Section I provides a brief introduction to subject matter. Section II details the concept of binary decision tree. Section III explains the basic idea of genetic algorithm. Section IV provides brief knowledge about advance packet classification, buffering, scheduling and dropping concept. Section V provides the conclusion part along with the future work.

II. BINARY DECISION TREE It classifies a set of cases into classes based on the defined rules. Nodes of the decision tree contains attribute name, edges are labeled with possible values of these attributes and leaf nodes are represented with

S6

S5

S7

S3

S1

S4

S2Sink1

Base Station

Sink2

Internet

Server

Si=Sensor

User

WBAN-1

WBAN-2

various classes. Entities are categorized by tracking the pathway from the root down to a leaf node, and by considering each edge. All internal nodes have at most two child nodes i.e. Left or Right child. Any traversal of the tree from root to leaf in which one attribute and one value is selected at each node is a rule of the form: “If (attribute1 = value1) and (attribute2 = value2) and (attribute3 = value3) ……...then classification = class1”.

III. GENETIC ALGORITHM Genetic Algorithm is an optimized method emulate on natural genetics to find estimated solutions. Each entity is coded as a finite-length string, called a chromosome. The chromosomes in the population compete with each other to survive and reproduce through three operators: reproduction, crossover and mutation. The chance for a chromosome to reproduce and survive is decided by a fitness function. Fitness is usually measured in terms of how well the chromosome solves some problem. During each successive generation, a new population is generated by selecting member of the current generation to mate based on fitness. To generate new children, crossover process selects a pair of individuals as parents from selection process. This process will continue until the desired size of the new population is obtained. Mutation process selects two random points on the chromosomes and flips them. The optimization process is carried out in cycles called generations. The algorithm terminates after a finite number of generations or when the population becomes stable.

IV. PROPOSED PROTOCOL Packet handling Module: It is responsible for handling traffic flows at different levels so that both real time traffic and non-real time traffic get benefit in terms of QoS. Packet classification, buffering or queuing, scheduling and dropping have a direct impact on QoS. A. Packet Classification Module: Packet classification is the most tough and vital challenge for sink node. Packets are classified into appropriate classes by examining specific fields such as < Source_address, Destination_address, Source_port, Destination_port, Packet type, Packet size, Alert/Critical level > in the packet header. Workflow diagram of packet classification module is given in Fig. 2. It classifies packets into different flows and assigns them different priorities. This module use a Genetic computing based Binary Decision Tree algorithm, which checks the rule based database considering Packet type, Packet size, and Alert/Critical level as main attribute and priority values as action. To make packet header fields suitable for

classification, we consider each field as a choice and label all choices into four classes. Each class is assigned a particular priority value <0, 1, 2, and 3>. The overall flow of proposed schema is shown in Fig. 3 is consists of four steps. I. Data partition: It partitions the data into training data set and test data set. II. Generate rules: A binary decision tree is applied to the training data set to deduce rules. i) Structure of rule: If (Packet_Type=Real time), then If (Packet_Size<=Available Bandwidth), then Set Priority: =Null and Forward packet for service Else Set Priority: =1 Else If (Critical level=high), then Set Priority: =0 Else If (Critical level=medium), then Set Priority: =2 Else If (Critical level=low), then Set Priority: =3 ii) Representation of Binary Decision Trees: N: Nodes of trees. All internal nodes are denoted by IN and leaf nodes are denoted by LN. E: Edges of tree (i.e. left or right edge denoted by arrow) α: It defines the one-to-one mapping between nodes rather than leaf node and left and right edges. i.e. {N-L} * {Yes, No} —> E. Here n belongs to {N-L}, α (n, Yes): left edge and α (n, No): right edge of a node. β: It labels the elements of N. Leaf nodes are labeled with square box and internal nodes are with circular box. III. Optimize the rule applying Genetic algorithm: To optimize the rule generated by decision tree a genetic algorithm is applied with the following steps. 1. Generate Initial Decision Tree: The populations in the form of a binary decision tree are created from the features given in header field of packets. It scan sample, selects records randomly from training data set, where class attributes are matched. Each chromosome can represent by a rule of the form If-Else. The left side of the rule has characteristic attributes or characteristic genes and the right side of the rule has class attribute or class gene. During the gene evolution, the characteristic genes will participate in the evolution. The initial fitness is assigned to zero. 2. Evolution: This process calculates each attribute with gain rate, encode the data and generate initial population. 3. Fitness: The fitness function used for calculation of classification accuracy and tells about the number of case correctly classified. The function for accuracy and fitness is given in equation (1) and equation (2). Accuracy = (True_Fit + False_Fit) /TS+FS (1)

Fitness rate = γ *Accuracy (2) Where 0< γ < 1, TS=Total numbers of True samples and FS=Total numbers of False samples, True_Fit= number of the rule predict the sample is true and the actual is true, False_Fit= number of the rule predict the sample is false and the actual is false. The accuracy calculate the probability of accuracy of rule apply on the training data. More numbers of samples are correctly classified if accuracy is high. If fitness rate is less than maximum threshold, then repeat step 4, otherwise go to step 6. 4. Perform Operations: Selection operation pick two populations as given in figure 4(a) and 4(b). Crossover and mutation operations are performed on these

populations to generate offspring. Crossover takes place by selecting same subtree from both the parent trees. A subtree is randomly chosen at a particular node of one parent and this is swapped with subtree of other parent as given in Fig 4(a) and 4(b). Crossover does not change the population if both selected nodes are leaf nodes or both nodes are root nodes or the two subtrees are the same. Mutation randomly selects some portion of branch as genes and flipped them. For example: an intermediate node can be randomly swapped with another intermediate node or a leaf node can be randomly changed with another leaf node as given in Fig. 4(b).

Fig. 2. Workflow Diagram for Packet Classification Module.

Fig. 3. Genetic Binary Decision Tree based Packet Classification Schema.

Make rule set considering these features and create database

Select features from packet header

Assign priority

Match? No

Generate and Test Rules

Training dataset

Test dataset

Binary Decision Tree

Generate Rule

Optimized Rule

Modify and Generate new Rules

Optimize these rules applying Genetic Algorithm

Selection

Crossover

Mutation

Evaluate new chromosome from fitness function

Initial Population

Yes

Periodic packets: Assign priority: 3

Non-Real Time Packet Real Time Packet

On-demand packets: Assign priority: 2

Critical Packets: Assign priority: 0

Check Packet Type in packet header

Check Critical level in packet header

Send prioritized packet into Buffering Module

High Medium Low

Check Bandwidth is available or not?

Assign priority: 1 Forward packets into Service Queue (SQ) of Scheduler Module

Yes No

Incoming Packet

Fig. 4(a). Population 1: Binary Decision Tree acts as Parent-1 (P1).

Fig. 4(b). Population 2: Binary Decision Tree acts as Parent-2 (P2).

Attribute: Packet_Type = Non-Real

Time Attribute: Packet_Size

< = Bandwidth Attribute:

Critical level = High

Attribute: Critical level = Medium

Attribute: Critical level

= Low

Label: Set Priority: 0


Label: Forward packet for Service

Label: Set Priority: Null



Yes No

Yes No Yes No

Yes No

Yes No

Crossover:

Subtree of parent P1 from this relative node will exchange with subtree of parent P2 at the given relative node in Fig. 4(b).

Attribute: Critical level = Medium


= High


= Low

Attribute: Packet_Type = Real Time



Label: Forward packet for Service

Label: Set Priority: Null



Yes No

Yes No

Yes No

Yes No

Yes No

Attribute: Packet_Size

< = Bandwidth

Crossover:

Subtree of parent P2 from this relative node will exchange with subtree of parent P1 at the given relative node in Fig. 4(a).

Mutation:

A leaf node can be randomly changed with another leaf node of same parent

5. Produce new generation: The new generation will produce by swapping the low fitness parent with the high fitness offspring. Then again the fitness of each individual in the new generation is calculated. Those individuals are taken out whose fitness value is less than the minimum threshold. 6. Iteration: It run for a number of generations until the termination criteria is satisfied. It stops after a fixed number of generations which was specified before or stops when a gene reaches a particular fitness level. IV. Analysis of optimized rule: The test data set is verified to check the accuracy of the optimized rule. B. Packet Buffering Module: WBAN applications are transmitting four kinds of traffics, hence needs four kinds of queue. Using of large number buffer is not a clever method, so we have considering two buffers only. One is used to store high priority packets and another one is used to keep low priority packets. If required amount of resources are available then it stores them into any one the buffer according to their priority, otherwise drop them. Algorithm for packet buffering module is

given in Fig. 5 saves the life of high priority packets because they have a negligible chance of drop and delay. C. Scheduling Module: In scheduling algorithm real time packets are getting served as soon as they arrive at receiver when the required bandwidth is available, otherwise they are kept into the high priority queue. Priority based scheduling always serves higher priority packets prior to low priority packets. But doing this low priority packets get starved for a longer time period and sometimes do not get a chance of service. Our proposed scheduling algorithm tries to reduce the low priority packet starvation rate efficiently by calculating their deadline. The packet having a deadline that is likely to be finished is fetched out from the queue and given a chance to get served before its life time has elapsed. The deadline of a packet is the ratio of its stay time in the queue to its expected wait time. The earliest deadline finds the maximum value among all deadlines. It also finds those packets whose waiting time is exceeded and helps to drop them. Fig. 6 explains the working of scheduling module.

Fig. 5. Workflow Diagram for Packet Buffering Module.

Find low priority packet from AQ

Is AQ prone to full?

Compare its priority with incoming packet

<<

> >

Find low priority packet from RQ

Check Priority

Is RQ prone to full?

0 or 1 2 or 3

Store this incoming packet into RQ

1. Delete low priority packet from AQ and Store into RQ 2. Store incoming packet into AQ

Is incoming packet is present in Alert

Queue (AQ)

Yes No Yes

Incoming packet from classification module

Compare its priority with incoming packet

Is incoming packet is present in Rest

Queue (RQ)

Duplicate packet

No

Yes

Yes

No

Compare its sequence no

with incoming packet >

= =

Store incoming packet into AQ

Duplicate packet

<

Delete low priority packet from RQ and Drop it

Drop incoming packet

No

Compare its sequence no

with incoming packet

= =

><

Fig. 6. Workflow Diagram for Packet Scheduling Module.

D. Packet Dropping Module: As we know that all newly arriving packets are either stored into buffer or dropped. Random dropping is very dangerous and needs to be detected. However, instead of randomly dropping packets, our proposed schema intelligently drops the low priority and less utilized packets with some probability when congestion occurs. It may discard a burst flow before the queue overflows, so that there is still space for other flows. To reduce retransmission ratio, duplicate packets are dropped and a notification regarding this is advertised to each sender. At the time of scheduling the deadline exceeded low priority packets are dropped.

V. CONCLUSION The proposed protocol will provide a genetic rule based packet classification, with dead line based prioritized data handling to overcome all kinds of packet issues. Binary decision tree based packet classification is very effective since it reduces space and time. Packet queuing and dropping module takes care of high priority or critical packets at the time of congestion. Future work can be done by scheming protocols with reduce resource complexity and improve performance.

REFERENCES [1] Md. Motaharul Islam, and Eui-Nam Huh, “A Novel Data Classification and Scheduling Scheme in the Virtualization of Wireless

Sensor Networks”, International Journal of Distributed Sensor Networks, vol. 2013, pp. 1-14, 2013. [2] Jashan Koshal, and Monark Bag, “Cascading of C4.5 Decision Tree and Support Vector Machine for Rule Based Intrusion Detection System”, International Journal of Computer Network and Information Security, vol. 8, pp. 8-20, 2012. [3] Pu Wang, Ke Tang, Edward P. K. Tsang, and Xin Yao, “A Memetic Genetic Programming with Decision Tree-based Local Search for Classification Problems”, ©2011 IEEE, pp. 917-924, 2011. [4] Mehdi Mohammadi, Bijan Raahemi, Ahmad Akbari, Hossein Moeinzadeh, and Babak Nasersharif, “Genetic-based minimum classification error mapping for accurate identifying Peer-to-Peer applications in the internet traffic”, Expert Systems with Applications, vol. 38, pp. 6417-6423, 2011. [5] Rodrigo C. Barros, Marcio P. Basgalupp, Andre C. P. L. F. de Carvalho, and Alex A. Freitas, “A Survey of Evolutionary Algorithms for Decision Tree Induction”, IEEE Transactions on Systems, Man, and Cybernetics- Part C: Applications & Reviews, vol. 10, no. 10, pp. 1-22, January 2010. [6] Pedro G. Espejo, Sebastian Ventura, and Francisco Herrera,” A survey on the application of genetic programming to classification”, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 40, no. 2, pp. 121-144, 2010. [7] Nan Zhou, Xu Zhu, and Yi Huang, “Genetic Algorithm Based Cross-layer Resource Allocation for Wireless OFDM Networks with Heterogeneous Traffic”, 7th European Signal Processing Conference (EUSIPCO 2009), pp. 1656-1659, 2009. [8] Denis Zuev, and Andrew W. Moore “Traffic classification using a statistical approach”, Lecture notes in computer science, Springer, vol. 3441, pp. 321-324, 2005. [9] Zhiwei Fu, Bruce L. Golden, Shreevardhan Lele, S. Raghavan, and Edward A. Wasil, “A Genetic Algorithm-Based Approach for Building Accurate Decision Trees”, INFORMS Journal on Computing, vol. 15, no. 1, pp. 3-22, 2003. [10] Zhiwei Fu, Bruce L. Golden, Shreevardhan Lele, S. Raghavan, and Edward A. Wasil, “Genetically engineered decision trees: Population diversity produces smarter trees”, Journal Operation Research, vol. 51, no. 6, pp. 894-907, 2003.

Wait

Find and delete the high priority packet from AQ and forward this into Service Queue (SQ) of scheduler module

1. Calculate Deadline of each packet Pi. Di = (Current time of Pi-Arrival time of Pi) / expected wait time 2. Find Earliest Deadline: ED = max {Di}, i=1 to n

Is Alert Queue (AQ) empty?

Is Rest Queue (RQ) empty?

ED=Th

Compare ED with Threshold value (Th)

ED>Th

Find and delete the high priority packet from RQ and forward this packet into SQ

Incoming packet from Buffering Module

No Yes

Yes

Find and delete this deadline exceeded packet from RQ and drop it

Find and delete this low priority packet from RQ and forward this packet into SQ

ED<Th

No

Documents

[IEEE 2014 Recent Advances in Engineering and Computational Sciences (RAECS) - Chandigarh, India (2014.03.6-2014.03.8)] 2014 Recent Advances in Engineering and Computational Sciences