Upload
iffat-anjum
View
52
Download
1
Tags:
Embed Size (px)
Citation preview
A Reinforcement Learning based Routing A Reinforcement Learning based Routing ProtocolProtocol
with QoS Support for Biomedical Sensor with QoS Support for Biomedical Sensor NetworksNetworksAuthor:
Xuedong LiangXuedong Liang
Ilangko BalasinghamIlangko Balasingham
Sang-Seon ByunSang-Seon ByunThe Interventional Center, Rikshospitalet University Hospital, Oslo, Norway N-0027
Dept. of Informatics, University of Oslo, Oslo, Norway N-0316
Dept. of Electronics and Telecommunications, Norwegian University of Science and Technology, Trondheim, Norway N-7491
Presented by:Iffat Anjum(Roll: 16)Nazia Alam(Roll: 28)15th Batch.
Date:26 th April, 2012
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Slide 1
ContentsContents Contribution. Problem Definition.
• Related works.• Biomedical Sensor Networks• Reinforcement Learning• Q-learning
Design of RL-QRP• Local Information Exchange• Q-learning Implementation• Learning-Based Routing Algorithm
Performance Evaluation. Limitation.
2
Slide 2
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
ContributionsContributions In RL-QRP, optimal routing policies can be found
through experiences and rewards without the need of maintaining precise network state information.
Considering impact of network traffic load and sensor node mobility on the network performance, RL-QRP fits well in dynamic environments.
RL-QRP performs well in terms of a number of QoS metrics and energy efficiency in various medical scenarios.
3
Slide 3
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Slide 4 Problem DefinitionProblem DefinitionThe main function of biomedical sensor networks is ,
Ensuring that data packets can be sensed and delivered to the
medical server reliably and efficiently.
4Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Related works
A number of QoS support routing protocols have been proposed for wireless sensor networks recently,
INSIGNIA, supported in mobile ad hoc networks, framework is based on in-band signaling and soft-state resource management. But not suitable for biomedical sensor networks for the inflexible nature of resource reservation scheme.
Problem DefinitionProblem Definition
CEDAR, is a core-extraction distributed ad hoc routing algorithm for QoS routing in ad hoc network environments. But the core could be the bottleneck of the network, the selection and maintenance of the core use extra network resources.
AdaR, adaptively learns optimal strategy to achieve multiple optimization goals. But how to map diverse QoS requirements into concrete Q-values is not defined.
Most of the previous QoS support routing protocols suffer .Heavy communication overhead.Computation burden of complicated algorithms.
5
Slide 5
Related works
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Problem DefinitionProblem Definition
A biomedical sensor network is deployed in a certain area, Sensor nodes are implanted or attached to patients body, Sink nodes are deployed in fix positions.
Biomedical sensor networks have the following features: Dynamic network topology : sensor node may leave, join or
dead (run out of battery); Time-varying wireless channel with serious electrical
interferences; Each sensor node has different QoS requirements , duty cycle,
packet arrival rate and forwarding willingness.
6
Slide 6
Biomedical Sensor Networks
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Problem DefinitionProblem Definition
Mobile nodes are aware of its geographic location , either using global positioning system (GPS) or distributed localization services.
Each node is aware of its immediate neighbors (within its radio range) and their locations using beacon exchanges.
Mobile sensor nodes follow the Random Waypoint Mobility Model (RWMM), for the network mobility.
This paper focus on 2 types of QoS requirements,Packet delivery ratio.End-to-end delay.
7
Slide 7
Biomedical Sensor Networks
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Problem DefinitionProblem Definition
8
Slide 8
Reinforcement Learning
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Figure: A reinforcement learning model.
Problem DefinitionProblem Definition
The concept of Reinforcement Learning is Markov Decision Process.
A MDP models an agent with a tuple (S,A,P,R).• S is the set of states,• A is a set of actions,• P(s` |s, a) is the transition model that describes the probability of
entering state s` after executing action a at state s.• R(s, a, s` ) is the reward obtained when the agent executes a at s and
enter s`. The goal of solving a MDP is to find an optimal policy , π : S → A,
that maps states to actions such that the cumulative reward is maximized.
9
Slide 9
Reinforcement Learning
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Problem DefinitionProblem Definition
10
Slide 10
Q-learning
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
A model-free method which calculates function Q(s, a) to find an optimal decision policy.
Each time an action a is executed, the agent receives an immediate reward r from the environment.
• Q(s, a) denotes the quality of action a at state s, α is the learning rate. And the weight of future rewards is modeled by γ.
• Q(s`, a`) is the expected future reward at state s` by taking action a`.
Design of Design of RL-QRPRL-QRP
11
Slide 11
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
The QoS routes computation and selection are based on a distributed reinforcement learning algorithm.
Sensor node calculates the route independently and individually. The Q-value Q(s, a) stands for the quality (progress has been
made) of the action a at state s.
Figure: Reinforcement learning based routing model.
Design of Design of RL-QRPRL-QRP
12
Slide 12
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
QoS Support Consideration
Each node will check the Qos requirement of the data packet and its Q-value table. The node then checks if it can make a certain progress of the data packet, if so, it will forward the packet to one of its neighboring nodes with the highest Q-value; if not, the packet will be dropped or sent with ‘best effort’.
Local Information Exchange
The local information exchange are facilitated using beacon exchanges with 1-hop neighboring sensor nodes. Which contains,
Position Information Exchange. Q-values Exchange.
Design of Design of RL-QRPRL-QRP
13
Slide 13
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
State: S = {si}, i= 1,2...N. N is the number of sensor nodes. Each node is a state s S.∈
Action: A = {a(sj |si)}, si, sj S. Execution of a(sj |si) means that a ∈packet is forwarded from state si to sj , provided si and sj are within each other’s communication range.
Reward function: R = prg(Pn).
Rn is the reward of execution of the action, which describes the progress has been made of forwarding data packet Pn.
Q-learning Implementation
Design of Design of RL-QRPRL-QRP
14
Slide 14
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Q-learning Implementation
Tsisj is the experienced delay between node si and sj ,
The reward of an action is implemented using ACK scheme.When node sj receives a packet from node si, sj will acknowledgethe packet by sending an ACK packet.
By calculating the1-hop delay, and the ratio of the number of ACK received divided by the number of data packets sent, si can estimatethe link properties between si and sj.
Design of Design of RL-QRPRL-QRP
15
Slide 15
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Learning-Based Routing Algorithm
Design of Design of RL-QRPRL-QRP
16
Slide 16
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Learning-Based Routing Algorithm
Performance EvaluationPerformance Evaluation
17
Slide 17
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Fig: Average end-to-end delay Fig: Average packet delivery to the sink node. ratio to the sink node.
Performance EvaluationPerformance Evaluation
18
Slide 18
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
Fig: The impact of node mobility Fig:The impact of network trafficon average packet delivery ratio. load on average end-to-end delay.
LimitationLimitation
19
Slide 19
Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
RL-QRP has neglected many common QoS requirements like network lifetime, throughput, connectivity etc.
Sensor nodes does not consider the interactions between itself and other sensor nodes, but this approach is not sufficient to achieve global optimization.
• Sensor nodes should consider the interactions with both the environment and the other nodes in the network, and cooperatively calculate the QoS routes in the context of multi-agent reinforcement learning (MaRL) framework.
20Green Networking Research GroupDept. of Computer Science and Engineering, University of Dhaka
THANK YOUTHANK YOU