[IEEE 2010 Second International Conference on Network Applications Protocols and Services (NETAPPS) - Alor Setar, Kedah, Malaysia (2010.09.22-2010.09.23)] 2010 Second International

Enhancing QoS protection In MPLS networks

Sabri M. Hanshi, Wajdi Al- Khateeb International Islamic University Malaysia Department of Electrical and Computer

Kullayiah of Engineering Jalan Sungai Pusu, Gombak, Selangor, Malaysia

Abstract- MPLS recovery mechanisms are increasing in popularity because they can guarantee fast restoration and high QoS assurance. In fact, QoS is important for interactive voice and video application and for specific clients. However, link failure always incurs delay and packet losses of the traffic passing through the failed link. Therefore, network has to restore the traffic by switching the affected traffic to alternative path. In this paper, QoS objectives are concerned in this study to redirect the protected traffic with acceptable levels of quality before failure take place. The proposed scheme setup more than one alternative path in advance in order to introduce fast rerouting and the selecting criteria is based on the required bandwidth and end-to-end delay. In this work, we proposed the traffic splitter to split the protected traffic after failure, in case the available bandwidth in the alternative path is not enough to deliver the traffic. Finally, alternative path selection is updated based on current network resource availability. To verify the efficiency of the proposed algorithm, the MPLS network simulator MNS-2 has been used as the test platform.

I. INTRODUCTION

Internet Protocol (IP) networks mainly deliver best-effort services such as web browsing and e-mail. However, by adding real-time and mission critical services which require diverse degrees of QoS guarantees, the Interior Gateway Protocol (IGP) has a problem of restoration time which has to be addressed in order to enhance this range of services. In other words, when link or node failure occurs, the IGP needs several seconds or minutes to recalculate the shortest paths and to bypass the failure.

A network fault tolerant or survivable must maintain an acceptable level of service performance during network failures. Thus, it supports software or hardware in order to quickly detect the failure and switch to a pre-defined backup path or link. In addition, the backup paths could be implemented at multiple layers, including wavelength-division multiplexing (WDM), Synchronous Optical Network/Synchronous Digital Hierarchy (SONET/SDH), and MPLS.

Multi Protocol Label Switching (MPLS) based recovery mechanisms introduce faster recovery schemes than IGP protocols, and will also be useful if IP networks are to evolve beyond best-effort services. In addition to achieve fast recovery, MPLS has other key characteristics that make it attractive to service providers [1]

• Efficient and flexible use of resources.

• Restoration times that match the requirements of the user's application can be achieved.

• Increasing network reliability and availability.

• Protecting path selectivity based on traffic service is possible.

• Allows end-to-end paths protection as well as path segments.

• Takes into account recovery actions of lower layers.

• Minimizes loss of data and packet reordering during recovery operation.

• Minimizes state and signaling overhead.

MPLS-based recovery is becoming more significant for

several reasons [1]. The first reason is that it is able to improve network reliability by enabling a faster response to faults than is possible with network layer methods alone while still providing the visibility of the network afforded by the network layer. Second, traditional IP rerouting may be too slow compared to a core MPLS network which requires recovery times smaller than times achieved by IP routing protocols. Furthermore, MPLS-based recovery schemes can be optimal to support the traffic engineering goal of optimal use of resources. Recovery schemes in MPLS can also introduce bandwidth protection to specific flow such as VoIP. However, network operators want to provide the fastest, most stable, and the best protection mechanism that can be provided at a reasonable cost. MPLS recovery mechanisms should give the flexibility to select the specific types of traffic that are protected in order to give operators more control over that trade-off. In addition, it is able to provide different levels of protection for different classes of service, according to their service requirements [1]

II. PROTECTION AND RESTORATION IN MPLS

Protection in MPLS can also be categorized based on the scope of recovery to local and global repair. A local recovery scheme protects a primary path segment in the vicinity of the fault against a single link or node failure, whereas a global recovery scheme protects an end-to-end primary path against any link or node failure. In order to avoid time delay by propagating the failure notification signal along the working path, local recovery scheme provides a backup path closest to the point of failure. This the main advantage to avoid service disruption delay during failure.

2010 Second International Conference on Network Applications, Protocols and Services

978-0-7695-4177-8/10 $26.00 © 2010 IEEE

DOI 10.1109/NETAPPS.2010.24

95

In [2] introduced performance evaluation of MPLS recovery schemes testing three recovery models; the Makam, Haskin and Simple dynamic models.

In Haskin model, the reversed backup path is established between the PML (Protection Merging LSR) and the PIL (Protection Ingress LSR) in the reverse direction of the working path. The first part of backup path from LSR which closed to failed link. The second part of the backup path is established between the PIL and PML along the path which disjoints with the working path. These two parts are linked to create the backup path. In case of failure, a node that detects the failure redirects incoming traffic to the backup path.

In Makam’s Model, A backup path is established between the PIL and PML along a transmission path that disjoints with the working path. In case of failure, a node that detects the failure transmits an FIS towards its PIL (ingress LSR). Once the ingress LSR receives a notification signal, it switches over affected traffic through the backup path. [2]

In simple-dynamic scheme, once a node detects a failure a new backup path is established to the PML based on shortest path which does not utilize any working path resources.

Performance evaluation of the above schemes The metrics for performance evaluation in [2] to compare

between the Haskin, Makam and simple-dynamic schemes are packet loss, reordering of packets and resources utilization. The advantage of the Haskin scheme is that packet loss is very little during node failure. On the other hand, Haskin's scheme introduces re-ordering of packets when the affected traffic is switched back to the working path after a node or link goes up.

The Makam scheme has no problem in the reordering of packets during failure. However, it introduces packet loss because PIL cannot execute the protection switching until it receives the FIS from a node that has detected a node failure. In addition, the number of the packet loss increases in proportion to the distance between the PIL and the failed LSR or link.

The simple-dynamic scheme also has no problem with reordering packets, but it introduces packet loss during switchover to the backup path. An LSR that detects a failure cannot perform protection until it creates a backup path. Furthermore, the amount of packet loss is decreased in proportion to the distance to the failed node or link..

III. THE FAILURE IMPACT

The guaranteed quality of service (QoS) of the traffic is a crucial aspect of evaluating the failure impact. Recovery time, packet loss and packet reordering are the most common forms of QoS impact that are discussed in many works such as [3], [4] and [5]. However, the following table presents the level of protection and recovery time for each class.

TABLE I Level of protection of different traffic classes

Failure Recovery Time in MPLS networks In MPLS-based networks the usual method of

recovering a failure is the utilization of an alternative and disjoint path to the working path. The establishment and the use of this path can be deployed by different methods. These methods can use pre-established (pre-routed and pre-signalled) backup paths or establishing these backup paths dynamically after failure. Furthermore, resources can be allocated in advance as proposed in some research. The complete recovery cycle starts when a failure is detected and finishes when the traffic is recovered (after the failure is fully repaired) back to the initial working path (normalization process).

IV. QS IN MPLS PROTECTION

The QoS routing algorithms enhancement can be done by adding new objectives to their algorithms to compute the suitable working and backup paths at the same time adding the most suitable backup scheme after the working path has been assigned.

In [6], a backup path decision module was introduced to choose the most appropriate backup scheme based on the traffic class. Although this paper offers the possibility of combining different backup techniques, this module does not take any decision about selecting the working path. On the other hand, this involves developing a very costly module (in terms of computing time). Therefore, new objectives have to be added to the QoS routing algorithms to compute both the working and backup paths, reducing the impact and the failure probability. The two-step routing algorithms proposed in [7] aims to select the backup path with less probability of failure and sometimes allow for better resource consumption. The main idea is to establish the working path and backup path in two steps by first calculating the working path (the shortest path meeting the QoS constraints) and then calculating the backup path (the shortest disjoint path). Once failure has been detected, the affected traffic is rerouted to the pre-established backup path. Then the proposed model computes a new backup path based in available network, and if the established traffic is better than the first backup path. The affected traffic is rerouted again from the first backup path to the newly established backup path.

Level of protection Recovery Time (TREC)

Very low > 1 min

Low 200 ms – 1 min

Medium 50 ms – 200 ms

High 20 ms – 50 ms

Very High < 20 ms

96

The MPLS-based multi-path protection frameworks have been proposed in various research works. A local fast reroute model has been presented in [8]in which bypass tunnels are established to back up all protected paths, not only for one particular protected LSP. This paper used the max-flow min-cut theorem to establish all possible backup paths for each node. Figure 1 illustrates the setup steps for the bypass tunnel (BPT). When the LSP setup request arrives at LSR2, LSR2 forwards the request to one of three directions based on the setup request message. Therefore, LSR1 sets the bypass tunnel to LSR1, LSR2 and LSR3. When LSR1 receives the setup request from downstream, it assigns one of them and when the link or node failure is detected, the LSR has to specify the bypass tunnel for the affected LSPs to reroute the protected traffic. First, the highest priority is chosen from all the affected LSPs, then the bypass tunnel is selected to redirect the traffic based on its QoS constraints. In case of no more residual bandwidth being available in any bypass tunnel, then the LSR sends back an FIS to the ingress LSR.

.

Multi- path routing has been presented in [9] in order to

minimize the effect of failure. The proposed scheme uses the (k, n) Threshold Sharing Algorithm (TSS) with multi-path routing. The main concern of this work is to guarantee the continuation of the network operation with no packet loss and recovery delay, and with reasonable network resource utilization. Arrived packets are divided into n pieces at the ingress LSR, called shadows, and any k of them can reconstruct the original message at the egress LSR. These shadows are sent in over maximally disjointed LSPs obtained using multipath routing. However, if one failure occurs in one of the LSPs, the reconstruction process will not be affected at the egress router, and no fault notification is required to be sent, and no backup path needs to be set up

Adding the QoS protection concept is proposed in [6]. The proposed mechanism aims to enhance the current QoS routing. It is divided into different modules in order to achieve scalability and transparency for this method. Therefore, three modules are presented in this algorithm; the working path routing module (WRM), the backup path decision module (BDM) and the backup routing (BRM) module. The main module is the BDM, where QoS protection (QoSP) selects the appropriate recovery model (global, local and reverse) based on QoS metrics in order to select the backup protection scheme which is most suitable for the protected traffic for this working path. The result of this module is the best backup path protection method. This scheme proposed three QoS metrics which are packet loss, restoration time and resource utilization.

MPLS recovery schemes are used by large network operators to assure high QoS in case of failure, by pre-establishing backup paths. Current networks use simple methods for calculating feasible backup paths. However, these paths may not efficiently use the network resources and block other traffic from being serviced, thereby reducing profitability. Therefore, this study introduces a novel approach to provide a highly efficient scheme which allows network operators to maximize their network QoS performance by locating alternative paths more efficiently.

V. THE PROPOSED SCHEME

The proposed QoS protection scheme aims to guarantee

the QoS requirements for the protected traffic, by redirecting the traffic flow to the best alternative path or paths based on available resources in alternative paths. Several functions have been built in to add QoS objectives into the protection scheme. Some of these functions are used to collect the network information, such as the required bandwidth, number of available alternative paths and unused bandwidth in all paths. Besides, other functions are built to make the alternative path decision, for example, the best path is selected based on the information collected by the above functions. Also in this work, we propose a traffic splitter to avoid unavailable bandwidth in each path. In other words, the unused bandwidth is not enough to reroute the protected traffic at a suitable QoS level.

The proposed scheme provides some significant features that introduce QoS into the protection scheme. It establishes more than one alternative path in advance in order to achieve fast rerouting in case of failure and consequently, reduces the recovery time and packet loss. Moreover, using more than one alternative path increases the system availability and reliability. The proposed scheme introduces QoS parameters such as bandwidth and end-to-end delay to select the best alternative path that meets these parameters. Therefore, the protected traffic will be delivered to the path that meets its QoS requirements. The proposed scheme introduces the traffic splitter to overcome bandwidth unavailability in each path. Therefore, in case of failure and if there is insufficient capacity in the paths, the protected traffic will be split and distributed according to the bandwidth availability in each path. Network information is recalculated periodically in order to reroute based on the current state of the network. This function is very important to face the changes of network availability caused by some resources becoming unavailable due to failure or new traffic passing through the path and also others becoming available.

Figure1 Alternative path selection in BPT

Figure 2 An example of establishing global, local and reverse backup

97

Figure 3 presents the flow diagram of the proposed

backup path selection mechanisms; the following lines describe the steps of the proposed scheme to select the suitable path for the affected traffic.

1. The required bandwidth of the protected traffic is computed in order to get the needed bandwidth (needed_bw) that must be available in one or more alternative paths to reroute the protected traffic once failure occurs (2) in Figure 3

2. Available bandwidth in each alternative path

will be calculated, based on estimation of available bandwidth equation proposed by Strauss, Katabi and Kaashoek(2003) along the path. 3. Al this point, the number of alternative paths is

assigned that has the available bandwidth sufficient to reroute the protected traffic. This bandwidth must be greater than the needed bandwidth for the protected traffic (4).

(1)

Therefore, the number of alternative paths that meet the above condition is obtained. 4. If there is more than one alternative path

meeting this condition, then go to step (6), otherwise go to step (9 ) 5. Latency or end-to-end-delay is computed for

each alternative path that is assigned in (4). The idea of this step is to provide a path that will offer a backup path with specifications close to the working path specifications.

In this step, an alternative path with lower delay will be assigned as a backup path. One advantage of choosing a path with lower delay is that it will overcome the out of order packets at destination egress LSR. 6. In case of failure, the protected traffic will be

redirected to the assigned path in (7). 7. The alternative path that guarantees the

protected traffic that provides the required bandwidth is selected as a backup path for this protected traffic (11). 8. If the protected traffic stops being sent (12)

through the network, the proposed scheme stops working (18). 9. Backup path selection will be updated

periodically, and the selection parameters will be recalculated based on the available resources. Therefore, the selected alternative path that is detected as a backup path will be reassigned based on the current network state (13).

VI. SIMULATION RESULTS

The following figure 4 illustrates a network topology which was implemented in the NS2 network simulator and the MNS2 (MPLS modulator) for NS2.26 [10] [11] . The topology is formed by 26 MPLS (LSR1 to LSR26) nodes and 2 IP nodes (n0 and n27) and is distributed as shown in Figure 4 The topology links the source to destination through the MPLS domain. All connection links in the MPLS domain are duplex links and have the same capacity and are equal to 1 Mbps except the link which connects LSR9 and LSR10 which is equal to 4nMbps. The links between n0 - LSR1 and LSR10-n0 are assigned to m Mbps.

Figure 4 The Tested Network Topology

Figure 3 The setup of the bypass tunnel in the branch router

98

Each node in the topology is a LSR, drop tail is used as LSR’s queue management scheme. Since the TCP flow has built in mechanism for rate adaptation, UDP is using in this study in order to get accurate observation on the network effect. Three different traffics are passed through the network (RT1, RT2 and SBT) where RT1 is the protected traffic.

The simulation time is 15 seconds and failure time is at 4.0 second and the repair at 6.0 second. the simulation is developed to compare the proposed QoS protection scheme for protected traffic RT1 with the protection scheme named fast reroute using pre-established tunnels proposed in Lai et al. (2008) which is referred to as BPT. Therefore, Two case studies have been designed to test both protection schemes and the comparison will be based on the packet loss and receiving throughput of protected traffic. Moreover, the network resource availability is changed during failure time in order to notice the protection scheme behaviour when faced with these changes.

Network state changes during failure and only one

path is available

Figure 5 illustrates the throughput of RT1 at the

receiving point. At the failure point, throughput fell for both schemes and returned quickly to the original point. However, due to SBT flow passing through LSP1 (1_11_12_13_14_15_16_17_18_10) the available capacity in this path became less than the required capacity to deliver the protected traffic at an acceptable QoS level. The receiving rate of the protected traffic in both schemes fell but in our proposed scheme it returned faster to the original rate than in the BPT scheme. This was because the proposed model triggered the ingress node to redirect the affected traffic to another alternative path (LSP3) which had enough capacity, based on updated information. In contrast, the BPT kept delivering the affected traffic to LSP1.

Figure 6 shows that the packet loss of protected traffic rises as a function of failure distance from ingress LSR. Packet losses of the BPT protection are higher than our proposed method. The reason is that due to the changes of network resources availability, the proposed method switches over the protected traffic to another path that has enough capacity.

VII. CONCLUSION In this study, the network has to restore the traffic by

switching the affected traffic to another path. Therefore, QoS objectives are concerned in a recovery scheme to redirect the protected traffic with the same quality as before failure takes place. To support high quality service, several parameters must be guaranteed, such as bandwidth, fast rerouting mechanisms and rerouting should be based on updated information of network. The proposed QoS scheme is used to serve the traffic based on information of network resources; therefore, alternative path decisions will be made according to changes in the current network resources availability at failure time. The proposed protection scheme introduces a significant contribution in the field of recovery schemes in terms of alternative path decision parameters and the rerouting method

REFERENCES

[1] Paul Meyers, Natalie Degrande, Sven Van den Bosc. (2009) Alcatel-Lucent Telecom Review. [Online]. HYPERLINK "http://www1.alcatel-lucent.com

[2] V. Sharma, Metanoia, F. Hellstrand. Framework for Multi-Protocol Label Switching (MPLS)-based Recovery.

[3] Gaeil Ahn & Woojk Chun, "Simulator for MPLS path restoration and performance evaluation," in ATM (ICATM 2001) and High Speed Intelligent Internet Symposium, 2001. Joint 4th IEEE International Conference on, Seoul, South Korea, 2001, pp. 32-36.

[4] Changcheng Huang, Vishal Sharma, Ken Owens, Srinivas Makam, "Building Reliable MPLS Networks Using a Path Protection Mechanism," IEEE Communications Magazine •, pp. 156-162, Mar. 2002.

[5] D.Adami, C.Callegari, D.Ceccarelli, S.Giordano, M.Pagano, "Design and development of MPLS-based recovery strategies in NS2," in Global Telecommunications Conference, 2006. GLOBECOM '06. IEEE, Nov. 27 -2006, pp. 1-5.

[6] Dimitry Haskin, Ram Krishnan ,, "A Method for Setting an Alternative Label Switched Paths to Handle Fast Reroute draft-haskin-mpls-fast-reroute-05.," 2001.

[7] Jose L. Marzo, Eusebi Calle, Caterina Scoglio and Tricha Anjali, "QoS On-Line Routing and MPLS Multilevel Protection: a Survey," IEEE Communications Magazine, October 2003.

[8] Calle, Eusebi, "Enhanced fault recovery methods for protected traffic services in GMPLS networks," thesis, 2004.

[9] Wei Kuang Lai, Zhen Chang Zheng, Chen-Da Tsai, "Fasr reroute with pre-established bypass tunnel in MPLS," computer communication 31, pp. 1660-1671, 2008.

Figure 5 Throughput at receiving point

Figure 6 Packet losses of RT1 as a function of distance from ingress LSR

99

[10] Sahel Alouneh, Anjali Agarwal, Abdeslam En-Nouaary, "A Novel Approach for Fault Tolerance in MPLS Networks," in Innovations in Information Technology, conference, 2006, pp. 1-5.

[11] Gaeil Ahn & Woojk Chun, "Design and Implementation of MPLS network simulator (MNS) supporting QoS," in 14th International conference on Information networking, july 2001, pp. 694-699.

[12] Gaeil Ahn, Woojk Chun, "Overview of MPLS network simulator design and implementation," in , 2001.

[13] Jacob Strauss, Dina Katabi, Frans Kaashoek, "A Measurement Study of Available Bandwidth Estimation tools," 2003.

100

Documents

[IEEE 2010 Second International Conference on Network Applications Protocols and Services (NETAPPS) - Alor Setar, Kedah, Malaysia (2010.09.22-2010.09.23)] 2010 Second International