Upload
dangbao
View
225
Download
1
Embed Size (px)
Citation preview
NEXT GENERATION NETWORK (NGN)
AVAILABILITY & RESILIENCE RESEARCH
Progress Report No. 1
The University of Canterbury Team 28 November 2005
Progress Report
Canterbury Team has been working on the NGN resiliency research project and would
like to present our primary achievements and proposed solutions according to the
promised outputs in the proposal.
Output1 Watching Brief
See Appendix 1, which list the selected papers and reports related to NGN resiliency
research with Canterbury team’s brief comments. We strongly suggest that those
papers can be modified and applied to NGN project, even for the future NGN
projects.
Output 2 Simulator Review
We propose to develop a NGN simulator by using a network simulation tool with a
possibility of gradual adding NGN elements and functionality. The purpose of the
simulator is to make possible an analysis of requirements for modelling and
evaluating resiliency schemes and service availability in NGN.
The problem complexity requires a higher level abstraction approach without the
actual implementation of detailed elements and protocols. Only simplified protocols
are used in the simulation network, and only the functionalities needed for simulating
the NGN resiliency are implemented. We proposed that a simplified network model
should include a PSTN emulator, a SIP emulator, a Softswitch and the appropriate
Gateways (e.g., Media Gateway, Signaling Gateway) between the PSTN network and
IP backbone network; MPLS enabled core network with RSVP-TE signaling protocol;
VoIP Protocols Stacks such as SIP, SIGTRAN, H323, H.248/MEGACO, MGCP,
NCS, RTP/RTCP will to be added gradually to the NGN simulation.
We are doing and will continue our simulation work in the following steps,
Step 1 Simulator Selection
For the proper NGN simulator, we are doing a survey on the most popular simulators
(e.g., Opnet, NS2, OMNet++, Jsim, Qualnet ) at this moment, our purpose is to
identify their strengths, weakness and also explore their programming potentials to
further develop resiliency modelling and VoIP in NGN scenarios.
So far, Canterbury team are doing a simple simulation, which is dedicated to scenarios
with failure/recovery conditions. The analysis is performed in terms of MPLS means
in Opnet. It is worth to observe behaviors of the traffic in failure conditions with
different scenarios i.e. with and without protective features applies. In addition, we
already found that the performance of recovery mechanism is connected with
topology (e.g., length of shortest and 2nd shortest paths, number of disjoint paths etc.)
as well as MPLS configuration (signaling timers, static and dynamic LSPs, location of
switching point reacted to the failure path, node and link).
Here we show a glimpse of Opnet functionality and abilities, which are suited to NGN
resiliency project.
1. Genetic topology with MPLS enabled IP routers setup
The basic scenario involves configuration of simple core network with attached end
nodes, the core nodes is enabled with MPLS, i.e., the IP routers support MPLS
functionality. To indicate MPLS core, the routers are marked with icons relevant for
LERs (node_0, node_1) and LSRs (node_2, node_3, node_4) as Figure 1.
Figure 1 MPLS setup with dynamic LSPs configured
2. MPLS parameters
Several general MPLS parameters need to be configured within MPLS configuration
object. Fundamentally, they involve definitions of FECs and traffic trunk profiles in Figure
2.
Figure 2 MPLS configuration object attributes
3. Action upon failure conditions
Network failure situations cause special conditions for the network to cope with.
The failure and recovery side effects become very dangerous for traffic streams
when there is no other mechanism than simple routing to handle them. Therefore,
MPLS feature of resiliency engineering is exposed here, as it provides great
advantage and promise for proper traffic handling upon failure.
In order to investigate network performance in failure conditions, the
Failure/Recovery config object in Opnet is used to introduce link breakdown. In
order to show effects of failure, voice application performance will be analyzed. We
proposed that there will be two scenarios compared i.e. with OSPF routing, another
with voice traffic pushed to flow explicit way and with MPLS recovery by backup
paths employment.
For example , Link node_1 node_2 fails at 480 s of simulation time. (The
configuration is shown in Figure 3
Figure 3. Failure/Recovery configuration for simple MPLS model scenario
The failing link is contained within the one of the shortest path chosen by
OSPF. However it is not predictable which flows go through each of the two shortest
paths. Thus it is relevant to make MPLS forward traffic of interest on the same
path initially. This practice provides the same initial conditions. When failure occurs
OSPF uses routing to redirect traffic. The second scenario uses established LSP to
forward voice and when the link on the path fails, the LSP uses backup LSP
configured to take over the traffic.
Within the MPLS core there are two static LSPs setup – one is to ensure that the
traffic is directed through the failing shortest path links (node_0 node_2 node_1).
Setup is shown in Figure 4.
a) b)
Figure 4 Failure recovery scenario setup:
a) LSPs direct traffic through the failing shortest path links,
b) MPLS recovery setup acting upon failures by means of backup LSPs,
Additionally, Opnet also provide function on. Fast Reroute Status within its Recovery
Parameters (Figure 5, 6).
Figure 5 Router with MPLS interface configuration for local bypass LSP
Figure 6. Primary LSP configuration for fast reroute protection
The main results related to resiliency need to be measured is the recovery time in
failure condition, it can be divided two factors, i.e., LSP recovery time and LSP
Traffic reroute time.
• LSP recovery time (sec) - the time taken to recover the LSP through a
new route. It expresses the difference between time when LSP failed and the
time when the ingress node was able to reroute the LSP through a different
route. This scenario takes place in cases where there is no alternative local
tunnel to take over the traffic and the conditions allow the LSP to recover.
• Traffic reroute time (sec) - the time taken to switch traffic away from
failed LSP. It is the difference between time when LSP failed and time when the
ingress node or the Point of Local Recovery (PLR), switches the traffic away
from failed LSP. This scenario occurs when there is ingress backup or local
tunnel LSPs established or when the main LSP cannot be recovered.
So far, we found that resiliency is a crucial issue providing reliability in failure
conditions and MPLS provides means for protecting not only single nodes or links but
also full paths. All these scenarios can be implemented in OPNET modeller.
Step 2: Topology Investigation
After step1, we will use the selected simulator to perform specific work on Telecom
NGN case. In this step, based on the given Telecom topology (e.g., 9 tier one nodes,
29 tier two nodes, existing and possible new links), different topologies will be
generated and studied and their outcomes will be analyzed and evaluated according to
the following suggested measure metrics:
• Average node degree, it is a well-known and common factor used when
evaluating topologies, which characterizes how densely a topology is
connected and also provides information regarding equipment complexity and
cost.
• Number of shortest paths, which is used for MPLS routing computation
• Number of 2nd shortest paths, which plays a key role for protection
mechanisms and recovery/failure conditions in MPLS.
• Length of 2nd shortest path, which influences the capabilities for MPLS
traffic engineering and alternative route selection.
• Number of node- and link-disjoint paths, which can provide estimates for
protection mechanism planning.
Based on the above metrics, we will compare different possible topologies and
develop a new solution approach to find the optimal topology for the Telecom NGN,
considering different node and link failure scenarios (single and multiple failures).
Step 3: Traffic Configuration and Behavior Study
Service availability behavior analysis needs examination of current and future traffic
patterns. The base scenario involves two traffic patterns that exhibit different behavior
such as VoIP and HTTP (or an application suggested by Telecom/Alcatel). The
additional background traffic of best effort type can be added in further scenarios. The
simulation will be carried out with realistic traffic matrix and distribution gathered
through existing network measurements, if available.
In addition, the traffic start delay provides time for convergence of routing and
forwarding tables at routers. It should be noted that the convergence time depends on
the routing algorithm and traffic distribution. Therefore we will study VoIP end-to-end
delay, packet loss and jitter behaviors for varying different traffic distributions and
routing algorithms such as OSPF.
Step 4: MPLS-TE Recovery Mechanisms and Signaling Protocols
We will use OPNET to set up MPLS elements (LER, LSR, LSP), parameters (FECs)
and related attributes like traffic engineering in the core network based on Telecom
case. MPLS-TE is armed with traffic engineering and multiple protection and
restoration schemes such as protection switching at ingress, local rerouting and
dynamic LSP recovery, which provide reliability in failure conditions and its repair
coverage may comprise nodes and links as well as full paths.
The performance of the recovery mechanism is related to topology-related factors
such as the length of shortest and 2nd shortest paths, number of disjoint paths etc. The
main objective is to model and investigate MPLS fast restoration potential and
performance from the following two perspectives:
• MPLS recovery mechanisms, which are crucial internal components of VoIP
service availability practices. Service resilience will take into account the
existing routing protocols such as OSPF/ IS-IS. Furthermore we will study
their extension to enhance routing performance in the MPLS enabled network.
• MPLS signaling protocol. A simplified version of Constrained-based Routing
over Label Distribution Protocol (CR-LDP) and the RSVP protocol could be
researched in failure/recovery conditions with different factors accounted for
such as topology and traffic distribution.
With the above considerations in mind, service availability can be evaluated in term
of recovery time, which is the time taken to recover the LSP through a new route. The
expected simulation outcome would be a sensitivity analysis of recovery time and its
relation to factors such as topology and traffic distribution in the following scenarios:
• LSP with no protection, where LSPs do not have any pre-defined backups
and traffic needs to be rerouted by standard routing protocols e.g., OSPF. The
recovery time would include signaling delay, reroute convergence delay etc.
• LSP with recovery feature, where there are pre-defined back-up explicitly-
routed LSPs, best suited to the traffic pattern. The recovery time here indicates
how much time should pass before the signaling protocol attempts to establish
a new LSP and switch the traffic.
Additionally, if time permits, some work will be devoted to GMPLS, which has been
developed by the IETF to extend the classical IP MPLS standard to address the
provisioning of paths at the fiber, wavelength, and SONET/SDH levels of the core
network. LSPs at each level of this hierarchy can therefore potentially benefit from
path protection approaches. It is suggested that GMPLS will be the future in NGN.
Therefore, we will explore MPLS/GMPLS with its label stacking capability to
implement an integrated two-layer restoration scheme. In addition, MPLS signaling
protocols need to be applied across the IP layer and the optical layer in a coordinated
manner.
Output 3 is cancelled and Outcome 4 and 5 will be partly considered
in other parts
Output 6 Service Availability Model Definition
Some papers already give us some good ideas of how to define and calculate network
and service availability in PSTN and IP network. We strongly suggest that the
recovery time regarding to different failure scenarios will be the most critical part in
the service availability modelling. In the paper and our further simulation work, we
will predict and measure the recovery time and try to find the efficient methods to
decrease recovery time, in other words, to improve the service resiliency in NGN. In
addition, we want to highlight that the recovery time also can be somehow translated
to packet loss, delay and jitter, which are also need to be considered in service
availability definition.
Output 7, 8 Network Component Selection and Prototype
Development
So far, we are still wondering what is the ‘network component selection’ means from
the Telecom and Alcatel point of view. Are they the new elements will be involved in
the future NGN network?
Also we suggest that output7 and 8 will be strongly combined and performed parallel
with output 2.
In addition, some new ideas are also can be applied to this project, for example, using
escalation strategies to propose multilayer resilience schemes in NZ NGN, which
includes
• DWDW layer -- protection schemes
• IP/MPLS layer -- restoration schemes
• Application and Middleware layers - Grid resilience schemes- checkpointing,
service migration and replication
• Treat all these schemes with a system point of view, and explore how to
coordinate them, i.e., which layer solutions are used for which type of failures
and how to trigger them
• Resiliency management and signaling model- 'GMPLS concept' - peer to peer
and overlay model
References
[1] Marek Małowidzki " Network Simulators : A Developer’s Perspective. Website
searching"
[2] G. Flores-Lucio, M. Paredes-Farrera, E. Jammeh, M. Fleury, and M. Reed,
"OPNET-Modeler and NS-2: Comparing the Accuracy of Network Simulators for
Packet-Level Analysis using a Network Testbed" 3rd WEAS Int. Conf. on Simulation,
Modelling and Optimization (ICOSMO 2003), Crete, vol. 2, pp. 700-707, 2003.
NOTE: An addendum to this paper is also available, clarifying some points made in
the paper with regards to OPNET Modeler.
[3] S. Wu, M. Riyadh, and M. Mannan, “OPNET Implementation of Megaco/H.248”
Doverspike, R. and J. Yates (2001). "Challenges for MPLS in optical network
restoration." Communications Magazine, IEEE 39(2): 89-96.
[4] M. Durvy, C. Diot, N. Taft, P. Thiran. "Network Availability Based Service
Differentiation". Proceedings of IWQoS 2003. Monterey (USA). June 2003.
[5] Christophe Diot and Gianluca Iannaccone and Athina Markopoulou and Chen-
Nee Chuah and Supratik Bhattacharyya (2003). "Service Availability in IP Networks.
".Sprint ATL Research Report Nr. RR03-ATL-071888. Sprint ATL. July 2003.
[6] Kellerer, B. and Neises, J. (2002) "Modelling of Service Availability" Euromicro
Conference, 2002. Proceedings. 28th
[7] Ruoshan Kone; Huaibei Zhou; "End-to-end availability analysis of physical
network" Computer and Information Technology, 2004. CIT '04. The Fourth
International Conference on 14-16 Sept. 2004 Page(s):668 – 673
[8] Alcatel 5020 Softswitch Open Call Control Engine for your Next Generation Service Offering. Alcatel website [9] Alcatel 7510 MG Media Gateway, Release 2.1. Alcatel website
[10] Paul Drew, MetaSwitch; Chris Gallon, Next-Generation VoIP Network
Architecture, Fujitsu Date: March 2003
[11] Chris Gallon, Quality of Service for Next Generation Voice Over IP Networks ,
Fujitsu Date: February 2003
[12] Georgios Ellinas, et al. (2002) "Restoration in Layered Architectures with a
WDM Mesh Optical Layer"
[13] P. Demeester, et al. "Resilience in a Multi-Layer Network "
[14] Oki, E., N. Matsuura, et al. (2002). "A disjoint path selection scheme with shared
risk link groups in GMPLS networks." Communications Letters, IEEE 6(9): 406-408
[15] Schollmeier, G. and C. Winkler (2004). "Providing sustainable QoS in next-
generation networks." Communications Magazine, IEEE 42(6): 102-107
[16] Elwalid, A., D. Mitra, et al. (2003). "Routing and protection in GMPLS
networks: from shortest paths to optimized designs." Lightwave Technology, Journal
of 21(11): 2828-2838.