24
Faculty of Economic Sciences, Communication and IT Computer Science Karlstad University Studies 2010:12 Johan Eklund On Switchover Performance in Multihomed SCTP

On Switchover Performance in Multihomed SCTP - …305835/FULLTEXT01.pdf · On Switchover Performance in Multihomed SCTP JOHAN EKLUND Department of Computer Science, ... (SIGTRAN)

  • Upload
    lenhan

  • View
    225

  • Download
    0

Embed Size (px)

Citation preview

Faculty of Economic Sciences, Communication and ITComputer Science

Karlstad University Studies2010:12

Johan Eklund

On Switchover Performance in Multihomed SCTP

Karlstad University Studies2010:12

Johan Eklund

On Switchover Performance in Multihomed SCTP

Johan Eklund. On Switchover Performance in Multihomed SCTP

Licentiate thesis

Karlstad University Studies 2010:12ISSN 1403-8099 ISBN 978-91-7063-298-3

© The Author

Distribution:

Karlstad University

Faculty of Economic Sciences, Communication and IT

Computer Science651 88 KarlstadSweden+46 54 700 10 00

www.kau.se

Printed at: Universitetstryckeriet, Karlstad 2010

On Switchover Performance inMultihomed SCTPJOHAN EKLUNDDepartment of Computer Science, Karlstad University

Abstract

The emergence of real-time applications, like Voice over IPand video conferencing, in IP net-works implies a challenge to the underlying infrastructure. Several real-time applications haverequirements on timeliness as well as on reliability and areaccompanied by signaling appli-cations to set up, tear down and control the media sessions. Since neither of the traditionaltransport protocols responsible for end-to-end transfer of messages was found suitable for sig-naling traffic, the Stream Control Transmission Protocol (SCTP) was standardized. The focusfor the protocol was initially on telephony signaling applications, but it was later widened toserve as a general purpose transport protocol. One major newfeature to enhance robustness inSCTP is multihoming, which enables for more than one path within the same association.

In this thesis we evaluate some of the mechanisms affecting transmission performance incase of a switchover between paths in a multihomed SCTP session. The major part of the eval-uation concerns a failure situation, where the current pathis broken. In case of failure, theendpoint does not get an explicit notification, but has to react upon missing acknowledgements.The challenge is to distinguish path failure from temporarycongestion to decide when to switchto an alternate path. A too fast switchover may be spurious, which could reduce transmissionperformance, while a too late switchover also results in reduced transmission performance. Thisimplies a tradeoff which involves several protocol as well as network parameters and we elab-orate among these to give a coherent view of the parameters and their interaction. Further, wepresent a recommendation on how to tune the parameters to meet telephony signaling require-ments, still without violating fairness to other traffic.

We also consider another angle of switchover performance, the startup on the alternate path.Since the available capacity is usually unknown to the sender, the transmission on a new path isstarted at a low rate and then increased as acknowledgementsof successful transmissions return.In case of switchover in the middle of a media session the startup phase after a switchover couldcause problems to the application. In multihomed SCTP the availability of the alternate pathmakes it feasible for the end-host to estimate the availablecapacity on the alternate path prior tothe switchover. Thus, it would be possible to implement a more efficient startup scheme. In thisthesis we combine different switchover scenarios with relevant traffic. For these combinations,we analytically evaluate and quantify the potential performance gain from utilizing an idealstartup mechanism as compared to the traditional startup procedure.

Keywords: Computer networking, transport protocols, multihoming, performance evaluation,signaling traffic.

i

Acknowledgments

First of all I would like to thank my supervisor Anna Brunstrom. Her support, comments on,and contribution to my work together with encouragement during my studies has been invaluablefor me. Further, I would like to thank my co-supervisors, Karl-Johan Grinnemo, for valuablecontribution to my work and for comments on my writing, and Johan Garcia especially forsupport when I have got stuck on technical issues.

I also want to thank the people in the Distributed Systems andCommunications ResearchGroup. Thanks for support and interesting discussions. Further, I would like to thank my col-leagues at the Department of Computer Science, for a pleasant and friendly working environ-ment.

I am also grateful for the financial support given by Vinnova,the Knowledge Foundationof Sweden and by the European Regional Development Fund. In some of the projects Erics-son Research, Tieto and the IT-foundation Compare have beenactive as industrial partners. Iacknowledge this with gratitude.

Last, but not least I would like to thank my family, my wife Maria and my children Gustaf,Henrik and Hanna for their love, support and for continuous capability of accepting my preoc-cupied mind when returning home after, sometimes long, daysof work. I am indebted to myfamily!

Thank you all!

Karlstad, May 2010

Johan Eklund

iii

List of Appended Papers

This thesis is based on the work presented in the following four papers. References to the paperswill be made using the alphabetic symbols associated with the papers.

Paper A Johan Eklund and Anna Brunstrom. Performance of Network Redundancy Mech-anisms in SCTP.Technical Report 2005:48, Karlstad University Studies, Sweden,September 2005.

Paper B Johan Eklund, Anna Brunstrom and Karl-Johan Grinnemo. On the Relation BetweenSACK Delay and SCTP Failover Performance for Different Traffic Distributions. InProceedings of The Fifth International Conference on Broadband Communications,Networks and Systems (Broadnets 2008), London, UK, September 2008.

Paper C Johan Eklund, Karl-Johan Grinnemo, Stephan Baucke, and Anna Brunstrom.Tun-ing SCTP Failover for Carrier Grade Telephony Signaling. InComputer Networks,Elsevier, Volume 54, Issue 1, January 2010, pp. 133-149.

Paper D Johan Eklund, Karl-Johan Grinnemo and Anna Brunstrom. Theoretical Analysisof an Ideal Startup Scheme in Multihomed SCTP. To appear inProceedings of Net-worked Services and Applications. Engineering, Control and Management Confer-ence (EUNICE 2010), Trondheim, Norway, June 2010.

Some of the papers have been subjected to minor editorial changes.

Comments on My Participation

For all the papers, except paper C, I am responsible for carrying out the experiments, and formost of the written material and ideas. In paper C, I am responsible for all experiments and ideasexcept the ones in Section 5.2 and Section 6.

v

Other Papers

Apart from the papers included in the thesis, I have co-authored the following paper:

• Johan Eklund and Anna Brunstrom. Impact of SACK delay and Link Delay on FailoverPerformance in SCTP. InProceedings of the third IASTED International Conference onCommunications and Computer Networks, Lima, Peru, October 2006.

vi

Contents

Abstract i

Acknowledgments iii

List of Appended Papers v

Comments on My Participation v

Other Papers vi

Introductory Summary 11 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Research Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Research Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3.1 Tuning of Parameters and MechanismsImpacting Failoverin SCTP . . 53.2 Improving Startup on the Alternate Path . . . . . . . . . . . . . .. . . 7

4 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Summary of Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Paper A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Paper B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Paper C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Paper D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

7 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Paper A: Performance of Network Redundancy Mechanisms in SCTP 131 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.2 Network Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3 Experimental Parameters and Data Gathering . . . . . . . . . .. . . . 19

3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .203.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2 Detailed Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

vii

3.3 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . .. 27

4.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Paper B: On the Relation Between SACK Delay and SCTP FailoverPerformance 31

Paper C: Tuning SCTP Failover for Carrier Grade Telephony Signaling 35

Paper D: Theoretical Analysis of an Ideal Startup Scheme in Multihomed SCTP 39

viii

Introductory Summary

1. Introduction 3

1 Introduction

The evolution of the Internet over the last decades, from being a restricted research networkwith limited capabilities, to a world-wide public network with billions of users is a tremendoussuccess and the outstanding example of the popularity of computer networks. From being aninteresting source of static information and a way to send text messages between end hosts,computer networks today offer a broad spectrum of dynamic services such as online gamingand video conferencing. This evolution has partly been madepossible by the Internet Protocol(IP) [24], which is a major fundament in the underlying technology of computer networks likethe Internet. IP offers a generic unreliable so called best effort service. On top of IP, protocolsare usually added to offer additional services. The protocols responsible for the transfer of databetween end hosts of a connection and the delivery of data to the application are called transportprotocols.

Traditional services offered over computer networks couldbe separated into two categoriesoffering reliable or unreliable delivery. The applications utilizing reliable service require con-firmation of data delivery or of failure. Confirmations are achieved by having the receivingendpoint transmit acknowledgements back to the sender uponreception of correct data packets.The cost for the reliability guarantee is that the service isnot optimized according to timingaspects and sometimes the service is slow. This service has traditionally been offered mainly bythe connection oriented transport protocol, TransmissionControl Protocol (TCP) [25]. The otherservice category is used by applications without any reliability requirements, but with more fo-cus on timeliness. This service has been offered by the second major transport protocol, the UserDatagram Protocol (UDP) [23], which usually offers a faster, but unreliable delivery of data.

Over the last years so called real-time applications, such as Voice over IP (VoIP) and videoconferencing have been deployed over IP networks. To set up,tear down and control the qualityfor the media sessions, these applications are usually supplemented by signaling applications.Signaling applications have requirements both on reliability and on timeliness [21] and generatetraffic of different intensity during the session. Since neither the service offered by TCP nor byUDP is sufficient for signaling application demands, the SIGnalling TRANsport (SIGTRAN)working group of the International Engineering Task Force (IETF) [12] designed a new reli-able transport protocol. The name of the new protocol is Stream Control Transmission Protocol(SCTP) [27], and it was designed primarily with telephony signaling application requirementsin focus. During the design process it was found that more applications could benefit from aprotocol like this, and IETF decided to standardize SCTP as ageneral purpose transport pro-tocol. One requirement to be accepted as a protocol for deployment in public networks was tomake the protocol TCP-friendly, i.e. to share network resources in a fair way with competing re-liable transports. To fulfill the fairness ambition, SCTP has inherited the congestion control [1]from TCP. Some other inherited features are the delayed ACK-mechanism [2], and selectiveacknowledgement (SACK) [19].

One of the requirements on networks for signaling traffic is robustness. To enhance robust-ness, SCTP supports multihoming. The multihoming feature of SCTP enables for more thanone network path in the same association, the name used in SCTP for a connection betweenhosts. Multihoming enables for continuous transfer although one path goes down. Since trans-

4 Introductory Summary

port protocols, like SCTP exist only in the endpoints, the switch from one path to another isperformed only after an endpoint has detected a failure. Theway for SCTP to detect a path fail-ure is by reacting upon missing acknowledgements. On the other hand, the reason for a missingacknowledgement is not always a failure. The packet may havebeen delayed due to temporarycongestion. This situation poses a challenge for SCTP in that it must decide when to regarda path unavailable or just temporarily disturbed. To masterthis challenge SCTP possesses atunable failover mechanism.

SCTP was not originally designed to accept dynamic additionand deletion of IP addressesduring transfer. To increase robustness and flexibility this was found to be a beneficial feature.Thus, this feature has now been standardized and is called the dynamic address reconfiguration(DAR) extension [28]. DAR was primarily added to SCTP to enable for switching or upgradingof network components, such as network interface cards, without interrupting the transfer, a socalled hotswap. Moreover, this feature opens up the possibility for handover between networksfor mobile terminals. To enable for efficient hotswap and handover, the startup scheme on thenew path has to be considered.

The above mentioned scenarios, failover, hotswap and handover imply three different situa-tions with different characteristics and performance metrics. Still, the same basic mechanismsare used in all scenarios. In this thesis, we use the term switchover as a common name for thesescenarios.

The major part of the work presented in this thesis focuses onidentification and evaluationof mechanisms and parameters affecting SCTP failover performance. To meet the requirementsof signaling applications on timeliness, we present a more strict configuration of the protocolparameters impacting SCTP failover performance as compared to the default recommendations.Further, we identify and evaluate some additional parameters that may impact failover perfor-mance and identify the interaction between the different parameters. The evaluations are sum-marized in a table of recommended tunings and considerations to improve failover performancewithout negatively impeding concurrent traffic.

Additionally, we make some initial work to improve the startup on the alternate path aftera switchover. We present an analytical model for calculating the switchover time for real-timetraffic. Based on this model and on models for bulk traffic, we identify and quantify the trafficscenarios and traffic patterns for which an improved startupscheme may be most beneficial.

2 Research Objective

The objective of this thesis is to evaluate and recommend tuning and/or modi-fication of param-eters and mechanisms impacting switchover performance in multihomed SCTP environments.A switchover usually implies a performance degradation in terms of reduced throughput and/orincreased transfer times and may result in additional network traffic. The aim of the work in thisthesis is to reduce the negative impact on the applications.The focus is primarily on signalingtraffic, but also other traffic with real-time requirements will benefit from optimized tuning andfrom improved mechanisms. To accomplish this objective, wefocus on two different aspects ofthe problem.

Firstly, we consider the tuning of the failover mechanism inSCTP. More specifically:

3. Research Problems 5

Which are the mechanisms and parameters affectingSCTP failover performance, how do they interact, andhow should they be tuned to optimize failoverperformance without violating fairness to other traffic?

Secondly, we consider the throughput degradation due to theslow-start phase after a switchoverto an alternate path.

Which are the scenarios when a switchover may occur,which are the relevant traffic types for each scenario andfor which of these combinations is the performance degradationfrom the slow-start mechanism important.

In the next section we further elaborate the above mentionedresearch problems. Moreover,the relation between our work and previous work regarding these problems is presented.

3 Research Problems

As mentioned in the introduction, SCTP was introduced as a way to meet the requirements oftelecommunication signaling applications, but later standardized as a general purpose transportprotocol suitable also for other types of traffic. Besides reliability requirements, signaling traffichas timing requirements. A too late signaling message is notvalid any longer. Thus, to fulfillboth these requirements in a failure situation, it is essential for SCTP to detect a path failureearly and to switch to an alternate path. Before the failoverit is important to distinguish pathfailure from temporary path problems to reduce the risk of a so called spurious failovers. Tomake the failover as well as the other switchover scenarios as transparent to the application aspossible, it is further attractive for SCTP to be able to utilize the available capacity of the newpath as soon as possible. Therefore an improved startup mechanism is expected to be beneficial.

3.1 Tuning of Parameters and MechanismsImpacting Failover in SCTP

SCTP failover performance is impacted by several protocol parameters [27]. The primary pa-rameter is the Path Max Retrans (PMR), which defines the number of consecutive timeouts toaccept before abandoning the current path.

PMR is closely coupled to the retransmission timeout (RTO) [15, 27], which regulates thetime to wait for an acknowledgement before retransmitting the packet. The RTO is adjusted dy-namically during the transfer, to smoothly adapt to the network conditions. Still, the boundariesfor the RTO are configurable by setting the maximum (RTOmax) and the minimum (RTOmin )values for the RTO. RFC 4960 gives some general recommendations concerning PMR and con-cerning the boundaries for the RTO interval [27]. However, these recommendations have shownto be too liberal [10, 17] to meet the signaling application demands of a failover time of maxi-mum 2 seconds [14]. Still, a stricter tuning of these parameters has to be conducted carefully fortwo reasons:

6 Introductory Summary

• Not to increase the risk of spurious failovers, which implies a switch to an alternate pathin a non-failure situation.

• Not to increase the risk of spurious timeouts.

Some research has been conducted on reducing the default value of five consecutive timeoutsfor PMR to shorten the failover process. In his Ph.D thesis Caro [16] presents results indicatingthat an aggressive tuning of PMR to 0 or 1 is beneficial for the total transfer time of large files.Signaling traffic, on the other hand, has quite different characteristics compared to bulk trafficand PMR has also been studied in relation to this traffic [17, 5]. In a study by Grinnemo et al.they find a value of 3 for PMR to be reasonable [10] for signaling traffic. The RTO boundaries inSCTP have been inherited directly from TCP [22], with the lower bound set to 1 second and theupper bound to 60 seconds. Especially the lower bound could imply an increased failover timein case of short RTTs. Modifications of the RTO interval has been studied before by Jungmaieret al. [17] and by Fallon et al. [5]. An alternate tuning of theRTO interval has to be conductedcarefully, not to increase the risk of spurious timeouts, and requires knowledge about networkcharacteristics.

Another protocol parameter which possibly could impact thefailover performance is thedelayed SACK [2], where a SACK may be delayed at the receiver for a specified time beforetransmission. The default value of the SACK delay of 200 ms isalso directly inherited fromTCP [1]. This default value is suitable for bulk traffic but not evidently for traffic of othercharacteristics.

To provide a more efficient failover Nishida has suggested anadditional state for the sendingSCTP host, called Potentially Failed (PF), as an intermediate state between active and inactivefor a path [20]. If an endpoint fails to receive acknowledgements from one destination addressit sets this endpoint in PF state. No data is sent to a destination in PF state, and the transferis immediately switched to an alternate destination. In thePF state, the proposal recommendsan increased intensity of heartbeats to be sent to the destination to probe reachability, and if aheartbeat is acknowledged the destination state is set backto active and the transfer is switchedback. The authors of this proposal mean that this modification of the protocol leads to a fasterfailover process without ignoring the default recommendations in RFC 4960.

In case of a retransmission timeout, the congestion window is collapsed to a small size andthe RTO timer is doubled. In case of several consecutive timeouts, this results in an exponentialincrease of the RTO timer. This increase was introduced to reduce the risk of spurious time-outs [22]. Still, this dramatically increased RTO has an impact on the time to detect a failure. Inthe PhD thesis by Grinnemo [9], he proposes a slightly relaxed backoff of the RTO timer. In thisthesis we further elaborate on this proposal.

From the above description it is evident that optimization of the failover performance is com-plex. The various protocol parameters may interact differently under different traffic conditions.Further, tuning of the protocol parameters has, in some cases, to be conducted with regard to net-work parameters. Still, after optimization of the system, the fairness aspect to other traffic hasto be regarded. In this thesis, we elaborate among these parameters to identify the impact of andthe interaction effects between different parameters in relation to network conditions and trafficdistributions to optimize failover performance. One goal is to present general recommendations

4. Research Methodology 7

on parameter tuning to make SCTP meet telecommunication requirements. Further, we aim atstudying effects from modifying the relaxed backoff mechanism in relation to the recommendedparameters above, to further improve the failover performance.

3.2 Improving Startup on the Alternate Path

A switchover may occur in three different scenarios. The basic scenario is the failover scenario,described above. The second scenario is the hotswap scenario, which is a planned switch fromone path to another, and the third scenario is the handover scenario.

Irrespective of switchover scenario, the traffic has to go through the slow start phase on thenew path. This may imply a throughput degradation which may disturb the application. Oneway of overcoming the negative effect is by removing the slow-start phase. This approach hasbeen studied for TCP by Liu et al. in a proposal called Jump Start [18]. The obvious outcome ofusing Jump Start depends on the available capacity in the network, if there is spare capacity inthe network Jump Start will not restrict the transfer while Jump Start may aggravate the situationin a congestion network. Another recent proposal to reduce the impact of the slow-start phasepresented by Chu et al. is to increase the initial congestionwindow [4]. In SCTP the newpath is usually known and connected before the switchover, which implies that it would also bepossible for the sending endpoint to achieve information concerning network conditions on thealternate path prior to the switchover. An estimate of the conditions on the alternate path couldbe achieved by applying some of the well known bandwidth estimation techniques [13, 29].This estimate could enable for an alternate startup mechanism to reduce the negative impact ofthe switchover. An improved startup mechanism in SCTP has been of interest to the researchcommunity [6,8,3]. Most of these studies have been focusingon specific traffic and on specificscenarios. To be able to evaluate the performance gain from an alternate startup mechanism itis important to match the different switchover scenarios torelevant traffic and to compare slowstart to the alternate mechanism. From this it is possible toidentify the scenarios where theperformance gain is most beneficial and to quantify the improvement.

In this thesis we aim at presenting some basic information onscenarios, where an improvedstartup may be most beneficial. This analysis is based on an analytical approach. Particularly,we aim at quantifying the improvement by an ideal startup mechanism compared to utilizing thetraditional slow-start mechanism. To do this we present a theoretical model on how to calculatethe potential transfer delay for real time traffic in a slow-start scenario.

4 Research Methodology

The studies conducted as a basis for this thesis are mainly conducted as real experiments. Realexperiments in a research context could either imply utilization of real implementations in alab environment or in a real network. The most realistic results are naturally achieved in a realnetwork. However, the network conditions are usually fluctuating, which makes the experimentshard to repeat. Further, the analysis of the results is complex in a real network and the impactof the studied parameters is sometimes not evident. Thus, tobe able to study effects of differentparameter tunings, a mix of real entities and a simulated network can be preferable, usually

8 Introductory Summary

referred to as emulation. Most of the studies in this thesis are based on emulation experiments,where a network emulator, Dummynet [26], has been used to emulate network behavior in acontrolled way. Further, a few studies are based on simulations, where results are produced in asingle machine running software imitating both endpoint and network behavior.

The studies concerning startup behavior are based on analytical methods, where behaviorof endpoints and networks have been mathematically modeled. This is a beneficial method toproduce general results and to display and understand general behavior like best case or worstcase behavior. The weakness of analytical methods is that they usually embrace many simplifi-cations. This may sometimes reduce the validity of the results and a continuation after havingachieved an analytical result usually is to verify the result in a real experiment.

5 Main Contributions

The main contributions of this thesis address SCTP switchover performance from two differentdirections:

1. We investigate parameters impacting SCTP failure detection, and suggest some improve-ments. In particular:

– we verify the Linux kernel implementation of SCTP and conclude that the perfor-mance of the Linux implementation of SCTP works in line with the standard. Fur-ther, we identify that the failover performance is dependant on the status in the net-work at the time of failure. This contribution is presented in paper A.

– we evaluate the impact of the acknowledgement delay (SACK delay) in relation todifferent traffic distributions. We identify that the SACK delay may have a severelydegrading impact on failover performance for traffic that consists of small individualmessages transmitted at low intensity. This contribution is presented in paper B.

– we present a coherent investigation concerning protocol and network parameters im-pacting failover performance. Further, we present a recommendation on how to con-figure the parameters to meet the requirements of telephony signaling applications.This contribution is given in paper C.

2. We analytically quantify the impact of slow-start and compare the results to an idealstartup mechanism in case of a switchover in SCTP. We identify that there is a degrad-ing impact by using slow-start and that the impact varies between different switchoverscenarios and traffic patterns. The degrading impact is mostsevere for real-time trafficgenerating high traffic loads, like a high-definition video conference, in a handover sce-nario. This contribution is presented in paper D.

6. Summary of Papers 9

6 Summary of Papers

Paper A – Performance of Network Redundancy Mechanisms in SCTP

This paper provides a background on SCTP multihoming and on the algorithms and parametersin the failover mechanism. Further, we experimentally evaluate the impact of a path failure ina multihomed SCTP association on signaling messages. The performance metrics are (i) themaximum transfer time for an individual message and (ii ) the time between occurrence of abreak on the primary path until the traffic is running smoothly on the alternate path, the so calledfailover time. The experiment is partly performed to verifythe Linux Kernel implementation ofSCTP (LK-SCTP) [11] and to evaluate the impact of PMR. The results verify that there is a clearrelation between PMR and the failover time. Additionally itbecomes evident that the status inthe network at the time of failure has an impact on the failover time if the traffic is sparse.

Paper B – On the Relation Between SACK Delay and SCTP FailoverPerformancefor Different Traffic Distributions

This paper provides a comprehensive evaluation of the impact of SACK delay on SCTP failoverperformance under different traffic intensities. The results show a clear relation between thetraffic distribution and the impact of the SACK delay. For lowintensity traffic, SACKs aremore frequently delayed. This delay increases the RTTs and thus restrains the RTO timer fromreaching a low value. This increased RTO timer impedes the failover time in case of failure. Wenotice severe negative effects for low intensity traffic, where the failover time may be increasedby more than 60% when keeping the SACK delay at the default value of 200 ms compared todisabling the SACK delay. Reducing the SACK delay to a small value above zero reduces thenegative impact significantly, at marginal cost in form of increased network traffic. If the trafficis intense our results show limited impact of SACK delay.

Paper C – Tuning SCTP Failover for Carrier Grade Telephony Signaling

This paper presents a comprehensive study concerning configuration of para-meters impactingfailover performance in a multihomed SCTP environment. Thegoal of the configuration is tomake SCTP compliant with the timing requirements of telephony signaling applications. Theconsidered SCTP parameters are the number of acceptable consecutive timeouts, PMR, the in-terval for the RTO timer (RTOmin and RTOmax) and the SACK delay. In the paper we presenthow tunings of these parameters has to regard network parameters like the RTT and the routerbuffers. The paper provides practically usable configuration recommendations. Additionally, asa complementary way to optimize SCTP failover performance,the paper discusses relaxing theexponential backoff of the retransmission timer.

Paper D – Theoretical Analysis of an Ideal Startup Scheme in Multihomed SCTP

This paper broadens the concept of switchover performance from focusing on improving failuredetection to include the startup on the alternate path. Thiswider view covers not only the failoverscenario, but is applicable for all switch-over scenarios.In case of switchover in a multihomed

10 Introductory Summary

association, the association has to go through the slow-start phase again. This may cause asudden drop in throughput and may increase message delays. By estimating the available band-width on the alternate path prior to a switchover, it is possible to utilize a more efficient startupscheme. In this paper we analytically compare and quantify the degrading impact of slow startin relation to an ideal startup mechanism. We combine the three different switchover scenarioswith relevant traffic and point out the scenarios where the slow-start mechanism could have aseriously degrading impact.

7 Future Work

For future work we intend to exploit the analytical results concerning an improved startup mech-anism on the alternate path, presented in paper D in this thesis. For the scenarios where an idealstartup mechanism have shown most beneficial, alternativesto the slow-start mechanism will befurther investigated. An estimated available capacity always reflects the momentary conditionsin the network, and the conditions may change rapidly in a dynamic environment. Therefore, wewill pay specific attention to the robustness of alternate startup mechanisms in relation to rapidlychanging environments. Moreover, the mechanism has to respect fairness to other traffic as wellas to network stability aspects. Our intention is to find an improved startup mechanism that canbe implemented in a real network.

A proposal to extend TCP to embrace multihoming, called Multipath TCP (MpTCP) [7] isunder standardization by IETF. The proposal has several similarities to the multihoming featureof SCTP and the results achieved for multihomed SCTP may alsobe utilized for the developmentof MpTCP.

REFERENCES 11

References

[1] M. Allman, V. Paxson, and E. Blanton. RFC 5681: TCP Congestion Control. InternetEngineering Task Force, September 2009.

[2] R. Braden. RFC 1122: Requirements for Internet Hosts - Communication Layers. InternetEngineering Task Force, October 1989.

[3] C. Casetti, C-F. Chiasserini, R. Fracchia, and M. Meo. AISLE: Autonomic Interface Se-LEction for Wireless Users. InProceedings of International Symposium on a World ofWireless Mobile and Multimedia Networks (WOWMOM’06), Washington, DC, USA, June2006.

[4] J. Chu and Y. Cheng. Increrasing TCP´s Initial Window,Internet-Draft, draft-hkchu-tcpm-initcwnd-00, work in progress. Internet Engineering Task Force, February 2010.

[5] S. Fallon, P. Jacob, Y. Qiao, L. Murphy, and E. Fallon. SCTP Switchover PerformanceIssues in WLAN Environments. InProceedings of 5th IEEE Consumer Communicationsand Networking Conference (CCNC 2008), Las Vegas, Nevada, US, January 2008.

[6] J. Fitzpatrick, S. Murphy, M. Atiquzzaman, and J. Murphy. ECHO: A Quality of Ser-vice Based Endpoint Centric Handover Scheme for VoIP. InProceedings of the WirelessCommunications and Networking Conference, 2008, Las Vegas, USA, April 2008.

[7] A. Ford, C. Raiciu, and M. Handley. TCP Extensions for Multipath Operation with Multi-ple Addresses,Internet-Draft, draft-ietf-mptcp-architecture-00, work in progress. InternetEngineering Task Force, March 2010.

[8] R. Fracchia, C. Casetti, C-F. Chiasserini, and M. Meo. A WiSE extension of SCTPfor wireless networks. InProceedings of International Conference on Communications(ICC2005), Seoul, South Korea, May 2005.

[9] K-J. Grinnemo.Transport Services for Soft Real-Time Applications in IP Networks. PhDthesis, Karlstad University (KaU), May 2006.

[10] K-J. Grinnemo and A. Brunstrom. Performance of SCTP-controlled Failovers in M3UA-based SIGTRAN Networks. InProceedings of Advanced Simulation Technologies Confer-ence 2004 (ASTC’04), Hyatt Regency Crystal City, Arlington, Virginia, USA, April 2004.

[11] http://lksctp.sourceforge.net/. SCTP for the Linux Kernel, 2010.

[12] http://www.ietf.org/. Internet Engineering Task Force (IETF) Home Page, 2010.

[13] N. Hu and P. Steenkiste. Estimating Available Bandwidth Using Packet Pair Probing. Tech-nical Report 2002:166, Division for Information Technology, Carnegie Mellon University,September 2002.

[14] ITU-T. Q.706: Specifications of Signaling System No. 7 –Message Transfer Part SignalingPerformance.ITU-T, March 1993.

12 Introductory Summary

[15] V. Jacobson. Congestion avoidance and control. InProceeding of Symposium on Commu-nications architectures and protocols (SIGCOMM ’88), New York, USA, August 1988.

[16] A. Caro Jr.End-to-End Fault Tolerance using Transport Layer Multihoming. PhD thesis,University of Delaware, August 2005.

[17] A. Jungmaier, E. Rathgeb, and M. Tuexen. On the Use of SCTP in Failover Scenarios. InProceedings of the 6th World Multiconference on Systemics,Cybernetics and Informatics,Orlando, US, July 2002.

[18] D. Liu, M. Allman, S. Jin, and L. Wang. Congestion Control without a startup phase. InProceedings of Fifth International Workshop on Protocols for FAST Long-Distance Net-works (PFLDnet 2007), Los Angeles, California, US, February 2007.

[19] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanov. RFC 2018: TCP Selective Acknowl-edgement Options. Internet Engineering Task Force, October 1996.

[20] Y. Nishida. Quick Failover Algorithm in SCTP,Internet-Draft, draft-nishida-tsvwg-sctp-failover-00, work in progress. Internet Engineering Task Force, February 2010.

[21] L. Ong, M. Garcia, H. Schwarzbauer, L. Coene, H. Lin, I. Juhasz, M. Holdrege, andC. Sharp. RFC 2719: Framework architecture for signaling transport. Internet EngineeringTask Force, October 1999.

[22] V. Paxson and M. Allman. RFC 2988: Computing TCP’s Retransmission Timer. InternetEngineering Task Force, November 2000.

[23] J. Postel. RFC 768: User Datagram Protocol. Internet Engineering Task Force, August1980.

[24] J. Postel. RFC 791: Internet Protocol. Internet Engineering Task Force, September 1981.

[25] J. Postel. RFC 793: Transmission Control Protocol. Internet Engineering Task Force,September 1981.

[26] L. Rizzo. Dummynet: A simple approach to the evaluationof network protocols.ACMComputer Communication Review, 27(1):31–41, January 1997.

[27] R. Stewart. RFC 4960: Stream Control Transmission Protocol. Internet Engineering TaskForce, September 2007.

[28] R. Stewart, Q. Xie, M. Tuexen, S. Maruyama, and M. Kozuka. RFC 5061: Stream ControlTransmission Protocol Dynamic Address Reconfiguration (DAR). Internet EngineeringTask Force, September 2007.

[29] H. Yihua and J. Brassil. NATALIE: An Adaptive, Network-Aware Traffic Equalizer. InProceedings of the International Conference on Communications (ICC´07), Glasgow, UK,June 2007.

Karlstad University StudiesISSN 1403-8099

ISBN 978-91-7063-298-3

On Switchover Performance

in Multihomed SCTP

The emergence of real-time applications, like Voice over IP and video conferencing, in IP networks implies a challenge to the underlying infrastructure. Several real-time applications have requirements on timeliness as well as on reliability and are accom-panied by signaling applications to set up, tear down and control the media sessions. Since neither of the traditional transport protocols responsible for end-to-end transfer of messages was found suitable for signaling traffic, the Stream Control Transmission Protocol (SCTP) was standardized. The focus for the protocol was initially on telephony signaling applications, but it was later widened to serve as a general purpose transport protocol. One major new feature to enhance robustness in SCTP is multihoming, which enables for more than one path within the same association.

In this thesis we evaluate some of the mechanisms affecting transmission performance in case of a switchover between paths in a multihomed SCTP session. The major part of the evaluation concerns a failure situation, where the current path is broken. In case of failure, the endpoint does not get an explicit notification, but has to react upon missing acknowledgements. The challenge is to distinguish path failure from temporary congestion to decide when to switch to an alternate path. A too fast switchover may be spurious, which could reduce transmission performance, while a too late switchover also results in reduced transmission performance. This implies a tradeoff which involves several protocol as well as network parameters and we elaborate among these to give a coherent view of the parameters and their interaction. Further, we present a recom-mendation on how to tune the parameters to meet telephony signaling requirements, still without violating fairness to other traffic.

We also consider another angle of switchover performance, the startup on the alternate path. Since the available capacity is usually unknown to the sender, the transmission on a new path is started at a low rate and then increased as aci knowledgements of successful transmissions return. In case of switchover in the middle of a media session the startup phase after a switchover could cause problems to the application. In mul-tihomed SCTP the availability of the alternate path makes it feasible for the end-host to estimate the available capacity on the alternate path prior to the switchover. Thus, it would be possible to implement a more efficient startup scheme. In this thesis we combine different switchover scenarios with relevant traffic. For these combinations, we analytically evaluate and quantify the potential performance gain from utilizing an ideal startup mechanism as compared to the traditional startup procedure.