17
management, multiprocessor architectures, high performance stor- age systems, video server architectures, and computer networks. Dr. Katz received a B.S. degree at Cornell University, and an M.S. and a Ph.D. at the University of California at Berkeley, all in com- puter science. His e-mail address is [email protected] and his WWW home page is http://www.cs.berkeley.edu/~randy.

management, multiprocessor architectures, high performance …cseweb.ucsd.edu/classes/fa01/cse222/papers/balakrishnan... · 2001. 10. 4. · management, multiprocessor architectures,

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • management, multiprocessor architectures, high performance stor-age systems, video server architectures, and computer networks.

    Dr. Katz received a B.S. degree at Cornell University, and an M.S.and a Ph.D. at the University of California at Berkeley, all in com-puter science. His e-mail address is [email protected] and hisWWW home page is http://www.cs.berkeley.edu/~randy.

  • [11] K. Fall and S. Floyd. Simulation-based Comparisonsof Tahoe, Reno, and Sack TCP. Computer Communi-cations Review, 1996.

    [12] J. C. Hoe. Start-up Dynamics of TCP’s CongestionControl and Avoidance Schemes. Master’s thesis,Massachusetts Institute of Technology, 1995.

    [13] V. Jacobson. Congestion Avoidance and Control. InProc. ACM SIGCOMM 88, August 1988.

    [14] V. Jacobson and R. T. Braden. TCP Extensions forLong Delay Paths. RFC, Oct 1988. RFC-1072.

    [15] P. Karn. The Qualcomm CDMA Digital Cellular Sys-tem. In Proc. 1993 USENIX Symp. on Mobile andLocation-Independent Computing, pages 35–40,August 1993.

    [16] P. Karn and C. Partridge. Improving Round-Trip TimeEstimates in Reliable Transport Protocols. ACMTransactions on Computer Systems, 9(4):364–373,November 1991.

    [17] S. Keshav and S. Morgan. SMART Retransmission:Performance with Overload and Random Losses. InProc. Infocom ’97, 1997.

    [18] S. Lin and D. J. Costello. Error Control Coding: Fun-damentals and Applications. Prentice-Hall, Inc., 1983.

    [19] Mathis, M. and Mahdavi, J. and Floyd, S. andRomanow, A. TCP Selective AcknowledgmentOptions, 1996. RFC-2018.

    [20] S. McCanne and V. Jacobson. The BSD Packet Filter:A New Architecture for User-Level Packet Capture. InProc. Winter ’93 USENIX Conference, San Diego, CA,January 1993.

    [21] Metricom, Inc. http://www.metricom.com, 1997.

    [22] S. Nanda, R. Ejzak, and B. T. Doshi. A Retransmis-sion Scheme for Circuit-Mode Data on WirelessLinks. IEEE Journal on Selected Areas in Communi-cations, 12(8), October 1994.

    [23] G. T. Nguyen, R. H. Katz, B. D. Noble, andM. Satyanarayanan. A Trace-based Approach forModeling Wireless Channel Behavior. In Proc. WinterSimulation Conference, Dec 1996.

    [24] J. B. Postel. Transmission Control Protocol. RFC,Information Sciences Institute, Marina del Rey, CA,September 1981. RFC-793.

    [25] S. Seshan, H. Balakrishnan, and R. H. Katz. Handoffsin Cellular Wireless Networks: The Daedalus Imple-

    mentation and Experience. Kluwer Journal on Wire-less Personal Communications, January 1997.

    [26] W. R. Stevens. TCP/IP Illustrated, Volume 1. Addison-Wesley, Reading, MA, Nov 1994.

    [27] AT&T WaveLAN: PC/AT Card Installation and Oper-ation. AT&T manual, 1994.

    [28] R. Yavatkar and N. Bhagwat. Improving End-to-EndPerformance of TCP over Mobile Internetworks. InMobile 94 Workshop on Mobile Computing Systemsand Applications, December 1994.

    Hari Balakrishnan (S ’95 / ACM S ’95) is a Ph.D. candidate inComputer Science at the University of California at Berkeley. Hisresearch interests are in the areas of computer networks, wirelessand mobile computing, and distributed computing and communi-cation systems. His current research is in the area of reliable datatransport over heterogeneous networking technologies.

    Hari received a B. Tech. degree in Computer Science and Engi-neering from the Indian Institute of Technology, Madras, in 1993and an M.S. degree in Computer Science from Berkeley in 1995.He received best student paper awards at the Winter Usenix ’95and at the ACM Mobicom ’95 conferences, and is the recipient ofa research grant from the Okawa Foundation. On the WWW, hisURL is http://www.cs.berkeley.edu/~hari and his e-mail address [email protected].

    Venkata N. Padmanabhan (IEEE S ’94 / ACM S ’94) is a Ph.D.candidate in Computer Science at the University of California atBerkeley. He received his B.Tech. degree from the Indian Instituteof Technology, Delhi in 1993 and his M.S. degree from the Univer-sity of California at Berkeley in 1995, both in Computer Science.

    Venkat has done research in the areas of Computer Networking,Mobile Computing and Operating Systems. The focus of his cur-rent work is network support for efficient Web access, and datatransport over asymmetric networks. He received the best studentpaper award at the Usenix ‘95 conference. He may be reached viae-mail at [email protected] and on the Web at http://www.cs.berkeley.edu/~padmanab.

    Srinivasan Seshan (ACM M ’92) received a B.S. in ElectricalEngineering, an M.S., and a Ph.D. in Computer Science from theUniversity of California at Berkeley in 1990, 1993 and 1995respectively. Since 1995, he has been a research staff member atthe IBM T.J. Watson Research Center. His research interestsinclude computer networks, mobile computing and distributedcomputing. His e-mail address is [email protected] and hisWWW home page is at http://www.research.ibm.com/people/s/srini.

    Randy H. Katz (F ’96, ACM F ’96) is a professor of computer sci-ence at the University of California at Berkeley, and is a principalinvestigator in the Bay Area Research Wireless Access Network(BARWAN) project. He has taught at Berkeley since 1983, withthe exception of 1993 and 1994 when he was a program managerand deputy director of the Computing Systems Technology Officeat the Defense Department’s Advanced Research Projects Agency.He has written over 130 technical publications on CAD, database

  • less connection, resulting in poor end-to-end throughput.Using a SMART-based selective acknowledgment mecha-nism for the wireless hop yields good throughput. However,the throughput is still slightly less than that for a well-tunedlink-layer scheme that does not split the connection. Thisdemonstrates that splitting the end-to-end connection is nota requirement for good performance.

    3. The SMART-based selective acknowledgment scheme weused is quite effective in dealing with a high packet loss ratewhen employed over the wireless hop or by a sender in aLAN environment. In the WAN experiments, the SACKscheme based on the IETF Draft resulted in significantlyimproving end-to-end performance, although its perfor-mance was not as good as in the best link schemes. Fromour results we conclude that selective acknowledgmentschemes are very useful in the presence of lossy links, espe-cially when losses occur in bursts.

    4. End-to-end schemes, while not as effective as local tech-niques in handling wireless losses, are promising since sig-nificant performance gains can be achieved without anyextensive support from intermediate nodes in the network.The explicit loss notification scheme we evaluated resultedin a throughput improvement of more than a factor of twoover TCP-Reno, with comparable goodput values.

    7. Future Work

    Our experiments with various SACK and ELN mechanismsdemonstrate the significant benefits of such schemes, asdescribed in Section 5. We are in the process of evaluatingprotocol enhancements based on these ideas in the presenceof both network congestion and wireless losses in differentnetwork topologies, especially in networks with multiplewireless hops. In addition, we are evaluating the perfor-mance of several of the protocols described in this paperunder other patterns of loss derived from traces in [23].

    We are investigating the impact of large variations in con-nection round-trip times and the impact of bandwidth andlatency asymmetry on transport performance [5]. Largeround-trip variations are common in networks like the Met-ricom Ricochet wireless network [21], especially in thepresence of bidirectional traffic. Bandwidth asymmetry isprevalent in many cable and satellite networks with low-bandwidth return channels.

    8. Acknowledgments

    We are grateful to Steven McCanne and the anonymousreviewers for ACM SIGCOMM ’96 and IEEE/ACM Trans-actions on Networking for several comments and sugges-tions that helped improve the quality of this paper. We thankSally Floyd and Vern Paxson for useful discussions onSACKs and related topics.

    This work was supported by DARPA contract DAAB07-95-C-D154, by the State of California under the MICRO pro-gram, and by the Hughes Aircraft Corporation, Metricom,Fuji Xerox, Daimler-Benz, Hybrid Networks, and IBM.Hari is partially supported by a research grant from theOkawa Foundation.

    9. References

    [1] E. Ayanoglu, S. Paul, T. F. LaPorta, K. K. Sabnani,and R. D. Gitlin. AIRMAIL: A Link-Layer Protocolfor Wireless Networks. ACM ACM/Baltzer WirelessNetworks Journal, 1:47–60, February 1995.

    [2] A. Bakre and B. R. Badrinath. Handoff and SystemSupport for Indirect TCP/IP. In Proc. Second UsenixSymp. on Mobile and Location-Independent Comput-ing, April 1995.

    [3] A. Bakre and B. R. Badrinath. I-TCP: Indirect TCP forMobile Hosts. In Proc. 15th International Conf. onDistributed Computing Systems (ICDCS), May 1995.

    [4] H. Balakrishnan. An Implementation of TCP SelectiveAcknowledgments. ftp://daedalus.cs.berkeley.edu/pub/tcpsack/, 1996.

    [5] H. Balakrishnan, V. N. Padmanabhan, and R.H. Katz.The Effects of Asymmetry on TCP Performance. InProc. ACM MOBICOM ’97, September 1997.

    [6] H. Balakrishnan, V. N. Padmanabhan, S. Seshan,M. Stemm, and R.H. Katz. TCP Behavior of a BusyWeb Server: Analysis and Improvements. In Proc.IEEE INFOCOM, March 1998. (To appear).

    [7] H. Balakrishnan, S. Seshan, and R.H. Katz. ImprovingReliable Transport and Handoff Performance in Cellu-lar Wireless Networks. ACM Wireless Networks, 1(4),December 1995.

    [8] R. Caceres and L. Iftode. Improving the Performanceof Reliable Transport Protocols in Mobile ComputingEnvironments. IEEE Journal on Selected Areas inCommunications, 13(5), June 1995.

    [9] R. Caceres and V. N. Padmanabhan. Fast and ScalableHandoffs in Wireless Internetworks. In Proc. 1st ACMConf. on Mobile Computing and Networking, Novem-ber 1996.

    [10] A. DeSimone, M. C. Chuah, and O. C. Yue. Through-put Performance of Transport-Layer Protocols overWireless LANs. In Proc. Globecom ’93, December1993.

  • A small amount of buffering and retransmission from basestations prevents packet loss during the short handoffperiod. In [9], the buffering happens at the mobile host’s oldbase station, which forwards packets to the new base stationat the time of handoff. In [25], one or more base stations inthe vicinity join a multicast group corresponding to themobile host and receive all packets destined to it, in antici-pation of a handoff. When the handoff happens, the newbase station is readily able to forward the buffered and thenewly arriving packets without introducing any reordering,thereby preventing unnecessary invocations of TCP fastretransmissions. Experimental results reported in [25] indi-cate that such fast handoffs have a minimal adverse effecton TCP performance, even when the handoff frequency is ashigh as once per second.

    In contrast to the above schemes that operate at the networklayer, handoffs in a split-connection context, such as in I-TCP [3], involve the transfer of transport-layer state fromthe old base station to the new one. This results in signifi-cantly higher latency; for example, [2] reports I-TCP hand-off latencies of the order of hundreds of milliseconds in aWaveLAN-based network.

    5.2 Implementation Strategies for ELN

    Section 3.1 described the ELN mechanism by which thetransport protocol can be made aware of losses unrelated tonetwork congestion and react appropriately to such losses.In this section, we outline possible implementation strate-gies and policies for this mechanism.

    A simple strategy for implementing ELN would be to do soat the receiver, as we did for the results presented in thispaper. In this method, the corruption of a packet at the link-layer, indicated by a CRC error, is passed up to the transportlayer, which sends an ELN message with the duplicateacknowledgments for the lost packet. In practice, it may behard to determine the connection that a corrupted packetbelongs to, since the header could itself be corrupted: thiscan be handled by protecting the TCP/IP header using anFEC scheme. However, there are circumstances in whichentire packets, including link-level headers, are droppedover a wireless link. In such circumstances, the base stationgenerates ELN messages to the sender (in-band, as part ofthe acknowledgment stream) when it observes duplicateTCP acknowledgments arriving from the mobile host.

    We expect Explicit Loss Notifications to be useful in thecontext of multi-hop wireless networks, and are exploringthis in on-going work. Such networks (e.g., Metricom’s Ric-ochet network [21]) typically use packet radio units to routepackets to and from a wired infrastructure. Here, in order toimplement ELN, periodic messages are exchanged betweenadjacent packet radio units about queue lengths and thisinformation is used as a heuristic to distinguish betweencongestion and packet corruption, especially when entire

    packets (including headers) are corrupted or dropped over awireless link. This, coupled with a simple link-level schemeto convey NACK information about missing packets, is suf-ficient to generate ELN messages to the source.

    5.3 Selective Acknowledgment Issues

    Our experience with the IETF SACK scheme highlightssome weaknesses with it both when sender window sizesare small. This situation can be improved by enhancing thesender’s loss recovery algorithm as follows. In general, thearrival of one duplicate acknowledgment at the receiverindicates that one segment has successfully reached thereceiver. Rather than wait for three duplicate acknowledg-ments and perform a fast retransmission, the sender nowtransmits a new segment from beyond the “right edge” ofthe current window upon the arrival of the first and secondduplicate acks. This probes the network for sustained con-gestion and generates duplicate acknowledgments. Notethat we have not violated standard congestion control proce-dures by doing this: we only send out a segment when onehas left the data pipe, following the principle of conserva-tion of packets [13]. This enhancement can coexist withSACKs to further avoid timeouts, since the arrival of anacknowledgment with a SACK block indicating the recep-tion of the newly transmitted segment is a strong indicatorthat the original segment was lost, independent of whetherthree duplicate acknowledgments arrive or not. Thus, thismechanism will improve performance when the sender’swindow is small and losses occur, and is further exploredand described in [6].

    6. Conclusions

    In this paper, we have presented a comparative analysis ofseveral techniques to improve the end-to-end performanceof TCP over lossy, wireless hops. We categorize these tech-niques as end-to-end, link-layer or split-connection based.We use the end-to-end throughput, and the wired and wire-less goodputs as metrics for comparison.

    Our results lead to the following conclusions:

    1. A reliable link-layer protocol that uses knowledge of TCP(LL-TCP-AWARE) to shield the sender from duplicateacknowledgments arising from wireless losses gives a 10-30% higher throughput than one (LL) that operates indepen-dently of TCP and does not attempt in-order delivery ofpackets. Also, the former avoids redundant retransmissionsby both the sender and the base station, resulting in a highergoodput. Of the schemes we investigated, the TCP-awarelink-layer protocol with selective acknowledgements per-forms the best.

    2. The split-connection approach, with standard TCP usedfor the wireless hop, shields the sender from wireless losses.However, the sender often stalls due to timeouts on the wire-

  • the loss of multiple packets in succession. We are in the pro-cess of experimenting with a temporal burst-loss modelbased on average lengths of fades and other causes of wire-less losses. The parameters of this model are derived from atrace-based modeling and characterization of the WaveLANnetwork [23].

    4.6 Performance at Different Error Rates

    In this section, we present the results of several experimentsperformed across a range of bit-error rates, for some of theprotocols described earlier — E2E (the baseline case), LL-TCP-AWARE, LL-SMART-TCP-AWARE, E2E-SMART,E2E-IETF-SACK, and SPLIT-SMART. We chose the bestperforming protocols from each category, as well as someother protocols (e.g., E2E-IETF-SACK) to illustrate someinteresting effects.

    Figure 12 shows the performance of these protocols for an 8MByte end-to-end transfer in a LAN environment, acrossexponentially-distributed error rates ranging from 1 errorevery 16 KB to 1 error every 256 KB, in increasing powersof two. We find that the overall qualitative results and con-clusions are similar to those presented earlier for the 64 KBerror rate. At low error rates (128 KB and 256 KB points inthe graph), all the protocols shown perform almost equallywell in improving TCP performance. At the 16 KB errorrate, the performance of the TCP-aware link-layer schemesis about 1.75-2 times better than E2E-SMART and about 9times better than TCP Reno.

    Another interesting point to note is the relative performanceof E2E-IETF-SACK and E2E-SMART, especially at thehigh error rates. The congestion window does not growlarger than a few packets in the steady state at these errorrates where there are multiple losses in many windows.E2E-IETF-SACK does not retransmit any packet usingSACK information unless it receives three duplicate

    acknowledgments (to overcome potential reordering ofpackets in the network), which implies that no fast retrans-missions are triggered if the number of packets in the win-dow is less than four or five4. The sender’s congestionwindow is often smaller than this, resulting in timeouts anddegraded performance. In contrast, our implementation ofE2E-SMART assumes no reordering of packets (which isjustified in the LAN case) and retransmits the lost packetwhen the first duplicate acknowledgment with loss informa-tion arrives. This reduces the number of timeouts and resultsin better end-to-end performance. In Section 5.3, we outlinea scheme in which the IETF protocol can be modified towork well even when the sender’s congestion window is notlarge enough to provide enough duplicate acknowledg-ments.

    5. Discussion

    In this section, we present a discussion of some miscella-neous issues. We discuss the effects of handoff on TCP per-formance, some implementation strategies and policies forthe ELN mechanism introduced in Section 3.1, and someissues related to SMART-based and IETF selectiveacknowledgment schemes.

    5.1 Wireless Handoffs

    Wireless networks are usually organized in a cellular topol-ogy where each cell includes a base station that acts as arouter between the wireless subnet and a wireline backbone.Mobile hosts typically communicate with fixed hosts via thebase station in the cell they are currently located in. Exam-ples of networks organized in this fashion include cellulartelephone networks and wireless local-area networks.

    As a mobile host moves, it may get out of the range of itscurrent base station but still be within the range of otherneighboring base stations. To maintain the mobile host’sconnectivity, a handoff procedure is invoked to re-route traf-fic to and from the mobile host via the new base station.However, depending on the details of the handoff algo-rithms, this procedure could lead to packet losses and reor-dering, which in turn could cause significant deterioration inthe performance of ongoing TCP transfers [8].

    Several proposals have been made for achieving fast hand-offs. Two examples include multicast-based handoffs [25]and hierarchical handoffs [9]. In both these schemes, hand-offs are made fast by restricting updates to the immediatevicinity of the mobile host. As a result the handoff latencyin a WaveLAN-based wireless local-area network is of theorder of 10-30 ms.

    4. This depends on whether delayed acknowledgments are used.

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    16 32 64 128 256

    Figure 12. Performance of six protocols (LAN case)across a range of bit-error rates, ranging from 1 errorevery 16 KB to 1 every 256 KB shown on a log-scale.

    E2EE2E-IETF-SACK

    LL-SMART-TCP-AWARE

    LL-TCP-AWARE

    SPLIT-SMART

    E2E-SMART

    Bit-error rate (1 error every x KBytes, average)

    Thr

    ough

    put

    (Mbp

    s)

  • space at the base station (64 KB in our experiments) isbounded3. In the WAN case, the throughput of the SPLITapproach is about 0.58 Mbps which is better than the 0.31Mbps that the E2E approach achieves (Figure 6), but not asgood as several other protocols described earlier. The largecongestion window size of the wired sender in SPLITenables a higher bandwidth utilization over the wired net-work, compared to an end-to-end TCP connection where thecongestion window size fluctuates rapidly.

    As expected, the throughput for the SPLIT-SMART schemeis much higher. It is about 1.3 Mbps in the LAN case andabout 1.1 Mbps in the WAN case. The SMART-based selec-tive acknowledgment scheme operating over the wirelesslink performs very well, especially since no reordering ofpackets occurs over this hop. However, there are a few timeswhen both the original transmission and the first retransmis-sion of a packet get lost, which sometimes results in acoarse timeout (as described in Section 3.1). This explainsthe difference in throughput between the SPLIT-SMARTscheme and the LL-SMART-TCP-AWARE scheme(Figure 3).

    In summary, while the split-connection approach results ingood throughput if the wireless connection uses specialmechanisms, the performance is worse than that of a well-tuned, TCP-aware link-layer protocol (LL-TCP-AWARE orLL-SMART-TCP-AWARE). Moreover, the link-layer proto-col preserves the end-to-end semantics of TCP acknowledg-ments. This demonstrates that the end-to-end connectionneed not be split at the base station in order to achieve goodperformance.

    3. A larger buffer at the base station will not necessarily improveperformance for two reasons: (1) we measure performance interms of receiver throughput, which is limited by the small conges-tion window size of the wireless connection, and (2) a long enoughtransfer will still fill up the buffer.

    4.5 Reaction to Burst Errors

    In this section, we report the results of some experimentsthat illustrate the benefit of selective acknowledgments inhandling burst losses. We consider two of the best perform-ing local protocols: LL-TCP-AWARE (Snoop) and LL-SMART-TCP-AWARE (Snoop with SMART-based selec-tive acknowledgments). LL-TCP-AWARE recovers from asingle loss by retransmitting the lost packet when two dupli-cate acknowledgments arrive for it. It also keeps track of thenumber of expected duplicate acknowledgments and thenext expected new acknowledgment after this local retrans-mission. If this loss is part of a burst, the first new acknowl-edgment to arrive after the duplicates will be less than thenext expected new one; this causes an immediate retrans-mission of the lost segment. This is similar to the mecha-nism used by E2E-NEWRENO (Section 3.1). LL-SMART-TCP-AWARE uses the additional useful information pro-vided by the SMART scheme — the sequence number ofthe segment that caused the duplicate acknowledgment —to accurately determine losses and recover from them.

    Table 5 shows the performance of the two protocols forbursts of lengths 2, 4, and 6 packets. These errors are gener-ated at an average rate of one every 64 KBytes of data, and2, 4, or 6 packets are destroyed in each case. Selectiveacknowledgments improve the performance of LL-SMART-TCP-AWARE over LL-TCP-AWARE by up to 30% in thepresence of burst errors. While this is a fairly simplisticburst-error model, it does illustrate the problems caused by

    Figure 11. Congestion window sizes as a function of time for the wired and wireless parts of the split TCP connection.The wired sender never sees any losses and maintains a 64 KB congestion window. However, the wireless TCP connec-

    tion’s congestion window fluctuates rapidly.

    08192

    16384245763276840960491525734465536

    0 20 40 60 80 100 120Con

    gest

    ion

    Win

    dow

    (by

    tes)

    Time (sec)0

    819216384245763276840960491525734465536

    0 20 40 60 80 100 120Con

    gest

    ion

    Win

    dow

    (by

    tes)

    Time (sec)

    Wired Wireless

    BurstLength

    LL-TCP-AWARE (Mbps)

    LL-SMART-TCP-AWARE (Mbps)

    2 1.25 1.28

    4 1.02 1.20

    6 0.84 1.10

    Table 5. Throughputs of LL-TCP-AWARE and LL-SMART-TCP-AWARE at different burst lengths. This

    illustrates the benefits of SACKs, even for a high-performance, TCP-aware link protocol.

  • we used, the throughput was about 0.8 Mbps, significantlyhigher than the 0.31 Mbps throughput of TCP Reno. How-ever, this is still about 35% worse than LL-OPT. Eventhough SACKs allow the sender to often recover from mul-tiple losses without timing out, the sender’s congestion win-dow decreases every time there is a packet dropped on thewireless link, causing it to remain small.

    In summary, E2E-NEWRENO is better than E2E, especiallyfor large socket buffer sizes. Adding ELN to TCP improvesthroughput significantly by successfully preventing unnec-essary fluctuations in the transmission window. Finally,SACKs provide significant improvement over TCP Reno,but perform about 10-15% worse than the best link-layerschemes in the LAN experiments, and about 35% worse inthe WAN experiments. These results suggest that an end-to-end protocol that has both ELN and SACKs will result ingood performance, and is an area of current work.

    4.4 Split-Connection Protocols

    The main advantage of the split-connection approaches isthat they isolate the TCP source from wireless losses. TheTCP sender of the second, wireless connection performs allthe retransmissions in response to wireless losses.

    Figure 9 and Table 4 show the throughput and goodput forthe split connection approach in the LAN and WAN envi-ronments. We report the results for two cases: when thewireless connection uses TCP Reno (labeled SPLIT) andwhen it uses the SMART-based selective acknowledgmentscheme described earlier (labeled SPLIT-SMART). We seethat the throughput achieved by the SPLIT approach (0.6Mbps) is quite low, about the same as that for end-to-endTCP Reno (labeled E2E in Figure 6). The reason for this isapparent from Figures 10 and 19, which show the progressof the data transfer and the size of the congestion windowfor the wired and wireless connections. We see that thewired connection neither has any retransmissions nor anytimeouts, resulting in a wired goodput of 100%. However, it(eventually) stalls whenever the sender of the wireless con-nection experiences a timeout, since the amount of buffer

    SPLIT SPLIT-SMART

    Thr

    ough

    put

    (Mbp

    s)

    Figure 9. Performance of split-connection protocols: bit error rate = 1.9x10-6 (1 error/65536 bytes).

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2

    Thr

    ough

    put

    (% o

    f m

    axim

    um)

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100LAN: Absolute Percentage of max.

    WAN: Absolute Percentage of max.

    97.3100.00.60

    97.299.90.58

    97.2100.0

    1.30 97.699.81.10

    Figure 10. Packet sequence trace for the wired and wireless parts of the SPLIT protocol. The wireless part has two rowsof horizontal dots: the top one shows the times of fast retransmissions and the bottom one the times of the timeout-

    based ones.

    01e+062e+063e+064e+065e+066e+067e+068e+069e+06

    0 20 40 60 80 100 120

    Sequ

    ence

    Num

    ber

    (byt

    es)

    Time (sec)

    01e+062e+063e+064e+065e+066e+067e+068e+069e+06

    0 20 40 60 80 100 120

    Sequ

    ence

    Num

    ber

    (byt

    es)

    Time (sec)

    Wired Wireless

    Fast retransmissions

    Coarse timeouts

    SPLIT SPLIT-SMART

    LAN (8 KB) 0.54 (97.4%,100%) 1.30 (97.6%,100%)

    LAN (32 KB) 0.60 (97.3%,100%) 1.30 (97.2%,100%)

    WAN (32 KB) 0.58 (97.2%,100%) 1.10 (97.6%,100%)

    Table 4. Summary of results for the split-connectionschemes at an average error rate of 1 every 64 KB.

  • the relative performance. This is because in situations thatE2E suffers a coarse timeout for a loss, the probability thatE2E-NEWRENO does not, increases with the number ofoutstanding packets in the network.

    Explicit Loss Notification: One way of eliminating the longdelays caused by coarse timeouts is to maintain as large awindow size as possible. E2E-NEWRENO remains in fastrecovery if the new acknowledgment is only partial, butreduces the window size to half its original value upon thearrival of the first new acknowledgment. The E2E-ELN andE2E-ELN-RXMT protocols use ELN information(Section 3.1) to prevent the sender from reducing the size ofthe congestion window in response to a wireless loss. Boththese schemes perform better than E2E-NEWRENO, andover two times better than E2E. This is a result of thesender’s explicit awareness of the wireless link, whichreduces the number of coarse timeouts (Figure 7) and rapidwindow size fluctuations (Figure 8). The E2E-ELN-RXMTprotocol performs only slightly better than E2E-ELN whenthe socket buffer size is 32 KB. This is because there is usu-ally enough data in the pipe to trigger a fast retransmissionfor E2E-ELN. The performance benefits of E2E-ELN-RXMT are more pronounced when the socket buffer size issmaller, as the numbers for the 8 KB socket buffer size indi-cate (Table 3). This is because E2E-ELN-RXMT does notwait for three duplicate acknowledgments before retrans-

    mitting a packet, if it has ELN information for it. The maxi-mum socket buffer size of 8 KB limits the number ofunacknowledged packets to a small number at any point intime, which reduces the probability of three duplicateacknowledgments arriving after a loss and triggering a fastretransmission.

    Despite explicit awareness of wireless losses, timeoutssometimes occur in the ELN-based protocols. This is aresult of our implementation of the ELN protocol, whichdoes not convey information about multiple wireless-relatedlosses to the sender. Since it is coupled with only cumula-tive acknowledgments, the sender is unaware of the occur-rence of multiple wireless-related losses in a window; weplan to couple SACKs and ELN together in future work.Section 5.2 discusses some possible implementation strate-gies and policies for ELN.

    Selective acknowledgments: We experimented with twodifferent SACK schemes. In the LAN case, we used a sim-ple SACK scheme based on a subset of the SMART pro-posal. This protocol was the best of the end-to-end protocolsin this situation, achieving a throughput of 1.25 Mbps (incontrast, the best local scheme, LL-SMART-TCP-AWARE,obtained a throughput of 1.39 Mbps).

    In the WAN case, we based our SACK implementation [4]on RFC 2018. For the exponentially-distributed loss pattern

    Figure 7. Packet sequence traces for E2E (TCP Reno) and E2E-ELN. The top row of horizontal dots shows the timeswhen fast retransmissions occur; the bottom row shows the coarse timeouts.

    01e+062e+063e+064e+065e+066e+067e+068e+069e+06

    0 50 100 150 200 250

    Sequ

    ence

    Num

    ber

    (byt

    es)

    Time (sec)

    01e+062e+063e+064e+065e+066e+067e+068e+069e+06

    0 50 100 150 200 250

    Sequ

    ence

    Num

    ber

    (byt

    es)

    Time (sec)

    E2E E2E-ELN

    Fast retransmissions

    Coarse timeouts

    Fast retransmissions

    Coarse timeouts

    Figure 8. Congestion window size as a function of time for E2E (TCP Reno) and E2E-ELN. This figure clearly showsthe utility of ELN in preventing rapid fluctuations, thereby maintaining a larger average congestion window size.

    08192

    16384245763276840960491525734465536

    0 50 100 150 200 250

    Con

    gest

    ion

    Win

    dow

    (by

    tes)

    Time (sec)

    08192

    16384245763276840960491525734465536

    0 50 100 150 200 250

    Con

    gest

    ion

    Win

    dow

    (by

    tes)

    Time (sec)

    E2E E2E-ELN

  • one. The 10% LAN degradation is almost entirely due to theexcessive retransmissions over the wireless link and to thesmaller average congestion window size compared to LL-TCP-AWARE. Another important point to note is that LLsuccessfully prevents coarse timeouts from happening at thesource. Figure 5 shows the sequence traces of TCP transfersfor LL-TCP-AWARE and LL.

    In summary, our results indicate that a simple link-layerretransmission scheme does not entirely avoid the adverseeffects of TCP fast retransmissions and the consequent per-formance degradation. An enhanced link-layer scheme thatuses knowledge of TCP semantics to prevent duplicateacknowledgments caused by wireless losses from reachingthe sender and locally retransmits packets achieves signifi-cantly better performance.

    4.3 End-To-End Protocols

    The performance of the various end-to-end protocols issummarized in Figure 6 and Table 3. The performance ofTCP Reno, the baseline E2E protocol, highlights the prob-lems with TCP over lossy links. At a 2.3% packet loss rate(as explained in Section 4.2), the E2E protocol achieves athroughput of less than 50% of the maximum (i.e., through-put in the absence of wireless losses) in the local-area andless than 25% of the maximum in the wide-area experi-ments. However, all the end-to-end protocols achieve good-

    puts close to the optimal value of 97.7%. The primaryreason for the low throughput is the large number of time-outs that occur during the transfer (Figure 7). The resultingaverage window size during the transfer is small, preventingthe “data pipe” from being kept full and reducing the effec-tiveness of the fast retransmission mechanism (Figure 8).

    The modified end-to-end protocols improve throughput byretransmitting packets known to have been lost on the wire-less hop earlier than they would have been by the baselineE2E protocol, and by reducing the fluctuations in windowsize. The E2E-NEWRENO, E2E-ELN, E2E-SMART andE2E-IETF-SACK protocols each use new TCP options andmore sophisticated acknowledgment processing techniquesto improve the speed and accuracy of identifying andretransmitting lost packets, as well as by recovering frommultiple losses in a single transmission window withouttiming out. The remainder of this section discusses the ben-efits of three techniques — partial acknowledgments,explicit loss notifications, and selective acknowledgments.

    Partial acknowledgments: E2E-NEWRENO, which usespartial acknowledgment information to recover from multi-ple losses in a window at the rate of one packet per round-trip time, performs between 10 and 25% better than E2Eover a LAN and about 2 times better than E2E in the WANexperiments. The performance improvement is a function ofthe socket buffer size — the larger the buffer size, the better

    Thr

    ough

    put

    (Mbp

    s)

    E2E E2E-NEWRENO E2E-SMART E2E-ELN E2E-ELNRXMT

    Figure 6. Performance of end-to-end protocols: bit error rate = 1.9x10-6 (1 error/65536 bytes).

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    Thr

    ough

    put

    (% o

    f m

    axim

    um)

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    E2E-IETF-SACK

    LAN: Absolute Percentage of max.WAN: Absolute Percentage of max.

    97.597.50.70

    97.397.30.31

    97.797.30.89

    97.597.50.64

    97.297.21.25

    97.597.50.80

    97.597.51.12 97.5

    97.50.93

    97.697.60.64

    97.597.50.95

    97.497.40.72

    E2EE2E-

    NEWRENO E2E-SMARTE2E-IETF-

    SACK E2E-ELNE2E-ELN-

    RXMT

    LAN (8 KB) 0.55 (97.0,96.0) 0.66 (97.3,97.3) 1.12 (97.6,97.6) 0.68 (97.3,97.3) 0.69 (97.3,97.2) 0.86 (97.4,97.3)

    LAN (32 KB) 0.70 (97.5,97.5) 0.89 (97.7,97.3) 1.25 (97.2,97.2) 1.12 (97.5,97.5) 0.93 (97.5,97.5) 0.95 (97.5,97.5)

    WAN (32 KB) 0.31 (97.3,97.3) 0.64 (97.5,97.5) N.A. 0.80 (97.5,97.5) 0.64 (97.6,97.6) 0.72 (97.4,97.4)

    Table 3. This table summarizes the results for the end-to-end schemes for an average error rate of one every 65536bytes of data. The numbers in the cells follow the same convention as in Table 2.

  • control mechanisms. These fast retransmissions result inreduced goodput; about 90% of the lost packets are retrans-mitted by both the source and the base station.

    The effects of this interaction are much more pronounced inthe wide-area experiments — the throughput difference isabout 30% in this case. The cause for the more pronounceddeterioration in performance is the higher bandwidth-delayproduct of the wide-area connection. The LL scheme causesthe sender to invoke congestion control procedures oftendue to duplicate acknowledgments and causes the averagewindow size of the transmitter to be lower than for LL-TCP-AWARE. This is shown in Figure 4, which compares thecongestion window size of LL and LL-TCP-AWARE as afunction of time. Note that the number of outstanding data

    bytes in the network is the minimum of the congestion win-dow and the receiver advertised window. This is bounded bythe receiver’s socket buffer size. In the congestion windowgraphs for each protocol, the receiver socket buffer is 32KB.

    In the wide area, the bandwidth-delay product is about23000 bytes (1.35 Mbps * 135 ms), and the congestion win-dow drops below this value several times during each TCPtransfer. On the other hand, the LAN experiments do notsuffer from such a large throughput degradation becauseLL’s lower congestion-window size is usually still largerthan the connection’s delay-bandwidth product of about1900 bytes (1.5 Mbps * 10 ms). Therefore, the LL schemecan maintain a nearly full “data pipe” between the senderand receiver in the local connection but not in the wide area

    LL LL-TCP-AWARE LL-SMARTLL-SMART-TCP-

    AWARE

    LAN (8 KB) 1.20 (95.6%,97.9%) 1.29 (97.6%,100%) 1.29 (96.1%,98.9%) 1.37 (97.6%,100%)

    LAN (32 KB) 1.20 (95.5%,97.9%) 1.36 (97.6%,100%) 1.29 (95.5%,98.3%) 1.39 (97.7%,100%)

    WAN (32 KB) 0.82 (95.5%,98.4%) 1.19 (97.6%,100%) 0.93 (95.3%,99.4%) 1.22 (97.6%,100%)

    Table 2. This table summarizes the results for the link-layer schemes for an average error rate of one every 65536bytes of data. Each entry is of the form: throughput (wireless goodput, wired goodput). Throughput is measured in

    Mbps. Goodput is expressed as a percentage.

    LL-TCP-AWARE

    Figure 4. Congestion window size for link-layer protocols in wide area tests. The horizontal dashed line in the LL graphshows the 23000 byte WAN bandwidth-delay product.

    LL

    08192

    16384245763276840960491525734465536

    0 10 20 30 40 50 60 70 80Con

    gest

    ion

    Win

    dow

    (by

    tes)

    Time (sec)

    08192

    16384245763276840960491525734465536

    0 10 20 30 40 50 60 70 80Con

    gest

    ion

    Win

    dow

    (by

    tes)

    Time (sec)

    Figure 5. Packet sequence traces for LL-TCP-AWARE and LL. No coarse timeouts occur in either case. For LL-TCP-AWARE, the horizontal row of dots shows the times of wireless link retransmissions. For LL, the top row shows sender

    fast retransmission times and the bottom row shows both local wireless and sender retransmissions.

    Wired retransmissions

    Wireless retransmissionsWireless retransmissions

    LL-TCP-AWARE LL

    0

    1e+06

    2e+06

    3e+06

    4e+06

    5e+06

    6e+06

    7e+06

    8e+06

    9e+06

    0 10 20 30 40 50 60 70 80

    Sequ

    ence

    Num

    ber

    (byt

    es)

    Time (sec)

    0

    1e+06

    2e+06

    3e+06

    4e+06

    5e+06

    6e+06

    7e+06

    8e+06

    9e+06

    Sequ

    ence

    Num

    ber

    (byt

    es)

    0 10 20 30 40 50 60 70 80Time (sec)

  • to focus on the effectiveness of the mechanisms in handlingsuch losses. The WAN experiments are performed across 16Internet hops with minimal congestion2 in order to study theimpact of large delay-bandwidth products.

    Each run in the experiment consists of an 8 MByte transferfrom the source to receiver across the wired net and theWaveLAN link. We chose this rather long transfer size inorder to limit the impact of transient behavior at the start ofa TCP connection. During each run, we measure thethroughput at the receiver in Mbps, and the wired and wire-less goodputs as percentages. In addition, all packet trans-missions on the Ethernet and WaveLan are recorded foranalysis using tcpdump [20], and the sender’s TCP codeinstrumented to record events such as coarse timeouts,retransmission times, duplicate acknowledgment arrivals,congestion window size changes, etc. The rest of this sec-tion presents and discusses the results of these experiments.

    4.2 Link-Layer Protocols

    Traditional link-layer protocols operate independently ofthe higher-layer protocol, and consequently, do not neces-sarily shield the sender from the lossy link. In spite of localretransmissions, TCP performance could be poor for tworeasons: (i) competing retransmissions caused by an incom-patible setting of timers at the two layers, and (ii) unneces-sary invocations of the TCP fast retransmission mechanismdue to out-of-order delivery of data. In [10], the effects ofthe first situation are simulated and analyzed for a TCP-liketransport protocol (that closely tracks the round-trip time toset its retransmission timeout) and a reliable link-layer pro-

    2. WAN experiments across the US were performed between 10pm and 4 am, PST and we verified that no congestion lossesoccurred in the runs reported.

    tocol. The conclusion was that unless the packet loss rate ishigh (more than about 10%), competing retransmissions bythe link and transport layers often lead to significant perfor-mance degradation. However, this is not the dominatingeffect when link layer schemes, such as LL, are used withTCP Reno and its variants. These TCP implementationshave coarse retransmission timeout granularities that aretypically multiples of 500 ms, while link-layer protocolstypically have much finer timeout granularities. The realproblem is that when packets are lost, link-layer protocolsthat do not attempt in-order delivery across the link (e.g.,LL) cause packets to reach the TCP receiver out-of-order.This leads to the generation of duplicate acknowledgmentsby the TCP receiver, which causes the sender to invoke fastretransmission and recovery. This can potentially causedegraded throughput and goodput, especially when thedelay-bandwidth product is large.

    Our results substantiate this claim, as can be seen by com-paring the LL and LL-TCP-AWARE results (Figure 3 andTable 2). For a packet size of 1400 bytes, a bit error rate of1.9x10-6 (1/65536 bytes) translates to a packet error rate ofabout 2.2 to 2.3%. Therefore, an optimal link-layer protocolthat recovers from errors locally and does not compete withTCP retransmissions should have a wireless goodput of97.7% and a wired goodput of 100% in the absence of con-gestion. In the LAN experiments, the throughput differencebetween LL and LL-TCP-AWARE is about 10%. However,the LL wireless goodput is only 95.5%, significantly lessthan LL-TCP-AWARE’s wireless goodput of 97.6%, whichis close to the maximum achievable goodput. When a lossoccurs, the LL protocol performs a local retransmission rel-atively quickly. However, enough packets are typically intransit to create more than 3 duplicate acknowledgments.These duplicates eventually propagate to the sender andtrigger a fast retransmission and the associated congestion

    LL LL-TCP-AWARE LL-SMART LL-SMART-TCP-AWARE

    Thr

    ough

    put

    (Mbp

    s)

    LAN: AbsoluteWireless GoodputWired Goodput

    Figure 3. Performance of link-layer protocols: bit-error rate = 1.9x10-6 (1 error/65536 bytes), socket buffer size = 32KB. For each case there are two bars: the thick one corresponds to the scale on the left and denotes the throughput in

    Mbps; the thin one corresponds to the scale on the right and shows the throughput as a percentage of the maximum, i.e.in the absence of wireless errors (1.5 Mbps in the LAN environment and 1.35 Mbps in the WAN environment).

    Throughput

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    1.6

    1.8

    2Percentage of max.

    WAN: Absolute Percentage of max.

    Thr

    ough

    put

    (% o

    f m

    axim

    um)

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    95.597.91.20

    95.598.40.82

    97.6100.01.36 97.6

    100.01.19

    95.598.31.29

    95.399.40.93

    97.7100.01.39 97.6

    100.01.22

  • based on selective acknowledgements but not suppressingduplicate acknowledgments at the base station.

    We added TCP awareness to both the LL and LL-SMARTprotocols, resulting in the LL-TCP-AWARE and LL-SMART-TCP-AWARE schemes. The LL-TCP-AWAREprotocol is identical to the snoop protocol, while the LL-SMART-TCP-AWARE protocol uses SMART-based tech-niques for further optimization using selective repeat. LL-SMART-TCP-AWARE is the best link-layer protocol in ourexperiments — it performs local retransmissions based onselective acknowledgments and shields the sender fromduplicate acknowledgments caused by wireless losses.

    3.3 Split-Connection Schemes

    Like I-TCP, our SPLIT scheme uses an intermediate host todivide a TCP connection into two separate TCP connec-tions. The implementation avoids data copying in the inter-mediate host by passing the pointers to the same bufferbetween the two TCP connections. A variant of the SPLITapproach we investigated, SPLIT-SMART, uses a SMART-based selective acknowledgment scheme on the wirelessconnection to perform selective retransmissions. There islittle chance of reordering of packets over the wireless con-nection since the intermediate host is only one hop awayfrom the final destination.

    4. Experimental Results

    In this section, we describe the experiments we performedand the results we obtained, including detailed explanationsfor observed performance. We start by describing the exper-imental testbed and methodology. We then describe the per-formance of the various link-layer, end-to-end and split-connection schemes.

    4.1 Experimental Methodology

    We performed several experiments to determine the perfor-mance and efficiency of each of the protocols. The protocolswere implemented as a set of modifications to the BSD/OSTCP/IP (Reno) network stack. To ensure a fair basis forcomparison, none of the protocols implementations intro-

    duce any additional data copying at intermediate pointsfrom sender to receiver.

    Our experimental testbed consists of IBM ThinkPad laptopsand Pentium-based personal computers running BSD/OS2.1 from BSDI. The machines are interconnected using a 10Mbps Ethernet and 915 MHz AT&T WaveLANs [27], ashared-medium wireless LAN with a raw signalling band-width of 2 Mbps. The network topology for our experimentsis shown in Figure 2. The peak throughput for TCP bulktransfers is 1.5 Mbps in the local area testbed and 1.35Mbps in the wide area testbed in the absence of congestionor wireless losses. These testbed topologies represent typi-cal scenarios of wireless links and mobile hosts, such as cel-lular wireless networks. In addition, our experiments focuson data transfer to the mobile host, which is the commoncase for mobile applications (e.g., Web accesses).

    In order to measure the performance of the protocols undercontrolled conditions, we generate errors on the lossy linkusing an exponentially-distributed bit-error model. Thereceiving entity on the lossy link generates an exponentialdistribution for each bit-error rate and changes the TCPchecksum of the packet if the error generator determinesthat the packet should be dropped. Losses are generated inboth directions of the wireless channel, so TCP acknowl-edgments are dropped too. The TCP data packet size in ourexperiments is 1400 bytes. We first measure and analyze theperformance of the various protocols at an average error rateof one every 64 KBytes (this corresponds to a bit-error rateof about 1.9x10-6 ). Note that since the exponential distribu-tion has a standard deviation equal to its mean, there areseveral occasions when multiple packets are lost in closesuccession. We then report the results of some burst errorsituations, where between two and six packets are droppedin every burst (Section 4.5). Finally, we investigate the per-formance of many of these protocols across a range of errorrates from one every 16 KB to one every 256 KB.

    The choice of the exponentially-distributed error model ismotivated by our desire to understand the precise dynamicsof each protocol in response to a wireless loss, and is not anattempt to empirically model a wireless channel. While theactual performance numbers will be a function of the exacterror model, the relative performance is dependent on howthe protocol behaves after one or more losses in a singleTCP window. Thus, we expect our overall conclusions to beapplicable under other patterns of wireless loss as well.Finally, we believe that though wireless errors are generatedartificially in our experiments, the use of a real testbed isstill valuable in that it introduces realistic effects such aswireless bandwidth limitation, media access contention,protocol processing delays, etc., which are hard to modelrealistically in a simulation.

    In our experiments, we attempt to ensure that losses are onlydue to wireless errors (and not congestion). This allows us

    TCP Source

    10 Mbps Ethernet

    TCP Receiver

    2 Mbps WaveLAN(lossy link)

    (Pentium-based PCrunning BSD/OS)

    Base Station(Pentium PCrunning BSD/OS)

    (Pentium laptoprunning BSD/OS)

    Figure 2. Experimental topology. There were an addi-tional 16 Internet hops between the source and base sta-

    tion during the WAN experiments.

  • the wireless link. As described in the rest of this section,each protocol reacts to these losses in different ways andgenerates messages that result in loss recovery. Althoughthis figure only shows data packets being lost, our experi-ments have wireless errors in both directions.

    3.1 End-To-End Schemes

    Although a wide variety of TCP versions are used on theInternet, the current de facto standard for TCP implementa-tions is TCP Reno [26]. We call this the E2E protocol, anduse it as the standard basis for performance comparison.

    The E2E-NEWRENO protocol improves the performanceof TCP-Reno after multiple packet losses in a window byremaining in fast recovery mode if the first new acknowl-edgment received after a fast retransmission is “partial”, i.e,is less than the value of the last byte transmitted when thefast retransmission was done. Such partial acknowledge-ments are indicative of multiple packet losses within theoriginal window of data. Remaining in fast recovery modeenables the connection to recover from losses at the rate ofone segment per round trip time, rather than stall until acoarse timeout as TCP-Reno often would [11, 12].

    The E2E-SMART and E2E-IETF-SACK protocols addSMART-based and IETF selective acknowledgmentsrespectively to the standard TCP Reno stack. This allowsthe sender to handle multiple losses within a window of out-standing data more efficiently. However, the sender stillassumes that losses are a result of congestion and invokescongestion control procedures, shrinking its congestionwindow size. This allows us to identify what percentage ofthe end-to-end performance degradation is associated withstandard TCP’s handling of error detection and retransmis-sion. We used the SMART-based scheme [17] only for theLAN experiments. This scheme is well-suited to situationswhere there is little reordering of packets, which is true forone-hop wireless systems such as ours. Unlike the schemeproposed in [17], we do not use any special techniques todetect the loss of a retransmission. The sender retransmits apacket when it receives a SMART acknowledgment only ifthe same packet was not retransmitted within the last round-trip time. If no further SMART acknowledgments arrive, thesender falls back to the coarse timeout mechanism torecover from the loss. We used the IETF selective acknowl-edgement scheme both for the LAN and the WAN experi-ments. Our implementation is based on the RFC and takesappropriate congestion control actions upon receivingSACK information [4].

    The E2E-ELN protocol adds an Explicit Loss Notification(ELN) option to TCP acknowledgments. When a packet isdropped on the wireless link, future cumulative acknowl-edgments corresponding to the lost packet are marked toidentify that a non-congestion related loss has occurred.Upon receiving this information with duplicate acknowl-

    edgments, the sender may perform retransmissions withoutinvoking the associated congestion-control procedures. Thisoption allows us to identify what percentage of the end-to-end performance degradation is associated with TCP’sincorrect invocation of congestion control algorithms whenit does a fast retransmission of a packet lost on the wirelesshop. The E2E-ELN-RXMT protocol is an enhancement ofthe previous one, where the sender retransmits the packet onreceiving the first duplicate acknowledgement with the ELNoption set (as opposed to the third duplicate acknowledge-ment in the case of TCP Reno), in addition to not shrinkingits window size in response to wireless losses.

    In practice, it might be difficult to identify which packetsare lost due to errors on a lossy link. However, in our exper-iments we assume sufficient knowledge at the receiver aboutwireless losses to generate ELN information. We describesome possible implementation policies and strategies for theELN mechanism in Section 5.2.

    3.2 Link-Layer Schemes

    Unlike TCP for the transport layer, there is no de facto stan-dard for link-layer protocols. Existing link-layer protocolschoose from techniques such as Stop-and-Wait, Go-Back-N,Selective Repeat and Forward Error Correction to providereliability. Our base link-layer algorithm, called LL, usescumulative acknowledgments to determine lost packets thatare retransmitted locally from the base station to the mobilehost. To minimize overhead, our implementation of LLleverages off TCP acknowledgments instead of generatingits own. Timeout-based retransmissions are done by main-taining a smoothed round-trip time estimate, with a mini-mum timeout granularity of 200 ms to limit the overhead ofprocessing timer events. This still allows the LL scheme toretransmit packets several times before a typical TCP Renotransmitter would time out. LL is equivalent to the snoopagent that does not suppress any duplicate acknowledg-ments, and does not attempt in-order delivery of packetsacross the link (unlike protocols proposed in [15], [22]).

    While the use of TCP acknowledgments by our LL protocolrenders it atypical of traditional ARQ protocols, we believethat it still preserves the key feature of such protocols: theability to retransmit packets locally, independently of andon a much faster time scale than TCP. Therefore, we expectthe qualitative aspects of our results to be applicable to gen-eral link-layer protocols.

    We also investigated a more sophisticated link-layer proto-col (LL-SMART) that uses selective retransmissions toimprove performance. The LL-SMART protocol performsthis by applying a SMART-based acknowledgment schemeat the link layer. Like the LL protocol, LL-SMART usesTCP acknowledgments instead of generating its own andlimits its minimum timeout to 200 ms. LL-SMART isequivalent to the snoop agent performing retransmissions

  • RFCs. Recently, there has been renewed interest in add-ing SACKs to TCP. Two relevant proposals are therecent RFC on TCP SACKs [19] and the SMARTscheme [17].

    The SACK RFC proposes that each acknowledgmentcontain information about up to three non-contiguousblocks of data that have been received successfully bythe receiver. Each block of data is described by its start-ing and ending sequence number. Due to the limitednumber of blocks, it is best to inform the sender aboutthe most recent blocks received. The RFC does not spec-ify the sender behavior, except to require that standardTCP congestion control actions be performed whenlosses occur.

    An alternate proposal, SMART, uses acknowledgmentsthat contain the cumulative acknowledgment and thesequence number of the packet that caused the receiverto generate the acknowledgment (this information is asubset of the three-blocks scheme proposed in the RFC).

    The sender uses this information to create a bitmask ofpackets that have been delivered successfully to thereceiver. When the sender detects a gap in the bitmask, itimmediately assumes that the missing packets have beenlost without considering the possibility that they simplymay have been reordered. Thus this scheme trades offsome resilience to reordering and lost acknowledgmentsin exchange for a reduction in overhead to generate andtransmit acknowledgments.

    3. Implementation Details

    This section describes the protocols we have implementedand evaluated. Table 1 summarizes the key ideas in eachscheme and the main differences between them. Figure 1shows a typical loss situation over the wireless link. Here,the TCP sender is in the middle of a transfer across a two-hop network to a mobile host. At the depicted time, thesender’s congestion window consists of 5 packets. Of thefive packets in the network, the first two packets are lost on

    Name Category Special Mechanisms

    E2E end-to-end standard TCP-Reno

    E2E-NEWRENO end-to-end TCP-NewReno

    E2E-SMART end-to-end SMART-based selective acks

    E2E-IETF-SACK end-to-end IETF selective acks

    E2E-ELN end-to-end Explicit Loss Notification (ELN)

    E2E-ELN-RXMT end-to-end ELN with retransmit on first dupack

    LL link-layer none

    LL-TCP-AWARE link-layer duplicate ack suppression

    LL-SMART link-layer SMART-based selective acks

    LL-SMART-TCP-AWARE link-layer SMART and duplicate ack suppression

    SPLIT split-connection none

    SPLIT-SMART split-connection SMART-based wireless connection

    Table 1. Summary of protocols studied in this paper.

    1 2 3 4

    4 3

    2

    1

    5

    5

    congestion window = 5

    Figure 1. A typical loss situation

    TCP Source

    Base Station

    TCP ReceiverLossy Link

    Packets Storedat Sender

    Packets in Flight

    Acknowledgments Returning

  • our conclusions in Section 6, and mention some future workin Section 7.

    2. Related Work

    In this section, we summarize some protocols that havebeen proposed to improve the performance of TCP overwireless links. We also briefly describe some proposedmethods to add SACKs to TCP.

    • Link-layer protocols: There have been several propos-als for reliable link-layer protocols. The two mainclasses of techniques employed by these protocols are:error correction, using techniques such as forward errorcorrection (FEC), and retransmission of lost packets inresponse to automatic repeat request (ARQ) messages.The link-layer protocols for the digital cellular systemsin the U.S. — both CDMA [15] and TDMA [22] — pri-marily use ARQ techniques. While the TDMA protocolguarantees reliable, in-order delivery of link-layerframes, the CDMA protocol only makes a limitedattempt and leaves eventual error recovery to the (reli-able) transport layer. Other protocols like the AIRMAILprotocol [1] employ a combination of FEC and ARQtechniques for loss recovery.

    The main advantage of employing a link-layer protocolfor loss recovery is that it fits naturally into the layeredstructure of network protocols. The link-layer protocoloperates independently of higher-layer protocols anddoes not maintain any per-connection state. The mainconcern about link-layer protocols is the possibility ofadverse effect on certain transport-layer protocols suchas TCP, as described in Section 1. We investigate this indetail in our experiments.

    • Split connection protocols [3, 28]: Split connectionprotocols split each TCP connection between a senderand receiver into two separate connections at the basestation — one TCP connection between the sender andthe base station, and the other between the base stationand the receiver. Over the wireless hop, a specializedprotocol tuned to the wireless environment may be used.In [28], the authors propose two protocols — one inwhich the wireless hop uses TCP, and another in whichthe wireless hop uses a selective repeat protocol (SRP)on top of UDP. They study the impact of handoffs onperformance and conclude that they obtain no significantadvantage by using SRP instead of TCP over the wire-less connection in their experiments. However, ourexperiments demonstrate benefits in using a simpleselective acknowledgment scheme with TCP over thewireless connection.

    Indirect-TCP [2] is a split-connection solution that usesstandard TCP for its connection over the wireless link.Like other split-connection proposals, it attempts to sep-

    arate loss recovery over the wireless link from thatacross the wireline network, thereby shielding the origi-nal TCP sender from the wireless link. However, as ourexperiments indicate, the choice of TCP over the wire-less link results in several performance problems. SinceTCP is not well-tuned for the lossy link, the TCP senderof the wireless connection often times out, causing theoriginal sender to stall. In addition, every packet incursthe overhead of going through TCP protocol processingtwice at the base station (as compared to zero times for anon-split-connection approach), although extra copiesare avoided by an efficient kernel implementation.Another disadvantage of split connections is that theend-to-end semantics of TCP acknowledgments is vio-lated, since acknowledgments to packets can now reachthe source even before the packets actually reach themobile host. Also, since split-connection protocolsmaintain a significant amount of state at the base stationper TCP connection, handoff procedures tend to be com-plicated and slow. Section 5.1 discusses some issuesrelated to cellular handoffs and TCP performance.

    • The Snoop Protocol [7]: The snoop protocol introducesa module, called the snoop agent, at the base station. Theagent monitors every packet that passes through the TCPconnection in both directions and maintains a cache ofTCP segments sent across the link that have not yet beenacknowledged by the receiver. A packet loss is detectedby the arrival of a small number of duplicate acknowl-edgments from the receiver or by a local timeout. Thesnoop agent retransmits the lost packet if it has it cachedand suppresses the duplicate acknowledgments. In ourclassification of the protocols, the snoop protocol is alink-layer protocol that takes advantage of the knowl-edge of the higher-layer transport protocol (TCP).

    The main advantage of this approach is that it suppressesduplicate acknowledgments for TCP segments lost andretransmitted locally, thereby avoiding unnecessary fastretransmissions and congestion control invocations bythe sender. The per-connection state maintained by thesnoop agent at the base station is soft, and is not essentialfor correctness. Like other link-layer solutions, thesnoop approach could also suffer from not being able tocompletely shield the sender from wireless losses.

    • Selective Acknowledgments: Since standard TCP usesa cumulative acknowledgment scheme, it often does notprovide the sender with sufficient information to recoverquickly from multiple packet losses within a singletransmission window. Several studies [e.g., 11] haveshown that TCP enhanced with selective acknowledg-ments performs better than standard TCP in such situa-tions. SACKs were added as an option to TCP by RFC1072 [14]. However, disagreements over the use ofSACKs prevented the specification from being adopted,and the SACK option was removed from later TCP

  • attempt to make the lossy link appear as a higher qualitylink with a reduced effective bandwidth. As a result, most ofthe losses seen by the TCP sender are caused by congestion.Examples of this approach include wireless links with reli-able link-layer protocols such as AIRMAIL [1], split con-nection approaches such as Indirect-TCP [3], and TCP-aware link-layer schemes such as the snoop protocol [7].The second class of techniques attempts to make the senderaware of the existence of wireless hops and realize thatsome packet losses are not due to congestion. The sendercan then avoid invoking congestion control algorithms whennon-congestion-related losses occur — we describe some ofthese techniques in Section 3. Finally, it is possible for awireless-aware transport protocol to coexist with link-layerschemes to achieve good performance.

    We classify the many schemes into three basic groups,based on their fundamental philosophy: end-to-end propos-als, split-connection proposals and link-layer proposals. Theend-to-end protocols attempt to make the TCP sender han-dle losses through the use of two techniques. First, they usesome form of selective acknowledgments (SACKs) to allowthe sender to recover from multiple packet losses in a win-dow without resorting to a coarse timeout. Second, theyattempt to have the sender distinguish between congestionand other forms of losses using an Explicit Loss Notifica-tion (ELN) mechanism. At the other end of the solutionspectrum, split-connection approaches completely hide thewireless link from the sender by terminating the TCP con-nection at the base station. Such schemes use a separate reli-able connection between the base station and the destinationhost. The second connection can use techniques such asnegative or selective acknowledgments, rather than juststandard TCP, to perform well over the wireless link. Thethird class of protocols, link-layer solutions, lie between theother two classes. These protocols attempt to hide link-related losses from the TCP sender by using local retrans-missions and perhaps forward error correction [e.g., 18]over the wireless link. The local retransmissions use tech-niques that are tuned to the characteristics of the wirelesslink to provide a significant increase in performance. Sincethe end-to-end TCP connection passes through the lossylink, the TCP sender may not be fully shielded from wire-less losses. This can happen either because of timer interac-tions between the two layers [10], or more likely because ofTCP’s duplicate acknowledgments causing sender fastretransmissions even for segments that are locally retrans-mitted. As a result, some proposals to improve TCP perfor-mance use mechanisms based on the knowledge of TCPmessaging to shield the TCP sender more effectively andavoid competing and redundant retransmissions [7].

    In this paper, we evaluate the performance of several end-to-end, split-connection and link-layer protocols using end-to-end throughput and goodput as performance metrics, in both

    LAN and WAN configurations. In particular, we seek toanswer the following specific questions:

    1. What combination of mechanisms results in best per-formance for each of the protocol classes?

    2. How important is it for link-layer schemes to be awareof TCP algorithms to achieve high end-to-end through-put?

    3. How useful are selective acknowledgments in dealingwith lossy links, especially in the presence of burstlosses?

    4. Is it important for the end-to-end connection to be splitin order to effectively shield the sender from wirelesslosses and obtain the best performance?

    We answer these questions by implementing and testing thevarious protocols in a wireless testbed consisting of PentiumPC base stations and IBM ThinkPad mobile hosts communi-cating over a 915 MHz AT&T Wavelan, all running BSD/OS 2.1. For each protocol, we measure the end-to-endthroughput, and goodputs for the wired and (one-hop) wire-less paths. For any path (or link), goodput is defined as theratio of the actual transfer size to the total number of bytestransmitted over that path. In general, the wired and wirelessgoodputs differ because of wireless losses, local retransmis-sions and congestion losses in the wired network. Thesemetrics allow us to determine the end-to-end performanceas well as the transmission efficiency across the network.While we used a wireless hop as the lossy link in our exper-iments, we believe our results are applicable in a wider con-text to links where significant losses occur for reasons otherthan congestion. Examples of such links include high-speedmodems and cable modems.

    We show that a reliable link-layer protocol with someknowledge of TCP results in very good performance. Ourexperiments indicate that shielding the TCP sender fromduplicate acknowledgments caused by wireless lossesimproves throughput by 10-30%. Furthermore, it is possibleto achieve good performance without splitting the end-to-end connection at the base station. We also demonstrate thatselective acknowledgments and explicit loss notificationsresult in significant performance improvements. Forinstance, the simple ELN scheme we evaluated improvedthe end-to-end throughput by a factor of more than twocompared to TCP Reno, with comparable goodput values.

    The rest of this paper is organized as follows. Section 2briefly describes some proposed solutions to the problem ofreliable transport protocols over wireless links. Section 3describes the implementation details of the different proto-cols in our wireless testbed, and Section 4 presents theresults and analysis of several experiments. Section 5 dis-cusses some miscellaneous issues related to handoffs, ELNimplementation and selective acknowledgments. We present

  • A Comparison of Mechanisms for Improving TCP Performance overWireless Links

    Hari Balakrishnan, Venkata N. Padmanabhan, Srinivasan Seshan and Randy H. Katz1

    {hari,padmanab,ss,randy}@cs.berkeley.eduComputer Science Division, Department of EECS, University of California at Berkeley

    Abstract

    Reliable transport protocols such as TCP are tuned to per-form well in traditional networks where packet losses occurmostly because of congestion. However, networks withwireless and other lossy links also suffer from significantlosses due to bit errors and handoffs. TCP responds to alllosses by invoking congestion control and avoidance algo-rithms, resulting in degraded end-to-end performance inwireless and lossy systems. In this paper, we compare sev-eral schemes designed to improve the performance of TCPin such networks. We classify these schemes into threebroad categories: end-to-end protocols, where loss recoveryis performed by the sender; link-layer protocols, that pro-vide local reliability; and split-connection protocols, thatbreak the end-to-end connection into two parts at the basestation. We present the results of several experiments per-formed in both LAN and WAN environments, usingthroughput and goodput as the metrics for comparison.

    Our results show that a reliable link-layer protocol that isTCP-aware provides very good performance. Furthermore,it is possible to achieve good performance without splittingthe end-to-end connection at the base station. We also dem-onstrate that selective acknowledgments and explicit lossnotifications result in significant performance improve-ments.

    Index Terms: Computer networks, wireless networks, TCP,link-layer protocols, internetworking.

    1. Introduction

    The increasing popularity of wireless networks indicatesthat wireless links will play an important role in future inter-networks. Reliable transport protocols such as TCP [24, 26]have been tuned for traditional networks comprising wiredlinks and stationary hosts. These protocols assume conges-tion in the network to be the primary cause for packet lossesand unusual delays. TCP performs well over such networksby adapting to end-to-end delays and congestion losses. TheTCP sender uses the cumulative acknowledgments it

    1. Web page URL http://daedalus.cs.berkeley.edu.Srinivasan Seshan is now at IBM T.J. Watson Research Center,Hawthorne, NY ([email protected]).

    receives to determine which packets have reached thereceiver, and provides reliability by retransmitting lostpackets. For this purpose, it maintains a running average ofthe estimated round-trip delay and the mean linear deviationfrom it. The sender identifies the loss of a packet either bythe arrival of several duplicate cumulative acknowledg-ments or the absence of an acknowledgment for the packetwithin a timeout interval equal to the sum of the smoothedround-trip delay and four times its mean deviation. TCPreacts to packet losses by dropping its transmission (conges-tion) window size before retransmitting packets, initiatingcongestion control or avoidance mechanisms (e.g., slowstart [13]) and backing off its retransmission timer (Karn’sAlgorithm [16]). These measures result in a reduction in theload on the intermediate links, thereby controlling the con-gestion in the network.

    Unfortunately, when packets are lost in networks for rea-sons other than congestion, these measures result in anunnecessary reduction in end-to-end throughput and hence,in sub-optimal performance. Communication over wirelesslinks is often characterized by sporadic high bit-error rates,and intermittent connectivity due to handoffs. TCP perfor-mance in such networks suffers from significant throughputdegradation and very high interactive delays [8].

    Recently, several schemes have been proposed to the allevi-ate the effects of non-congestion-related losses on TCP per-formance over networks that have wireless or similar high-loss links [3, 7, 28]. These schemes choose from a variety ofmechanisms, such as local retransmissions, split-TCP con-nections, and forward error correction, to improve end-to-end throughput. However, it is unclear to what extent eachof the mechanisms contributes to the improvement in per-formance. In this paper, we examine and compare the effec-t iveness of these schemes and their variants, andexperimentally analyze the individual mechanisms and thedegree of performance improvement due to each.

    There are two different approaches to improving TCP per-formance in such lossy systems. The first approach hidesany non-congestion-related losses from the TCP sender andtherefore requires no changes to existing sender implemen-tations. The intuition behind this approach is that since theproblem is local, it should be solved locally, and that thetransport layer need not be aware of the characteristics ofthe individual links. Protocols that adopt this approach

    To appear, IEEE/ACM Transactions on Networking, December 1997. This is a much-extended and revised version of a paperthat appeared at ACM SIGCOMM, 1996.